Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH 1/1] tipc: fix double-free in tipc_buf_append()
From: Lee Jones @ 2026-04-20 14:33 UTC (permalink / raw)
  To: Tung Quang Nguyen
  Cc: Jon Maloy, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, netdev@vger.kernel.org,
	tipc-discussion@lists.sourceforge.net,
	linux-kernel@vger.kernel.org
In-Reply-To: <GV1P189MB19886E03BDD0F407B757E5ABC62F2@GV1P189MB1988.EURP189.PROD.OUTLOOK.COM>

On Mon, 20 Apr 2026, Tung Quang Nguyen wrote:

> Subject: [PATCH 1/1] tipc: fix double-free in tipc_buf_append()
> >
> >The tipc_msg_validate() function can potentially reallocate the skb it is
> >validating, freeing the old one.  In tipc_buf_append(), it was being called with a
> >pointer to a local variable which was a copy of the caller's skb pointer.
> >
> >If the skb was reallocated and validation subsequently failed, the error
> >handling path would free the original skb pointer, which had already been
> >freed, leading to double-free.
> >
> >Fix this by passing the caller's skb pointer-pointer directly to
> >tipc_msg_validate(), ensuring any modification is reflected correctly.
> >The local skb pointer is then updated from the (possibly modified) caller's
> >pointer.
> >
> >Fixes: d618d09a68e4 ("tipc: enforce valid ratio between skb truesize and
> >contents")
> >Assisted-by: Gemini:gemini-3.1-pro-preview
> >Signed-off-by: Lee Jones <lee@kernel.org>
> >---
> > net/tipc/msg.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> >diff --git a/net/tipc/msg.c b/net/tipc/msg.c index 76284fc538eb..9f4f612ee027
> >100644
> >--- a/net/tipc/msg.c
> >+++ b/net/tipc/msg.c
> >@@ -177,8 +177,9 @@ int tipc_buf_append(struct sk_buff **headbuf, struct
> >sk_buff **buf)
> >
> > 	if (fragid == LAST_FRAGMENT) {
> > 		TIPC_SKB_CB(head)->validated = 0;
> >-		if (unlikely(!tipc_msg_validate(&head)))
> >+		if (unlikely(!tipc_msg_validate(headbuf)))
> > 			goto err;
> >+		head = *headbuf;
> This is a known issue and was reported via https://patchwork.kernel.org/project/netdevbpf/patch/20260330205313.2433372-1-nicholas@carlini.com/
> The author did not respond to my comment.
> Can you improve the fix by applying my patch?

I'd be happy to make any required changes.

However, is this approach superior to simply passing a reference?

v1 appears to be simpler, easier to read and avoids the explanation.

> diff --git a/net/tipc/msg.c b/net/tipc/msg.c
> index 76284fc538eb..01a693559589 100644
> --- a/net/tipc/msg.c
> +++ b/net/tipc/msg.c
> @@ -177,8 +177,19 @@ int tipc_buf_append(struct sk_buff **headbuf, struct sk_buff **buf)
>  
>         if (fragid == LAST_FRAGMENT) {
>                 TIPC_SKB_CB(head)->validated = 0;
> -               if (unlikely(!tipc_msg_validate(&head)))
> +               if (unlikely(!tipc_msg_validate(&head))) {
> +                       /* reassembled skb has been freed in
> +                        * tipc_msg_validate() because of invalid truesize.
> +                        * head now points to newly-allocated reassembled skb
> +                        * while *headbuf points to freed reassembled skb.
> +                        * So, correct *headbuf for freeing newly-allocated
> +                        * reassembled skb later.
> +                        */
> +                       if (head != *headbuf)
> +                               *headbuf = head;
> +
>                         goto err;
> +               }
>                 *buf = head;
>                 TIPC_SKB_CB(head)->tail = NULL;
>                 *headbuf = NULL;
> > 		*buf = head;
> > 		TIPC_SKB_CB(head)->tail = NULL;
> > 		*headbuf = NULL;
> >--
> >2.54.0.rc1.513.gad8abe7a5a-goog
> >
> 

-- 
Lee Jones [李琼斯]

^ permalink raw reply

* Re: [RFC Patch net-next v1 8/9] r8169: move struct ethtool_ops
From: Andrew Lunn @ 2026-04-20 14:33 UTC (permalink / raw)
  To: javen
  Cc: hkallweit1, nic_swsd, andrew+netdev, davem, edumazet, kuba,
	pabeni, horms, netdev, linux-kernel
In-Reply-To: <20260420021957.1756-9-javen_xu@realsil.com.cn>

On Mon, Apr 20, 2026 at 10:19:56AM +0800, javen wrote:
> From: Javen Xu <javen_xu@realsil.com.cn>
> 
> This patch move struct ethtool_ops, no changes. Prepare for next patch.

Commit messages are all about "Why?". You normally move code because a
local function is after where it will be called from. I assume your
next patch is going to add a function which will be listed in the
ethtool ops, and it calls a function which is currently after the
ethtool ops. Please name names.

    Andrew

---
pw-bot: cr

^ permalink raw reply

* Re: [PATCH v4 net] ax25: fix OOB read after address header strip in ax25_rcv()
From: Simon Horman @ 2026-04-20 14:30 UTC (permalink / raw)
  To: ashutoshdesai993
  Cc: 'Simon Horman', netdev, linux-hams, jreuter, davem,
	edumazet, kuba, pabeni, stable, linux-kernel, david.laight.linux
In-Reply-To: <20260417065407.206499-1-ashutoshdesai993@gmail.com>

From: 'Simon Horman' <horms@kernel.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
ax25: fix OOB read after address header strip in ax25_rcv()

A crafted AX.25 frame with a valid address header but no control or PID
bytes causes skb->len to drop to zero after skb_pull() strips the
address header. The subsequent reads of skb->data[0] and skb->data[1]
are then out of bounds.

> diff --git a/net/ax25/ax25_in.c b/net/ax25/ax25_in.c
> index d75b3e9ed93de..d14ccebf9cdd5 100644
> --- a/net/ax25/ax25_in.c
> +++ b/net/ax25/ax25_in.c
> @@ -190,6 +190,9 @@ static int ax25_rcv(struct sk_buff *skb, struct net_device *dev,
>  	ax25_cb *ax25;
>  	ax25_dev *ax25_dev;
>  
> +	if (skb_linearize(skb))
> +		goto free;

Could calling skb_linearize() before verifying if the device is configured for
AX.25 introduce a regression?

Since ax25_packet_type is registered globally, ax25_rcv() receives all
ETH_P_AX25 packets from any network interface. skb_linearize() performs
memory allocations to coalesce fragmented packets. By performing this before
checking if ax25_dev is NULL, could an attacker send fragmented ETH_P_AX25
packets to an unconfigured interface to force unnecessary memory allocation
and CPU cycles?

Should the skb_linearize() call be moved after the device configuration check?

> +
>  	/*
>  	 *	Process the AX.25/LAPB frame.
>  	 */
[ ... ]
> @@ -217,6 +220,9 @@ static int ax25_rcv(struct sk_buff *skb, struct net_device *dev,
>  	 */
>  	skb_pull(skb, ax25_addr_size(&dp));
>  
> +	if (skb->len < 2)
> +		goto free;

Does this unconditionally drop all AX.25 LAPB control frames?

Standard AX.25 LAPB control frames (such as SABM, DISC, UA, DM, and RR)
consist solely of a 1-byte Control field and no PID byte. For these connection
management frames, skb->len is exactly 1 after the address header is pulled.
By discarding all frames shorter than 2 bytes, could this break the ability
to establish, acknowledge, or disconnect AX.25 connections?

Should the initial check only verify skb->len >= 1, and any requirement for
skb->len >= 2 be strictly scoped to the code paths that actually parse a PID
byte?

Also, if this skb->len < 2 check is relaxed, there is a pre-existing
out-of-bounds read in ax25_decode() for extended modulus frames.

If a malformed 1-byte frame is received on an extended modulus (AX25_EMODULUS)
connection, ax25_decode() unconditionally reads skb->data[1] for I-frames and
S-frames. It then calls skb_pull(skb, 2) but ignores the return value. Should
ax25_decode() validate that skb->len >= 2 before parsing the second control
byte for extended modulus frames?

Finally, there appears to be a pre-existing use-after-free and list corruption
in the ax25_rcv() error path when allocating memory for ax25->digipeat.

When handling an incoming connection request for a listener socket, the code
physically links the skb into the listener socket's receive queue via
skb_queue_head(&sk->sk_receive_queue, skb). Shortly after, it attempts to
allocate memory. If this allocation fails, the error path explicitly calls
kfree_skb(skb).

However, it fails to unlink the skb from sk->sk_receive_queue. (The subsequent
call to ax25_destroy_socket() flushes the newly created socket's queue, not
the listener socket's queue). This leaves a freed skb in the listener socket's
queue. When a user process later calls accept(), will it dequeue and
dereference the freed skb, resulting in a use-after-free?

^ permalink raw reply

* RE: [PATCH 1/1] tipc: fix double-free in tipc_buf_append()
From: Tung Quang Nguyen @ 2026-04-20 14:49 UTC (permalink / raw)
  To: Lee Jones
  Cc: Jon Maloy, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, netdev@vger.kernel.org,
	tipc-discussion@lists.sourceforge.net,
	linux-kernel@vger.kernel.org
In-Reply-To: <20260420143309.GD3202366@google.com>

>> Subject: [PATCH 1/1] tipc: fix double-free in tipc_buf_append()
>> >
>> >The tipc_msg_validate() function can potentially reallocate the skb
>> >it is validating, freeing the old one.  In tipc_buf_append(), it was
>> >being called with a pointer to a local variable which was a copy of the
>caller's skb pointer.
>> >
>> >If the skb was reallocated and validation subsequently failed, the
>> >error handling path would free the original skb pointer, which had
>> >already been freed, leading to double-free.
>> >
>> >Fix this by passing the caller's skb pointer-pointer directly to
>> >tipc_msg_validate(), ensuring any modification is reflected correctly.
>> >The local skb pointer is then updated from the (possibly modified)
>> >caller's pointer.
>> >
>> >Fixes: d618d09a68e4 ("tipc: enforce valid ratio between skb truesize
>> >and
>> >contents")
>> >Assisted-by: Gemini:gemini-3.1-pro-preview
>> >Signed-off-by: Lee Jones <lee@kernel.org>
>> >---
>> > net/tipc/msg.c | 3 ++-
>> > 1 file changed, 2 insertions(+), 1 deletion(-)
>> >
>> >diff --git a/net/tipc/msg.c b/net/tipc/msg.c index
>> >76284fc538eb..9f4f612ee027
>> >100644
>> >--- a/net/tipc/msg.c
>> >+++ b/net/tipc/msg.c
>> >@@ -177,8 +177,9 @@ int tipc_buf_append(struct sk_buff **headbuf,
>> >struct sk_buff **buf)
>> >
>> > 	if (fragid == LAST_FRAGMENT) {
>> > 		TIPC_SKB_CB(head)->validated = 0;
>> >-		if (unlikely(!tipc_msg_validate(&head)))
>> >+		if (unlikely(!tipc_msg_validate(headbuf)))
>> > 			goto err;
>> >+		head = *headbuf;
>> This is a known issue and was reported via
>> https://patchwork.kernel.org/project/netdevbpf/patch/20260330205313.24
>> 33372-1-nicholas@carlini.com/ The author did not respond to my
>> comment.
>> Can you improve the fix by applying my patch?
>
>I'd be happy to make any required changes.
>
>However, is this approach superior to simply passing a reference?
>
>v1 appears to be simpler, easier to read and avoids the explanation.
>
As I explained, your fix adds extra overhead to normal path while the error path is corner case and it rarely happens.
Whatever approach is applied, we need to add explanation to understand more easily the logic and hidden trick in tipc_msg_validate().
>> diff --git a/net/tipc/msg.c b/net/tipc/msg.c index
>> 76284fc538eb..01a693559589 100644
>> --- a/net/tipc/msg.c
>> +++ b/net/tipc/msg.c
>> @@ -177,8 +177,19 @@ int tipc_buf_append(struct sk_buff **headbuf,
>> struct sk_buff **buf)
>>
>>         if (fragid == LAST_FRAGMENT) {
>>                 TIPC_SKB_CB(head)->validated = 0;
>> -               if (unlikely(!tipc_msg_validate(&head)))
>> +               if (unlikely(!tipc_msg_validate(&head))) {
>> +                       /* reassembled skb has been freed in
>> +                        * tipc_msg_validate() because of invalid truesize.
>> +                        * head now points to newly-allocated reassembled skb
>> +                        * while *headbuf points to freed reassembled skb.
>> +                        * So, correct *headbuf for freeing newly-allocated
>> +                        * reassembled skb later.
>> +                        */
>> +                       if (head != *headbuf)
>> +                               *headbuf = head;
>> +
>>                         goto err;
>> +               }
>>                 *buf = head;
>>                 TIPC_SKB_CB(head)->tail = NULL;
>>                 *headbuf = NULL;
>> > 		*buf = head;
>> > 		TIPC_SKB_CB(head)->tail = NULL;
>> > 		*headbuf = NULL;
>> >--
>> >2.54.0.rc1.513.gad8abe7a5a-goog
>> >
>>
>
>--
>Lee Jones [李琼斯]

^ permalink raw reply

* RE: [Intel-wired-lan] [PATCH v5 net-next 0/8] dpll/ice: Add TXC DPLL type and full TX reference clock control for E825
From: Kubalewski, Arkadiusz @ 2026-04-20 14:52 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Vecera, Ivan, vadim.fedorenko@linux.dev, edumazet@google.com,
	netdev@vger.kernel.org, richardcochran@gmail.com,
	donald.hunter@gmail.com, linux-kernel@vger.kernel.org,
	davem@davemloft.net, Prathosh.Satish@microchip.com,
	andrew+netdev@lunn.ch, intel-wired-lan@lists.osuosl.org,
	horms@kernel.org, Kitszel, Przemyslaw, Nguyen, Anthony L,
	pabeni@redhat.com, jiri@resnulli.us
In-Reply-To: <20260418122603.06d12715@kernel.org>

>From: Jakub Kicinski <kuba@kernel.org>
>Sent: Saturday, April 18, 2026 9:26 PM
>
>On Fri, 17 Apr 2026 12:22:05 +0000 Kubalewski, Arkadiusz wrote:
>> >> I was thinking that this is more like a purpose specific DPLL device,
>> >> if
>> >> someone would want something similar we would have to review it,
>> >> right?
>> >
>> >We would if it was a Ethernet MAC PLL, but if someone wanted to expose
>> >whether some random PLL in their ASIC locks - are we adding a new type
>> >for each one of those?
>>
>> Yes, that was the implicit intention within those patches, if other
>> purpose
>> specific PLL would have to be present for whatever HW design and user
>> control over it would be required, then that would be the easiest to
>> maintain in the long term? Multiple types and each have own
>> function/purpose.
>>
>> It would be good as long as there is one PLL for a function per board,
>> once
>> there could be multiple ones for single function, we would have to add
>> some
>> enumeration (labels, etc.)
>
>Defer on adding identifiers. User knows which driver and bus device
>spawned the pll and more importantly what the pin topology is.
>Naming in the kernel is rarely a good idea.

Sure.

>
>> >> It depends, TX clock has one of external pins connected to external
>> >> DPLL,
>> >> but second is a board-level pin with ability to provide some external
>> >> clock signal, the user would have to determine that purpose just
>> >> based
>> >> on the topology of one of the pins, which seems a bit problematic?
>> >> I.e. if at some point there would be HW with only external non-DPLL
>> >> connected pins?
>> >
>> >Not sure I follow, TBH. To me the function of the "MAC PLL" is fairly
>> >obvious from the fact that it has a pin exposed via rtnetlink. So it's
>> >obviously a DPLL which can drive the Tx clock?
>>
>> I am lost a bit now too. You mean clock recovery pin? And EEC type dpll?
>> In this solution the 'MAC'/EEC is external and it doesn't drive TX
>> clocks
>> directly.
>
>MAC == "tspll" == TXC in this series. On Grzegorz's diagram the new PLL
>was in the MAC, which makes sense since it's a pll in the same ASIC as
>the MAC.
>

We wanted the TSPLL from the picture to be PPS type as it drives the PHC
timer within the MAC.

>I'm saying that the function of that pll is obvious since its pin will
>plug into the netdev / rtnetlink.
>

Yeah I got it, just saying it will work for now :)

>> >It's the function / relation / linking to the EEC DPLL that may not
>> >be obvious. But user can see how the pins connect they can get some
>> >LLM to draw a diagram of a live system.. et voila :)
>>
>> Yes, correct it would work for this particular HW, but adding a variant
>> without a external EEC-connected pin in the picture would be problematic
>> to understand 'generic' dpll purpose, pointing to the labels later.
>
>The function of the "MAC/tspll" is still obvious. The clarity of the
>external PLL is not helped by naming the "MAC/tspll".
>
>> Just to make it clear. I believe that generic type dpll could be used in
>> any HW and for any purpose, so after all each such usage could possibly
>> introduce entropy and confusion on the user side.
>>
>> But if you are fine with that, then sure, we can live with generic
>> purpose dpll.
>
>Considering all the imperfect options - generic / unnamed type would be
>my preference.

Ok, sounds good.

Thank you!
Arkadiusz

^ permalink raw reply

* [PATCH net-next] net: hns: use u32 for register offset in RCB TX coalescing
From: Agalakov Daniil @ 2026-04-20 14:40 UTC (permalink / raw)
  To: Jian Shen
  Cc: Agalakov Daniil, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, netdev, linux-kernel, lvc-project,
	Roman Razov

The local variable reg in hns_rcb_get_tx_coalesced_frames() and
hns_rcb_set_tx_coalesced_frames() holds a register offset passed to
dsaf_read_dev()/dsaf_write_dev(). Register offsets on this hardware
are 32-bit values; using u64 was misleading.

Change the type to u32, consistent with the hardware.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Agalakov Daniil <ade@amicon.ru>
---
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c
index 635b3a95dd82..3c4e4e8ca140 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c
@@ -563,7 +563,7 @@ u32 hns_rcb_get_rx_coalesced_frames(
 u32 hns_rcb_get_tx_coalesced_frames(
 	struct rcb_common_cb *rcb_common, u32 port_idx)
 {
-	u64 reg;
+	u32 reg;
 
 	reg = RCB_CFG_PKTLINE_REG + (port_idx + HNS_RCB_TX_PKTLINE_OFFSET) * 4;
 	return dsaf_read_dev(rcb_common, reg);
@@ -634,7 +634,7 @@ int hns_rcb_set_tx_coalesced_frames(
 {
 	u32 old_waterline =
 		hns_rcb_get_tx_coalesced_frames(rcb_common, port_idx);
-	u64 reg;
+	u32 reg;
 
 	if (coalesced_frames == old_waterline)
 		return 0;
-- 
2.51.0


^ permalink raw reply related

* [PATCH net 0/4] rxrpc: Miscellaneous fixes
From: David Howells @ 2026-04-20 14:58 UTC (permalink / raw)
  To: netdev
  Cc: David Howells, Marc Dionne, Jakub Kicinski, David S. Miller,
	Eric Dumazet, Paolo Abeni, Simon Horman, linux-afs, linux-kernel

Here are some fixes for rxrpc, as found by Sashiko[1]:

 (1) Fix leaks in rxkad_verify_response().

 (2) Fix lack of unsharing of RESPONSE packets.

 (3) Fix integer overflow in RxGK ticket length check.

 (4) Fix handling of rxkad-encrypted packets with crypto-misaligned
     lengths.

David

The patches can be found here also:

	http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=rxrpc-fixes

Link: https://sashiko.dev/#/patchset/20260408121252.2249051-1-dhowells%40redhat.com [1]

David Howells (4):
  rxrpc: Fix memory leaks in rxkad_verify_response()
  rxrpc: Fix conn-level packet handling to unshare RESPONSE packets
  rxgk: Fix potential integer overflow in length check
  rxrpc: Fix rxkad crypto unalignment handling

 include/linux/key.h          |   2 +
 include/trace/events/rxrpc.h |   1 +
 net/rxrpc/ar-internal.h      |   2 +-
 net/rxrpc/conn_event.c       |  12 ++-
 net/rxrpc/io_thread.c        |  15 +---
 net/rxrpc/rxgk_app.c         |   2 +-
 net/rxrpc/rxgk_common.h      |   1 +
 net/rxrpc/rxkad.c            | 142 +++++++++++++----------------------
 net/rxrpc/skbuff.c           |  26 ++++++-
 9 files changed, 94 insertions(+), 109 deletions(-)


^ permalink raw reply

* [PATCH net 1/4] rxrpc: Fix memory leaks in rxkad_verify_response()
From: David Howells @ 2026-04-20 14:58 UTC (permalink / raw)
  To: netdev
  Cc: David Howells, Marc Dionne, Jakub Kicinski, David S. Miller,
	Eric Dumazet, Paolo Abeni, Simon Horman, linux-afs, linux-kernel,
	Jeffrey Altman, stable
In-Reply-To: <20260420145900.1223732-1-dhowells@redhat.com>

Fix rxkad_verify_response() to free ticket by using a __free() construct
rather than explicitly freeing it.

Also fix rxkad_verify_response() to free the server key by using a __free()
construct.

Fixes: 57af281e5389 ("rxrpc: Tidy up abort generation infrastructure")
Fixes: ec832bd06d6f ("rxrpc: Don't retain the server key in the connection")
Closes: https://sashiko.dev/#/patchset/20260408121252.2249051-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Jeffrey Altman <jaltman@auristor.com>
cc: Eric Dumazet <edumazet@google.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: netdev@vger.kernel.org
cc: stable@kernel.org
---
 include/linux/key.h |   2 +
 net/rxrpc/rxkad.c   | 133 +++++++++++++++-----------------------------
 2 files changed, 48 insertions(+), 87 deletions(-)

diff --git a/include/linux/key.h b/include/linux/key.h
index 81b8f05c6898..1cafbc3827c2 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -304,6 +304,8 @@ extern void key_put(struct key *key);
 extern bool key_put_tag(struct key_tag *tag);
 extern void key_remove_domain(struct key_tag *domain_tag);
 
+DEFINE_FREE(key_put, struct key *, if (!IS_ERR(_T)) key_put(_T))
+
 static inline struct key *__key_get(struct key *key)
 {
 	refcount_inc(&key->usage);
diff --git a/net/rxrpc/rxkad.c b/net/rxrpc/rxkad.c
index eb7f2769d2b1..0acdc46f42c2 100644
--- a/net/rxrpc/rxkad.c
+++ b/net/rxrpc/rxkad.c
@@ -1131,21 +1131,20 @@ static int rxkad_decrypt_response(struct rxrpc_connection *conn,
 static int rxkad_verify_response(struct rxrpc_connection *conn,
 				 struct sk_buff *skb)
 {
-	struct rxkad_response *response;
 	struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
 	struct rxrpc_crypt session_key;
-	struct key *server_key;
 	time64_t expiry;
-	void *ticket;
 	u32 version, kvno, ticket_len, level;
 	__be32 csum;
 	int ret, i;
 
 	_enter("{%d}", conn->debug_id);
 
-	server_key = rxrpc_look_up_server_security(conn, skb, 0, 0);
+	struct key *server_key __free(key_put) =
+		rxrpc_look_up_server_security(conn, skb, 0, 0);
 	if (IS_ERR(server_key)) {
 		ret = PTR_ERR(server_key);
+		server_key = NULL;
 		switch (ret) {
 		case -ENOKEY:
 			return rxrpc_abort_conn(conn, skb, RXKADUNKNOWNKEY, ret,
@@ -1160,16 +1159,15 @@ static int rxkad_verify_response(struct rxrpc_connection *conn,
 	}
 
 	ret = -ENOMEM;
-	response = kzalloc_obj(struct rxkad_response, GFP_NOFS);
+	struct rxkad_response *response __free(kfree) =
+		kzalloc_obj(struct rxkad_response, GFP_NOFS);
 	if (!response)
 		goto temporary_error;
 
 	if (skb_copy_bits(skb, sizeof(struct rxrpc_wire_header),
-			  response, sizeof(*response)) < 0) {
-		rxrpc_abort_conn(conn, skb, RXKADPACKETSHORT, -EPROTO,
-				 rxkad_abort_resp_short);
-		goto protocol_error;
-	}
+			  response, sizeof(*response)) < 0)
+		return rxrpc_abort_conn(conn, skb, RXKADPACKETSHORT, -EPROTO,
+					rxkad_abort_resp_short);
 
 	version = ntohl(response->version);
 	ticket_len = ntohl(response->ticket_len);
@@ -1177,103 +1175,79 @@ static int rxkad_verify_response(struct rxrpc_connection *conn,
 
 	trace_rxrpc_rx_response(conn, sp->hdr.serial, version, kvno, ticket_len);
 
-	if (version != RXKAD_VERSION) {
-		rxrpc_abort_conn(conn, skb, RXKADINCONSISTENCY, -EPROTO,
-				 rxkad_abort_resp_version);
-		goto protocol_error;
-	}
+	if (version != RXKAD_VERSION)
+		return rxrpc_abort_conn(conn, skb, RXKADINCONSISTENCY, -EPROTO,
+					rxkad_abort_resp_version);
 
-	if (ticket_len < 4 || ticket_len > MAXKRB5TICKETLEN) {
-		rxrpc_abort_conn(conn, skb, RXKADTICKETLEN, -EPROTO,
-				 rxkad_abort_resp_tkt_len);
-		goto protocol_error;
-	}
+	if (ticket_len < 4 || ticket_len > MAXKRB5TICKETLEN)
+		return rxrpc_abort_conn(conn, skb, RXKADTICKETLEN, -EPROTO,
+					rxkad_abort_resp_tkt_len);
 
-	if (kvno >= RXKAD_TKT_TYPE_KERBEROS_V5) {
-		rxrpc_abort_conn(conn, skb, RXKADUNKNOWNKEY, -EPROTO,
-				 rxkad_abort_resp_unknown_tkt);
-		goto protocol_error;
-	}
+	if (kvno >= RXKAD_TKT_TYPE_KERBEROS_V5)
+		return rxrpc_abort_conn(conn, skb, RXKADUNKNOWNKEY, -EPROTO,
+					rxkad_abort_resp_unknown_tkt);
 
 	/* extract the kerberos ticket and decrypt and decode it */
 	ret = -ENOMEM;
-	ticket = kmalloc(ticket_len, GFP_NOFS);
+	void *ticket __free(kfree) = kmalloc(ticket_len, GFP_NOFS);
 	if (!ticket)
-		goto temporary_error_free_resp;
+		goto temporary_error;
 
 	if (skb_copy_bits(skb, sizeof(struct rxrpc_wire_header) + sizeof(*response),
-			  ticket, ticket_len) < 0) {
-		rxrpc_abort_conn(conn, skb, RXKADPACKETSHORT, -EPROTO,
-				 rxkad_abort_resp_short_tkt);
-		goto protocol_error;
-	}
+			  ticket, ticket_len) < 0)
+		return rxrpc_abort_conn(conn, skb, RXKADPACKETSHORT, -EPROTO,
+					rxkad_abort_resp_short_tkt);
 
 	ret = rxkad_decrypt_ticket(conn, server_key, skb, ticket, ticket_len,
 				   &session_key, &expiry);
 	if (ret < 0)
-		goto temporary_error_free_ticket;
+		goto temporary_error;
 
 	/* use the session key from inside the ticket to decrypt the
 	 * response */
 	ret = rxkad_decrypt_response(conn, response, &session_key);
 	if (ret < 0)
-		goto temporary_error_free_ticket;
+		goto temporary_error;
 
 	if (ntohl(response->encrypted.epoch) != conn->proto.epoch ||
 	    ntohl(response->encrypted.cid) != conn->proto.cid ||
-	    ntohl(response->encrypted.securityIndex) != conn->security_ix) {
-		rxrpc_abort_conn(conn, skb, RXKADSEALEDINCON, -EPROTO,
-				 rxkad_abort_resp_bad_param);
-		goto protocol_error_free;
-	}
+	    ntohl(response->encrypted.securityIndex) != conn->security_ix)
+		return rxrpc_abort_conn(conn, skb, RXKADSEALEDINCON, -EPROTO,
+					rxkad_abort_resp_bad_param);
 
 	csum = response->encrypted.checksum;
 	response->encrypted.checksum = 0;
 	rxkad_calc_response_checksum(response);
-	if (response->encrypted.checksum != csum) {
-		rxrpc_abort_conn(conn, skb, RXKADSEALEDINCON, -EPROTO,
-				 rxkad_abort_resp_bad_checksum);
-		goto protocol_error_free;
-	}
+	if (response->encrypted.checksum != csum)
+		return rxrpc_abort_conn(conn, skb, RXKADSEALEDINCON, -EPROTO,
+					rxkad_abort_resp_bad_checksum);
 
 	for (i = 0; i < RXRPC_MAXCALLS; i++) {
 		u32 call_id = ntohl(response->encrypted.call_id[i]);
 		u32 counter = READ_ONCE(conn->channels[i].call_counter);
 
-		if (call_id > INT_MAX) {
-			rxrpc_abort_conn(conn, skb, RXKADSEALEDINCON, -EPROTO,
-					 rxkad_abort_resp_bad_callid);
-			goto protocol_error_free;
-		}
-
-		if (call_id < counter) {
-			rxrpc_abort_conn(conn, skb, RXKADSEALEDINCON, -EPROTO,
-					 rxkad_abort_resp_call_ctr);
-			goto protocol_error_free;
-		}
-
+		if (call_id > INT_MAX)
+			return rxrpc_abort_conn(conn, skb, RXKADSEALEDINCON, -EPROTO,
+						rxkad_abort_resp_bad_callid);
+		if (call_id < counter)
+			return rxrpc_abort_conn(conn, skb, RXKADSEALEDINCON, -EPROTO,
+						rxkad_abort_resp_call_ctr);
 		if (call_id > counter) {
-			if (conn->channels[i].call) {
-				rxrpc_abort_conn(conn, skb, RXKADSEALEDINCON, -EPROTO,
-						 rxkad_abort_resp_call_state);
-				goto protocol_error_free;
-			}
+			if (conn->channels[i].call)
+				return rxrpc_abort_conn(conn, skb, RXKADSEALEDINCON, -EPROTO,
+							rxkad_abort_resp_call_state);
 			conn->channels[i].call_counter = call_id;
 		}
 	}
 
-	if (ntohl(response->encrypted.inc_nonce) != conn->rxkad.nonce + 1) {
-		rxrpc_abort_conn(conn, skb, RXKADOUTOFSEQUENCE, -EPROTO,
-				 rxkad_abort_resp_ooseq);
-		goto protocol_error_free;
-	}
+	if (ntohl(response->encrypted.inc_nonce) != conn->rxkad.nonce + 1)
+		return rxrpc_abort_conn(conn, skb, RXKADOUTOFSEQUENCE, -EPROTO,
+					rxkad_abort_resp_ooseq);
 
 	level = ntohl(response->encrypted.level);
-	if (level > RXRPC_SECURITY_ENCRYPT) {
-		rxrpc_abort_conn(conn, skb, RXKADLEVELFAIL, -EPROTO,
-				 rxkad_abort_resp_level);
-		goto protocol_error_free;
-	}
+	if (level > RXRPC_SECURITY_ENCRYPT)
+		return rxrpc_abort_conn(conn, skb, RXKADLEVELFAIL, -EPROTO,
+					rxkad_abort_resp_level);
 	conn->security_level = level;
 
 	/* create a key to hold the security data and expiration time - after
@@ -1281,30 +1255,15 @@ static int rxkad_verify_response(struct rxrpc_connection *conn,
 	 * as for a client connection */
 	ret = rxrpc_get_server_data_key(conn, &session_key, expiry, kvno);
 	if (ret < 0)
-		goto temporary_error_free_ticket;
-
-	kfree(ticket);
-	kfree(response);
+		goto temporary_error;
 	_leave(" = 0");
 	return 0;
 
-protocol_error_free:
-	kfree(ticket);
-protocol_error:
-	kfree(response);
-	key_put(server_key);
-	return -EPROTO;
-
-temporary_error_free_ticket:
-	kfree(ticket);
-temporary_error_free_resp:
-	kfree(response);
 temporary_error:
 	/* Ignore the response packet if we got a temporary error such as
 	 * ENOMEM.  We just want to send the challenge again.  Note that we
 	 * also come out this way if the ticket decryption fails.
 	 */
-	key_put(server_key);
 	return ret;
 }
 


^ permalink raw reply related

* [PATCH net 2/4] rxrpc: Fix conn-level packet handling to unshare RESPONSE packets
From: David Howells @ 2026-04-20 14:58 UTC (permalink / raw)
  To: netdev
  Cc: David Howells, Marc Dionne, Jakub Kicinski, David S. Miller,
	Eric Dumazet, Paolo Abeni, Simon Horman, linux-afs, linux-kernel,
	Jeffrey Altman, stable
In-Reply-To: <20260420145900.1223732-1-dhowells@redhat.com>

The security operations that verify the RESPONSE packets decrypt bits of it
in place - however, the sk_buff may be shared with a packet sniffer, which
would lead to the sniffer seeing an apparently corrupt packet (actually
decrypted).

Fix this by unsharing the skbuff before handing it off to the specific
security handler.

Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Closes: https://sashiko.dev/#/patchset/20260408121252.2249051-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Jeffrey Altman <jaltman@auristor.com>
cc: Eric Dumazet <edumazet@google.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: netdev@vger.kernel.org
cc: stable@kernel.org
---
 net/rxrpc/ar-internal.h |  2 +-
 net/rxrpc/conn_event.c  | 12 ++++++++++--
 net/rxrpc/io_thread.c   | 15 +++------------
 net/rxrpc/skbuff.c      | 26 ++++++++++++++++++++++----
 4 files changed, 36 insertions(+), 19 deletions(-)

diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index 96ecb83c9071..fb04d2ffdb27 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -1486,7 +1486,7 @@ int rxrpc_server_keyring(struct rxrpc_sock *, sockptr_t, int);
 void rxrpc_kernel_data_consumed(struct rxrpc_call *, struct sk_buff *);
 void rxrpc_new_skb(struct sk_buff *, enum rxrpc_skb_trace);
 void rxrpc_see_skb(struct sk_buff *, enum rxrpc_skb_trace);
-void rxrpc_eaten_skb(struct sk_buff *, enum rxrpc_skb_trace);
+struct sk_buff *rxrpc_unshare_skb(struct sk_buff **_old, gfp_t gfp);
 void rxrpc_get_skb(struct sk_buff *, enum rxrpc_skb_trace);
 void rxrpc_free_skb(struct sk_buff *, enum rxrpc_skb_trace);
 void rxrpc_purge_queue(struct sk_buff_head *);
diff --git a/net/rxrpc/conn_event.c b/net/rxrpc/conn_event.c
index 9a41ec708aeb..3d56a5d23369 100644
--- a/net/rxrpc/conn_event.c
+++ b/net/rxrpc/conn_event.c
@@ -244,8 +244,9 @@ static void rxrpc_call_is_secure(struct rxrpc_call *call)
  * connection-level Rx packet processor
  */
 static int rxrpc_process_event(struct rxrpc_connection *conn,
-			       struct sk_buff *skb)
+			       struct sk_buff **_skb)
 {
+	struct sk_buff *skb = *_skb;
 	struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
 	bool secured = false;
 	int ret;
@@ -270,6 +271,13 @@ static int rxrpc_process_event(struct rxrpc_connection *conn,
 		}
 		spin_unlock_irq(&conn->state_lock);
 
+		skb = rxrpc_unshare_skb(_skb, GFP_NOFS);
+		if (!skb)
+			return -ENOMEM;
+
+		/* If unshared, skb will have changed. */
+		sp = rxrpc_skb(skb);
+
 		ret = conn->security->verify_response(conn, skb);
 		if (ret < 0)
 			return ret;
@@ -371,7 +379,7 @@ static void rxrpc_do_process_connection(struct rxrpc_connection *conn)
 	 * connection that each one has when we've finished with it */
 	while ((skb = skb_dequeue(&conn->rx_queue))) {
 		rxrpc_see_skb(skb, rxrpc_skb_see_conn_work);
-		ret = rxrpc_process_event(conn, skb);
+		ret = rxrpc_process_event(conn, &skb);
 		switch (ret) {
 		case -ENOMEM:
 		case -EAGAIN:
diff --git a/net/rxrpc/io_thread.c b/net/rxrpc/io_thread.c
index 697956931925..0592ce644fc3 100644
--- a/net/rxrpc/io_thread.c
+++ b/net/rxrpc/io_thread.c
@@ -249,19 +249,10 @@ static bool rxrpc_input_packet(struct rxrpc_local *local, struct sk_buff **_skb)
 		 * decryption.
 		 */
 		if (sp->hdr.securityIndex != 0) {
-			skb = skb_unshare(skb, GFP_ATOMIC);
-			if (!skb) {
-				rxrpc_eaten_skb(*_skb, rxrpc_skb_eaten_by_unshare_nomem);
-				*_skb = NULL;
+			skb = rxrpc_unshare_skb(_skb, GFP_ATOMIC);
+			if (!skb)
 				return just_discard;
-			}
-
-			if (skb != *_skb) {
-				rxrpc_eaten_skb(*_skb, rxrpc_skb_eaten_by_unshare);
-				*_skb = skb;
-				rxrpc_new_skb(skb, rxrpc_skb_new_unshared);
-				sp = rxrpc_skb(skb);
-			}
+			sp = rxrpc_skb(skb);
 		}
 		break;
 
diff --git a/net/rxrpc/skbuff.c b/net/rxrpc/skbuff.c
index 3bcd6ee80396..0dca9ca163f1 100644
--- a/net/rxrpc/skbuff.c
+++ b/net/rxrpc/skbuff.c
@@ -47,12 +47,30 @@ void rxrpc_get_skb(struct sk_buff *skb, enum rxrpc_skb_trace why)
 }
 
 /*
- * Note the dropping of a ref on a socket buffer by the core.
+ * Do the unsharing of a socket buffer, noting the event in the traces.
  */
-void rxrpc_eaten_skb(struct sk_buff *skb, enum rxrpc_skb_trace why)
+struct sk_buff *rxrpc_unshare_skb(struct sk_buff **_old, gfp_t gfp)
 {
-	int n = atomic_inc_return(&rxrpc_n_rx_skbs);
-	trace_rxrpc_skb(skb, 0, n, why);
+	struct sk_buff *skb, *old = *_old;
+	int n, r = refcount_read(&old->users);
+
+	skb = skb_unshare(old, gfp);
+	if (!skb) {
+		n = atomic_dec_return(&rxrpc_n_rx_skbs);
+		trace_rxrpc_skb(old, r, n, rxrpc_skb_eaten_by_unshare_nomem);
+		*_old = NULL;
+		return skb;
+	}
+
+	if (skb != old) {
+		n = atomic_read(&rxrpc_n_rx_skbs);
+		trace_rxrpc_skb(old, r, n, rxrpc_skb_eaten_by_unshare);
+		trace_rxrpc_skb(skb, refcount_read(&skb->users), n,
+				rxrpc_skb_new_unshared);
+		*_old = skb;
+	}
+
+	return skb;
 }
 
 /*


^ permalink raw reply related

* [PATCH net 3/4] rxgk: Fix potential integer overflow in length check
From: David Howells @ 2026-04-20 14:58 UTC (permalink / raw)
  To: netdev
  Cc: David Howells, Marc Dionne, Jakub Kicinski, David S. Miller,
	Eric Dumazet, Paolo Abeni, Simon Horman, linux-afs, linux-kernel,
	Jeffrey Altman, stable
In-Reply-To: <20260420145900.1223732-1-dhowells@redhat.com>

Fix potential integer overflow in rxgk_extract_token() when checking the
length of the ticket.  Rather than rounding up the value to be tested
(which might overflow), round down the size of the available data.

Fixes: 2429a1976481 ("rxrpc: Fix untrusted unsigned subtract")
Closes: https://sashiko.dev/#/patchset/20260408121252.2249051-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Jeffrey Altman <jaltman@auristor.com>
cc: Eric Dumazet <edumazet@google.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: netdev@vger.kernel.org
cc: stable@kernel.org
---
 net/rxrpc/rxgk_app.c    | 2 +-
 net/rxrpc/rxgk_common.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/rxrpc/rxgk_app.c b/net/rxrpc/rxgk_app.c
index 30275cb5ba3e..5587639d60c5 100644
--- a/net/rxrpc/rxgk_app.c
+++ b/net/rxrpc/rxgk_app.c
@@ -214,7 +214,7 @@ int rxgk_extract_token(struct rxrpc_connection *conn, struct sk_buff *skb,
 	ticket_len	= ntohl(container.token_len);
 	ticket_offset	= token_offset + sizeof(container);
 
-	if (xdr_round_up(ticket_len) > token_len - sizeof(container))
+	if (ticket_len > xdr_round_down(token_len - sizeof(container)))
 		goto short_packet;
 
 	_debug("KVNO %u", kvno);
diff --git a/net/rxrpc/rxgk_common.h b/net/rxrpc/rxgk_common.h
index 80164d89e19c..1e257d7ab8ec 100644
--- a/net/rxrpc/rxgk_common.h
+++ b/net/rxrpc/rxgk_common.h
@@ -34,6 +34,7 @@ struct rxgk_context {
 };
 
 #define xdr_round_up(x) (round_up((x), sizeof(__be32)))
+#define xdr_round_down(x) (round_down((x), sizeof(__be32)))
 #define xdr_object_len(x) (4 + xdr_round_up(x))
 
 /*


^ permalink raw reply related

* [PATCH net 4/4] rxrpc: Fix rxkad crypto unalignment handling
From: David Howells @ 2026-04-20 14:58 UTC (permalink / raw)
  To: netdev
  Cc: David Howells, Marc Dionne, Jakub Kicinski, David S. Miller,
	Eric Dumazet, Paolo Abeni, Simon Horman, linux-afs, linux-kernel,
	Jeffrey Altman, stable
In-Reply-To: <20260420145900.1223732-1-dhowells@redhat.com>

Fix handling of a packet with a misaligned crypto length.  Also handle
non-ENOMEM errors from decryption by aborting.  Further, remove the
WARN_ON_ONCE() so that it can't be remotely triggered (a trace line can
still be emitted).

Fixes: f93af41b9f5f ("rxrpc: Fix missing error checks for rxkad encryption/decryption failure")
Closes: https://sashiko.dev/#/patchset/20260408121252.2249051-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Jeffrey Altman <jaltman@auristor.com>
cc: Eric Dumazet <edumazet@google.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: netdev@vger.kernel.org
cc: stable@kernel.org
---
 include/trace/events/rxrpc.h | 1 +
 net/rxrpc/rxkad.c            | 9 +++++++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/rxrpc.h b/include/trace/events/rxrpc.h
index 578b8038b211..5820d7e41ea0 100644
--- a/include/trace/events/rxrpc.h
+++ b/include/trace/events/rxrpc.h
@@ -37,6 +37,7 @@
 	EM(rxkad_abort_1_short_encdata,		"rxkad1-short-encdata")	\
 	EM(rxkad_abort_1_short_header,		"rxkad1-short-hdr")	\
 	EM(rxkad_abort_2_short_check,		"rxkad2-short-check")	\
+	EM(rxkad_abort_2_crypto_unaligned,	"rxkad2-crypto-unaligned") \
 	EM(rxkad_abort_2_short_data,		"rxkad2-short-data")	\
 	EM(rxkad_abort_2_short_header,		"rxkad2-short-hdr")	\
 	EM(rxkad_abort_2_short_len,		"rxkad2-short-len")	\
diff --git a/net/rxrpc/rxkad.c b/net/rxrpc/rxkad.c
index 0acdc46f42c2..2c4697063ab2 100644
--- a/net/rxrpc/rxkad.c
+++ b/net/rxrpc/rxkad.c
@@ -510,6 +510,9 @@ static int rxkad_verify_packet_2(struct rxrpc_call *call, struct sk_buff *skb,
 		return rxrpc_abort_eproto(call, skb, RXKADSEALEDINCON,
 					  rxkad_abort_2_short_header);
 
+	/* Don't let the crypto algo see a misaligned length. */
+	sp->len = round_down(sp->len, 8);
+
 	/* Decrypt the skbuff in-place.  TODO: We really want to decrypt
 	 * directly into the target buffer.
 	 */
@@ -543,8 +546,10 @@ static int rxkad_verify_packet_2(struct rxrpc_call *call, struct sk_buff *skb,
 	if (sg != _sg)
 		kfree(sg);
 	if (ret < 0) {
-		WARN_ON_ONCE(ret != -ENOMEM);
-		return ret;
+		if (ret == -ENOMEM)
+			return ret;
+		return rxrpc_abort_eproto(call, skb, RXKADSEALEDINCON,
+					  rxkad_abort_2_crypto_unaligned);
 	}
 
 	/* Extract the decrypted packet length */


^ permalink raw reply related

* Re: [PATCH 1/1] tipc: fix double-free in tipc_buf_append()
From: Lee Jones @ 2026-04-20 15:10 UTC (permalink / raw)
  To: Tung Quang Nguyen
  Cc: Jon Maloy, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, netdev@vger.kernel.org,
	tipc-discussion@lists.sourceforge.net,
	linux-kernel@vger.kernel.org
In-Reply-To: <GV1P189MB198853F316580DDA547F4925C62F2@GV1P189MB1988.EURP189.PROD.OUTLOOK.COM>

On Mon, 20 Apr 2026, Tung Quang Nguyen wrote:

> >> Subject: [PATCH 1/1] tipc: fix double-free in tipc_buf_append()
> >> >
> >> >The tipc_msg_validate() function can potentially reallocate the skb
> >> >it is validating, freeing the old one.  In tipc_buf_append(), it was
> >> >being called with a pointer to a local variable which was a copy of the
> >caller's skb pointer.
> >> >
> >> >If the skb was reallocated and validation subsequently failed, the
> >> >error handling path would free the original skb pointer, which had
> >> >already been freed, leading to double-free.
> >> >
> >> >Fix this by passing the caller's skb pointer-pointer directly to
> >> >tipc_msg_validate(), ensuring any modification is reflected correctly.
> >> >The local skb pointer is then updated from the (possibly modified)
> >> >caller's pointer.
> >> >
> >> >Fixes: d618d09a68e4 ("tipc: enforce valid ratio between skb truesize
> >> >and
> >> >contents")
> >> >Assisted-by: Gemini:gemini-3.1-pro-preview
> >> >Signed-off-by: Lee Jones <lee@kernel.org>
> >> >---
> >> > net/tipc/msg.c | 3 ++-
> >> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >> >
> >> >diff --git a/net/tipc/msg.c b/net/tipc/msg.c index
> >> >76284fc538eb..9f4f612ee027
> >> >100644
> >> >--- a/net/tipc/msg.c
> >> >+++ b/net/tipc/msg.c
> >> >@@ -177,8 +177,9 @@ int tipc_buf_append(struct sk_buff **headbuf,
> >> >struct sk_buff **buf)
> >> >
> >> > 	if (fragid == LAST_FRAGMENT) {
> >> > 		TIPC_SKB_CB(head)->validated = 0;
> >> >-		if (unlikely(!tipc_msg_validate(&head)))
> >> >+		if (unlikely(!tipc_msg_validate(headbuf)))
> >> > 			goto err;
> >> >+		head = *headbuf;
> >> This is a known issue and was reported via
> >> https://patchwork.kernel.org/project/netdevbpf/patch/20260330205313.24
> >> 33372-1-nicholas@carlini.com/ The author did not respond to my
> >> comment.
> >> Can you improve the fix by applying my patch?
> >
> >I'd be happy to make any required changes.
> >
> >However, is this approach superior to simply passing a reference?
> >
> >v1 appears to be simpler, easier to read and avoids the explanation.
> >
> As I explained, your fix adds extra overhead to normal path while the error path is corner case and it rarely happens.
> Whatever approach is applied, we need to add explanation to understand more easily the logic and hidden trick in tipc_msg_validate().

Very well.  I have made the recommended changes.

The patch is currently in my build-test environment.

I will post v2, when everything has been satisfied.

-- 
Lee Jones [李琼斯]

^ permalink raw reply

* Re: [PATCH net 1/4] xsk: avoid skb leak in XDP_TX_METADATA case
From: Stanislav Fomichev @ 2026-04-20 15:22 UTC (permalink / raw)
  To: Jason Xing
  Cc: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
	maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk,
	john.fastabend, horms, andrew+netdev, bpf, netdev, Jason Xing
In-Reply-To: <20260418045644.28612-2-kerneljasonxing@gmail.com>

On 04/18, Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
> 
> Fix it by explicitly adding kfree_skb() before returning back to its
> caller.
> 
> How to reproduce it in virtio_net:
> 1. the current skb is the first one (which means no frag and xs->skb is
>    NULL) and users enable metadata feature.
> 2. xsk_skb_metadata() returns a error code.
> 3. the caller xsk_build_skb() clears skb by using 'skb = NULL;'.
> 4. there is no chance to free this skb anymore.
> 
> Closes: https://lore.kernel.org/all/20260415085204.3F87AC19424@smtp.kernel.org/
> Fixes: 30c3055f9c0d ("xsk: wrap generic metadata handling onto separate function")
> Signed-off-by: Jason Xing <kernelxing@tencent.com>

Acked-by: Stanislav Fomichev <sdf@fomichev.me>

^ permalink raw reply

* Re: [PATCH net v2] selftests: netfilter: conntrack_sctp_collision.sh: Introduce SCTP INIT collision test
From: Jakub Kicinski @ 2026-04-20 15:23 UTC (permalink / raw)
  To: Yi Chen
  Cc: Pablo Neira Ayuso, Florian Westphal, Phil Sutter, Long Xin,
	David S . Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
	Shuah Khan, coreteam, netfilter-devel, linux-kselftest,
	linux-kernel, netdev
In-Reply-To: <20260418195843.303946-1-yiche.cy@gmail.com>

On Sun, 19 Apr 2026 03:58:43 +0800 Yi Chen wrote:
> The existing test covered a scenario where a delayed INIT_ACK chunk
> updates the vtag in conntrack after the association has already been
> established.

AI says:

The conntrack_sctp_collision.sh selftest is now failing in the NIPA CI on
both the normal and debug kernel builds:

  not ok 1 1 selftests: net/netfilter: conntrack_sctp_collision.sh # exit=1

  # Test for SCTP INIT_ACK Collision in nf_conntrack:
  # Invalid netns name ""
  # Invalid netns name ""

The root cause is a shell variable scoping bug introduced by this patch.
The new test structure wraps `topo_setup` in a subshell:

  (topo_setup && conf_delay $SERVER_NS link0 2) || exit $?
  if ! do_test; then
      ...
  fi

`topo_setup` calls `setup_ns CLIENT_NS SERVER_NS ROUTER_NS`, which sets
those variables inside the subshell. Those assignments do not propagate
back to the parent shell, so when `do_test` is called afterwards, both
`$SERVER_NS` and `$CLIENT_NS` expand to empty strings. The `ip net exec ""`
calls then fail with "Invalid netns name """.

The second test case (SCTP INIT Collision) would have the same problem.

The fix is to avoid the subshell or ensure the namespace variables are
visible to `do_test`. The simplest approach is to remove the subshell
wrapping and call `topo_setup`, `conf_delay`, and `do_test` in the same
shell scope:

  topo_setup && conf_delay "$SERVER_NS" link0 2 || exit $?
  if ! do_test; then
      exit $ksft_fail
  fi

  topo_setup && conf_delay "$CLIENT_NS" link3 1 || exit $?
  if ! do_test; then
      exit $ksft_fail
  fi

Please also note that `conf_delay` references `$ROUTER_NS` directly
(not via a parameter), so it too requires that those variables be set
in the same shell scope.

^ permalink raw reply

* Re: [PATCH net 0/2] tcp: symmetric challenge ACK for SEG.ACK > SND.NXT
From: Jakub Kicinski @ 2026-04-20 15:24 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: netdev, Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima,
	David S. Miller, David Ahern, Paolo Abeni, Simon Horman,
	Shuah Khan, linux-kernel, linux-kselftest
In-Reply-To: <20260420025428.101192-1-jiayuan.chen@linux.dev>

On Mon, 20 Apr 2026 10:54:07 +0800 Jiayuan Chen wrote:
> Commit 354e4aa391ed ("tcp: RFC 5961 5.2 Blind Data Injection Attack
> Mitigation") quotes RFC 5961 Section 5.2 in full, which requires
> that any incoming segment whose ACK value falls outside
> [SND.UNA - MAX.SND.WND, SND.NXT] MUST be discarded and an ACK sent
> back.  Linux currently sends that challenge ACK only on the lower
> edge (SEG.ACK < SND.UNA - MAX.SND.WND); on the symmetric upper edge
> (SEG.ACK > SND.NXT) the segment is silently dropped with
> SKB_DROP_REASON_TCP_ACK_UNSENT_DATA.
> 
> Patch 1 completes the mitigation by emitting a rate-limited challenge
> ACK on that branch, reusing tcp_send_challenge_ack() and honouring
> FLAG_NO_CHALLENGE_ACK for consistency with the lower-edge case.
> 
> Patch 2 adds a packetdrill selftest under
> tools/testing/selftests/net/packetdrill/ that verifies the new
> behaviour.

AI says:

Your patch "tcp: send a challenge ACK on SEG.ACK > SND.NXT" breaks an
existing packetdrill selftest:

  selftests/net/packetdrill/tcp_ts_recent_invalid_ack.pkt

Test output:
  tcp_ts_recent_invalid_ack.pkt:25: error handling packet:
  live packet field tcp_ack_seq: expected: 1001 (0x3e9) vs actual: 1 (0x1)
  script packet:  0.200125 . 1:1(0) ack 1001 <nop,nop,TS val 200 ecr 201>
  actual packet:  0.200119 . 1:1(0) ack 1 win 65535 <nop,nop,TS val 200 ecr 200>
  not ok 1 ipv4
  not ok 2 ipv6
  not ok 3 ipv4-mapped-ipv6

Root cause:

The test `tcp_ts_recent_invalid_ack.pkt` sends a FIN+ACK with ACK=9999
(which exceeds SND.NXT=1) to verify that the kernel does not update ts_recent
from an invalid packet.  Before your patch, this packet was silently dropped.
After your patch, the kernel now emits a challenge ACK (SEQ=1, ACK=1) in
response to the ACK=9999 segment.

The test script does not expect this challenge ACK, so when it subsequently
tries to match the expected "ack 1001" response to the following data segment,
it instead sees the challenge ACK "ack 1", causing a mismatch on all three
address families (ipv4, ipv6, ipv4-mapped-ipv6).

Fix: update `tcp_ts_recent_invalid_ack.pkt` to consume the new challenge ACK
before checking the response to the subsequent data segment.  For example,
add after the bad FIN+ACK line:

  +0 > . 1:1(0) ack 1

so that the challenge ACK is explicitly expected and the rest of the script
proceeds as before.

^ permalink raw reply

* Re: Path forward for NFC in the kernel
From: Mark Greer @ 2026-04-20 15:31 UTC (permalink / raw)
  To: Krzysztof Kozlowski, Jakub Kicinski, Michael Thalmeier,
	Raymond Hackley, Michael Walle, Bongsu Jeon, David Heidelberg
  Cc: netdev
In-Reply-To: <9c4a4acf-b4f1-4e84-93bf-cdf080cb9970@kernel.org>

[Trying this again hopefully without HTML.]

On 4/17/26 12:18 AM, Krzysztof Kozlowski wrote:
> On 16/04/2026 19:10, Jakub Kicinski wrote:
>> Hi folks! We are struggling to keep up with the number of security 
>> reports and AI generated patches in the kernel. NFC is infamous for 
>> being a huge CVE magnet. We need someone to step up as a maintainer, 
>> create an NFC tree and handle all the incoming submissions. Send us 
>> (or Linus if you prefer) periodic PRs, like WiFi, Bluetooth etc. do. 
>> If that does not happen I'm afraid we'll have to move the NFC code 
>> out of the tree, put it up on GH or some such, and let it accumulate 
>> CVEs there.. I'm planning to send a PR to Linus to shed the 
>> unmaintained code early next week. We need to have a maintainer 
>> established by then. 
> +Cc David Heidelberg recently trying to use Linux NFC stack, Just 
> "collecting" patches is not a big deal, I could do this, but actually 
> reviewing the patches with necessary due diligence is the effort I 
> could not provide in a reasonable time frame. And picking up patches 
> without proper review feels risky... NFC has a long history of issues, 
> first mostly pointed out by syzbot but now apparently by AI tools. The 
> code base is quite old, with no major improvements or testings 
> happening but not in a way "oh, it's stable and working like 'cp' 
> command" but rather "no one knows how many bugs are on top of each 
> other and if it actually still works". Syzbot and AI reported bugs 
> encourage random drive-by fixes by people not testing the code, thus 
> particular bug report might be fixed, but for example NFC stops 
> working and no one knows that. Does anyone knows if the NFC 
> stack/drivers actually works fine? Did anyone test actual devices? If 
> not, then moving to Github would be even more reasonable. Another 
> point is that AFAIU, most of real world devices, like Android-based 
> phones, don't use the Linux NFC stack but their custom HAL/user-space 
> based libraries and drivers. Some other non-Android projects use 
> libnfc userspace, which seems to be maintained only as bugfix 
> (https://github.com/nfc-tools/libnfc/commits/master/). Best regards, 
> Krzysztof 

That is my understanding too.

I would love to take this on but I've been swamped for years with my 
"day job" and I'm pretty burned out.
Taking on something like this isn't possible for me. I also didn't 
realize it was creating so many CVEs so,
yeah, sounds like it should be removed then. Thanks for all of the work 
you did on this, Krzysztof.

Mark
-- 



^ permalink raw reply

* Re: [net-next v2 2/5] dt-bindings: net: starfive,jh7110-dwmac: Add JHB100 support
From: Krzysztof Kozlowski @ 2026-04-20 15:33 UTC (permalink / raw)
  To: Minda Chen
  Cc: Alexandre Torgue, Andrew Lunn, David S . Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Maxime Coquelin,
	Emil Renner Berthing, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, netdev, linux-kernel, linux-stm32, devicetree
In-Reply-To: <20260417024523.107786-3-minda.chen@starfivetech.com>

On Fri, Apr 17, 2026 at 10:45:20AM +0800, Minda Chen wrote:
> Add StarFive JHB100 dwmac support and compatible.
> The JHB100 dwmac shares the same driver code as the JH7110 dwmac,

Please describe the hardware or programming interface, not driver code.

> which contains 2 SGMII interfaces, 1 RGMII/RMII interface and
> 1 RMII interface.
> JHB100 dwmac has only one reset signal and one main interrupt
> line.


Drop all below, not relevant.

> 
> Please refer to below:
> 
> JHB100: reset-names = "stmmaceth";
> 
> Example usage of JHB100 in the device tree:
> 
> gmac0: ethernet@11b80000 {
>         compatible = "starfive,jhb100-dwmac",
>                      "snps,dwmac-5.20";
>         interrupts = <225>;
>         interrupt-names = "macirq";
>         ...
> };
> 
> Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
> ---
>  .../devicetree/bindings/net/snps,dwmac.yaml   |  1 +
>  .../bindings/net/starfive,jh7110-dwmac.yaml   | 23 +++++++++++++++++++
>  2 files changed, 24 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/net/snps,dwmac.yaml b/Documentation/devicetree/bindings/net/snps,dwmac.yaml
> index 38bc34dc4f09..85cd3252e8b1 100644
> --- a/Documentation/devicetree/bindings/net/snps,dwmac.yaml
> +++ b/Documentation/devicetree/bindings/net/snps,dwmac.yaml
> @@ -115,6 +115,7 @@ properties:
>          - sophgo,sg2044-dwmac
>          - starfive,jh7100-dwmac
>          - starfive,jh7110-dwmac
> +        - starfive,jhb100-dwmac
>          - tesla,fsd-ethqos
>          - thead,th1520-gmac
>  
> diff --git a/Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml b/Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml
> index 0d1962980f57..edc246a71ce3 100644
> --- a/Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml
> +++ b/Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml
> @@ -18,6 +18,7 @@ select:
>          enum:
>            - starfive,jh7100-dwmac
>            - starfive,jh7110-dwmac
> +          - starfive,jhb100-dwmac
>    required:
>      - compatible
>  
> @@ -30,6 +31,9 @@ properties:
>        - items:
>            - const: starfive,jh7110-dwmac
>            - const: snps,dwmac-5.20
> +      - items:
> +          - const: starfive,jhb100-dwmac

So that's an enum in previous "items" list.... but your commit msg said
your devices are compatible, so confusing.

Best regards,
Krzysztof


^ permalink raw reply

* Re: [PATCH iwl-next v2 3/3] igc: add support for forcing link speed without autonegotiation
From: Simon Horman @ 2026-04-20 15:35 UTC (permalink / raw)
  To: Abdul Rahim, Faizal
  Cc: khai.wen.tan, anthony.l.nguyen, przemyslaw.kitszel, andrew+netdev,
	davem, edumazet, kuba, pabeni, intel-wired-lan, netdev,
	linux-kernel, faizal.abdul.rahim, hong.aun.looi, khai.wen.tan
In-Reply-To: <3481ae84-5c36-4591-94c1-78b70fff4d7b@linux.intel.com>

On Mon, Apr 20, 2026 at 11:20:07AM +0800, Abdul Rahim, Faizal wrote:
> 
> 
> On 19/4/2026 12:48 am, Simon Horman wrote:
> > From: 'Simon Horman' <horms@kernel.org>
> > 
> > This is an AI-generated review of your patch. The human sending this
> > email has considered the AI review valid, or at least plausible.
> > Full review at: https://sashiko.dev
> > ---
> > igc: add support for forcing link speed without autonegotiation
> > 
> > This commit adds support to allow users to force 10/100 Mb/s link speed
> > and duplex via ethtool when autonegotiation is disabled.
> > 
> > > diff --git a/drivers/net/ethernet/intel/igc/igc_ethtool.c b/drivers/net/ethernet/intel/igc/igc_ethtool.c
> > > index cfcbf2fdad6ea..5bd37d1be1688 100644
> > > --- a/drivers/net/ethernet/intel/igc/igc_ethtool.c
> > > +++ b/drivers/net/ethernet/intel/igc/igc_ethtool.c
> > 
> > [ ... ]
> > 
> > > @@ -2000,6 +2013,41 @@ static int igc_ethtool_get_link_ksettings(struct net_device *netdev,
> > >   	return 0;
> > >   }
> > > +/**
> > > + * igc_handle_autoneg_disabled - Configure forced speed/duplex settings
> > > + * @adapter: private driver structure
> > > + * @speed: requested speed (must be SPEED_10 or SPEED_100)
> > > + * @duplex: requested duplex
> > > + *
> > > + * Records forced speed/duplex when autoneg is disabled.
> > > + * Caller must validate speed before calling this function.
> > > + */
> > > +static void igc_handle_autoneg_disabled(struct igc_adapter *adapter, u32 speed,
> > > +					u8 duplex)
> > > +{
> > > +	struct igc_mac_info *mac = &adapter->hw.mac;
> > > +
> > > +	switch (speed) {
> > > +	case SPEED_10:
> > > +		mac->forced_speed_duplex = (duplex == DUPLEX_FULL) ?
> > > +			IGC_FORCED_10F : IGC_FORCED_10H;
> > > +		break;
> > > +	case SPEED_100:
> > > +		mac->forced_speed_duplex = (duplex == DUPLEX_FULL) ?
> > > +			IGC_FORCED_100F : IGC_FORCED_100H;
> > > +		break;
> > > +	default:
> > > +		WARN_ONCE(1, "Unsupported speed %u\n", speed);
> > > +		return;
> > > +	}
> > > +
> > > +	mac->autoneg_enabled = false;
> > > +
> > > +	/* Half-duplex cannot support flow control per IEEE 802.3 */
> > > +	if (duplex == DUPLEX_HALF)
> > > +		adapter->hw.fc.requested_mode = igc_fc_none;
> > > +}
> > 
> > If userspace issues an ethtool command without specifying duplex, the duplex
> > value can be DUPLEX_UNKNOWN. In this case, forced_speed_duplex correctly
> > defaults to half-duplex since duplex == DUPLEX_FULL evaluates to false.
> > 
> > However, since DUPLEX_UNKNOWN is not equal to DUPLEX_HALF, the check above
> > fails, and the requested flow control mode remains unchanged. Could this leave
> > flow control enabled on a forced half-duplex link, violating IEEE 802.3?
> > Should this check be based on the resolved mac->forced_speed_duplex state
> > instead?
> > 
> 
> You're right, thanks for pointing that out.
> 
> That said, it feels simpler to address it with [1]:
> if (duplex != DUPLEX_FULL)
>     adapter->hw.fc.requested_mode = igc_fc_none;
> 
> Rather than [2]:
>  if (mac->forced_speed_duplex == IGC_FORCED_10H ||
>         mac->forced_speed_duplex == IGC_FORCED_100H)
>         adapter->hw.fc.requested_mode = igc_fc_none;
> 
> Are you okay with [1] ?

Yes, [1] sounds sensible to me.

> 
> > [ ... ]
> > 
> > > @@ -2071,14 +2120,20 @@ igc_ethtool_set_link_ksettings(struct net_device *netdev,
> > >   		}
> > >   	}
> > > +	if (cmd->base.autoneg == AUTONEG_DISABLE &&
> > > +	    cmd->base.speed != SPEED_10 && cmd->base.speed != SPEED_100) {
> > > +		netdev_info(dev, "Unsupported speed for forced link\n");
> > > +		return -EINVAL;
> > > +	}
> > > +
> > >   	while (test_and_set_bit(__IGC_RESETTING, &adapter->state))
> > >   		usleep_range(1000, 2000);
> > 
> > This isn't a bug introduced by this patch, but does this ad-hoc
> > synchronization using bit flags circumvent proper locking guarantees? The
> > networking subsystem guidelines forbid using bit flags as ad-hoc locks.
> > 
> > Using a bit flag in a polling loop bypasses lockdep analysis, fairness, and
> > strict memory ordering. Could this be replaced with a real synchronization
> > mechanism like a mutex?
> 
> It looks like a worthwhile cleanup. However, it likely doesn’t belong in
> this series, since the synchronization pattern predates these patches and is
> used throughout the igc driver (set_ringparam, set_pauseparam, set_channels,
> etc.). We could address it in different patch series and align the other
> code paths at the same time ?

Yes, agreed.

^ permalink raw reply

* Re: [PATCH net 0/2] tcp: symmetric challenge ACK for SEG.ACK > SND.NXT
From: Jiayuan Chen @ 2026-04-20 15:38 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima,
	David S. Miller, David Ahern, Paolo Abeni, Simon Horman,
	Shuah Khan, linux-kernel, linux-kselftest
In-Reply-To: <20260420082454.1b85cd94@kernel.org>


On 4/20/26 11:24 PM, Jakub Kicinski wrote:
> AI says:
>
> Your patch "tcp: send a challenge ACK on SEG.ACK > SND.NXT" breaks an
> existing packetdrill selftest:
>
>    selftests/net/packetdrill/tcp_ts_recent_invalid_ack.pkt
>
> Test output:
>    tcp_ts_recent_invalid_ack.pkt:25: error handling packet:
>    live packet field tcp_ack_seq: expected: 1001 (0x3e9) vs actual: 1 (0x1)
>    script packet:  0.200125 . 1:1(0) ack 1001 <nop,nop,TS val 200 ecr 201>
>    actual packet:  0.200119 . 1:1(0) ack 1 win 65535 <nop,nop,TS val 200 ecr 200>
>    not ok 1 ipv4
>    not ok 2 ipv6
>    not ok 3 ipv4-mapped-ipv6
>
> Root cause:
>
> The test `tcp_ts_recent_invalid_ack.pkt` sends a FIN+ACK with ACK=9999
> (which exceeds SND.NXT=1) to verify that the kernel does not update ts_recent
> from an invalid packet.  Before your patch, this packet was silently dropped.
> After your patch, the kernel now emits a challenge ACK (SEQ=1, ACK=1) in
> response to the ACK=9999 segment.
>
> The test script does not expect this challenge ACK, so when it subsequently
> tries to match the expected "ack 1001" response to the following data segment,
> it instead sees the challenge ACK "ack 1", causing a mismatch on all three
> address families (ipv4, ipv6, ipv4-mapped-ipv6).
>
> Fix: update `tcp_ts_recent_invalid_ack.pkt` to consume the new challenge ACK
> before checking the response to the subsequent data segment.  For example,
> add after the bad FIN+ACK line:
>
>    +0 > . 1:1(0) ack 1
>
> so that the challenge ACK is explicitly expected and the rest of the script
> proceeds as before.


Thanks Jakub, you're right.

Will fold the tcp_ts_recent_invalid_ack.pkt update into the kernel fix in v2

to keep bisect clean and makes the backport self-contained.


^ permalink raw reply

* Re: [PATCH iwl-net 1/4] ice: fix asymmetric pause negotiation reporting in ethtool
From: Simon Horman @ 2026-04-20 15:40 UTC (permalink / raw)
  To: Aleksandr Loktionov
  Cc: intel-wired-lan, anthony.l.nguyen, netdev, Tomasz Lichwala
In-Reply-To: <20260417062954.1241900-2-aleksandr.loktionov@intel.com>

On Fri, Apr 17, 2026 at 08:29:51AM +0200, Aleksandr Loktionov wrote:
> From: Tomasz Lichwala <tomasz.lichwala@intel.com>
> 
> Add Asym_Pause to the supported link modes so that asymmetric pause
> negotiation is properly reported via ethtool. Without Asym_Pause in
> the supported modes, 'ethtool -a' incorrectly shows 'RX/TX negotiated: off'
> for asymmetric pause configurations, even when pause is properly
> negotiated and functional at the hardware level.
> 
> Fixes: 5a056cd7ead2 ("ice: add lp_advertising flow control support")
> Signed-off-by: Tomasz Lichwala <tomasz.lichwala@intel.com>
> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

* Re: [PATCH net 1/4] xsk: avoid skb leak in XDP_TX_METADATA case
From: Stanislav Fomichev @ 2026-04-20 15:42 UTC (permalink / raw)
  To: Jason Xing
  Cc: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
	maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk,
	john.fastabend, horms, andrew+netdev, bpf, netdev, Jason Xing
In-Reply-To: <aeZEjZSsVIdpcbFF@devvm17672.vll0.facebook.com>

On 04/20, Stanislav Fomichev wrote:
> On 04/18, Jason Xing wrote:
> > From: Jason Xing <kernelxing@tencent.com>
> > 
> > Fix it by explicitly adding kfree_skb() before returning back to its
> > caller.
> > 
> > How to reproduce it in virtio_net:
> > 1. the current skb is the first one (which means no frag and xs->skb is
> >    NULL) and users enable metadata feature.
> > 2. xsk_skb_metadata() returns a error code.
> > 3. the caller xsk_build_skb() clears skb by using 'skb = NULL;'.
> > 4. there is no chance to free this skb anymore.
> > 
> > Closes: https://lore.kernel.org/all/20260415085204.3F87AC19424@smtp.kernel.org/
> > Fixes: 30c3055f9c0d ("xsk: wrap generic metadata handling onto separate function")
> > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> 
> Acked-by: Stanislav Fomichev <sdf@fomichev.me>

Actually, I take that back.. While looking at patch 2 (which always
confuses me with that -EOVERFLOW handling), looks like we set
skb->destructor in xsk_skb_init_misc? So this kfree will do
xsk_cq_submit_addr_locked? Not sure that's what we want? Should we
move xsk_skb_init_misc to happen after xsk_skb_metadata?

^ permalink raw reply

* Re: [PATCH net 2/4] xsk: free the skb when hitting the upper bound MAX_SKB_FRAGS
From: Stanislav Fomichev @ 2026-04-20 15:44 UTC (permalink / raw)
  To: Jason Xing
  Cc: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
	maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk,
	john.fastabend, horms, andrew+netdev, bpf, netdev, Jason Xing
In-Reply-To: <20260418045644.28612-3-kerneljasonxing@gmail.com>

On 04/18, Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
> 
> Fix it by explicitly adding kfree_skb() before returning back to its
> caller.
> 
> How to reproduce it in virtio_net:
> 1. the current skb is the first one (which means xs->skb is NULL) and
>    hit the limit MAX_SKB_FRAGS.
> 2. xsk_build_skb_zerocopy() returns -EOVERFLOW.
> 3. the caller xsk_build_skb() clears skb by using 'skb = NULL;'. This
>    is why bug can be triggered.
> 4. there is no chance to free this skb anymore.
> 
> Note that if in this case the xs->skb is not NULL, xsk_build_skb() will
> call xsk_drop_skb(xs->skb) to do the right thing.
> 
> Fixes: cf24f5a5feea ("xsk: add support for AF_XDP multi-buffer on Tx path")
> Signed-off-by: Jason Xing <kernelxing@tencent.com>
> ---
>  net/xdp/xsk.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index 8fcde34aec7b..5d3dbb118730 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -778,8 +778,11 @@ static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs,
>  	addr = buffer - pool->addrs;
>  
>  	for (copied = 0, i = skb_shinfo(skb)->nr_frags; copied < len; i++) {
> -		if (unlikely(i >= MAX_SKB_FRAGS))
> +		if (unlikely(i >= MAX_SKB_FRAGS)) {
> +			if (!xs->skb)
> +				kfree_skb(skb);
>  			return ERR_PTR(-EOVERFLOW);
> +		}
>  
>  		page = pool->umem->pgs[addr >> PAGE_SHIFT];
>  		get_page(page);
> -- 
> 2.41.3
> 

As mentioned for patch 1, need to do something with that
xsk_skb_init_misc... Otherwise we hit destructor with xsk_cq_submit_addr_locked
here as well

^ permalink raw reply

* Re: [PATCH iwl-net 2/4] ice: fix autoneg disable when link partner doesn't support AN
From: Simon Horman @ 2026-04-20 15:54 UTC (permalink / raw)
  To: Aleksandr Loktionov
  Cc: intel-wired-lan, anthony.l.nguyen, netdev, Konrad Knitter
In-Reply-To: <20260417062954.1241900-3-aleksandr.loktionov@intel.com>

On Fri, Apr 17, 2026 at 08:29:52AM +0200, Aleksandr Loktionov wrote:
> From: Konrad Knitter <konrad.knitter@intel.com>
> 
> Disabling autonegotiation was silently ignored when autoneg had not yet
> completed (ICE_AQ_AN_COMPLETED was not set), leaving the configuration
> unchanged with no error. This could prevent link from forming if the
> link partner requires non-autoneg mode.
> 
> Extend the condition to also allow disabling autoneg when the link
> partner reports no AN ability (ICE_AQ_LP_AN_ABILITY clear). Gate the
> ICE_AQ_LP_AN_ABILITY check on the link being up so that stale or
> zeroed an_info when link is down does not produce a false positive.
> Introduce the helper ice_autoneg_disable_allowed() to make the check
> explicit.
> 
> Fixes: f1a4a66d2310 ("ice: fix set pause param autoneg check")

AI generated code review [1] flags that this does not seem to be
the right Fixes tag and that the following seems more appropriate:

Fixes: 48cb27f2fd18 ("ice: Implement handlers for ethtool PHY/link operations")

[1] https://netdev-ai.bots.linux.dev/ai-local.html

> Signed-off-by: Konrad Knitter <konrad.knitter@intel.com>
> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

The above not withstanding, this looks looks to me:

Reviewed-by: Simon Horman <horms@kernel.org>


I somehow feel obliged to acknowledge that Sashiko has provided
review of this patch. However, I don't believe any of the
issues flagged there should block progress of this patch.

You may wish to look over that review for possible follow-up activity.

...

^ permalink raw reply

* Re: [PATCH iwl-net 3/4] ice: support RDMA on 4+-port E830 devices
From: Simon Horman @ 2026-04-20 15:56 UTC (permalink / raw)
  To: aleksandr.loktionov
  Cc: 'Simon Horman', intel-wired-lan, anthony.l.nguyen, netdev,
	jan.glaza
In-Reply-To: <20260417062954.1241900-4-aleksandr.loktionov@intel.com>

From: 'Simon Horman' <horms@kernel.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
ice: support RDMA on 4+-port E830 devices

This patch narrows the RDMA capability check to keep RDMA enabled for E830
devices configured with more than 4 ports, as they do not share the
limitations of E810 and E82X devices.

> diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
> --- a/drivers/net/ethernet/intel/ice/ice_common.c
> +++ b/drivers/net/ethernet/intel/ice/ice_common.c
> @@ -2509,7 +2509,7 @@ ice_recalc_port_limited_caps(struct ice_hw *hw, struct ice_hw_common_caps *caps)
>  		caps->maxtc = 4;
>  		ice_debug(hw, ICE_DBG_INIT, "reducing maxtc to %d (based on #ports)\n",
>  			  caps->maxtc);
> -		if (caps->rdma) {
> +		if (caps->rdma && hw->mac_type != ICE_MAC_E830) {
>  			ice_debug(hw, ICE_DBG_INIT, "forcing RDMA off\n");
>  			caps->rdma = 0;
>  		}

Does this leave the diagnostic message further down out of sync with the
new condition?

If the device is an E830 with more than 4 ports, RDMA remains enabled, but
because the dev_info() call just below this block is evaluated for all
devices with more than 4 ports:

ice_recalc_port_limited_caps() {
	if (hw->dev_caps.num_funcs > 4) {
		...
		if (caps->rdma && hw->mac_type != ICE_MAC_E830) {
			ice_debug(hw, ICE_DBG_INIT, "forcing RDMA off\n");
			caps->rdma = 0;
		}

		/* print message only when processing device capabilities
		 * during initialization.
		 */
		if (caps == &hw->dev_caps.common_cap)
			dev_info(ice_hw_to_dev(hw), "RDMA functionality is not available with the current device configuration.\n");
	}
}

Could this cause confusion by logging a warning about RDMA being disabled
when it is actually active?

^ permalink raw reply

* Re: [PATCH iwl-net 4/4] ice: report EIPE checksum errors to the OS on E830
From: Simon Horman @ 2026-04-20 15:57 UTC (permalink / raw)
  To: Aleksandr Loktionov; +Cc: intel-wired-lan, anthony.l.nguyen, netdev, Jan Glaza
In-Reply-To: <20260417062954.1241900-5-aleksandr.loktionov@intel.com>

On Fri, Apr 17, 2026 at 08:29:54AM +0200, Aleksandr Loktionov wrote:
> From: Jan Glaza <jan.glaza@intel.com>
> 
> For E830 adapters the hardware-reported EIPE (Ethernet Inline IPsec
> Engine) error is a reliable indication that a received packet failed
> decryption and has a bad checksum. Route EIPE errors through the
> generic checksum error path on E830 so the error is visible via
> standard ethtool statistics (rx_csum_bad).
> 
> On previous devices (E810, E82X) the EIPE flag can be spuriously set
> on encapsulated packets with inner L2 padding, so those adapters only
> increment the driver-private hw_rx_eipe_error counter without routing
> through the checksum error path.
> 
> Fixes: 0ca6755f3cc2 ("ice: Add a new counter for Rx EIPE errors")
> Signed-off-by: Jan Glaza <jan.glaza@intel.com>
> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

Reviewed-by: Simon Horman <horms@kernel.org>


Sashiko has provided review of this patch. However, I don't believe any of
the issues flagged there should block progress of this patch.

You may wish to look over that review for possible follow-up activity.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox