Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net] net: libwx: fix VMDQ mask for 1-queue mode
From: Larysa Zaremba @ 2026-06-25 10:14 UTC (permalink / raw)
  To: Jiawen Wu
  Cc: netdev, 'Mengyuan Lou', 'Andrew Lunn',
	'David S. Miller', 'Eric Dumazet',
	'Jakub Kicinski', 'Paolo Abeni',
	'Simon Horman', 'Kees Cook'
In-Reply-To: <062401dd0487$3a3152f0$ae93f8d0$@trustnetic.com>

On Thu, Jun 25, 2026 at 05:44:33PM +0800, Jiawen Wu wrote:
> On Thu, Jun 25, 2026 5:39 PM, Larysa Zaremba wrote:
> > On Thu, Jun 25, 2026 at 05:08:51PM +0800, Jiawen Wu wrote:
> > > In wx_set_vmdq_queues(), the VMDQ mask was not set for the devices not
> > > support WX_FLAG_MULTI_64_FUNC, i.e., NGBE devices. A mask of 0 causes
> > > __ALIGN_MASK(1, ~vmdq->mask) to return 0, which incorrectly sets
> > > q_per_pool to 0 in wx_write_qde().
> > >
> > > Fix the VMDQ 1-queue mask to 0x7F then ensures that __ALIGN_MASK(1,
> > > 0x7F) correctly evaluates to 1.
> > 
> > __ALIGN_MASK(1, 0x7F) evaulates to 0x80 (128), not to 1. __ALIGN_MASK(1, 0x7E)
> > evaluates to 1. Maybe you need 0x7D for 2 queues and 0x7E for 1 queue?
> 
> Sorry, the commit log is so wrong for that '~' is missing...
> I want to describe that __ALIGN_MASK(1, ~0x7F) evaluates to 1.
>

Then I do not have any further concerns. Given you fix the lack of "~" in the 
commit message and change "not support" to "not supporting" above, I approve 
this patch.

Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
 
> > 
> > >
> > > Fixes: c52d4b898901 ("net: libwx: Redesign flow when sriov is enabled")
> > > Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
> > > ---
> > >  drivers/net/ethernet/wangxun/libwx/wx_lib.c  | 1 +
> > >  drivers/net/ethernet/wangxun/libwx/wx_type.h | 1 +
> > >  2 files changed, 2 insertions(+)
> > >
> > > diff --git a/drivers/net/ethernet/wangxun/libwx/wx_lib.c b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
> > > index d042567b8128..814d88d2aee4 100644
> > > --- a/drivers/net/ethernet/wangxun/libwx/wx_lib.c
> > > +++ b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
> > > @@ -1802,6 +1802,7 @@ static bool wx_set_vmdq_queues(struct wx *wx)
> > >  			rss_i = 4;
> > >  		}
> > >  	} else {
> > > +		vmdq_m = WX_VMDQ_1Q_MASK;
> > >  		/* double check we are limited to maximum pools */
> > >  		vmdq_i = min_t(u16, 8, vmdq_i);
> > >
> > > diff --git a/drivers/net/ethernet/wangxun/libwx/wx_type.h b/drivers/net/ethernet/wangxun/libwx/wx_type.h
> > > index c7befe4cdfe9..65e3e55db1cf 100644
> > > --- a/drivers/net/ethernet/wangxun/libwx/wx_type.h
> > > +++ b/drivers/net/ethernet/wangxun/libwx/wx_type.h
> > > @@ -486,6 +486,7 @@ enum WX_MSCA_CMD_value {
> > >
> > >  #define WX_VMDQ_4Q_MASK              0x7C
> > >  #define WX_VMDQ_2Q_MASK              0x7E
> > > +#define WX_VMDQ_1Q_MASK              0x7F
> > >
> > >  /****************** Manageablility Host Interface defines ********************/
> > >  #define WX_HI_MAX_BLOCK_BYTE_LENGTH  256 /* Num of bytes in range */
> > > --
> > > 2.51.0
> > >
> > 
> 

^ permalink raw reply

* Re: [PATCH v2 0/7] vmsplice: fix some problems in my previous vmsplice patchset
From: Askar Safin @ 2026-06-25 10:11 UTC (permalink / raw)
  To: david
  Cc: akpm, avagin, axboe, brauner, collin.funk1, david.laight.linux,
	dhowells, fuse-devel, hch, jack, joannelkoong, kernel, linux-api,
	linux-fsdevel, linux-kernel, linux-mm, luto, metze, miklos,
	netdev, patches, pfalcato, safinaskar, torvalds, val, viro, w,
	willy
In-Reply-To: <89ea76b3-e956-4232-8180-ee3929adf905@kernel.org>

"David Hildenbrand (Arm)" <david@kernel.org>:
> I think we concluded that we cannot rip out vmsplice that way at this point, and
> I suspect that Christian will drop that topic branch from -next after -rc1.

I think my patches still have a chance.

On fuse regression: I return EINVAL for particular combination of
flags used by fuse. This causes fuse to fail-back to non-vmsplice
code path. I did Debian code search, and I found none significant
packages, which use same combination of options.

So I think I was able to deal with fuse regression.

On CRIU named fifo "Not supported" regression: it is handled.

On CRIU major performance regression: it is NOT handled. But I still
think my approach is right. (See cover letter for details.)

(I wrote about all these in cover letter for this v2 patchset.)

So all regressions found so far (except for CRIU major performance
regression) are handled.

Other option is to introduce some deprecation period (as
suggested by Andrei Vagin). I can do this, if needed.

-- 
Askar Safin

^ permalink raw reply

* Re: [PATCH v2] net: mdio: airoha: fix reset control leak in error path
From: Larysa Zaremba @ 2026-06-25 10:10 UTC (permalink / raw)
  To: Wentao Liang
  Cc: Andrew Lunn, Heiner Kallweit, Russell King, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel
In-Reply-To: <20260622115403.39772-1-vulab@iscas.ac.cn>


Please, specify the target tree (probably, net) via a subject prefix.
Also, looks like you are missing to many CCs [0]

[0] 
https://netdev-ctrl.bots.linux.dev/logs/build/1114710/14639274/cc_maintainers/summary

On Mon, Jun 22, 2026 at 07:54:03PM +0800, Wentao Liang wrote:
> In airoha_mdio_probe(), after calling reset_control_deassert(),
> if clk_set_rate() fails, the function returns immediately without
> calling reset_control_assert(). This leaves the reset line
> deasserted and causes a reference count leak on shared reset
> controllers.
>

Sashiko correctly points out that since the reset controller is exclusive, 
there is no refcount leak. [1]

So the problem is missing rstc->rcdev->ops->assert(rstc->rcdev, rstc->id) call. 
It would be great if you could describe, what problem this can cause.

> Fix this by reorganizing the error handling to use a goto label,
> ensuring reset_control_assert() is called on all error paths
> before returning.
> 
> Also add error checking for reset_control_deassert().

Sashiko correctly points out you do not actually do this, which is fine, just 
update the commit message. [1]

[1] https://sashiko.dev/#/patchset/20260622115403.39772-1-vulab%40iscas.ac.cn

> Fixes: 67e3ba978361 ("net: mdio: Add MDIO bus controller for Airoha AN7583")
> Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
> ---
>  drivers/net/mdio/mdio-airoha.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/mdio/mdio-airoha.c b/drivers/net/mdio/mdio-airoha.c
> index 52e7475121ea..4c1b2415687c 100644
> --- a/drivers/net/mdio/mdio-airoha.c
> +++ b/drivers/net/mdio/mdio-airoha.c
> @@ -246,15 +246,17 @@ static int airoha_mdio_probe(struct platform_device *pdev)
>  
>  	ret = clk_set_rate(priv->clk, freq);
>  	if (ret)
> -		return ret;
> +		goto err_reset_assert;
>  
>  	ret = devm_of_mdiobus_register(dev, bus, dev->of_node);
> -	if (ret) {
> -		reset_control_assert(priv->reset);
> -		return ret;
> -	}
> +	if (ret)
> +		goto err_reset_assert;
>  
>  	return 0;
> +
> +err_reset_assert:
> +	reset_control_assert(priv->reset);
> +	return ret;
>  }
>  
>  static const struct of_device_id airoha_mdio_dt_ids[] = {
> -- 
> 2.39.5 (Apple Git-154)
> 
> 

^ permalink raw reply

* Re: [PATCH net] nfc: nci: fix uninit-value in the RF discover/activated NTF handlers
From: Sam P @ 2026-06-25 10:04 UTC (permalink / raw)
  To: Simon Horman
  Cc: david, davem, edumazet, kuba, pabeni, oe-linux-nfc, netdev,
	linux-kernel, stable
In-Reply-To: <20260625091828.1170615-2-horms@kernel.org>

On 25/06/2026 10:18, Simon Horman wrote:
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> Full review at: https://netdev-ai.bots.linux.dev/sashiko/
> ---
> nfc: nci: fix uninit-value in the RF discover/activated NTF handlers
> 
> This zero-initialises the on-stack nci_rf_discover_ntf and
> nci_rf_intf_activated_ntf structs in the RF discover and interface
> activated NTF handlers so that the technology- and activation-specific
> unions read back as zero when the corresponding length fields are zero.
> 
> Link: https://lore.kernel.org/netdev/20260623172109.1105965-2-horms@kernel.org/
> 
>>   - activated: with rf_tech_specific_params_len == 0 the same union is read
>>     via nci_target_auto_activated(); with activation_params_len == 0 the
>>     activation_params union is read by nci_store_ats_nfc_iso_dep() into
>>     ndev->target_ats.
> 
> [Low]
> Does the third scenario in this paragraph match the code?
> 
> Looking at nci_store_ats_nfc_iso_dep():
> 
>      static int nci_store_ats_nfc_iso_dep(struct nci_dev *ndev,
>                           const struct nci_rf_intf_activated_ntf *ntf)
>      {
>          ndev->target_ats_len = 0;
> 
>          if (ntf->activation_params_len <= 0)
>              return NCI_STATUS_OK;
> 
>          if (ntf->activation_params.nfca_poll_iso_dep.rats_res_len > NFC_ATS_MAXSIZE) {
>          ...
> 
> The function returns NCI_STATUS_OK before any access to the
> activation_params union when activation_params_len is 0, so the union
> is not read in that case.
> 
> The sibling nci_store_general_bytes_nfc_dep() has the same early return
> on activation_params_len <= 0.
> 
> The two rf_tech_specific_params_len == 0 cases listed earlier in the
> commit message do match the code (nci_add_new_protocol() and
> nci_target_auto_activated() unconditionally pass
> &ntf.rf_tech_specific_params even when rf_tech_specific_params_len is
> 0), and the KMSAN trace quoted in the commit message only mentions the
> nci_add_new_protocol() path.
> 
> The ntf = {} change still looks correct and necessary for the
> rf_tech_specific_params union. Could the third bullet be dropped or
> adjusted so the commit message reflects only the paths that actually
> read uninitialised bytes?

Thanks for sharing the review.

This is valid feedback, I'll fix up the commit message in a v2.

^ permalink raw reply

* Re: [PATCH net v2] tipc: fix out-of-bounds read in broadcast Gap ACK blocks
From: Sam P @ 2026-06-25 10:00 UTC (permalink / raw)
  To: Tung Quang Nguyen
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, netdev@vger.kernel.org,
	tipc-discussion@lists.sourceforge.net,
	linux-kernel@vger.kernel.org, Jon Maloy
In-Reply-To: <GV1P189MB19881AD711511094760906DCC6EC2@GV1P189MB1988.EURP189.PROD.OUTLOOK.COM>

On 25/06/2026 10:23, Tung Quang Nguyen wrote:
>> diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c index
>> 76a1585d3f6b..08637c3c9db0 100644
>> --- a/net/tipc/bcast.c
>> +++ b/net/tipc/bcast.c
>> @@ -497,11 +497,12 @@ void tipc_bcast_ack_rcv(struct net *net, struct
>> tipc_link *l,
>>   */
>> int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l,
>> 			struct tipc_msg *hdr,
>> -			struct sk_buff_head *retrq)
>> +			struct sk_buff_head *retrq, bool *valid)
>> {
>> 	struct sk_buff_head *inputq = &tipc_bc_base(net)->inputq;
>> 	struct tipc_gap_ack_blks *ga;
>> 	struct sk_buff_head xmitq;
>> +	u16 glen;
> 
> Move this variable declaration to the bottom to follow reverse xmas tree style.
> 
>> 	int rc = 0;
>>
>> 	__skb_queue_head_init(&xmitq);
>> @@ -510,13 +511,18 @@ int tipc_bcast_sync_rcv(struct net *net, struct
>> tipc_link *l,
>> 	if (msg_type(hdr) != STATE_MSG) {
>> 		tipc_link_bc_init_rcv(l, hdr);
>> 	} else if (!msg_bc_ack_invalid(hdr)) {
>> -		tipc_get_gap_ack_blks(&ga, l, hdr, false);
>> -		if (!sysctl_tipc_bc_retruni)
>> -			retrq = &xmitq;
>> -		rc = tipc_link_bc_ack_rcv(l, msg_bcast_ack(hdr),
>> -					  msg_bc_gap(hdr), ga, &xmitq,
>> -					  retrq);
>> -		rc |= tipc_link_bc_sync_rcv(l, hdr, &xmitq);
>> +		glen = tipc_get_gap_ack_blks(&ga, l, hdr, false);
>> +		if (glen > msg_data_sz(hdr)) {
>> +			/* Malformed Gap ACK blocks; caller drops the msg */
>> +			*valid = false;
>> +		} else {
>> +			if (!sysctl_tipc_bc_retruni)
>> +				retrq = &xmitq;
>> +			rc = tipc_link_bc_ack_rcv(l, msg_bcast_ack(hdr),
>> +						  msg_bc_gap(hdr), ga, &xmitq,
>> +						  retrq);
>> +			rc |= tipc_link_bc_sync_rcv(l, hdr, &xmitq);
>> +		}
>> 	}
>> 	tipc_bcast_unlock(net);
>>
>> diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h index
>> 2d9352dc7b0e..55d17b5413e1 100644
>> --- a/net/tipc/bcast.h
>> +++ b/net/tipc/bcast.h
>> @@ -97,7 +97,7 @@ void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l,
>> 			struct tipc_msg *hdr);
>> int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l,
>> 			struct tipc_msg *hdr,
>> -			struct sk_buff_head *retrq);
>> +			struct sk_buff_head *retrq, bool *valid);
>> int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg,
>> 			struct tipc_link *bcl);
>> int tipc_nl_bc_link_set(struct net *net, struct nlattr *attrs[]); diff --git
>> a/net/tipc/node.c b/net/tipc/node.c index 97aa970a0d83..2887f94ee28f
>> 100644
>> --- a/net/tipc/node.c
>> +++ b/net/tipc/node.c
>> @@ -1831,12 +1831,13 @@ static void tipc_node_mcast_rcv(struct tipc_node
>> *n)  }
>>
>> static void tipc_node_bc_sync_rcv(struct tipc_node *n, struct tipc_msg *hdr,
>> -				  int bearer_id, struct sk_buff_head *xmitq)
>> +				  int bearer_id, struct sk_buff_head *xmitq,
>> +				  bool *valid)
>> {
>> 	struct tipc_link *ucl;
>> 	int rc;
>>
>> -	rc = tipc_bcast_sync_rcv(n->net, n->bc_entry.link, hdr, xmitq);
>> +	rc = tipc_bcast_sync_rcv(n->net, n->bc_entry.link, hdr, xmitq, valid);
> 
> 'valid' needs to be checked after this call. Then, return immediately if it is false.
> 
>>
>> 	if (rc & TIPC_LINK_DOWN_EVT) {
>> 		tipc_node_reset_links(n);
>> @@ -2140,12 +2141,18 @@ void tipc_rcv(struct net *net, struct sk_buff *skb,
>> struct tipc_bearer *b)
>>
>> 	/* Ensure broadcast reception is in synch with peer's send state */
>> 	if (unlikely(usr == LINK_PROTOCOL)) {
>> +		bool valid = true;
>> +
>> 		if (unlikely(skb_linearize(skb))) {
>> 			tipc_node_put(n);
>> 			goto discard;
>> 		}
>> 		hdr = buf_msg(skb);
>> -		tipc_node_bc_sync_rcv(n, hdr, bearer_id, &xmitq);
>> +		tipc_node_bc_sync_rcv(n, hdr, bearer_id, &xmitq, &valid);
>> +		if (!valid) {
>> +			tipc_node_put(n);
>> +			goto discard;
>> +		}
>> 	} else if (unlikely(tipc_link_acked(n->bc_entry.link) != bc_ack)) {
>> 		tipc_bcast_ack_rcv(net, n->bc_entry.link, hdr);
>> 	}
>>
>> base-commit: a986fde914d88af47eb78fd29c5d1af7952c3500
>> --
>> 2.54.0
> 

Thanks for the review, I'll address this in a v3.

^ permalink raw reply

* [PATCH 0/2] net/sched: finish the qdisc_dequeue_peeked conversion (taprio, multiq)
From: Bryam Vargas via B4 Relay @ 2026-06-25  9:51 UTC (permalink / raw)
  To: Vinicius Costa Gomes, Paolo Abeni, Jamal Hadi Salim, Jiri Pirko,
	Jakub Kicinski, David S. Miller, Eric Dumazet
  Cc: Simon Horman, netdev, Jarek Poplawski, Vladimir Oltean,
	linux-kernel

Commit 77be155cba4e added peek emulation: a non-work-conserving qdisc's
->peek dequeues one skb and stashes it in the child's gso_skb. A parent
that peeks such a child must then take the packet with
qdisc_dequeue_peeked(), not a direct ->dequeue(), or the stashed skb is
bypassed and the child's qlen/backlog desync. sch_red and sch_sfb were
just fixed for this; taprio and multiq still take the direct path.

With a qfq child the desync re-enters qfq_dequeue on an emptied aggregate
list and dereferences NULL, panicking from softirq on ordinary egress.
taprio reaches it on its own (root-only software path, all gates open);
multiq reaches it when a peeking parent such as tbf wraps it over a
non-work-conserving grandchild. Both need only CAP_NET_ADMIN.

Confirmed under KASAN: the unpatched arm panics, the patched arm is
clean, and a work-conserving-child control is clean. The reproducers and
splats for both are below; the per-patch changes are one line each.

taprio reproducer (self-triggering, no parent qdisc needed):

  ip link add dummy0 numtxqueues 4 type dummy; ip link set dummy0 up
  ip addr add 10.10.11.10/24 dev dummy0
  tc qdisc add dev dummy0 root handle 1: taprio num_tc 2 \
     map 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 queues 1@0 1@1 \
     base-time 9000000000000000000 sched-entry S 03 200000 flags 0x0 clockid CLOCK_TAI
  tc qdisc replace dev dummy0 parent 1:1 handle 3: qfq
  tc class  add dev dummy0 classid 3:1 parent 3: qfq maxpkt 512 weight 1
  tc filter add dev dummy0 parent 3: protocol ip prio 1 matchall classid 3:1
  ping -c1 10.10.11.99 -I dummy0

[  903.769174] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000009: 0000 [#1] SMP KASAN NOPTI
[  903.769953] KASAN: null-ptr-deref in range [0x0000000000000048-0x000000000000004f]
[  903.770456] CPU: 7 UID: 0 PID: 16162 Comm: ping Not tainted 7.1.0-rc5 #1 PREEMPT(lazy)
[  903.771725] RIP: 0010:qfq_dequeue+0x362/0x1580 [sch_qfq]
[  903.777452] Call Trace:
[  903.778311]  taprio_dequeue_from_txq+0x383/0x680 [sch_taprio]
[  903.778685]  taprio_dequeue_tc_priority+0x19a/0x330 [sch_taprio]
[  903.779645]  taprio_dequeue+0xa6/0x330 [sch_taprio]
[  903.780299]  __qdisc_run+0x16c/0x1890
[  903.780854]  __dev_queue_xmit+0x1ece/0x3390
[  903.784109]  ip_finish_output2+0x571/0x1da0
[  903.785996]  ip_output+0x26c/0x4d0
[  903.789572]  ping_v4_sendmsg+0xd22/0x12b0
[  903.796118]  __x64_sys_sendto+0xe0/0x1c0
[  903.796612]  do_syscall_64+0xee/0x590
[  903.818669] Kernel panic - not syncing: Fatal exception in interrupt

multiq reproducer (needs a peeking parent over a stashing child; tbf
values chosen to force it to throttle):

  ip link add dummy0 numtxqueues 2 type dummy; ip link set dummy0 up
  ip addr add 10.10.11.10/24 dev dummy0
  tc qdisc add dev dummy0 root handle 1: tbf rate 88bit burst 1661b \
     peakrate 2257333 minburst 1024 limit 7b
  tc qdisc add dev dummy0 parent 1: handle 2: multiq
  for b in 1 2; do                          # qfq on every band
    tc qdisc  add dev dummy0 parent 2:$b handle 3$b: qfq
    tc class  add dev dummy0 classid 3$b:1 parent 3$b: qfq maxpkt 512 weight 1
    tc filter add dev dummy0 parent 3$b: protocol ip prio 1 matchall classid 3$b:1
  done
  ping -c12 10.10.11.99 -I dummy0

[ 1066.385097] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000009: 0000 [#1] SMP KASAN NOPTI
[ 1066.386385] KASAN: null-ptr-deref in range [0x0000000000000048-0x000000000000004f]
[ 1066.387227] CPU: 1 UID: 0 PID: 5357 Comm: ping Not tainted 7.1.0-rc5 #1 PREEMPT(lazy)
[ 1066.389183] RIP: 0010:qfq_dequeue+0x362/0x1580 [sch_qfq]
[ 1066.396316] Call Trace:
[ 1066.396768]  multiq_dequeue+0x163/0x360 [sch_multiq]
[ 1066.397885]  tbf_dequeue+0x6b9/0xf17 [sch_tbf]
[ 1066.398269]  __qdisc_run+0x16c/0x1890
[ 1066.399315]  __dev_queue_xmit+0x1ece/0x3390
[ 1066.403276]  ip_finish_output2+0x571/0x1da0
[ 1066.404818]  ip_output+0x26c/0x4d0
[ 1066.408620]  ping_v4_sendmsg+0xd22/0x12b0
[ 1066.415264]  __x64_sys_sendto+0xe0/0x1c0
[ 1066.416251]  do_syscall_64+0xee/0x590
[ 1066.441210] Kernel panic - not syncing: Fatal exception in interrupt

---
Bryam Vargas (2):
      net/sched: sch_taprio: Replace direct dequeue call with peek and qdisc_dequeue_peeked
      net/sched: sch_multiq: Replace direct dequeue call with peek and qdisc_dequeue_peeked

 net/sched/sch_multiq.c | 2 +-
 net/sched/sch_taprio.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)
---
base-commit: 02f144fbb4c86c360495d33debe307cb46a57f95
change-id: 20260625-b4-disp-31bcb279-082e59a3aa36

Best regards,
-- 
Bryam Vargas <hexlabsecurity@proton.me>



^ permalink raw reply

* [PATCH 1/2] net/sched: sch_taprio: Replace direct dequeue call with peek and qdisc_dequeue_peeked
From: Bryam Vargas via B4 Relay @ 2026-06-25  9:51 UTC (permalink / raw)
  To: Vinicius Costa Gomes, Paolo Abeni, Jamal Hadi Salim, Jiri Pirko,
	Jakub Kicinski, David S. Miller, Eric Dumazet
  Cc: Simon Horman, netdev, Jarek Poplawski, Vladimir Oltean,
	linux-kernel
In-Reply-To: <20260625-b4-disp-31bcb279-v1-0-85c40b83c529@proton.me>

From: Bryam Vargas <hexlabsecurity@proton.me>

When taprio's software path peeks a non-work-conserving child qdisc, the
child stashes the peeked skb in its gso_skb; taprio_dequeue_from_txq()
then takes the packet with a direct child ->dequeue() call, which ignores
that stash, orphans the peeked skb and desyncs the child's qlen/backlog.
With a qfq child this re-enters the child on an emptied list and
dereferences NULL, panicking the kernel from softirq on ordinary egress.

Take the packet through qdisc_dequeue_peeked(), as sch_red and sch_sfb
now do. The helper returns the child's stashed skb first and is a no-op
when there is none, so a work-conserving child is unaffected and the
gated path now consumes the skb whose length was charged to the budget.

Fixes: 5a781ccbd19e ("tc: Add support for configuring the taprio scheduler")
Cc: stable@vger.kernel.org
Cc: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
 net/sched/sch_taprio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index 558987d9b977..299234a5f0fe 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -749,7 +749,7 @@ static struct sk_buff *taprio_dequeue_from_txq(struct Qdisc *sch, int txq,
 		return NULL;

 skip_peek_checks:
-	skb = child->ops->dequeue(child);
+	skb = qdisc_dequeue_peeked(child);
 	if (unlikely(!skb))
 		return NULL;

-- 
2.43.0

^ permalink raw reply related

* [PATCH 2/2] net/sched: sch_multiq: Replace direct dequeue call with peek and qdisc_dequeue_peeked
From: Bryam Vargas via B4 Relay @ 2026-06-25  9:51 UTC (permalink / raw)
  To: Vinicius Costa Gomes, Paolo Abeni, Jamal Hadi Salim, Jiri Pirko,
	Jakub Kicinski, David S. Miller, Eric Dumazet
  Cc: Simon Horman, netdev, Jarek Poplawski, Vladimir Oltean,
	linux-kernel
In-Reply-To: <20260625-b4-disp-31bcb279-v1-0-85c40b83c529@proton.me>

From: Bryam Vargas <hexlabsecurity@proton.me>

multiq_dequeue() takes a packet from a band's child with a direct
->dequeue() call after multiq_peek() peeked it. When the child is
non-work-conserving the peek stashes the skb in the child's gso_skb, so
the direct dequeue returns a different skb and orphans the stash,
desyncing the child's qlen/backlog. With a qfq child reached through a
peeking parent (e.g. tbf) this re-enters the child on an emptied list and
dereferences NULL, panicking the kernel from softirq on ordinary egress.

Take the packet through qdisc_dequeue_peeked(), as sch_prio already does
and as sch_red and sch_sfb were just fixed to do. The helper is a no-op
when the child has no stash, so a work-conserving child is unaffected.

Fixes: 77be155cba4e ("pkt_sched: Add peek emulation for non-work-conserving qdiscs.")
Cc: stable@vger.kernel.org
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
 net/sched/sch_multiq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sched/sch_multiq.c b/net/sched/sch_multiq.c
index 4e465d11e3d7..a467dd122369 100644
--- a/net/sched/sch_multiq.c
+++ b/net/sched/sch_multiq.c
@@ -103,7 +103,7 @@ static struct sk_buff *multiq_dequeue(struct Qdisc *sch)
 		if (!netif_xmit_stopped(
 		    netdev_get_tx_queue(qdisc_dev(sch), q->curband))) {
 			qdisc = q->queues[q->curband];
-			skb = qdisc->dequeue(qdisc);
+			skb = qdisc_dequeue_peeked(qdisc);
 			if (skb) {
 				qdisc_bstats_update(sch, skb);
 				qdisc_qlen_dec(sch);

-- 
2.43.0



^ permalink raw reply related

* RE: [PATCH net] net: libwx: fix VMDQ mask for 1-queue mode
From: Jiawen Wu @ 2026-06-25  9:44 UTC (permalink / raw)
  To: 'Larysa Zaremba'
  Cc: netdev, 'Mengyuan Lou', 'Andrew Lunn',
	'David S. Miller', 'Eric Dumazet',
	'Jakub Kicinski', 'Paolo Abeni',
	'Simon Horman', 'Kees Cook'
In-Reply-To: <ajz3QK96wKoLD4n4@soc-5CG4396X81.clients.intel.com>

On Thu, Jun 25, 2026 5:39 PM, Larysa Zaremba wrote:
> On Thu, Jun 25, 2026 at 05:08:51PM +0800, Jiawen Wu wrote:
> > In wx_set_vmdq_queues(), the VMDQ mask was not set for the devices not
> > support WX_FLAG_MULTI_64_FUNC, i.e., NGBE devices. A mask of 0 causes
> > __ALIGN_MASK(1, ~vmdq->mask) to return 0, which incorrectly sets
> > q_per_pool to 0 in wx_write_qde().
> >
> > Fix the VMDQ 1-queue mask to 0x7F then ensures that __ALIGN_MASK(1,
> > 0x7F) correctly evaluates to 1.
> 
> __ALIGN_MASK(1, 0x7F) evaulates to 0x80 (128), not to 1. __ALIGN_MASK(1, 0x7E)
> evaluates to 1. Maybe you need 0x7D for 2 queues and 0x7E for 1 queue?

Sorry, the commit log is so wrong for that '~' is missing...
I want to describe that __ALIGN_MASK(1, ~0x7F) evaluates to 1.

> 
> >
> > Fixes: c52d4b898901 ("net: libwx: Redesign flow when sriov is enabled")
> > Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
> > ---
> >  drivers/net/ethernet/wangxun/libwx/wx_lib.c  | 1 +
> >  drivers/net/ethernet/wangxun/libwx/wx_type.h | 1 +
> >  2 files changed, 2 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/wangxun/libwx/wx_lib.c b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
> > index d042567b8128..814d88d2aee4 100644
> > --- a/drivers/net/ethernet/wangxun/libwx/wx_lib.c
> > +++ b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
> > @@ -1802,6 +1802,7 @@ static bool wx_set_vmdq_queues(struct wx *wx)
> >  			rss_i = 4;
> >  		}
> >  	} else {
> > +		vmdq_m = WX_VMDQ_1Q_MASK;
> >  		/* double check we are limited to maximum pools */
> >  		vmdq_i = min_t(u16, 8, vmdq_i);
> >
> > diff --git a/drivers/net/ethernet/wangxun/libwx/wx_type.h b/drivers/net/ethernet/wangxun/libwx/wx_type.h
> > index c7befe4cdfe9..65e3e55db1cf 100644
> > --- a/drivers/net/ethernet/wangxun/libwx/wx_type.h
> > +++ b/drivers/net/ethernet/wangxun/libwx/wx_type.h
> > @@ -486,6 +486,7 @@ enum WX_MSCA_CMD_value {
> >
> >  #define WX_VMDQ_4Q_MASK              0x7C
> >  #define WX_VMDQ_2Q_MASK              0x7E
> > +#define WX_VMDQ_1Q_MASK              0x7F
> >
> >  /****************** Manageablility Host Interface defines ********************/
> >  #define WX_HI_MAX_BLOCK_BYTE_LENGTH  256 /* Num of bytes in range */
> > --
> > 2.51.0
> >
> 


^ permalink raw reply

* [PATCH net] net: airoha: dma map xmit frags with skb_frag_dma_map()
From: Lorenzo Bianconi @ 2026-06-25  9:42 UTC (permalink / raw)
  To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: linux-arm-kernel, linux-mediatek, netdev, Lorenzo Bianconi

Map xmit skb fragments using skb_frag_dma_map() instead of
dma_map_single(skb_frag_address()). skb_frag_address() relies on
page_address() to obtain a kernel virtual address, which is not
guaranteed to work for all page types (e.g. highmem pages or
user-pinned pages from MSG_ZEROCOPY).
skb_frag_dma_map() maps the fragment directly via its struct page and
offset through dma_map_page(), avoiding the need for a kernel virtual
address entirely.
Introduce an enum airoha_dma_map_type to track how each queue entry was
mapped (single vs page), so that the matching unmap function is called
on completion and in error paths.

Fixes: 23020f049327 ("net: airoha: Introduce ethernet support for EN7581 SoC")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/airoha/airoha_eth.c | 61 ++++++++++++++++++++------------
 drivers/net/ethernet/airoha/airoha_eth.h |  7 ++++
 2 files changed, 45 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
index 932b3a3df2e5..1caf6766f2c0 100644
--- a/drivers/net/ethernet/airoha/airoha_eth.c
+++ b/drivers/net/ethernet/airoha/airoha_eth.c
@@ -944,6 +944,25 @@ static void airoha_qdma_wake_netdev_txqs(struct airoha_queue *q)
 	q->txq_stopped = false;
 }
 
+static void airoha_unmap_xmit_buf(struct airoha_eth *eth,
+				  struct airoha_queue_entry *e)
+{
+	switch (e->dma_type) {
+	case AIROHA_DMA_MAP_PAGE:
+		dma_unmap_page(eth->dev, e->dma_addr, e->dma_len,
+			       DMA_TO_DEVICE);
+		break;
+	case AIROHA_DMA_MAP_SINGLE:
+		dma_unmap_single(eth->dev, e->dma_addr, e->dma_len,
+				 DMA_TO_DEVICE);
+		break;
+	case AIROHA_DMA_UNMAPPED:
+	default:
+		break;
+	}
+	e->dma_type = AIROHA_DMA_UNMAPPED;
+}
+
 static int airoha_qdma_tx_napi_poll(struct napi_struct *napi, int budget)
 {
 	struct airoha_tx_irq_queue *irq_q;
@@ -1006,9 +1025,7 @@ static int airoha_qdma_tx_napi_poll(struct napi_struct *napi, int budget)
 		skb = e->skb;
 		e->skb = NULL;
 
-		dma_unmap_single(eth->dev, e->dma_addr, e->dma_len,
-				 DMA_TO_DEVICE);
-		e->dma_addr = 0;
+		airoha_unmap_xmit_buf(eth, e);
 		list_add_tail(&e->list, &q->tx_list);
 
 		WRITE_ONCE(desc->msg0, 0);
@@ -1177,12 +1194,10 @@ static void airoha_qdma_tx_cleanup(struct airoha_qdma *qdma)
 			struct airoha_qdma_desc *desc = &q->desc[j];
 			struct sk_buff *skb = e->skb;
 
-			if (!e->dma_addr)
+			if (e->dma_type == AIROHA_DMA_UNMAPPED)
 				continue;
 
-			dma_unmap_single(qdma->eth->dev, e->dma_addr,
-					 e->dma_len, DMA_TO_DEVICE);
-			e->dma_addr = 0;
+			airoha_unmap_xmit_buf(qdma->eth, e);
 			list_add_tail(&e->list, &q->tx_list);
 
 			WRITE_ONCE(desc->ctrl, 0);
@@ -2193,8 +2208,8 @@ static netdev_tx_t airoha_dev_xmit(struct sk_buff *skb,
 	struct netdev_queue *txq;
 	struct airoha_queue *q;
 	LIST_HEAD(tx_list);
+	dma_addr_t addr;
 	int i = 0, qid;
-	void *data;
 	u16 index;
 	u8 fport;
 
@@ -2250,24 +2265,22 @@ static netdev_tx_t airoha_dev_xmit(struct sk_buff *skb,
 		return NETDEV_TX_BUSY;
 	}
 
-	len = skb_headlen(skb);
-	data = skb->data;
-
 	e = list_first_entry(&q->tx_list, struct airoha_queue_entry,
 			     list);
+	len = skb_headlen(skb);
+	addr = dma_map_single(netdev->dev.parent, skb->data, len,
+			      DMA_TO_DEVICE);
+	if (unlikely(dma_mapping_error(netdev->dev.parent, addr)))
+		goto error_unlock;
+
+	e->dma_type = AIROHA_DMA_MAP_SINGLE;
 	index = e - q->entry;
 
 	while (true) {
 		struct airoha_qdma_desc *desc = &q->desc[index];
 		skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
-		dma_addr_t addr;
 		u32 val;
 
-		addr = dma_map_single(netdev->dev.parent, data, len,
-				      DMA_TO_DEVICE);
-		if (unlikely(dma_mapping_error(netdev->dev.parent, addr)))
-			goto error_unmap;
-
 		list_move_tail(&e->list, &tx_list);
 		e->skb = i == nr_frags - 1 ? skb : NULL;
 		e->dma_addr = addr;
@@ -2291,8 +2304,13 @@ static netdev_tx_t airoha_dev_xmit(struct sk_buff *skb,
 		if (++i == nr_frags)
 			break;
 
-		data = skb_frag_address(frag);
 		len = skb_frag_size(frag);
+		addr = skb_frag_dma_map(netdev->dev.parent, frag, 0, len,
+					DMA_TO_DEVICE);
+		if (unlikely(dma_mapping_error(netdev->dev.parent, addr)))
+			goto error_unmap;
+
+		e->dma_type = AIROHA_DMA_MAP_PAGE;
 	}
 	q->queued += i;
 
@@ -2313,11 +2331,8 @@ static netdev_tx_t airoha_dev_xmit(struct sk_buff *skb,
 	return NETDEV_TX_OK;
 
 error_unmap:
-	list_for_each_entry(e, &tx_list, list) {
-		dma_unmap_single(netdev->dev.parent, e->dma_addr, e->dma_len,
-				 DMA_TO_DEVICE);
-		e->dma_addr = 0;
-	}
+	list_for_each_entry(e, &tx_list, list)
+		airoha_unmap_xmit_buf(dev->eth, e);
 	list_splice(&tx_list, &q->tx_list);
 error_unlock:
 	spin_unlock_bh(&q->lock);
diff --git a/drivers/net/ethernet/airoha/airoha_eth.h b/drivers/net/ethernet/airoha/airoha_eth.h
index d7ff8c5200e2..2765244d937c 100644
--- a/drivers/net/ethernet/airoha/airoha_eth.h
+++ b/drivers/net/ethernet/airoha/airoha_eth.h
@@ -170,12 +170,19 @@ enum trtcm_param {
 #define TRTCM_TOKEN_RATE_MASK			GENMASK(23, 6)
 #define TRTCM_TOKEN_RATE_FRACTION_MASK		GENMASK(5, 0)
 
+enum airoha_dma_map_type {
+	AIROHA_DMA_UNMAPPED,
+	AIROHA_DMA_MAP_SINGLE,
+	AIROHA_DMA_MAP_PAGE,
+};
+
 struct airoha_queue_entry {
 	union {
 		void *buf;
 		struct {
 			struct list_head list;
 			struct sk_buff *skb;
+			enum airoha_dma_map_type dma_type;
 		};
 	};
 	dma_addr_t dma_addr;

---
base-commit: 232c4ca2343d1181cbfc061f9856d9591e397579
change-id: 20260625-airoha-eth-skb_frag_dma_map-bcccd5d6e4b1

Best regards,
-- 
Lorenzo Bianconi <lorenzo@kernel.org>


^ permalink raw reply related

* Re: [PATCH net] net: libwx: fix VMDQ mask for 1-queue mode
From: Larysa Zaremba @ 2026-06-25  9:39 UTC (permalink / raw)
  To: Jiawen Wu
  Cc: netdev, Mengyuan Lou, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Kees Cook
In-Reply-To: <60D88ADD3E295420+20260625090851.539640-1-jiawenwu@trustnetic.com>

On Thu, Jun 25, 2026 at 05:08:51PM +0800, Jiawen Wu wrote:
> In wx_set_vmdq_queues(), the VMDQ mask was not set for the devices not
> support WX_FLAG_MULTI_64_FUNC, i.e., NGBE devices. A mask of 0 causes
> __ALIGN_MASK(1, ~vmdq->mask) to return 0, which incorrectly sets
> q_per_pool to 0 in wx_write_qde().
> 
> Fix the VMDQ 1-queue mask to 0x7F then ensures that __ALIGN_MASK(1,
> 0x7F) correctly evaluates to 1.

__ALIGN_MASK(1, 0x7F) evaulates to 0x80 (128), not to 1. __ALIGN_MASK(1, 0x7E) 
evaluates to 1. Maybe you need 0x7D for 2 queues and 0x7E for 1 queue?

> 
> Fixes: c52d4b898901 ("net: libwx: Redesign flow when sriov is enabled")
> Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
> ---
>  drivers/net/ethernet/wangxun/libwx/wx_lib.c  | 1 +
>  drivers/net/ethernet/wangxun/libwx/wx_type.h | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/drivers/net/ethernet/wangxun/libwx/wx_lib.c b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
> index d042567b8128..814d88d2aee4 100644
> --- a/drivers/net/ethernet/wangxun/libwx/wx_lib.c
> +++ b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
> @@ -1802,6 +1802,7 @@ static bool wx_set_vmdq_queues(struct wx *wx)
>  			rss_i = 4;
>  		}
>  	} else {
> +		vmdq_m = WX_VMDQ_1Q_MASK;
>  		/* double check we are limited to maximum pools */
>  		vmdq_i = min_t(u16, 8, vmdq_i);
>  
> diff --git a/drivers/net/ethernet/wangxun/libwx/wx_type.h b/drivers/net/ethernet/wangxun/libwx/wx_type.h
> index c7befe4cdfe9..65e3e55db1cf 100644
> --- a/drivers/net/ethernet/wangxun/libwx/wx_type.h
> +++ b/drivers/net/ethernet/wangxun/libwx/wx_type.h
> @@ -486,6 +486,7 @@ enum WX_MSCA_CMD_value {
>  
>  #define WX_VMDQ_4Q_MASK              0x7C
>  #define WX_VMDQ_2Q_MASK              0x7E
> +#define WX_VMDQ_1Q_MASK              0x7F
>  
>  /****************** Manageablility Host Interface defines ********************/
>  #define WX_HI_MAX_BLOCK_BYTE_LENGTH  256 /* Num of bytes in range */
> -- 
> 2.51.0
> 

^ permalink raw reply

* Re: [PATCH net] net: gianfar: use of_irq_get()
From: Simon Horman @ 2026-06-25  9:36 UTC (permalink / raw)
  To: rosenp
  Cc: Simon Horman, netdev, claudiu.manoil, andrew+netdev, davem,
	edumazet, kuba, pabeni, afleming, linux-kernel
In-Reply-To: <20260624032122.986986-1-rosenp@gmail.com>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://netdev-ai.bots.linux.dev/sashiko/
---
net: gianfar: use of_irq_get()

This patch switches gfar_parse_group() from irq_of_parse_and_map() to
of_irq_get() and widens the irq field in struct gfar_irqinfo from
unsigned int to int. The stated goals are to fix a leak from a missing
irq_dispose_mapping() and to enable -EPROBE_DEFER propagation.

> diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c
> index 3271de5844f8..17a0d0787ed2 100644
> --- a/drivers/net/ethernet/freescale/gianfar.c
> +++ b/drivers/net/ethernet/freescale/gianfar.c
> @@ -514,15 +514,15 @@ static int gfar_parse_group(struct device_node *np,
>  	if (!grp->regs)
>  		return -ENOMEM;
>
> -	gfar_irq(grp, TX)->irq = irq_of_parse_and_map(np, 0);
> +	gfar_irq(grp, TX)->irq = of_irq_get(np, 0);
>
>  	/* If we aren't the FEC we have multiple interrupts */
>  	if (model && strcasecmp(model, "FEC")) {
> -		gfar_irq(grp, RX)->irq = irq_of_parse_and_map(np, 1);
> -		gfar_irq(grp, ER)->irq = irq_of_parse_and_map(np, 2);
> -		if (!gfar_irq(grp, TX)->irq ||
> -		    !gfar_irq(grp, RX)->irq ||
> -		    !gfar_irq(grp, ER)->irq)
> +		gfar_irq(grp, RX)->irq = of_irq_get(np, 1);
> +		gfar_irq(grp, ER)->irq = of_irq_get(np, 2);
> +		if (gfar_irq(grp, TX)->irq < 0 ||
> +		    gfar_irq(grp, RX)->irq < 0 ||
> +		    gfar_irq(grp, ER)->irq < 0)
>  			return -EINVAL;
>  	}

[High]
Does this change miss the 0-return mapping failure case?

The kernel-doc on of_irq_get() in drivers/of/irq.c says it returns:

    Linux IRQ number on success, or 0 on the IRQ mapping failure, or
    -EPROBE_DEFER ..., or error code in case of any other failure.

The old test rejected an irq value of 0 with !gfar_irq(grp, *)->irq,
which caught both irq_of_parse_and_map()'s and of_irq_get()'s
zero-on-failure. The new test only rejects values strictly less than
zero, so a 0 returned from of_irq_get() is now stored as the IRQ and
later flows into request_irq()/free_irq() during probe and unbind.

Should the test be <= 0, or split so that negative errors are
propagated and 0 still triggers a failure?

[High]
Is -EPROBE_DEFER lost here?

The commit message states:

    Changed irq to an int as required by the of_irq_get API as it
    supports -EPROBE_DEFER.

However, the only error path is:

    if (gfar_irq(grp, TX)->irq < 0 ||
        gfar_irq(grp, RX)->irq < 0 ||
        gfar_irq(grp, ER)->irq < 0)
            return -EINVAL;

of_irq_get() returns -EPROBE_DEFER when irq_find_host() cannot yet
locate the parent interrupt controller. Collapsing every negative
return into -EINVAL prevents the driver core from retrying the probe
once the interrupt controller appears.

Should the actual error from of_irq_get() be propagated, for example:

    int ret = gfar_irq(grp, TX)->irq;
    if (ret < 0)
            return ret;

so that -EPROBE_DEFER reaches gfar_of_init() and then gfar_probe()?

[Medium]
Can the commit message description be revisited?

The message states:

    of_irq_get() differs from irq_of_parse_and_map() in that the latter
    requires calling irq_dispose_mapping() when done, which is missing
    in the driver. Meaning it leaks memory.

    No need to map it anyway. Just need the value stored in the irq
    field.

Looking at of_irq_get() in drivers/of/irq.c:

    int of_irq_get(struct device_node *dev, int index)
    {
            ...
            rc = irq_create_of_mapping(&oirq);
            ...
    }

of_irq_get() ends with the same irq_create_of_mapping() call that
irq_of_parse_and_map() makes, so the resulting virq mapping is created
identically by both APIs.

If the previous code leaked, does the new code not leak in the same
way? And does the Fixes: b31a1d8b4151 tag accurately describe this
change, given that it may cause stable-tree selection on the basis of a
leak that does not appear to exist?

^ permalink raw reply

* Re: [PATCH v29 4/5] sfc: obtain and map cxl range using devm_cxl_probe_mem
From: Alejandro Lucero Palau @ 2026-06-25  9:31 UTC (permalink / raw)
  To: Dan Williams (nvidia), alejandro.lucero-palau, linux-cxl, netdev,
	dan.j.williams, edward.cree, davem, kuba, pabeni, edumazet,
	dave.jiang
  Cc: Edward Cree
In-Reply-To: <6a3c55eea91d0_f12301008f@djbw-dev.notmuch>


On 6/24/26 23:10, Dan Williams (nvidia) wrote:
> alejandro.lucero-palau@ wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Use core API for safely obtain the CXL range linked to an HDM committed
>> by the BIOS. Map such a range for being used as the ctpio buffer.
>>
>> A potential user space action through sysfs unbinding or core cxl
>> modules remove will trigger sfc driver device detachment, with that case
>> not racing with this mapping as this is done during driver probe and
>> therefore protected with device lock against those user space actions.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
>> Acked-by: Edward Cree <ecree.xilinx@gmail.com>
>> ---
>>   drivers/net/ethernet/sfc/efx.c     |  2 ++
>>   drivers/net/ethernet/sfc/efx_cxl.c | 23 +++++++++++++++++++++++
>>   drivers/net/ethernet/sfc/efx_cxl.h |  3 +++
>>   3 files changed, 28 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
>> index 61cbb6cfc360..3806cd3dd7f4 100644
>> --- a/drivers/net/ethernet/sfc/efx.c
>> +++ b/drivers/net/ethernet/sfc/efx.c
>> @@ -984,6 +984,7 @@ static void efx_pci_remove(struct pci_dev *pci_dev)
>>   	efx_fini_io(efx);
>>   
>>   	probe_data = container_of(efx, struct efx_probe_data, efx);
>> +	efx_cxl_exit(probe_data);
>>   
>>   	pci_dbg(efx->pci_dev, "shutdown successful\n");
>>   
>> @@ -1242,6 +1243,7 @@ static int efx_pci_probe(struct pci_dev *pci_dev,
>>   	return 0;
>>   
>>    fail3:
>> +	efx_cxl_exit(probe_data);
>>   	efx_fini_io(efx);
>>    fail2:
>>   	efx_fini_struct(efx);
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>> index 18b535b3ea40..3e7c950f83e9 100644
>> --- a/drivers/net/ethernet/sfc/efx_cxl.c
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -18,6 +18,7 @@ int efx_cxl_init(struct efx_probe_data *probe_data)
>>   {
>>   	struct efx_nic *efx = &probe_data->efx;
>>   	struct pci_dev *pci_dev = efx->pci_dev;
>> +	struct range cxl_pio_range;
>>   	struct efx_cxl *cxl;
>>   	u16 dvsec;
>>   	int rc;
>> @@ -73,9 +74,31 @@ int efx_cxl_init(struct efx_probe_data *probe_data)
>>   		return -ENODEV;
>>   	}
>>   
>> +	cxl->cxlmd = devm_cxl_probe_mem(&cxl->cxlds, &cxl_pio_range);
>> +	if (IS_ERR(cxl->cxlmd)) {
>> +		pci_err(pci_dev, "CXL accel memdev creation failed\n");
>> +		return PTR_ERR(cxl->cxlmd);
>> +	}
>> +
>> +	cxl->ctpio_cxl = ioremap_wc(cxl_pio_range.start,
>> +				    range_len(&cxl_pio_range));
>> +	if (!cxl->ctpio_cxl) {
>> +		pci_err(pci_dev, "CXL ioremap region (%pra) failed\n",
>> +			&cxl_pio_range);
>> +		return -ENOMEM;
>> +	}
>> +
>>   	probe_data->cxl = cxl;
>>   
>>   	return 0;
>>   }
>>   
>> +void efx_cxl_exit(struct efx_probe_data *probe_data)
>> +{
> If you are going to have an explicit efx_cxl_exit() then I would also
> add an explicit unregistration of the memdev.


This is necessary for undoing the mmap. Nothing else happens there 
because it is all relying on devm ...


I could change the ioremap_wc call to devm_ioremap_wc, but


> This would also fix the
> Sashiko report about pci_disable_device() running while the cxl_memdev
> is still registered. Unfortunately, mixing devm and explicit unwind is
> always fraught.


I do not think there is a problem here. The cxl core does not need what 
a type2 driver can do regarding PCI BAR mappings, or at least it is not 
the case for sfc.

Any action through sysfs cxl will go through cxl core and the only thing 
linked to the type device is the CXL registers which are mapped inside 
cxl_map_component_regs() and those are managed resources.


So, I can not see why this change is needed. If it is really necessary, 
please describe the problem with more detail.


It looks like you need reasons for delaying this further ...


>
> Let me know if this passes your testing, and I can send it out as a
> standalone patch. You could also use it to unwind if the ioremap()
> fails.


You did not read my comments on v28 ...


I changed efx_cxl_init to make the driver probe to fail if cxl is 
supported and enabled but the cxl initialization fails, including 
ioremap_wc(). What you proposed to do, explicitly undo cxl 
initialization bits, has the same outcome: device detached from the driver.



^ permalink raw reply

* [PATCH net] xfrm: fix stack-out-of-bounds in xfrm_tmpl_resolve_one
From: Eric Dumazet @ 2026-06-25  9:24 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, netdev, eric.dumazet, Eric Dumazet,
	syzbot+0ac4d84afe1066a1f3e9, Steffen Klassert, Herbert Xu

syzbot reported a stack-out-of-bounds read in xfrm_state_find()
which flows from xfrm_tmpl_resolve_one().

The issue occurs when a policy has a mix of family-changing templates
(e.g. BEET or IPTFS) and transport templates. If an optional
family-changing template is skipped because no state is found, the
current family of the flow (`family`) is not updated. The subsequent
transport template is then evaluated using the unchanged family (e.g.
AF_INET), but it uses the template's `encap_family` (e.g. AF_INET6)
to perform the state lookup.

This causes `xfrm_state_find()` to interpret the IPv4 flow addresses
(allocated on the stack as `struct flowi4` in `raw_sendmsg` or
`udp_sendmsg`) as IPv6 addresses (`xfrm_address_t`), leading to a
16-byte read from the 4-byte stack variables, triggering KASAN.

Fix this by tracking the active family of the flow (`cur_family`)
during template resolution:
1. Initialize `cur_family` to the flow's original family.
2. For transport templates, verify that `tmpl->encap_family` matches
   `cur_family`. If they mismatch, abort with -EINVAL.
3. When a template that can change the family (tunnel, beet, iptfs) is
   successfully resolved, update `cur_family` to `tmpl->encap_family`.
4. If a template is skipped (optional), `cur_family` remains unchanged.

This prevents mismatched transport lookups and makes the resolution
robust against any family-transition gaps.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+0ac4d84afe1066a1f3e9@syzkaller.appspotmail.com
Closes: https://www.spinics.net/lists/netdev/msg1200923.html
Assisted-by: Jetski:gemini-3.1-pro-preview
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
---
 net/xfrm/xfrm_policy.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 7ef861a0e8231b63ece816b5237b03fa1367ccf9..95e30670303d34598ba164dff59a65c14489d5f3 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -2485,6 +2485,7 @@ xfrm_tmpl_resolve_one(struct xfrm_policy *policy, const struct flowi *fl,
 	int i, error;
 	xfrm_address_t *daddr = xfrm_flowi_daddr(fl, family);
 	xfrm_address_t *saddr = xfrm_flowi_saddr(fl, family);
+	unsigned short cur_family = family;
 	xfrm_address_t tmp;

 	for (nx = 0, i = 0; i < policy->xfrm_nr; i++) {
@@ -2511,6 +2512,11 @@ xfrm_tmpl_resolve_one(struct xfrm_policy *policy, const struct flowi *fl,
 					goto fail;
 				local = &tmp;
 			}
+		} else {
+			if (tmpl->encap_family != cur_family) {
+				error = -EINVAL;
+				goto fail;
+			}
 		}

 		x = xfrm_state_find(remote, local, fl, tmpl, policy, &error,
@@ -2526,6 +2532,11 @@ xfrm_tmpl_resolve_one(struct xfrm_policy *policy, const struct flowi *fl,
 			xfrm[nx++] = x;
 			daddr = remote;
 			saddr = local;
+			if (tmpl->mode == XFRM_MODE_TUNNEL ||
+			    tmpl->mode == XFRM_MODE_IPTFS ||
+			    tmpl->mode == XFRM_MODE_BEET) {
+				cur_family = tmpl->encap_family;
+			}
 			continue;
 		}
 		if (x) {
-- 
2.55.0.rc0.799.gd6f94ed593-goog

^ permalink raw reply related

* RE: [PATCH net v2] tipc: fix out-of-bounds read in broadcast Gap ACK blocks
From: Tung Quang Nguyen @ 2026-06-25  9:23 UTC (permalink / raw)
  To: Samuel Page
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, netdev@vger.kernel.org,
	tipc-discussion@lists.sourceforge.net,
	linux-kernel@vger.kernel.org, Jon Maloy
In-Reply-To: <20260624135629.727262-1-sam@bynar.io>

>Subject: [PATCH net v2] tipc: fix out-of-bounds read in broadcast Gap ACK
>blocks
>
>A broadcast PROTOCOL/STATE_MSG can carry a Gap ACK blocks record in its
>data area. tipc_get_gap_ack_blks() only verifies that the record's len field is
>self-consistent with its ugack_cnt/bgack_cnt counts (sz == struct_size(p, gacks,
>ugack_cnt + bgack_cnt)); it does not check that the record actually fits in the
>message data area, msg_data_sz().
>
>The unicast caller tipc_link_proto_rcv() bounds it ("if (glen > dlen) break;"), but
>the broadcast caller tipc_bcast_sync_rcv() discards the returned size, so
>tipc_link_advance_transmq() copies the record off the receive skb with an
>attacker-controlled count:
>
>	this_ga = kmemdup(ga, struct_size(ga, gacks, ga->bgack_cnt),
>			  GFP_ATOMIC);
>
>A TIPC neighbour that negotiated TIPC_GAP_ACK_BLOCK triggers it with one
>ordinary broadcast STATE_MSG (msg_bc_ack_invalid() clear), sized so its data
>area is short, carrying a Gap ACK record with len = 0x400, bgack_cnt = 0xff and
>ugack_cnt = 0. len then equals struct_size(p, gacks, 255), so the consistency
>check passes and ga is non-NULL; kmemdup() reads struct_size(ga, gacks, 255)
>= 1024 bytes out of the much smaller skb:
>
>  BUG: KASAN: slab-out-of-bounds in kmemdup_noprof+0x48/0x60
>  Read of size 1024 at addr ffff0000c7030d38 by task poc864/69
>  Call trace:
>   kmemdup_noprof+0x48/0x60
>   tipc_link_advance_transmq+0x86c/0xb80
>   tipc_link_bc_ack_rcv+0x19c/0x1e0
>   tipc_bcast_sync_rcv+0x1c4/0x2c4
>   tipc_rcv+0x85c/0x1340
>   tipc_l2_rcv_msg+0xac/0x104
>  The buggy address belongs to the object at ffff0000c7030d00
>   which belongs to the cache skbuff_small_head of size 704
>  The buggy address is located 56 bytes inside of
>   allocated 704-byte region [ffff0000c7030d00, ffff0000c7030fc0)
>
>The copied-out bytes are subsequently consumed as gap/ack values, but the
>read is already out of bounds at the kmemdup() regardless of how they are
>used.
>
>The unicast STATE path drops such a message: "if (glen > dlen) break;"
>skips the rest of STATE_MSG handling and the skb is freed. Make the broadcast
>path drop it too. tipc_bcast_sync_rcv() now bounds the record against
>msg_data_sz() and, when it does not fit, reports it back through
>tipc_node_bc_sync_rcv() to tipc_rcv() so the skb is discarded rather than
>processed. ga is not cleared on this path: ga == NULL already means "legacy
>peer without Selective ACK", a distinct legitimate state.
>
>Fixes: d7626b5acff9 ("tipc: introduce Gap ACK blocks for broadcast link")
>Cc: stable@vger.kernel.org
>Assisted-by: Bynario AI
>Signed-off-by: Samuel Page <sam@bynar.io>
>---
>v2, per review of v1 [1]:
> - v1 cleared 'ga' on an oversized Gap ACK record, which let the malformed
>   STATE message be processed as a legacy (no Selective ACK) one rather than
>   dropped.  v2 drops it instead, matching the unicast STATE path:
>   tipc_bcast_sync_rcv() reports the bad record through a bool output
>   parameter, propagated by tipc_node_bc_sync_rcv() to tipc_rcv(), which
>   discards the skb.
> - v1 touched only net/tipc/bcast.c; v2 also touches net/tipc/{bcast.h,node.c}.
>
>[1] https://lore.kernel.org/netdev/20260623134137.3641275-1-sam@bynar.io/
>
>For reference, an earlier thread proposed validating inside
>tipc_get_gap_ack_blks():
>
>https://lore.kernel.org/netdev/1316452e465e9a96fce44ec15130a14f3872149f.
>1775809727.git.caoruide123@gmail.com/
>
> net/tipc/bcast.c | 22 ++++++++++++++--------  net/tipc/bcast.h |  2 +-
>net/tipc/node.c  | 13 ++++++++++---
> 3 files changed, 25 insertions(+), 12 deletions(-)
>
>diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c index
>76a1585d3f6b..08637c3c9db0 100644
>--- a/net/tipc/bcast.c
>+++ b/net/tipc/bcast.c
>@@ -497,11 +497,12 @@ void tipc_bcast_ack_rcv(struct net *net, struct
>tipc_link *l,
>  */
> int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l,
> 			struct tipc_msg *hdr,
>-			struct sk_buff_head *retrq)
>+			struct sk_buff_head *retrq, bool *valid)
> {
> 	struct sk_buff_head *inputq = &tipc_bc_base(net)->inputq;
> 	struct tipc_gap_ack_blks *ga;
> 	struct sk_buff_head xmitq;
>+	u16 glen;

Move this variable declaration to the bottom to follow reverse xmas tree style.

> 	int rc = 0;
>
> 	__skb_queue_head_init(&xmitq);
>@@ -510,13 +511,18 @@ int tipc_bcast_sync_rcv(struct net *net, struct
>tipc_link *l,
> 	if (msg_type(hdr) != STATE_MSG) {
> 		tipc_link_bc_init_rcv(l, hdr);
> 	} else if (!msg_bc_ack_invalid(hdr)) {
>-		tipc_get_gap_ack_blks(&ga, l, hdr, false);
>-		if (!sysctl_tipc_bc_retruni)
>-			retrq = &xmitq;
>-		rc = tipc_link_bc_ack_rcv(l, msg_bcast_ack(hdr),
>-					  msg_bc_gap(hdr), ga, &xmitq,
>-					  retrq);
>-		rc |= tipc_link_bc_sync_rcv(l, hdr, &xmitq);
>+		glen = tipc_get_gap_ack_blks(&ga, l, hdr, false);
>+		if (glen > msg_data_sz(hdr)) {
>+			/* Malformed Gap ACK blocks; caller drops the msg */
>+			*valid = false;
>+		} else {
>+			if (!sysctl_tipc_bc_retruni)
>+				retrq = &xmitq;
>+			rc = tipc_link_bc_ack_rcv(l, msg_bcast_ack(hdr),
>+						  msg_bc_gap(hdr), ga, &xmitq,
>+						  retrq);
>+			rc |= tipc_link_bc_sync_rcv(l, hdr, &xmitq);
>+		}
> 	}
> 	tipc_bcast_unlock(net);
>
>diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h index
>2d9352dc7b0e..55d17b5413e1 100644
>--- a/net/tipc/bcast.h
>+++ b/net/tipc/bcast.h
>@@ -97,7 +97,7 @@ void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l,
> 			struct tipc_msg *hdr);
> int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l,
> 			struct tipc_msg *hdr,
>-			struct sk_buff_head *retrq);
>+			struct sk_buff_head *retrq, bool *valid);
> int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg,
> 			struct tipc_link *bcl);
> int tipc_nl_bc_link_set(struct net *net, struct nlattr *attrs[]); diff --git
>a/net/tipc/node.c b/net/tipc/node.c index 97aa970a0d83..2887f94ee28f
>100644
>--- a/net/tipc/node.c
>+++ b/net/tipc/node.c
>@@ -1831,12 +1831,13 @@ static void tipc_node_mcast_rcv(struct tipc_node
>*n)  }
>
> static void tipc_node_bc_sync_rcv(struct tipc_node *n, struct tipc_msg *hdr,
>-				  int bearer_id, struct sk_buff_head *xmitq)
>+				  int bearer_id, struct sk_buff_head *xmitq,
>+				  bool *valid)
> {
> 	struct tipc_link *ucl;
> 	int rc;
>
>-	rc = tipc_bcast_sync_rcv(n->net, n->bc_entry.link, hdr, xmitq);
>+	rc = tipc_bcast_sync_rcv(n->net, n->bc_entry.link, hdr, xmitq, valid);

'valid' needs to be checked after this call. Then, return immediately if it is false.

>
> 	if (rc & TIPC_LINK_DOWN_EVT) {
> 		tipc_node_reset_links(n);
>@@ -2140,12 +2141,18 @@ void tipc_rcv(struct net *net, struct sk_buff *skb,
>struct tipc_bearer *b)
>
> 	/* Ensure broadcast reception is in synch with peer's send state */
> 	if (unlikely(usr == LINK_PROTOCOL)) {
>+		bool valid = true;
>+
> 		if (unlikely(skb_linearize(skb))) {
> 			tipc_node_put(n);
> 			goto discard;
> 		}
> 		hdr = buf_msg(skb);
>-		tipc_node_bc_sync_rcv(n, hdr, bearer_id, &xmitq);
>+		tipc_node_bc_sync_rcv(n, hdr, bearer_id, &xmitq, &valid);
>+		if (!valid) {
>+			tipc_node_put(n);
>+			goto discard;
>+		}
> 	} else if (unlikely(tipc_link_acked(n->bc_entry.link) != bc_ack)) {
> 		tipc_bcast_ack_rcv(net, n->bc_entry.link, hdr);
> 	}
>
>base-commit: a986fde914d88af47eb78fd29c5d1af7952c3500
>--
>2.54.0


^ permalink raw reply

* Re: [PATCH net] nfc: nci: fix uninit-value in the RF discover/activated NTF handlers
From: Simon Horman @ 2026-06-25  9:18 UTC (permalink / raw)
  To: sam
  Cc: Simon Horman, david, davem, edumazet, kuba, pabeni, oe-linux-nfc,
	netdev, linux-kernel, stable
In-Reply-To: <20260623234126.214667-1-sam@bynar.io>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://netdev-ai.bots.linux.dev/sashiko/
---
nfc: nci: fix uninit-value in the RF discover/activated NTF handlers

This zero-initialises the on-stack nci_rf_discover_ntf and
nci_rf_intf_activated_ntf structs in the RF discover and interface
activated NTF handlers so that the technology- and activation-specific
unions read back as zero when the corresponding length fields are zero.

Link: https://lore.kernel.org/netdev/20260623172109.1105965-2-horms@kernel.org/

>  - activated: with rf_tech_specific_params_len == 0 the same union is read
>    via nci_target_auto_activated(); with activation_params_len == 0 the
>    activation_params union is read by nci_store_ats_nfc_iso_dep() into
>    ndev->target_ats.

[Low]
Does the third scenario in this paragraph match the code?

Looking at nci_store_ats_nfc_iso_dep():

    static int nci_store_ats_nfc_iso_dep(struct nci_dev *ndev,
                         const struct nci_rf_intf_activated_ntf *ntf)
    {
        ndev->target_ats_len = 0;

        if (ntf->activation_params_len <= 0)
            return NCI_STATUS_OK;

        if (ntf->activation_params.nfca_poll_iso_dep.rats_res_len > NFC_ATS_MAXSIZE) {
        ...

The function returns NCI_STATUS_OK before any access to the
activation_params union when activation_params_len is 0, so the union
is not read in that case.

The sibling nci_store_general_bytes_nfc_dep() has the same early return
on activation_params_len <= 0.

The two rf_tech_specific_params_len == 0 cases listed earlier in the
commit message do match the code (nci_add_new_protocol() and
nci_target_auto_activated() unconditionally pass
&ntf.rf_tech_specific_params even when rf_tech_specific_params_len is
0), and the KMSAN trace quoted in the commit message only mentions the
nci_add_new_protocol() path.

The ntf = {} change still looks correct and necessary for the
rf_tech_specific_params union. Could the third bullet be dropped or
adjusted so the commit message reflects only the paths that actually
read uninitialised bytes?

^ permalink raw reply

* [PATCH net] net: libwx: fix VMDQ mask for 1-queue mode
From: Jiawen Wu @ 2026-06-25  9:08 UTC (permalink / raw)
  To: netdev
  Cc: Mengyuan Lou, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Larysa Zaremba,
	Kees Cook, Jiawen Wu

In wx_set_vmdq_queues(), the VMDQ mask was not set for the devices not
support WX_FLAG_MULTI_64_FUNC, i.e., NGBE devices. A mask of 0 causes
__ALIGN_MASK(1, ~vmdq->mask) to return 0, which incorrectly sets
q_per_pool to 0 in wx_write_qde().

Fix the VMDQ 1-queue mask to 0x7F then ensures that __ALIGN_MASK(1,
0x7F) correctly evaluates to 1.

Fixes: c52d4b898901 ("net: libwx: Redesign flow when sriov is enabled")
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
---
 drivers/net/ethernet/wangxun/libwx/wx_lib.c  | 1 +
 drivers/net/ethernet/wangxun/libwx/wx_type.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/wangxun/libwx/wx_lib.c b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
index d042567b8128..814d88d2aee4 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_lib.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
@@ -1802,6 +1802,7 @@ static bool wx_set_vmdq_queues(struct wx *wx)
 			rss_i = 4;
 		}
 	} else {
+		vmdq_m = WX_VMDQ_1Q_MASK;
 		/* double check we are limited to maximum pools */
 		vmdq_i = min_t(u16, 8, vmdq_i);
 
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_type.h b/drivers/net/ethernet/wangxun/libwx/wx_type.h
index c7befe4cdfe9..65e3e55db1cf 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_type.h
+++ b/drivers/net/ethernet/wangxun/libwx/wx_type.h
@@ -486,6 +486,7 @@ enum WX_MSCA_CMD_value {
 
 #define WX_VMDQ_4Q_MASK              0x7C
 #define WX_VMDQ_2Q_MASK              0x7E
+#define WX_VMDQ_1Q_MASK              0x7F
 
 /****************** Manageablility Host Interface defines ********************/
 #define WX_HI_MAX_BLOCK_BYTE_LENGTH  256 /* Num of bytes in range */
-- 
2.51.0


^ permalink raw reply related

* Re: [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2
From: Askar Safin @ 2026-06-25  9:03 UTC (permalink / raw)
  To: val
  Cc: akpm, axboe, brauner, david, dhowells, fuse-devel, hch, jack,
	joannelkoong, linux-api, linux-fsdevel, linux-kernel, linux-mm,
	miklos, netdev, patches, pfalcato, rostedt, safinaskar, torvalds,
	viro, willy
In-Reply-To: <83f05c55-efba-4bf5-abfe-d2ab0819e904@packett.cool>

Val Packett <val@packett.cool>:
> speaking of fuse_dev_splice……_write actually, this series has broken 
> xdg-document-portal!
> 
> https://github.com/flatpak/xdg-desktop-portal/issues/2026
> 
> Specifically what happens is that the EINVAL is returned due to oh.len 
> != nbytes:
> 
> fuse_dev_do_write: oh.len 16400 != nbytes 15526
> 
> (where 16400 == 16384 (read len) + 16, 15526 == 15510 (file len) + 16)
> 
> After reverting the series, there is no error because oh.len 
> becomes 15526 too.

Please, test v2 version of my fixes:
https://lore.kernel.org/lkml/20260625083409.3769242-1-safinaskar@gmail.com/ .

This should fix this bug.

-- 
Askar Safin

^ permalink raw reply

* Re: [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2
From: Askar Safin @ 2026-06-25  8:53 UTC (permalink / raw)
  To: avagin
  Cc: akpm, alexander, axboe, bernd, brauner, criu, david, dhowells,
	fuse-devel, hch, jack, joannelkoong, linux-api, linux-fsdevel,
	linux-kernel, linux-mm, miklos, netdev, patches, pfalcato,
	rostedt, safinaskar, torvalds, val, viro, willy
In-Reply-To: <CANaxB-xUrLQYGiRJZc4Boi+KX=0TJSWymErNovANVko20fMDVA@mail.gmail.com>

Andrei Vagin <avagin@gmail.com>:
> On Wed, Jun 24, 2026 at 12:12 AM Askar Safin <safinaskar@gmail.com> wrote:
> > Does CRIU actually rely on ability to do SPLICE_F_NONBLOCK vmsplice into
> > named fifos? Or this is merely a test?
> 
> Yes, it does.

I. e. CRIU relies on that named fifo behavior? Okay, I just sent
v2 version of my fixes. The patchset contains fix for named fifos.

Please, test that this fixes that named fifo problem.

> I already explained that this isn't just a perfomance degradation, it
> actually breaks the pre-dump mechanism in CRIU. vmsplice is invoked from
> our parasite code within the context of a user process, where execution
> speed is critical. A heavy performance penalty completely invalidates
> the pre-dump logic, making the feature useless.

This is very unfortunate. But I still want to remove vmsplice.

> At a minimum, we may need to consider a deprecation plan where vmsplice
> with SPLICE_F_GIFT triggers a warning for a few releases before these
> changes are applied. Alternatively, we could introduce the proposed
> behavior alongside a sysctl to fall back to the old behavior and explicitly
> state that this fallback path will be completely deprecated in a future kernel
> version.

My patches change not only SPLICE_F_GIFT behavior, but also vmsplice
behavior in general.

Let other developers decide what to do (i. e. do nothing, remove
vmsplice now or implement some deprecation scheme).

-- 
Askar Safin

^ permalink raw reply

* Re: [PATCH v2] net: meth: Fix skb allocation failure handling in RX init
From: Simon Horman @ 2026-06-25  8:47 UTC (permalink / raw)
  To: haoxiang_li2024
  Cc: Simon Horman, andrew+netdev, davem, edumazet, kuba, pabeni,
	pavan.chebbi, netdev, linux-kernel
In-Reply-To: <20260624031940.2690439-1-haoxiang_li2024@163.com>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev

Simon says:

Hi Haoxiang,

I think that the pre-existing issues flagged below do not need
to be considered in the context of this patch. But I am wondering
if you could take a look over the others.

---
net: meth: Fix skb allocation failure handling in RX init

meth_init_rx_ring() does not check the return value of alloc_skb().
If the allocation fails, the NULL skb is passed to skb_reserve() and
then dereferenced through skb->head.

> --- a/drivers/net/ethernet/sgi/meth.c
> +++ b/drivers/net/ethernet/sgi/meth.c
> @@ -228,6 +228,9 @@ static int meth_init_rx_ring(struct meth_private *priv)
>  
>  	for (i = 0; i < RX_RING_ENTRIES; i++) {
>  		priv->rx_skbs[i] = alloc_skb(METH_RX_BUFF_SIZE, 0);
> +		if (!priv->rx_skbs[i])
> +			goto err_free_skbs;
> +
>  		/* 8byte status vector + 3quad padding + 2byte padding,
>  		 * to put data on 64bit aligned boundary */
>  		skb_reserve(priv->rx_skbs[i],METH_RX_HEAD);

[ ... ]

> @@ -240,6 +243,17 @@ static int meth_init_rx_ring(struct meth_private *priv)
>  	}
>          priv->rx_write = 0;
>  	return 0;
> +
> +err_free_skbs:
> +	while (i--) {
> +		dma_unmap_single(&priv->pdev->dev, priv->rx_ring_dmas[i],
> +				 METH_RX_BUFF_SIZE, DMA_FROM_DEVICE);

[Severity: Critical]
During the allocation loop, each successfully mapped DMA address is pushed
immediately to the hardware via the mace->eth.rx_fifo register. Unmapping
them here frees the memory, but the hardware FIFO still retains the physical
addresses.

When meth_init_rx_ring() returns -ENOMEM to meth_tx_timeout(), the error is
ignored and the RX DMA engine is enabled anyway:

meth_tx_timeout()
    ...
    meth_init_rx_ring(priv);
    priv->dma_ctrl |= METH_DMA_TX_EN | METH_DMA_RX_EN | METH_DMA_RX_INT_EN;

Will this cause the hardware to DMA incoming packet data into the unmapped
and potentially reallocated memory?

> +		priv->rx_ring[i] = 0;

[Severity: High]
If the hardware remains active as described above, will the RX interrupt
handler dereference this NULL pointer?

When the hardware receives a packet, it raises an RX interrupt and meth_rx()
attempts to process it:

meth_rx()
    ...
    status = priv->rx_ring[priv->rx_write]->status.raw;

Since the pointer at this index was cleared in the error path, this looks
like it will cause a kernel panic.

> +		priv->rx_ring_dmas[i] = 0;
> +		kfree_skb(priv->rx_skbs[i]);
> +		priv->rx_skbs[i] = NULL;

[Severity: Critical]
Does this partial cleanup lead to a double free of socket buffers?

When meth_init_rx_ring() fails at index N, this loop zeroes out elements 0
through N-1. The elements from N through 63 retain their dangling pointers
from the previous execution of meth_free_rx_ring(), which fails to clear SKB
pointers after freeing them.

Since meth_tx_timeout() ignores the error and leaves the device running, the
next time meth_free_rx_ring() is called, it iterates over all 64 elements
again. Will this call kfree_skb() on the dangling pointers in indices N
through 63?

> +	}
> +	return -ENOMEM;
>  }

[Severity: High]
This isn't a bug introduced by this patch, but dma_map_single() is called in
meth_init_rx_ring() without checking for mapping errors:

meth_init_rx_ring()
    ...
    priv->rx_ring_dmas[i] = dma_map_single(&priv->pdev->dev,
                                           priv->rx_ring[i],
                                           METH_RX_BUFF_SIZE,
                                           DMA_FROM_DEVICE);
    mace->eth.rx_fifo = priv->rx_ring_dmas[i];

If the mapping fails, an error cookie is passed directly to the hardware.
Could this lead to IOMMU faults or memory corruption when DMA bounce buffers
are exhausted?

[Severity: High]
This is a pre-existing issue, but there appears to be a race condition due
to lockless processing in meth_rx().

The interrupt handler drops the spinlock before processing the ring:

meth_rx()
    ...
    spin_unlock_irqrestore(&priv->meth_lock, flags);
    ...
    while (priv->rx_write != fifo_rptr) {

Concurrently, meth_tx_timeout() can acquire the lock and destroy the RX rings,
falsely assuming it is protected:

meth_tx_timeout()
    ...
    spin_lock_irqsave(&priv->meth_lock,flags);
    meth_reset(dev);
    ...
    meth_free_rx_ring(priv);

If meth_free_rx_ring() frees the SKBs while meth_rx() is still executing the
lockless while loop, could this result in a use-after-free or NULL pointer
dereference?

^ permalink raw reply

* [syzbot] [net?] KASAN: stack-out-of-bounds Read in xfrm_state_find (6)
From: syzbot @ 2026-06-25  8:49 UTC (permalink / raw)
  To: davem, edumazet, herbert, horms, kuba, linux-kernel, netdev,
	pabeni, steffen.klassert, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    a975094bf98c Merge tag 'exfat-for-7.2-rc1' of git://git.ke..
git tree:       bpf-next
console output: https://syzkaller.appspot.com/x/log.txt?x=14c0eba1580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=9519196b0a0d47bc
dashboard link: https://syzkaller.appspot.com/bug?extid=0ac4d84afe1066a1f3e9
compiler:       Debian clang version 22.1.6 (++20260514074242+fc4aad7b5db3-1~exp1~20260514074407.73), Debian LLD 22.1.6

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/25ab9553b7ce/disk-a975094b.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/5498f6d5131a/vmlinux-a975094b.xz
kernel image: https://storage.googleapis.com/syzbot-assets/90c6fca52c8c/bzImage-a975094b.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+0ac4d84afe1066a1f3e9@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: stack-out-of-bounds in jhash2 include/linux/jhash.h:138 [inline]
BUG: KASAN: stack-out-of-bounds in __xfrm6_addr_hash net/xfrm/xfrm_hash.h:16 [inline]
BUG: KASAN: stack-out-of-bounds in __xfrm6_daddr_saddr_hash net/xfrm/xfrm_hash.h:29 [inline]
BUG: KASAN: stack-out-of-bounds in __xfrm_dst_hash net/xfrm/xfrm_hash.h:95 [inline]
BUG: KASAN: stack-out-of-bounds in xfrm_state_find+0x590d/0x5ec0 net/xfrm/xfrm_state.c:1421
Read of size 4 at addr ffffc900061a7908 by task syz.8.4714/27393

CPU: 1 UID: 0 PID: 27393 Comm: syz.8.4714 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/09/2026
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 print_address_description+0x55/0x1e0 mm/kasan/report.c:378
 print_report+0x58/0x70 mm/kasan/report.c:482
 kasan_report+0x117/0x150 mm/kasan/report.c:595
 jhash2 include/linux/jhash.h:138 [inline]
 __xfrm6_addr_hash net/xfrm/xfrm_hash.h:16 [inline]
 __xfrm6_daddr_saddr_hash net/xfrm/xfrm_hash.h:29 [inline]
 __xfrm_dst_hash net/xfrm/xfrm_hash.h:95 [inline]
 xfrm_state_find+0x590d/0x5ec0 net/xfrm/xfrm_state.c:1421
 xfrm_tmpl_resolve_one net/xfrm/xfrm_policy.c:2513 [inline]
 xfrm_tmpl_resolve net/xfrm/xfrm_policy.c:2564 [inline]
 xfrm_resolve_and_create_bundle+0x7f3/0x3070 net/xfrm/xfrm_policy.c:2862
 xfrm_bundle_lookup net/xfrm/xfrm_policy.c:3097 [inline]
 xfrm_lookup_with_ifid+0x576/0x1b40 net/xfrm/xfrm_policy.c:3228
 xfrm_lookup net/xfrm/xfrm_policy.c:3327 [inline]
 xfrm_lookup_route+0x3c/0x1c0 net/xfrm/xfrm_policy.c:3338
 raw_sendmsg+0x110d/0x1a20 net/ipv4/raw.c:628
 sock_sendmsg_nosec+0x10e/0x180 net/socket.c:776
 __sock_sendmsg net/socket.c:790 [inline]
 ____sys_sendmsg+0x54e/0x850 net/socket.c:2684
 ___sys_sendmsg+0x2a5/0x360 net/socket.c:2738
 __sys_sendmsg net/socket.c:2770 [inline]
 __do_sys_sendmsg net/socket.c:2775 [inline]
 __se_sys_sendmsg net/socket.c:2773 [inline]
 __x64_sys_sendmsg+0x1b1/0x290 net/socket.c:2773
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7feeebb9ce59
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007feeeca26028 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007feeebe15fa0 RCX: 00007feeebb9ce59
RDX: 000000000400c894 RSI: 0000200000000900 RDI: 0000000000000007
RBP: 00007feeebc32e6f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007feeebe16038 R14: 00007feeebe15fa0 R15: 00007fff5f12fac8
 </TASK>

The buggy address belongs to stack of task syz.8.4714/27393
 and is located at offset 328 in frame:
 raw_sendmsg+0x0/0x1a20 net/ipv4/raw.c:909

This frame has 5 objects:
 [32, 104) 'opt_copy_u'
 [144, 200) 'ipc'
 [240, 248) 'rt'
 [272, 328) 'fl4'
 [368, 392) 'rfv'

The buggy address belongs to a 8-page vmalloc region starting at 0xffffc900061a0000 allocated at copy_process+0x81b/0x42e0 kernel/fork.c:2110
The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8b157
memcg:ffff88807cfe5e02
flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000000000 0000000000000000 ffffea00022c55c8 0000000000000000
raw: 0000000000000000 0000000000000000 00000001ffffffff ffff88807cfe5e02
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x29c2(GFP_NOWAIT|__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_ZERO), pid 27358, tgid 27357 (syz.3.4705), ts 776904415615, free_ts 773990322016
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x1f9/0x250 mm/page_alloc.c:1859
 prep_new_page mm/page_alloc.c:1867 [inline]
 get_page_from_freelist+0x21fa/0x2270 mm/page_alloc.c:3946
 __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5304
 alloc_pages_mpol+0x212/0x380 mm/mempolicy.c:2490
 alloc_frozen_pages_noprof mm/mempolicy.c:2561 [inline]
 alloc_pages_noprof+0xac/0x2a0 mm/mempolicy.c:2581
 vm_area_alloc_pages mm/vmalloc.c:3667 [inline]
 __vmalloc_area_node mm/vmalloc.c:3892 [inline]
 __vmalloc_node_range_noprof+0x795/0x1730 mm/vmalloc.c:4082
 __vmalloc_node_noprof+0xc2/0x100 mm/vmalloc.c:4143
 alloc_thread_stack_node kernel/fork.c:359 [inline]
 dup_task_struct+0x28e/0x850 kernel/fork.c:929
 copy_process+0x81b/0x42e0 kernel/fork.c:2110
 kernel_clone+0x2d7/0x940 kernel/fork.c:2746
 __do_sys_clone kernel/fork.c:2887 [inline]
 __se_sys_clone kernel/fork.c:2871 [inline]
 __x64_sys_clone+0x1b6/0x230 kernel/fork.c:2871
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
page last free pid 27224 tgid 27224 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 __free_pages_prepare mm/page_alloc.c:1406 [inline]
 __free_frozen_pages+0xc1f/0xd10 mm/page_alloc.c:2950
 __slab_free+0x274/0x2c0 mm/slub.c:5672
 qlink_free mm/kasan/quarantine.c:163 [inline]
 qlist_free_all+0x99/0x100 mm/kasan/quarantine.c:179
 kasan_quarantine_reduce+0x148/0x160 mm/kasan/quarantine.c:286
 __kasan_slab_alloc+0x22/0x80 mm/kasan/common.c:350
 kasan_slab_alloc include/linux/kasan.h:253 [inline]
 slab_post_alloc_hook mm/slub.c:4610 [inline]
 slab_alloc_node mm/slub.c:4939 [inline]
 __do_kmalloc_node mm/slub.c:5333 [inline]
 __kmalloc_noprof+0x312/0x750 mm/slub.c:5347
 _kmalloc_noprof include/linux/slab.h:973 [inline]
 tomoyo_realpath_from_path+0xef/0x640 security/tomoyo/realpath.c:251
 tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
 tomoyo_check_open_permission+0x229/0x470 security/tomoyo/file.c:776
 security_file_open+0xa9/0x240 security/security.c:2739
 do_dentry_open+0x4a0/0x1380 fs/open.c:924
 vfs_open+0x3b/0x340 fs/open.c:1079
 do_open fs/namei.c:4700 [inline]
 path_openat+0x2e44/0x3830 fs/namei.c:4859
 do_file_open+0x23e/0x4a0 fs/namei.c:4888
 do_sys_openat2+0x115/0x200 fs/open.c:1395
 do_sys_open fs/open.c:1401 [inline]
 __do_sys_openat fs/open.c:1417 [inline]
 __se_sys_openat fs/open.c:1412 [inline]
 __x64_sys_openat+0x138/0x170 fs/open.c:1412
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94

Memory state around the buggy address:
 ffffc900061a7800: 00 00 00 00 00 f2 f2 f2 f2 f2 00 00 00 00 00 00
 ffffc900061a7880: 00 f2 f2 f2 f2 f2 00 f2 f2 f2 00 00 00 00 00 00
>ffffc900061a7900: 00 f2 f2 f2 f2 f2 00 00 00 f3 f3 f3 f3 f3 f3 f3
                      ^
 ffffc900061a7980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffffc900061a7a00: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply

* Re: [PATCH v2 0/7] vmsplice: fix some problems in my previous vmsplice patchset
From: David Hildenbrand (Arm) @ 2026-06-25  8:46 UTC (permalink / raw)
  To: Askar Safin, linux-fsdevel, Christian Brauner, Alexander Viro,
	Jan Kara
  Cc: linux-kernel, linux-mm, linux-api, netdev, fuse-devel,
	Linus Torvalds, Matthew Wilcox, Jens Axboe, Christoph Hellwig,
	David Howells, Andrew Morton, Pedro Falcato, Miklos Szeredi,
	Andy Lutomirski, Collin Funk, David Laight, Stefan Metzmacher,
	The 8472, Willy Tarreau, Joanne Koong, Val Packett, Andrei Vagin,
	patches
In-Reply-To: <20260625083409.3769242-1-safinaskar@gmail.com>

On 6/25/26 10:34, Askar Safin wrote:
> This patchset is for VFS. Of course, it depends on my previous vmsplice
> patchset ( https://lore.kernel.org/all/20260531010107.1953702-1-safinaskar@gmail.com/ ).
> 
> I fix some problems in my previous patchset.

I think we concluded that we cannot rip out vmsplice that way at this point, and
I suspect that Christian will drop that topic branch from -next after -rc1.

-- 
Cheers,

David

^ permalink raw reply

* Re: [PATCH net-next v5 1/4] dpll: add DPLL_PIN_TYPE_INT_NCO pin type
From: Jiri Pirko @ 2026-06-25  8:45 UTC (permalink / raw)
  To: Vadim Fedorenko
  Cc: Ivan Vecera, Kubalewski, Arkadiusz, Jakub Kicinski,
	netdev@vger.kernel.org, Jiri Pirko, David S. Miller,
	Donald Hunter, Eric Dumazet, Schmidt, Michal, Paolo Abeni,
	Vaananen, Pasi, Oros, Petr, Prathosh Satish, Simon Horman,
	linux-kernel@vger.kernel.org
In-Reply-To: <0f8fe4e0-72d8-48a6-96ad-d1650919d2df@linux.dev>

Wed, Jun 24, 2026 at 05:57:35PM +0200, vadim.fedorenko@linux.dev wrote:
>On 19/06/2026 18:07, Ivan Vecera wrote:

[...]

>> 
>> Proposal:
>> 1) new pin capability
>>     - name: state-connected-override
>>     - doc: pin state can be changed to connected in any DPLL mode
>> 
>> 2) new NCO pin type to switch the DPLL to NCO mode when connected
>> 
>> 3) automatic-only DPLL
>>     - should expose NCO pin with state-connected-override capability
>> 
>> 4) manual-only DPLL
>>    - does not need to expose NCO pin with state-connected-override cap
>> 
>> 5) dual-mode DPLL (supporting mode switching)
>>    - if it exposes NCO pin with the override cap then it has to support
>>      switching to NCO mode directly from AUTO mode
>>    - if does not expose NCO pin with the override cap then a user MUST
>>      switch the DPLL mode from AUTO to MANUAL to be able to make NCO
>>      pin connected to the DPLL
>
>I still don't see good reasoning for the pin. Even this sentence says
>"DPLL mode" which keeps me thinking whether we have to move it to a
>special DPLL mode. All these items look like overcomplication of a
>simple function of the device itself. DPLL can be either in the closed
>loop when one of the pins provides a signal to align to, or in the open
>loop meaning that software can control adjustments to phase/frequency.
>But it's definitely a property of the device, and it's not a pin in any
>kind...

Vadim, did you see this:
https://lore.kernel.org/all/aiftnkuT9IP31qUm@FV6GYCPJ69/ ?
I very thoroughly described what you are questioning. There is 0 reply
to that email so perhaps you missed it? IDK.



^ permalink raw reply

* RE: [PATCH net 4/4] selftests: bonding: add a test for VLAN propagation over a bonded real device
From: Loktionov, Aleksandr @ 2026-06-25  8:41 UTC (permalink / raw)
  To: Jakub Kicinski, davem@davemloft.net
  Cc: netdev@vger.kernel.org, edumazet@google.com, pabeni@redhat.com,
	andrew+netdev@lunn.ch, horms@kernel.org, jv@jvosburgh.net,
	sdf@fomichev.me, dongchenchen2@huawei.com, idosch@nvidia.com,
	n05ec@lzu.edu.cn, yuantan098@gmail.com, kuniyu@google.com,
	nb@tipi-net.de, dtatulea@nvidia.com
In-Reply-To: <20260624182018.2445732-5-kuba@kernel.org>



> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Wednesday, June 24, 2026 8:20 PM
> To: davem@davemloft.net
> Cc: netdev@vger.kernel.org; edumazet@google.com; pabeni@redhat.com;
> andrew+netdev@lunn.ch; horms@kernel.org; jv@jvosburgh.net;
> sdf@fomichev.me; dongchenchen2@huawei.com; idosch@nvidia.com;
> n05ec@lzu.edu.cn; yuantan098@gmail.com; kuniyu@google.com; nb@tipi-
> net.de; Loktionov, Aleksandr <aleksandr.loktionov@intel.com>;
> dtatulea@nvidia.com; Jakub Kicinski <kuba@kernel.org>
> Subject: [PATCH net 4/4] selftests: bonding: add a test for VLAN
> propagation over a bonded real device
> 
> Add a regression test for the VLAN notifier handling that the
> netdev_work deferral fixed.
> 
> A VLAN's real device propagates its UP/DOWN, MTU and feature changes
> onto the VLANs stacked on top of it. This used to be done
> synchronously from the real device's notifier and deadlocked when the
> real device was brought up while enslaved to a bond (instance lock
> held across NETDEV_UP) and the VLAN on top was itself a bond member:
> the synchronous propagation re-entered the stack and took the same
> instance lock again.
> 
> The test covers both halves:
>  - that the deferred UP/DOWN, MTU and feature propagation actually
> lands on
>    the VLAN (link state and MTU use an ops-locked dummy, i.e. the
> deferral
>    path; features use veth, which exports vlan_features to inherit),
> and
>  - that the deadlock-prone topology - a VLAN on a dummy, with the VLAN
> and
>    the dummy each enslaved to a different bond - can be built without
>    hanging.
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
>  .../selftests/drivers/net/bonding/Makefile    |   1 +
>  .../drivers/net/bonding/bond_vlan_real_dev.sh | 180
> ++++++++++++++++++
>  2 files changed, 181 insertions(+)
>  create mode 100755
> tools/testing/selftests/drivers/net/bonding/bond_vlan_real_dev.sh
> 
> diff --git a/tools/testing/selftests/drivers/net/bonding/Makefile
> b/tools/testing/selftests/drivers/net/bonding/Makefile
> index be130bf585a4..6364ca02642d 100644
> --- a/tools/testing/selftests/drivers/net/bonding/Makefile
> +++ b/tools/testing/selftests/drivers/net/bonding/Makefile
> @@ -13,6 +13,7 @@ TEST_PROGS := \
>  	bond_options.sh \
>  	bond_passive_lacp.sh \
>  	bond_stacked_header_parse.sh \
> +	bond_vlan_real_dev.sh \
>  	dev_addr_lists.sh \
>  	mode-1-recovery-updelay.sh \
>  	mode-2-recovery-updelay.sh \
> diff --git
> a/tools/testing/selftests/drivers/net/bonding/bond_vlan_real_dev.sh
> b/tools/testing/selftests/drivers/net/bonding/bond_vlan_real_dev.sh
> new file mode 100755
> index 000000000000..542d9ffc4819
> --- /dev/null
> +++
> b/tools/testing/selftests/drivers/net/bonding/bond_vlan_real_dev.sh
> @@ -0,0 +1,180 @@

...

> +exit "$EXIT_STATUS"
> --
> 2.54.0


Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>


^ permalink raw reply

* [PATCH v2 7/7] pipe: set FMODE_NOWAIT for named FIFOs
From: Askar Safin @ 2026-06-25  8:34 UTC (permalink / raw)
  To: linux-fsdevel, Christian Brauner, Alexander Viro, Jan Kara
  Cc: linux-kernel, linux-mm, linux-api, netdev, fuse-devel,
	Linus Torvalds, Matthew Wilcox, Jens Axboe, Christoph Hellwig,
	David Howells, Andrew Morton, David Hildenbrand, Pedro Falcato,
	Miklos Szeredi, Andy Lutomirski, Collin Funk, David Laight,
	Stefan Metzmacher, The 8472, Willy Tarreau, Joanne Koong,
	Val Packett, Andrei Vagin, patches
In-Reply-To: <20260625083409.3769242-1-safinaskar@gmail.com>

CRIU relies on ability to do vmsplice(SPLICE_F_NONBLOCK) on named FIFOs.

Signed-off-by: Askar Safin <safinaskar@gmail.com>
---
 fs/pipe.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/pipe.c b/fs/pipe.c
index c0ccf21b9..a8e9b4459 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -1156,6 +1156,12 @@ static int fifo_open(struct inode *inode, struct file *filp)
 	/* We can only do regular read/write on fifos */
 	stream_open(inode, filp);
 
+	/*
+	 * CRIU relies on ability to do vmsplice(SPLICE_F_NONBLOCK)
+	 * on named FIFOs.
+	 */
+	filp->f_mode |= FMODE_NOWAIT;
+
 	switch (filp->f_mode & (FMODE_READ | FMODE_WRITE)) {
 	case FMODE_READ:
 	/*
-- 
2.47.3


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox