Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH bpf-next v5 0/3] bpf, sockmap: reject a packet-modifying SK_SKB stream parser
From: patchwork-bot+netdevbpf @ 2026-06-26 12:27 UTC (permalink / raw)
  To: Sechang Lim
  Cc: ast, daniel, andrii, john.fastabend, jakub, eddyz87, edumazet,
	kuniyu, pabeni, willemb, davem, kuba, martin.lau, song,
	yonghong.song, jolsa, memxor, horms, shuah, jiayuan.chen,
	bobby.eshleman, netdev, bpf, linux-kselftest, linux-kernel
In-Reply-To: <20260620024423.4141004-1-rhkrqnwk98@gmail.com>

Hello:

This series was applied to bpf/bpf.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Sat, 20 Jun 2026 02:44:15 +0000 you wrote:
> A BPF_PROG_TYPE_SK_SKB stream parser runs on strparser's message head,
> which can chain skbs through frag_list. A parser that resizes the skb
> frees the frag_list segments that strparser still tracks through
> skb_nextp, leading to a use-after-free.
> 
> A stream parser is only meant to measure the next message, not to modify
> the packet, so reject a packet-modifying parser at attach time.
> 
> [...]

Here is the summary with links:
  - [bpf-next,v5,1/3] selftests/bpf: don't modify the skb in the strparser parser prog
    https://git.kernel.org/bpf/bpf/c/22a0cc10dacb
  - [bpf-next,v5,2/3] bpf, sockmap: reject a packet-modifying SK_SKB stream parser
    https://git.kernel.org/bpf/bpf/c/31e2f36d3821
  - [bpf-next,v5,3/3] selftests/bpf: test rejection of a packet-modifying SK_SKB stream parser
    https://git.kernel.org/bpf/bpf/c/05fb34384d20

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net 1/7] xsk: fix buffer leak in xsk_drop_skb() for AF_XDP multi-buffer Tx
From: Jason Xing @ 2026-06-26 12:24 UTC (permalink / raw)
  To: Larysa Zaremba
  Cc: Maciej Fijalkowski, netdev, bpf, magnus.karlsson, stfomichev,
	kuba, pabeni, horms, bjorn, Jason Xing
In-Reply-To: <aj5egH5eDwfsY30-@soc-5CG4396X81.clients.intel.com>

On Fri, Jun 26, 2026 at 7:12 PM Larysa Zaremba <larysa.zaremba@intel.com> wrote:
>
> On Tue, Jun 23, 2026 at 03:32:34PM +0200, Maciej Fijalkowski wrote:
> > From: Jason Xing <kernelxing@tencent.com>
> >
> > This patch is inspired by the check[1] from sashiko. It says when
> > overflow happens, the address of cq to be published is invalid.
> > Actually the severer thing is the whole process of publishing the
> > address of cq in this particular case is not right: it should truely
> > publish the address and advance the cached_prod in cq as long as it
> > reads descriptors from txq.
> >
> > The following is the full analysis.
> > xsk_drop_skb() is called in three places, which all discard a partially
> > built multi-buffer skb:
> > 1) xsk_build_skb() -EOVERFLOW error path: packet exceeds MAX_SKB_FRAGS
> > 2) __xsk_generic_xmit() post-loop cleanup: an invalid descriptor in
> >    the TX ring prevents the partial packet from completing
> > 3) xsk_release(): socket close while xs->skb holds an incomplete packet
> >
> > In all three cases, the TX descriptors for the already-processed frags
> > have been consumed from the TX ring (xskq_cons_release), and CQ slots
> > have been reserved. However, xsk_drop_skb() calls xsk_consume_skb()
> > which cancels the CQ reservations via xsk_cq_cancel_locked(). Since
> > the buffer addresses never appear in the completion queue, userspace
> > permanently loses track of these buffers.
> >
> > Fix this by letting consume_skb() trigger the existing xsk_destruct_skb
> > destructor, which already submits buffer addresses to the CQ via
> > xsk_cq_submit_addr_locked().
> >
> > Note that cancelling the descriptors back to the TX ring (via
> > xskq_cons_cancel_n) is not a appropriate option because an oversized
> > packet that always exceeds MAX_SKB_FRAGS would be retried indefinitely,
> > which is an obviously deadlock bug in the TX path.
> >
> > Also move the desc->addr assignment in xsk_build_skb() above the
> > overflow check so that the current descriptor's address is recorded
> > before a potential -EOVERFLOW jump to free_err, consistent with the
> > zerocopy path in xsk_build_skb_zerocopy().
> >
> > [1]: https://lore.kernel.org/all/20260425041726.85FB3C2BCB2@smtp.kernel.org/
>
> This change looks good, but overflow case with only 1 descriptor worries me.

I presume you referred to xsk_build_skb_zerocopy()?

> In such cases, once we get to following code, kfree_skb() has already happened:
>
>         if (err == -EOVERFLOW) {
>                 if (xs->skb) {
>                         /* Drop the packet */
>                         xsk_inc_num_desc(xs->skb);
>                         xsk_drop_skb(xs->skb);
>                 } else {
>                         xsk_cq_cancel_locked(xs->pool, 1);
>                         xs->tx->invalid_descs++;
>                 }
>                 xskq_cons_release(xs->tx);
>         }
>
> kfree_skb() should have resulted in submission of the single fat descriptor to
> xsk_cq_submit_addr_locked() via xsk_destruct_skb(), so far consistent with the

At least, in the NO_LINEAR case, xsk_skb_init_misc() is not called
since the OVERFLOW skips this function, which means kfree_skb()
doesn't invoke xsk_destruct_skb() to publish it in the CQ. So it's
safe to cancel the cq reservation (in xsk_cq_cancel_locked(xs->pool,
1)).

Thanks,
Jason

> multi-descriptor bevaior you are proposing here.
>
> But what happens when we cancel a submitted CQ slot via
> xsk_cq_cancel_locked(xs->pool, 1) in the above code?
>
> >
> > Fixes: cf24f5a5feea ("xsk: add support for AF_XDP multi-buffer on Tx path")
> > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > ---
> >  net/xdp/xsk.c | 13 ++++++++-----
> >  1 file changed, 8 insertions(+), 5 deletions(-)
> >
> > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > index b970f30ea9b9..a7a83dc4546a 100644
> > --- a/net/xdp/xsk.c
> > +++ b/net/xdp/xsk.c
> > @@ -794,8 +794,11 @@ static void xsk_consume_skb(struct sk_buff *skb)
> >
> >  static void xsk_drop_skb(struct sk_buff *skb)
> >  {
> > -     xdp_sk(skb->sk)->tx->invalid_descs += xsk_get_num_desc(skb);
> > -     xsk_consume_skb(skb);
> > +     struct xdp_sock *xs = xdp_sk(skb->sk);
> > +
> > +     xs->tx->invalid_descs += xsk_get_num_desc(skb);
> > +     consume_skb(skb);
> > +     xs->skb = NULL;
> >  }
> >
> >  static int xsk_skb_metadata(struct sk_buff *skb, void *buffer,
> > @@ -877,7 +880,7 @@ static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs,
> >                       return ERR_PTR(-ENOMEM);
> >
> >               /* in case of -EOVERFLOW that could happen below,
> > -              * xsk_consume_skb() will release this node as whole skb
> > +              * xsk_drop_skb() will release this node as whole skb
> >                * would be dropped, which implies freeing all list elements
> >                */
> >               xsk_addr->addrs[xsk_addr->num_descs] = desc->addr;
> > @@ -969,6 +972,8 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
> >                               goto free_err;
> >                       }
> >
> > +                     xsk_addr->addrs[xsk_addr->num_descs] = desc->addr;
> > +
> >                       if (unlikely(nr_frags == (MAX_SKB_FRAGS - 1) && xp_mb_desc(desc))) {
> >                               err = -EOVERFLOW;
> >                               goto free_err;
> > @@ -986,8 +991,6 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
> >
> >                       skb_add_rx_frag(skb, nr_frags, page, 0, len, PAGE_SIZE);
> >                       refcount_add(PAGE_SIZE, &xs->sk.sk_wmem_alloc);
> > -
> > -                     xsk_addr->addrs[xsk_addr->num_descs] = desc->addr;
> >               }
> >       }
> >
> > --
> > 2.43.0
> >
> >

^ permalink raw reply

* Re: [PATCH net v4 1/2] net: phy: sfp: free mii_bus in sfp_i2c_mdiobus_destroy
From: Larysa Zaremba @ 2026-06-26 12:05 UTC (permalink / raw)
  To: Petr Wozniak
  Cc: Russell King, Andrew Lunn, Heiner Kallweit, Jakub Kicinski,
	David S . Miller, Eric Dumazet, Paolo Abeni, netdev, linux-kernel,
	linux-phy, Maxime Chevallier, Bjorn Mork, Aleksander Bajkowski,
	Marek Behun
In-Reply-To: <20260624084814.20972-2-petr.wozniak@gmail.com>

On Wed, Jun 24, 2026 at 10:48:13AM +0200, Petr Wozniak wrote:
> sfp_i2c_mdiobus_create() allocates the I2C MDIO bus with mdio_i2c_alloc(),
> a plain (non-devm) allocation, and registers it. sfp_i2c_mdiobus_destroy()
> only unregisters the bus and clears sfp->i2c_mii without calling
> mdiobus_free(). As the only reference to the bus is then cleared, the
> struct mii_bus is leaked.
> 
> This is hit whenever a copper/RollBall SFP module that instantiated an MDIO
> bus is removed: sfp_sm_main() takes the global teardown path and calls
> sfp_i2c_mdiobus_destroy(). sfp_cleanup(), on driver unbind, frees
> sfp->i2c_mii directly, which is why the leak only triggered on module
> hot-removal and not on unbind.
> 
> Free the bus in sfp_i2c_mdiobus_destroy() to match the allocation done in
> sfp_i2c_mdiobus_create().
> 
> Fixes: e85b1347ace6 ("net: sfp: create/destroy I2C mdiobus before PHY probe/after PHY release")
> Signed-off-by: Petr Wozniak <petr.wozniak@gmail.com>
> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>

Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>

> ---
>  drivers/net/phy/sfp.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
> index 03bfd8640db9..c4d274ab651e 100644
> --- a/drivers/net/phy/sfp.c
> +++ b/drivers/net/phy/sfp.c
> @@ -963,6 +963,7 @@ static int sfp_i2c_mdiobus_create(struct sfp *sfp)
>  static void sfp_i2c_mdiobus_destroy(struct sfp *sfp)
>  {
>  	mdiobus_unregister(sfp->i2c_mii);
> +	mdiobus_free(sfp->i2c_mii);
>  	sfp->i2c_mii = NULL;
>  }
>  
> -- 
> 2.51.0
> 
> 

^ permalink raw reply

* Re: [PATCH net v2] net: ipa: fix SMEM state handle leaks in SMP2P init
From: Larysa Zaremba @ 2026-06-26 11:52 UTC (permalink / raw)
  To: Haoxiang Li
  Cc: elder, andrew+netdev, davem, edumazet, kuba, pabeni, netdev,
	linux-kernel, stable
In-Reply-To: <20260624065955.2822765-1-haoxiang_li2024@163.com>

On Wed, Jun 24, 2026 at 02:59:55PM +0800, Haoxiang Li wrote:
> ipa_smp2p_init() acquires two Qualcomm SMEM state handles with
> qcom_smem_state_get(). However, neither the init error paths
> nor ipa_smp2p_exit() release them.
> 
> Release both handles with qcom_smem_state_put() in the init
> error paths and in ipa_smp2p_exit().
> 
> Fixes: 530f9216a953 ("soc: qcom: ipa: AP/modem communications")
> Cc: stable@vger.kernel.org
> Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com>

Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>

> ---
> Changes in v2:
>  - Use explicit qcom_smem_state_put() calls instead of devm helpers.
>    Thanks, Alex! Thanks, Jakub!
> ---
>  drivers/net/ipa/ipa_smp2p.c | 30 ++++++++++++++++++++++--------
>  1 file changed, 22 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/ipa/ipa_smp2p.c b/drivers/net/ipa/ipa_smp2p.c
> index 2f0ccdd937cc..331c00ad02c0 100644
> --- a/drivers/net/ipa/ipa_smp2p.c
> +++ b/drivers/net/ipa/ipa_smp2p.c
> @@ -232,19 +232,27 @@ ipa_smp2p_init(struct ipa *ipa, struct platform_device *pdev, bool modem_init)
>  					  &valid_bit);
>  	if (IS_ERR(valid_state))
>  		return PTR_ERR(valid_state);
> -	if (valid_bit >= 32)		/* BITS_PER_U32 */
> -		return -EINVAL;
> +	if (valid_bit >= 32) {		/* BITS_PER_U32 */
> +		ret = -EINVAL;
> +		goto err_valid_state_put;
> +	}
>  
>  	enabled_state = qcom_smem_state_get(dev, "ipa-clock-enabled",
>  					    &enabled_bit);
> -	if (IS_ERR(enabled_state))
> -		return PTR_ERR(enabled_state);
> -	if (enabled_bit >= 32)		/* BITS_PER_U32 */
> -		return -EINVAL;
> +	if (IS_ERR(enabled_state)) {
> +		ret = PTR_ERR(enabled_state);
> +		goto err_valid_state_put;
> +	}
> +	if (enabled_bit >= 32) {		/* BITS_PER_U32 */
> +		ret = -EINVAL;
> +		goto err_enabled_state_put;
> +	}
>  
>  	smp2p = kzalloc_obj(*smp2p);
> -	if (!smp2p)
> -		return -ENOMEM;
> +	if (!smp2p) {
> +		ret = -ENOMEM;
> +		goto err_enabled_state_put;
> +	}
>  
>  	smp2p->ipa = ipa;
>  
> @@ -289,6 +297,10 @@ ipa_smp2p_init(struct ipa *ipa, struct platform_device *pdev, bool modem_init)
>  	ipa->smp2p = NULL;
>  	mutex_destroy(&smp2p->mutex);
>  	kfree(smp2p);
> +err_enabled_state_put:
> +	qcom_smem_state_put(enabled_state);
> +err_valid_state_put:
> +	qcom_smem_state_put(valid_state);
>  
>  	return ret;
>  }
> @@ -305,6 +317,8 @@ void ipa_smp2p_exit(struct ipa *ipa)
>  	ipa_smp2p_power_release(ipa);
>  	ipa->smp2p = NULL;
>  	mutex_destroy(&smp2p->mutex);
> +	qcom_smem_state_put(smp2p->enabled_state);
> +	qcom_smem_state_put(smp2p->valid_state);
>  	kfree(smp2p);
>  }
>  
> -- 
> 2.25.1
> 
> 

^ permalink raw reply

* Re: [PATCH iwl v3] ice: retry reading NVM if admin queue returns EBUSY
From: Przemek Kitszel @ 2026-06-26 11:46 UTC (permalink / raw)
  To: Robert Malz
  Cc: Simon Horman, Grzegorz Nitka, anthony.l.nguyen, intel-wired-lan,
	netdev
In-Reply-To: <CADcc-bwC4FGQSGyRcnj2ZpGT5+0Q6mjQd-FTCB-mCmmwYrC8Qw@mail.gmail.com>

On 6/26/26 10:15, Robert Malz wrote:
> Hey Przemek,
> I ran some tests and unfortunately, the following sentence from the
> datasheet is true:
> "For specific resources, such as Change Lock (0x0003) and Global Config Lock
> (0x0004), this field is used by software to override the default timeout for the
> operation, and also to specify the timeout used for this operation."
> 
> This means we can only change a default timeout for 0x0003 and 0x0004
> but not for 0x0001 (NVM resource).
> Whatever timeout I provide FW defaults to 0xB88
> Input:
> [ 2209.656758] ice 0000:31:00.0: CQ CMD: opcode 0x0008, flags 0x2000,
> datalen 0x0000, retval 0x0000
> [ 2209.656760] ice 0000:31:00.0:        cookie (h,l) 0x00000000 0x00000000
> [ 2209.656761] ice 0000:31:00.0:        param (0,1)  0x00010001 0x00000BB9
> Output:
> [ 2209.656927] ice 0000:31:00.0: CQ CMD: opcode 0x0008, flags 0x2003,
> datalen 0x0000, retval 0x0000
> [ 2209.656929] ice 0000:31:00.0:        cookie (h,l) 0x00000000 0x00000000
> [ 2209.656931] ice 0000:31:00.0:        param (0,1)  0x00010001 0x00000BB8
> 
> Correct me If I'm wrong, but the only way to properly handle it is to
> ensure the resource is locked and released between every
> ice_acquire_nvm call.
> I'll start working on this.

thank you for checking out!

I agree that simple retries with improved (refactored) locking will be
good solution.
Failure to lock should count as an unsuccessful attempt, with possible
retry after a sleep.

> 
> Regards,
> Robert



^ permalink raw reply

* Re: [PATCH net v2] net: liquidio: fix BAR resource leak on PF number failure
From: Larysa Zaremba @ 2026-06-26 11:44 UTC (permalink / raw)
  To: Haoxiang Li
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, ricardo.farrington,
	felix.manlunas, horms, netdev, linux-kernel, stable
In-Reply-To: <20260624064013.2809570-1-haoxiang_li2024@163.com>

On Wed, Jun 24, 2026 at 02:40:13PM +0800, Haoxiang Li wrote:
> If cn23xx_get_pf_num() fails, the function returns without
> unmapping either BAR. Unmap both BARs before returning from
> the error path.
> 
> Found by manual code review.
> 
> Fixes: 0c45d7fe12c7 ("liquidio: fix use of pf in pass-through mode in a virtual machine")
> Cc: stable@vger.kernel.org
> Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com>

Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>

> ---
> Changes in v2:
>  - Modify the commit message.
>  - Introduce goto unwind path to do the cleanup. Thanks, Simon!
> ---
>  .../cavium/liquidio/cn23xx_pf_device.c         | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c b/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c
> index 75f22f74774c..06b4424e778e 100644
> --- a/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c
> +++ b/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c
> @@ -1163,18 +1163,14 @@ int setup_cn23xx_octeon_pf_device(struct octeon_device *oct)
>  	if (octeon_map_pci_barx(oct, 1, MAX_BAR1_IOREMAP_SIZE)) {
>  		dev_err(&oct->pci_dev->dev, "%s CN23XX BAR1 map failed\n",
>  			__func__);
> -		octeon_unmap_pci_barx(oct, 0);
> -		return 1;
> +		goto err_unmap_bar0;
>  	}
>  
>  	if (cn23xx_get_pf_num(oct) != 0)
> -		return 1;
> +		goto err_unmap_bar1;
>  
> -	if (cn23xx_sriov_config(oct)) {
> -		octeon_unmap_pci_barx(oct, 0);
> -		octeon_unmap_pci_barx(oct, 1);
> -		return 1;
> -	}
> +	if (cn23xx_sriov_config(oct))
> +		goto err_unmap_bar1;
>  
>  	octeon_write_csr64(oct, CN23XX_SLI_MAC_CREDIT_CNT, 0x3F802080802080ULL);
>  
> @@ -1205,6 +1201,12 @@ int setup_cn23xx_octeon_pf_device(struct octeon_device *oct)
>  	oct->coproc_clock_rate = 1000000ULL * cn23xx_coprocessor_clock(oct);
>  
>  	return 0;
> +
> +err_unmap_bar1:
> +	octeon_unmap_pci_barx(oct, 1);
> +err_unmap_bar0:
> +	octeon_unmap_pci_barx(oct, 0);
> +	return 1;
>  }
>  EXPORT_SYMBOL_GPL(setup_cn23xx_octeon_pf_device);
>  
> -- 
> 2.25.1
> 
> 

^ permalink raw reply

* Re: [PATCH 2/6] remoteproc: qcom: Add M0 BTSS secure PIL driver
From: George Moussalem @ 2026-06-26 11:32 UTC (permalink / raw)
  To: Konrad Dybcio, Jens Axboe, Ulf Hansson, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Johannes Berg, Jeff Johnson,
	Bartosz Golaszewski, Marcel Holtmann, Luiz Augusto von Dentz,
	Balakrishna Godavarthi, Rocky Liao, Saravana Kannan, Andrew Lunn,
	Heiner Kallweit, Russell King, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Bjorn Andersson,
	Konrad Dybcio, Mathieu Poirier, Philipp Zabel
  Cc: linux-block, linux-kernel, linux-mmc, devicetree, linux-wireless,
	ath10k, linux-arm-msm, linux-bluetooth, netdev, linux-remoteproc
In-Reply-To: <38aceb33-b28e-4994-b277-de070b6dae2b@oss.qualcomm.com>

On 6/26/26 15:20, Konrad Dybcio wrote:
> On 6/25/26 4:10 PM, George Moussalem via B4 Relay wrote:
>> From: George Moussalem <george.moussalem@outlook.com>
>>
>> Add support to bring up the M0 core of the bluetooth subsystem found in
>> the IPQ5018 SoC.
>>
>> The signed firmware loaded is authenticated by TrustZone. If successful,
>> the M0 core boots the firmware and the peripheral is taken out of reset
>> using a Secure Channel Manager call to TrustZone.
>>
>> Signed-off-by: George Moussalem <george.moussalem@outlook.com>
>> ---
> 
> Can this not fit inside the existing PAS driver?

I've tried but there were two issues with that:

1. a custom way to load the firmware into memory is required because the
loadable segment needs to be offset by the virtual address in the mbn
file (see 0x20250 below). The standard mdt_loader uses the physical
addresses.

readelf -l bt_fw_patch.mbn

Elf file type is EXEC (Executable file)
Entry point 0x20255
There are 3 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  NULL           0x000000 0x00000000 0x00000000 0x00094 0x00000     0
  NULL           0x001000 0x0001a000 0x0001a000 0x00088 0x01000     0x1000
  LOAD           0x002000 0x00020250 0x00000000 0x06154 0x190f8 RWE 0x4

2. memory needs to be ioremapped using ioremap, not ioremap_wc, else TZ
will complain and throw XPU violations due to strict memory alignment
and non-cache requirements.

> 
> Konrad

Cheers,
George

^ permalink raw reply

* Re: [PATCH v2] vhost/net: fix clear_user start address in VHOST_GET_FEATURES_ARRAY
From: Eugenio Perez Martin @ 2026-06-26 11:31 UTC (permalink / raw)
  To: rom.wang
  Cc: Michael S . Tsirkin, Jason Wang, Paolo Abeni, kvm, virtualization,
	netdev, linux-kernel, Yufeng Wang
In-Reply-To: <20260626070438.59149-1-r4o5m6e8o@163.com>

On Fri, Jun 26, 2026 at 9:05 AM rom.wang <r4o5m6e8o@163.com> wrote:
>
> From: Yufeng Wang <wangyufeng@kylinos.cn>
>
> The clear_user() call in VHOST_GET_FEATURES_ARRAY incorrectly starts
> at argp, which is the beginning of the features array, overwriting the
> data just written by copy_to_user(). It should start after the copied
> elements at argp + copied * sizeof(u64) to only zero the trailing
> unused space.
>
> Use size_mul() for both the offset and length calculations so the
> arithmetic stays consistent with the surrounding code and remains
> overflow-safe.
>
> Fixes: 333c515d1896 ("vhost-net: allow configuring extended features")
> Signed-off-by: Yufeng Wang <wangyufeng@kylinos.cn>
>
> ---
> Changes in v2:
> - Use size_mul() for the offset calculation as well, per review feedback.
>
> Link to v1: https://lore.kernel.org/all/20260526080336.61296-1-r4o5m6e8o@163.com/
>
> Note:
> Thank you for your review and suggestions.
>
> I tried to add a switch in tools/virtio/vhost_net_test.c.
> The switch is meant to use VHOST_GET_FEATURES_ARRAY and
> VHOST_SET_FEATURES_ARRAY instead of the legacy versions.
>
> However, when I ran `make virtio` in the tools directory,
> the build failed with an error: missing asm/percpu_types.h.
> I fixed that error, but then another error appeared.
>
> Would it be acceptable to postpone the submission of
> this test case until I have sorted out all the build
> errors?
> ---
>  drivers/vhost/net.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 77b59f49bddb..4b963dafa233 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -1784,7 +1784,8 @@ static long vhost_net_ioctl(struct file *f, unsigned int ioctl,
>                         return -EFAULT;
>
>                 /* Zero the trailing space provided by user-space, if any */
> -               if (clear_user(argp, size_mul(count - copied, sizeof(u64))))
> +               if (clear_user(argp + size_mul(copied, sizeof(u64)),
> +                              size_mul(count - copied, sizeof(u64))))

Acked-by: Eugenio Pérez <eperezma@redhat.com>

Thanks!

>                         return -EFAULT;
>                 return 0;
>         case VHOST_SET_FEATURES_ARRAY:
> --
> 2.34.1
>


^ permalink raw reply

* Re: [PATCH 4/6] dt-bindings: net: bluetooth: Document Qualcomm IPQ5018 Bluetooth controller
From: Konrad Dybcio @ 2026-06-26 11:30 UTC (permalink / raw)
  To: George Moussalem, Krzysztof Kozlowski
  Cc: Jens Axboe, Ulf Hansson, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Johannes Berg, Jeff Johnson, Bartosz Golaszewski,
	Marcel Holtmann, Luiz Augusto von Dentz, Balakrishna Godavarthi,
	Rocky Liao, Saravana Kannan, Andrew Lunn, Heiner Kallweit,
	Russell King, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Bjorn Andersson, Konrad Dybcio,
	Mathieu Poirier, Philipp Zabel, linux-block, linux-kernel,
	linux-mmc, devicetree, linux-wireless, ath10k, linux-arm-msm,
	linux-bluetooth, netdev, linux-remoteproc
In-Reply-To: <SN7PR19MB673692EBED649CF6DC9833A89DEB2@SN7PR19MB6736.namprd19.prod.outlook.com>

On 6/26/26 1:20 PM, George Moussalem wrote:
> On 6/26/26 14:53, Krzysztof Kozlowski wrote:
>> On Thu, Jun 25, 2026 at 06:10:08PM +0400, George Moussalem wrote:
>>> Document the Qualcomm IPQ5018 Bluetooth controller.
>>>
>>> Signed-off-by: George Moussalem <george.moussalem@outlook.com>
>>> ---

[...]

>>> +      compatible = "qcom,ipq5018-bt";
>>> +
>>> +      qcom,ipc = <&apcs_glb 8 23>;
>>> +      interrupts = <GIC_SPI 162 IRQ_TYPE_EDGE_RISING>;
>>
>> No firmware to load?
> 
> firmware is loaded by the remoteproc in patch 1
> 
>>
>> It feels like remoteproc node split is fake. The property qcom,rproc is
>> even more supporting that case. Shouldn't this be simply one device -
>> bluetooth? What sort of two devices do you have exactly? How can I
>> identify them in the hardware?
> 
> I wasn't sure how to represent the HW. Should I make this bluetooth node
> a childnode of the rproc? Essentially, this is the transport layer
> (using shared memory space and IPC/interrupt).
> 
> Most QCA BT controllers are also childnodes of a serdev/uart node as
> they use serdev for transport.
> 
> From what I understand, it's simply BT firmware running on this
> dedicated M0 core in the SoC itself connected to an RF.

Seems like this rhymes with the WPSS remoteproc +ATH1xK_AHB situation
- the Q6 core power sequences and manages the wireless controller,
while Linux gets to drive the device as it would if it were connected
over PCIe/ UART respectively, just with MMIO writes instead.

Konrad

^ permalink raw reply

* Re: [PATCH 4/6] dt-bindings: net: bluetooth: Document Qualcomm IPQ5018 Bluetooth controller
From: George Moussalem @ 2026-06-26 11:20 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Jens Axboe, Ulf Hansson, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Johannes Berg, Jeff Johnson, Bartosz Golaszewski,
	Marcel Holtmann, Luiz Augusto von Dentz, Balakrishna Godavarthi,
	Rocky Liao, Saravana Kannan, Andrew Lunn, Heiner Kallweit,
	Russell King, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Bjorn Andersson, Konrad Dybcio,
	Mathieu Poirier, Philipp Zabel, linux-block, linux-kernel,
	linux-mmc, devicetree, linux-wireless, ath10k, linux-arm-msm,
	linux-bluetooth, netdev, linux-remoteproc
In-Reply-To: <20260626-discerning-light-swan-6b599c@quoll>

On 6/26/26 14:53, Krzysztof Kozlowski wrote:
> On Thu, Jun 25, 2026 at 06:10:08PM +0400, George Moussalem wrote:
>> Document the Qualcomm IPQ5018 Bluetooth controller.
>>
>> Signed-off-by: George Moussalem <george.moussalem@outlook.com>
>> ---
>>  .../bindings/net/bluetooth/qcom,ipq5018-bt.yaml    | 63 ++++++++++++++++++++++
>>  1 file changed, 63 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/net/bluetooth/qcom,ipq5018-bt.yaml b/Documentation/devicetree/bindings/net/bluetooth/qcom,ipq5018-bt.yaml
>> new file mode 100644
>> index 000000000000..afd33f851858
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/net/bluetooth/qcom,ipq5018-bt.yaml
>> @@ -0,0 +1,63 @@
>> +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
>> +%YAML 1.2
>> +---
>> +$id: http://devicetree.org/schemas/net/bluetooth/qcom,ipq5018-bt.yaml#
>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>> +
>> +title: Qualcomm IPQ5018 Bluetooth
>> +
>> +maintainers:
>> +  - George Moussalem <george.moussalem@outlook.com>
>> +
>> +properties:
>> +  compatible:
>> +    enum:
>> +      - qcom,ipq5018-bt
>> +
>> +  interrupts:
>> +    items:
>> +      - description:
>> +          Interrupt line from the M0 Bluetooth Subsystem to the host processor
> 
> What is M0?

it's a low power Cortex M0 core, for Bluetooth processing in this case.

> 
> Anyway, this part feels completely redundant. Can "interrupts" property
> be anything else than an interrupt line from the device to the host
> processor?
> 
> 
>> +          to notify it of events such as re
> 
> This feels useful, but cut/incomplete.

yeah, c/p error. The interrupt is to notify the host of bluetooth events
running on the m0 core, such as TX/CMD completion and/or availability of
new data frames in the ring buffers.

> 
>> +
>> +  qcom,ipc:
>> +    $ref: /schemas/types.yaml#/definitions/phandle-array
>> +    items:
>> +      - items:
>> +          - description: phandle to a syscon node representing the APCS registers
>> +          - description: u32 representing offset to the register within the syscon
>> +          - description: u32 representing the ipc bit within the register
>> +    description: |
>> +      These entries specify the outgoing IPC bit used for signaling the remote
>> +      M0 BTSS core of a host event or for sending an ACK if the remote processor
>> +      expects it.
>> +
>> +  qcom,rproc:
>> +    $ref: /schemas/types.yaml#/definitions/phandle
>> +    description:
>> +      Phandle to the remote processor node representing the M0 BTSS core.
>> +
>> +required:
>> +  - compatible
>> +  - interrupts
>> +  - qcom,ipc
>> +  - qcom,rproc
>> +
>> +allOf:
>> +  - $ref: bluetooth-controller.yaml#
>> +  - $ref: qcom,bluetooth-common.yaml
>> +
>> +unevaluatedProperties: false
>> +
>> +examples:
>> +  - |
>> +    #include <dt-bindings/interrupt-controller/arm-gic.h>
>> +
>> +    bluetooth: bluetooth {
> 
> Drop unused label

will drop

> 
>> +      compatible = "qcom,ipq5018-bt";
>> +
>> +      qcom,ipc = <&apcs_glb 8 23>;
>> +      interrupts = <GIC_SPI 162 IRQ_TYPE_EDGE_RISING>;
> 
> No firmware to load?

firmware is loaded by the remoteproc in patch 1

> 
> It feels like remoteproc node split is fake. The property qcom,rproc is
> even more supporting that case. Shouldn't this be simply one device -
> bluetooth? What sort of two devices do you have exactly? How can I
> identify them in the hardware?

I wasn't sure how to represent the HW. Should I make this bluetooth node
a childnode of the rproc? Essentially, this is the transport layer
(using shared memory space and IPC/interrupt).

Most QCA BT controllers are also childnodes of a serdev/uart node as
they use serdev for transport.

From what I understand, it's simply BT firmware running on this
dedicated M0 core in the SoC itself connected to an RF.

> 
>> +
>> +      qcom,rproc = <&m0_btss>;
> 
> Best regards,
> Krzysztof
> 


^ permalink raw reply

* Re: [PATCH 2/6] remoteproc: qcom: Add M0 BTSS secure PIL driver
From: Konrad Dybcio @ 2026-06-26 11:20 UTC (permalink / raw)
  To: george.moussalem, Jens Axboe, Ulf Hansson, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Johannes Berg, Jeff Johnson,
	Bartosz Golaszewski, Marcel Holtmann, Luiz Augusto von Dentz,
	Balakrishna Godavarthi, Rocky Liao, Saravana Kannan, Andrew Lunn,
	Heiner Kallweit, Russell King, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Bjorn Andersson,
	Konrad Dybcio, Mathieu Poirier, Philipp Zabel
  Cc: linux-block, linux-kernel, linux-mmc, devicetree, linux-wireless,
	ath10k, linux-arm-msm, linux-bluetooth, netdev, linux-remoteproc
In-Reply-To: <20260625-ipq5018-bluetooth-v1-2-d999be0e04f7@outlook.com>

On 6/25/26 4:10 PM, George Moussalem via B4 Relay wrote:
> From: George Moussalem <george.moussalem@outlook.com>
> 
> Add support to bring up the M0 core of the bluetooth subsystem found in
> the IPQ5018 SoC.
> 
> The signed firmware loaded is authenticated by TrustZone. If successful,
> the M0 core boots the firmware and the peripheral is taken out of reset
> using a Secure Channel Manager call to TrustZone.
> 
> Signed-off-by: George Moussalem <george.moussalem@outlook.com>
> ---

Can this not fit inside the existing PAS driver?

Konrad

^ permalink raw reply

* [PATCH net v3] amt: don't read the IP header from a reallocated skb head
From: Michael Bommarito @ 2026-06-26 11:19 UTC (permalink / raw)
  To: Taehee Yoo, Andrew Lunn, David S . Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel
In-Reply-To: <20260617123443.3586930-1-michael.bommarito@gmail.com>

Several AMT receive paths cache a pointer into the skb header
(ip_hdr() / ipv6_hdr()) and then call a helper that can reallocate
the skb head before reading the cached pointer:

  amt_rcv()                 caches ip_hdr(skb), then amt_parse_type()
                            does pskb_may_pull() before reading
                            iph->saddr;
  amt_update_handler()      caches ip_hdr(skb), then pskb_may_pull() /
                            iptunnel_pull_header() before reading
                            iph->saddr;
  amt_igmpv3_report_handler() caches ip_hdr(skb), then ip_mc_may_pull()
                            in the record loop before reading
                            iph->saddr;
  amt_mldv2_report_handler() caches ipv6_hdr(skb), then
                            ipv6_mc_may_pull() in the record loop
                            before reading ip6h->saddr.

pskb_may_pull() and the *_mc_may_pull() helpers can reallocate the
skb head; when they do, the old head is freed and the cached pointer
dangles, so the later source-address read is a use-after-free of the
freed head.

The only field used after the pull in each case is the source
address, which is stable across the pull. Snapshot it before the
pull and use the snapshot.

The sibling handlers that re-derive ip_hdr() after the pull
(amt_multicast_data_handler(), amt_membership_query_handler()) and
the handlers that read the source address with no intervening pull
(amt_discovery_handler(), amt_request_handler(), the IGMPv2/MLDv1
report and leave handlers) are not affected.

Fixes: cbc21dc1cfe9 ("amt: add data plane of amt interface")
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
---
v3: address Jakub Kicinski's and Taehee Yoo's review of v2: fix all of
    the stale-iph reads Sashiko flagged in amt.c in one patch, not just
    amt_update_handler(). v2 fixed only amt_update_handler(); this also
    covers amt_rcv(), amt_igmpv3_report_handler() and
    amt_mldv2_report_handler(), which have the same pre-pull cached
    header read. The amt_update_handler() change is functionally the
    same as v2 (snapshot the source address before the pull).
    v2: https://lore.kernel.org/all/20260617123443.3586930-1-michael.bommarito@gmail.com/

Built for x86_64 (CONFIG_AMT=m) with W=1, no new warnings; checkpatch
--strict clean.

 drivers/net/amt.c | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/net/amt.c b/drivers/net/amt.c
index 724a8163a5142..7094bdab0f463 100644
--- a/drivers/net/amt.c
+++ b/drivers/net/amt.c
@@ -2000,13 +2000,15 @@ static void amt_igmpv3_report_handler(struct amt_dev *amt, struct sk_buff *skb,
 	struct igmpv3_report *ihrv3 = igmpv3_report_hdr(skb);
 	int len = skb_transport_offset(skb) + sizeof(*ihrv3);
 	void *zero_grec = (void *)&igmpv3_zero_grec;
-	struct iphdr *iph = ip_hdr(skb);
 	struct amt_group_node *gnode;
 	union amt_addr group, host;
 	struct igmpv3_grec *grec;
+	__be32 saddr;
 	u16 nsrcs;
 	int i;
 
+	saddr = ip_hdr(skb)->saddr;
+
 	for (i = 0; i < ntohs(ihrv3->ngrec); i++) {
 		len += sizeof(*grec);
 		if (!ip_mc_may_pull(skb, len))
@@ -2022,7 +2024,7 @@ static void amt_igmpv3_report_handler(struct amt_dev *amt, struct sk_buff *skb,
 		memset(&group, 0, sizeof(union amt_addr));
 		group.ip4 = grec->grec_mca;
 		memset(&host, 0, sizeof(union amt_addr));
-		host.ip4 = iph->saddr;
+		host.ip4 = saddr;
 		gnode = amt_lookup_group(tunnel, &group, &host, false);
 		if (!gnode) {
 			gnode = amt_add_group(amt, tunnel, &group, &host,
@@ -2162,13 +2164,15 @@ static void amt_mldv2_report_handler(struct amt_dev *amt, struct sk_buff *skb,
 	struct mld2_report *mld2r = (struct mld2_report *)icmp6_hdr(skb);
 	int len = skb_transport_offset(skb) + sizeof(*mld2r);
 	void *zero_grec = (void *)&mldv2_zero_grec;
-	struct ipv6hdr *ip6h = ipv6_hdr(skb);
 	struct amt_group_node *gnode;
 	union amt_addr group, host;
 	struct mld2_grec *grec;
+	struct in6_addr saddr;
 	u16 nsrcs;
 	int i;
 
+	saddr = ipv6_hdr(skb)->saddr;
+
 	for (i = 0; i < ntohs(mld2r->mld2r_ngrec); i++) {
 		len += sizeof(*grec);
 		if (!ipv6_mc_may_pull(skb, len))
@@ -2184,7 +2188,7 @@ static void amt_mldv2_report_handler(struct amt_dev *amt, struct sk_buff *skb,
 		memset(&group, 0, sizeof(union amt_addr));
 		group.ip6 = grec->grec_mca;
 		memset(&host, 0, sizeof(union amt_addr));
-		host.ip6 = ip6h->saddr;
+		host.ip6 = saddr;
 		gnode = amt_lookup_group(tunnel, &group, &host, true);
 		if (!gnode) {
 			gnode = amt_add_group(amt, tunnel, &group, &host,
@@ -2455,8 +2459,10 @@ static bool amt_update_handler(struct amt_dev *amt, struct sk_buff *skb)
 	struct ethhdr *eth;
 	struct iphdr *iph;
 	int len, hdr_size;
+	__be32 saddr;
 
 	iph = ip_hdr(skb);
+	saddr = iph->saddr;
 
 	hdr_size = sizeof(*amtmu) + sizeof(struct udphdr);
 	if (!pskb_may_pull(skb, hdr_size))
@@ -2472,7 +2478,7 @@ static bool amt_update_handler(struct amt_dev *amt, struct sk_buff *skb)
 	skb_reset_network_header(skb);
 
 	list_for_each_entry_rcu(tunnel, &amt->tunnel_list, list) {
-		if (tunnel->ip4 == iph->saddr) {
+		if (tunnel->ip4 == saddr) {
 			if ((amtmu->nonce == tunnel->nonce &&
 			     amtmu->response_mac == tunnel->mac)) {
 				mod_delayed_work(amt_wq, &tunnel->gc_wq,
@@ -2772,7 +2778,7 @@ static void amt_gw_rcv(struct amt_dev *amt, struct sk_buff *skb)
 static int amt_rcv(struct sock *sk, struct sk_buff *skb)
 {
 	struct amt_dev *amt;
-	struct iphdr *iph;
+	__be32 saddr;
 	int type;
 	bool err;
 
@@ -2785,7 +2791,7 @@ static int amt_rcv(struct sock *sk, struct sk_buff *skb)
 	}
 
 	skb->dev = amt->dev;
-	iph = ip_hdr(skb);
+	saddr = ip_hdr(skb)->saddr;
 	type = amt_parse_type(skb);
 	if (type == -1) {
 		err = true;
@@ -2795,7 +2801,7 @@ static int amt_rcv(struct sock *sk, struct sk_buff *skb)
 	if (amt->mode == AMT_MODE_GATEWAY) {
 		switch (type) {
 		case AMT_MSG_ADVERTISEMENT:
-			if (iph->saddr != amt->discovery_ip) {
+			if (saddr != amt->discovery_ip) {
 				netdev_dbg(amt->dev, "Invalid Relay IP\n");
 				err = true;
 				goto drop;
@@ -2807,7 +2813,7 @@ static int amt_rcv(struct sock *sk, struct sk_buff *skb)
 			}
 			goto out;
 		case AMT_MSG_MULTICAST_DATA:
-			if (iph->saddr != amt->remote_ip) {
+			if (saddr != amt->remote_ip) {
 				netdev_dbg(amt->dev, "Invalid Relay IP\n");
 				err = true;
 				goto drop;
@@ -2818,7 +2824,7 @@ static int amt_rcv(struct sock *sk, struct sk_buff *skb)
 			else
 				goto out;
 		case AMT_MSG_MEMBERSHIP_QUERY:
-			if (iph->saddr != amt->remote_ip) {
+			if (saddr != amt->remote_ip) {
 				netdev_dbg(amt->dev, "Invalid Relay IP\n");
 				err = true;
 				goto drop;

base-commit: ab9de95c9cf952332ab79453b4b5d1bfca8e514f
-- 
2.53.0


^ permalink raw reply related

* Re: [PATCH 1/6] dt-bindings: remoteproc: document M0 Bluetooth Subsystem secure PIL
From: Krzysztof Kozlowski @ 2026-06-26 11:16 UTC (permalink / raw)
  To: George Moussalem
  Cc: Jens Axboe, Ulf Hansson, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Johannes Berg, Jeff Johnson, Bartosz Golaszewski,
	Marcel Holtmann, Luiz Augusto von Dentz, Balakrishna Godavarthi,
	Rocky Liao, Saravana Kannan, Andrew Lunn, Heiner Kallweit,
	Russell King, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Bjorn Andersson, Konrad Dybcio,
	Mathieu Poirier, Philipp Zabel, linux-block, linux-kernel,
	linux-mmc, devicetree, linux-wireless, ath10k, linux-arm-msm,
	linux-bluetooth, netdev, linux-remoteproc
In-Reply-To: <SN7PR19MB67364ADE8CDD7C31297AE18D9DEB2@SN7PR19MB6736.namprd19.prod.outlook.com>

On 26/06/2026 12:51, George Moussalem wrote:
>>
>> No supplies? no address space? How do you actually trigger remoteproc
>> startup?
> 
> No supplied and no address space. The core is booted by a
> qcom_scm_auth_and_reset call to TrustZone which authenticated the
> firmware, takes it out of reset and boots it.

Then commit msg could be improved:

"Firmware loaded is authenticated via TrustZone." ->
"Firmware is loaded and authenticated via TrustZone."


Best regards,
Krzysztof

^ permalink raw reply

* Re: [PATCH net 1/7] xsk: fix buffer leak in xsk_drop_skb() for AF_XDP multi-buffer Tx
From: Larysa Zaremba @ 2026-06-26 11:12 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: netdev, bpf, magnus.karlsson, stfomichev, kuba, pabeni, horms,
	kerneljasonxing, bjorn, Jason Xing
In-Reply-To: <20260623133240.1048434-2-maciej.fijalkowski@intel.com>

On Tue, Jun 23, 2026 at 03:32:34PM +0200, Maciej Fijalkowski wrote:
> From: Jason Xing <kernelxing@tencent.com>
> 
> This patch is inspired by the check[1] from sashiko. It says when
> overflow happens, the address of cq to be published is invalid.
> Actually the severer thing is the whole process of publishing the
> address of cq in this particular case is not right: it should truely
> publish the address and advance the cached_prod in cq as long as it
> reads descriptors from txq.
> 
> The following is the full analysis.
> xsk_drop_skb() is called in three places, which all discard a partially
> built multi-buffer skb:
> 1) xsk_build_skb() -EOVERFLOW error path: packet exceeds MAX_SKB_FRAGS
> 2) __xsk_generic_xmit() post-loop cleanup: an invalid descriptor in
>    the TX ring prevents the partial packet from completing
> 3) xsk_release(): socket close while xs->skb holds an incomplete packet
> 
> In all three cases, the TX descriptors for the already-processed frags
> have been consumed from the TX ring (xskq_cons_release), and CQ slots
> have been reserved. However, xsk_drop_skb() calls xsk_consume_skb()
> which cancels the CQ reservations via xsk_cq_cancel_locked(). Since
> the buffer addresses never appear in the completion queue, userspace
> permanently loses track of these buffers.
> 
> Fix this by letting consume_skb() trigger the existing xsk_destruct_skb
> destructor, which already submits buffer addresses to the CQ via
> xsk_cq_submit_addr_locked().
> 
> Note that cancelling the descriptors back to the TX ring (via
> xskq_cons_cancel_n) is not a appropriate option because an oversized
> packet that always exceeds MAX_SKB_FRAGS would be retried indefinitely,
> which is an obviously deadlock bug in the TX path.
> 
> Also move the desc->addr assignment in xsk_build_skb() above the
> overflow check so that the current descriptor's address is recorded
> before a potential -EOVERFLOW jump to free_err, consistent with the
> zerocopy path in xsk_build_skb_zerocopy().
> 
> [1]: https://lore.kernel.org/all/20260425041726.85FB3C2BCB2@smtp.kernel.org/

This change looks good, but overflow case with only 1 descriptor worries me.
In such cases, once we get to following code, kfree_skb() has already happened:

	if (err == -EOVERFLOW) {
		if (xs->skb) {
			/* Drop the packet */
			xsk_inc_num_desc(xs->skb);
			xsk_drop_skb(xs->skb);
		} else {
			xsk_cq_cancel_locked(xs->pool, 1);
			xs->tx->invalid_descs++;
		}
		xskq_cons_release(xs->tx);
	}

kfree_skb() should have resulted in submission of the single fat descriptor to 
xsk_cq_submit_addr_locked() via xsk_destruct_skb(), so far consistent with the
multi-descriptor bevaior you are proposing here.

But what happens when we cancel a submitted CQ slot via 
xsk_cq_cancel_locked(xs->pool, 1) in the above code?

> 
> Fixes: cf24f5a5feea ("xsk: add support for AF_XDP multi-buffer on Tx path")
> Signed-off-by: Jason Xing <kernelxing@tencent.com>
> ---
>  net/xdp/xsk.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index b970f30ea9b9..a7a83dc4546a 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -794,8 +794,11 @@ static void xsk_consume_skb(struct sk_buff *skb)
>  
>  static void xsk_drop_skb(struct sk_buff *skb)
>  {
> -	xdp_sk(skb->sk)->tx->invalid_descs += xsk_get_num_desc(skb);
> -	xsk_consume_skb(skb);
> +	struct xdp_sock *xs = xdp_sk(skb->sk);
> +
> +	xs->tx->invalid_descs += xsk_get_num_desc(skb);
> +	consume_skb(skb);
> +	xs->skb = NULL;
>  }
>  
>  static int xsk_skb_metadata(struct sk_buff *skb, void *buffer,
> @@ -877,7 +880,7 @@ static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs,
>  			return ERR_PTR(-ENOMEM);
>  
>  		/* in case of -EOVERFLOW that could happen below,
> -		 * xsk_consume_skb() will release this node as whole skb
> +		 * xsk_drop_skb() will release this node as whole skb
>  		 * would be dropped, which implies freeing all list elements
>  		 */
>  		xsk_addr->addrs[xsk_addr->num_descs] = desc->addr;
> @@ -969,6 +972,8 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
>  				goto free_err;
>  			}
>  
> +			xsk_addr->addrs[xsk_addr->num_descs] = desc->addr;
> +
>  			if (unlikely(nr_frags == (MAX_SKB_FRAGS - 1) && xp_mb_desc(desc))) {
>  				err = -EOVERFLOW;
>  				goto free_err;
> @@ -986,8 +991,6 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
>  
>  			skb_add_rx_frag(skb, nr_frags, page, 0, len, PAGE_SIZE);
>  			refcount_add(PAGE_SIZE, &xs->sk.sk_wmem_alloc);
> -
> -			xsk_addr->addrs[xsk_addr->num_descs] = desc->addr;
>  		}
>  	}
>  
> -- 
> 2.43.0
> 
> 

^ permalink raw reply

* Re: [patch 09/24] timekeeping: Add CLOCK_AUX support for ktime_get_snapshot_id()
From: Thomas Weißschuh @ 2026-06-26 11:03 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, David Woodhouse, Miroslav Lichvar, John Stultz,
	Stephen Boyd, Anna-Maria Behnsen, Frederic Weisbecker,
	Arthur Kiyanovski, Rodolfo Giometti, Vincent Donnefort,
	Marc Zyngier, Oliver Upton, kvmarm, Oliver Upton, Richard Cochran,
	netdev, Takashi Iwai, Miri Korenblit, Johannes Berg, Jacob Keller,
	Tony Nguyen, Saeed Mahameed, Peter Hilber, Michael S. Tsirkin,
	virtualization, linux-wireless, linux-sound
In-Reply-To: <87echtk24a.ffs@fw13>

On Fri, Jun 26, 2026 at 12:49:41PM +0200, Thomas Gleixner wrote:
> On Fri, Jun 26 2026 at 10:48, Thomas Weißschuh wrote:
> > On Tue, May 26, 2026 at 07:14:13PM +0200, Thomas Gleixner wrote:
> > (...)
> >
> >>  static inline void tk_update_aux_offs(struct timekeeper *tk, ktime_t offs)
> >> @@ -1218,6 +1223,12 @@ bool ktime_get_snapshot_id(struct system
> >>  		tkd = &tk_core;
> >>  		offs = &tk_core.timekeeper.offs_boot;
> >>  		break;
> >> +	case CLOCK_AUX ... CLOCK_AUX_LAST:
> >> +		tkd = aux_get_tk_data(clock_id);
> >> +		if (!tkd)
> >> +			return false;
> >> +		offs = &tkd->timekeeper.offs_aux;
> >> +		break;
> >
> > 'tkd' is also used to compute 'monoraw'. However 'tkr_raw' and 'tkr_mono'
> > are the same for auxilary clocks, so this will compute a wrong 'monoraw'.
> 
> AUX clocks are independent in the first place and the MONORAW part is
> the "MONORAW" related to the AUX clock itself. 
> 
> > Instead 'monoraw' should be computed based on 'tk_core'.
> > Which then also requires the sequence locking of 'tk_core'.
> 
> No. From a PTP and steering point of view you want the "raw" value which
> is related to the AUX clock itself and not the global one.

Ack.

However the kdocs call it 'CLOCK_MONOTONIC_RAW'. Can we clean this up?

> The global MONORAW and the AUX clock MONORAW are related obviously as
> they share the same conversion factors.
> 
>      MONORAW(AUX$N) = MONORAW(GLOBAL) + OFFSET(AUX$N)
> 
> We don't have that information anywhere right now and we might want to
> expose it to allow user space a proper correlation, but that's an
> orthogonal problem.
> 
> From a PTP/steering point of view the code is correct as is.

Thanks for the clarifications.


Thomas

^ permalink raw reply

* Re: [PATCH net v7 3/4] iavf: send MAC change request synchronously
From: Jose Ignacio Tornos Martinez @ 2026-06-26 10:54 UTC (permalink / raw)
  To: horms
  Cc: aleksandr.loktionov, anthony.l.nguyen, davem, edumazet,
	intel-wired-lan, jacob.e.keller, jtornosm, kuba, netdev, pabeni,
	przemyslaw.kitszel
In-Reply-To: <20260624163148.1146093-2-horms@kernel.org>

>>       if (!ether_addr_equal(netdev->dev_addr, addr->sa_data))
>>               return -EACCES;

> [Severity: High]
> Will this cause a persistent state mismatch if a timeout is followed by a
> retry?
>
> If iavf_set_mac_sync() times out, it returns -EAGAIN and bypasses this
> rollback, leaving adapter->hw.mac.addr updated but netdev->dev_addr at the
> old MAC.
>
> If the user then retries before the PF asynchronously completes the original
> request, iavf_add_ether_addrs() returns -EBUSY. Since -EBUSY != -EAGAIN,
> the rollback runs, forcibly reverting adapter->hw.mac.addr to the old MAC.
> 
> When the PF's delayed success response finally arrives,
> iavf_virtchnl_completion() sees the old MAC in both hw.mac.addr and
> netdev->dev_addr, bypassing the netdev MAC update. The PF hardware is now
> programmed with the new MAC, but the kernel retains the old MAC, dropping
> traffic.

This scenario requires an extremely narrow timing window: a 2.5s PF timeout
(rare in normal operation, PF typically responds much faster), followed by an
immediate retry before adminq_task clears current_op (microsecond window).

Even if this race occurs, it's recoverable - the retry fails with -EBUSY
and the user retries again successfully. This is a transient failure, not
permanent state corruption.

>>       if (adapter->current_op != VIRTCHNL_OP_UNKNOWN) {
>>               /* bail because we already have a command pending */
>>               dev_err(&adapter->pdev->dev, "Cannot add filters, command %d pending\n",
>>                      adapter->current_op);
>> -             return;
>> +             return -EBUSY;
>>       }
>
> [Severity: High]
> Could routine background tasks cause MAC changes to frequently fail?
>
> The watchdog task routinely sets adapter->current_op for background commands
> like VIRTCHNL_OP_GET_STATS. If iavf_set_mac() is called during this window,
> iavf_set_mac_sync() will hit this check and immediately fail the MAC change
> with -EBUSY.
>
> Because there is no retry logic in iavf_set_mac(), this error propagates
> directly to userspace, making MAC address configuration randomly fail based
> on watchdog timing.

Background virtchnl operations (like GET_STATS) complete quickly, typically
within milliseconds. The collision window is very small.

Fail-fast with -EBUSY is semantically correct and allows user-space retry.
If a collision occurs, it's spurious and recoverable immediately on retry.
This is acceptable compared to the complexity and deadlock risks of queuing
or waiting for background operations to complete while holding netdev_lock.

>> -static void iavf_mac_add_reject(struct iavf_adapter *adapter)
>> + */
>> +void iavf_mac_add_reject(struct iavf_adapter *adapter)
>>  {
>>       struct net_device *netdev = adapter->netdev;
>>       struct iavf_mac_filter *f, *ftmp;
>
> [Severity: Medium]
> Does exporting and reusing iavf_mac_add_reject() for local rollback cause
> unintended background MAC changes?
>
> When the user requests a MAC address that is already in the filter list
> (e.g., a secondary MAC), iavf_add_filter() sets f->is_new_mac = false and
> f->add = true, and flags IAVF_FLAG_AQ_ADD_MAC_FILTER.
>
> If the synchronous operation fails locally (like the -EBUSY return above),
> iavf_set_mac() calls this function. However, iavf_mac_add_reject() only
> clears the pending state and removes the filter if f->is_new_mac is true:
>
>    if (f->is_new_mac) {
>        list_del(&f->list);
>        kfree(f);
>    }
>
> Reused filters are ignored, leaving f->add = true and the
> IAVF_FLAG_AQ_ADD_MAC_FILTER flag active. The background watchdog task will
> eventually process this flag and blindly send the MAC configuration to the
> PF, even though the VF already aborted the operation locally.

This scenario only occurs when setting the primary MAC to an address that
already exists as a secondary MAC in the filter list - an extremely rare
configuration.

Even if this occurs and the watchdog later sends the MAC to the PF, it is
harmless: the MAC is already configured on the PF (as a secondary), so the
redundant ADD_ETH_ADDR message has no adverse effect.

The common case - changing primary MAC to a new address - uses is_new_mac =
true and is handled correctly by the rollback logic.

These concerns represent theoretical edge cases that are extremely unlikely
in practice. The synchronous approach fixes a reproducible deadlock affecting
all users in production, allowing the user to retry and complete the
operation. The trade-off is justified.

^ permalink raw reply

* Re: [PATCH 4/6] dt-bindings: net: bluetooth: Document Qualcomm IPQ5018 Bluetooth controller
From: Krzysztof Kozlowski @ 2026-06-26 10:53 UTC (permalink / raw)
  To: George Moussalem
  Cc: Jens Axboe, Ulf Hansson, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Johannes Berg, Jeff Johnson, Bartosz Golaszewski,
	Marcel Holtmann, Luiz Augusto von Dentz, Balakrishna Godavarthi,
	Rocky Liao, Saravana Kannan, Andrew Lunn, Heiner Kallweit,
	Russell King, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Bjorn Andersson, Konrad Dybcio,
	Mathieu Poirier, Philipp Zabel, linux-block, linux-kernel,
	linux-mmc, devicetree, linux-wireless, ath10k, linux-arm-msm,
	linux-bluetooth, netdev, linux-remoteproc
In-Reply-To: <20260625-ipq5018-bluetooth-v1-4-d999be0e04f7@outlook.com>

On Thu, Jun 25, 2026 at 06:10:08PM +0400, George Moussalem wrote:
> Document the Qualcomm IPQ5018 Bluetooth controller.
> 
> Signed-off-by: George Moussalem <george.moussalem@outlook.com>
> ---
>  .../bindings/net/bluetooth/qcom,ipq5018-bt.yaml    | 63 ++++++++++++++++++++++
>  1 file changed, 63 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/net/bluetooth/qcom,ipq5018-bt.yaml b/Documentation/devicetree/bindings/net/bluetooth/qcom,ipq5018-bt.yaml
> new file mode 100644
> index 000000000000..afd33f851858
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/net/bluetooth/qcom,ipq5018-bt.yaml
> @@ -0,0 +1,63 @@
> +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/net/bluetooth/qcom,ipq5018-bt.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Qualcomm IPQ5018 Bluetooth
> +
> +maintainers:
> +  - George Moussalem <george.moussalem@outlook.com>
> +
> +properties:
> +  compatible:
> +    enum:
> +      - qcom,ipq5018-bt
> +
> +  interrupts:
> +    items:
> +      - description:
> +          Interrupt line from the M0 Bluetooth Subsystem to the host processor

What is M0?

Anyway, this part feels completely redundant. Can "interrupts" property
be anything else than an interrupt line from the device to the host
processor?


> +          to notify it of events such as re

This feels useful, but cut/incomplete.

> +
> +  qcom,ipc:
> +    $ref: /schemas/types.yaml#/definitions/phandle-array
> +    items:
> +      - items:
> +          - description: phandle to a syscon node representing the APCS registers
> +          - description: u32 representing offset to the register within the syscon
> +          - description: u32 representing the ipc bit within the register
> +    description: |
> +      These entries specify the outgoing IPC bit used for signaling the remote
> +      M0 BTSS core of a host event or for sending an ACK if the remote processor
> +      expects it.
> +
> +  qcom,rproc:
> +    $ref: /schemas/types.yaml#/definitions/phandle
> +    description:
> +      Phandle to the remote processor node representing the M0 BTSS core.
> +
> +required:
> +  - compatible
> +  - interrupts
> +  - qcom,ipc
> +  - qcom,rproc
> +
> +allOf:
> +  - $ref: bluetooth-controller.yaml#
> +  - $ref: qcom,bluetooth-common.yaml
> +
> +unevaluatedProperties: false
> +
> +examples:
> +  - |
> +    #include <dt-bindings/interrupt-controller/arm-gic.h>
> +
> +    bluetooth: bluetooth {

Drop unused label

> +      compatible = "qcom,ipq5018-bt";
> +
> +      qcom,ipc = <&apcs_glb 8 23>;
> +      interrupts = <GIC_SPI 162 IRQ_TYPE_EDGE_RISING>;

No firmware to load?

It feels like remoteproc node split is fake. The property qcom,rproc is
even more supporting that case. Shouldn't this be simply one device -
bluetooth? What sort of two devices do you have exactly? How can I
identify them in the hardware?

> +
> +      qcom,rproc = <&m0_btss>;

Best regards,
Krzysztof


^ permalink raw reply

* Re: [PATCH 1/6] dt-bindings: remoteproc: document M0 Bluetooth Subsystem secure PIL
From: George Moussalem @ 2026-06-26 10:51 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Jens Axboe, Ulf Hansson, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Johannes Berg, Jeff Johnson, Bartosz Golaszewski,
	Marcel Holtmann, Luiz Augusto von Dentz, Balakrishna Godavarthi,
	Rocky Liao, Saravana Kannan, Andrew Lunn, Heiner Kallweit,
	Russell King, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Bjorn Andersson, Konrad Dybcio,
	Mathieu Poirier, Philipp Zabel, linux-block, linux-kernel,
	linux-mmc, devicetree, linux-wireless, ath10k, linux-arm-msm,
	linux-bluetooth, netdev, linux-remoteproc
In-Reply-To: <20260626-tiny-warm-jerboa-3ba57a@quoll>

Hi Krzysztof,

On 6/26/26 14:47, Krzysztof Kozlowski wrote:
> On Thu, Jun 25, 2026 at 06:10:05PM +0400, George Moussalem wrote:
>> Document the M0 Bluetooth Subsystem remote processor core found in the
>> Qualcomm IPQ5018 SoC. Firmware loaded is authenticated via TrustZone.
>> The firmware running on the M0 core provides bluetooth functionality.
>>
>> Signed-off-by: George Moussalem <george.moussalem@outlook.com>
>> ---
>>  .../bindings/remoteproc/qcom,m0-btss-pil.yaml      | 72 ++++++++++++++++++++++
>>  1 file changed, 72 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/remoteproc/qcom,m0-btss-pil.yaml b/Documentation/devicetree/bindings/remoteproc/qcom,m0-btss-pil.yaml
>> new file mode 100644
>> index 000000000000..397bb6815d71
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/remoteproc/qcom,m0-btss-pil.yaml
> 
> Use compatible as filename.

understood, will update in v2.

> 
>> @@ -0,0 +1,72 @@
>> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
>> +%YAML 1.2
>> +---
>> +$id: http://devicetree.org/schemas/remoteproc/qcom,m0-btss-pil.yaml#
>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>> +
>> +title: Qualcomm M0 BTSS Peripheral Image Loader
>> +
>> +maintainers:
>> +  - George Moussalem <george.moussalem@outlook.com>
>> +
>> +description:
>> +  Qualcomm M0 BTSS Peripheral Secure Image Loader loads firmware and powers up
>> +  the M0 BTSS remote processor core on the Qualcomm IPQ5018 SoC.
>> +
>> +properties:
>> +  compatible:
>> +    enum:
>> +      - qcom,ipq5018-btss-pil
>> +
>> +  firmware-name:
>> +    maxItems: 1
>> +    description: Firmware name for the M0 Bluetooth Subsystem core
> 
> You can drop description, pretty obvious.

will drop

> 
>> +
>> +  clocks:
>> +    items:
>> +      - description: M0 BTSS low power oscillator clock
>> +
>> +  clock-names:
>> +    items:
>> +      - const: btss_lpo_clk
> 
> Just "lpo"

will update

> 
>> +
>> +  memory-region:
>> +    items:
>> +      - description: M0 BTSS reserved memory carveout
>> +
>> +  resets:
>> +    items:
>> +      - description: M0 BTSS reset
>> +
>> +  reset-names:
>> +    items:
>> +      - const: btss_reset
> 
> Drop names. Using block name as input name is not really useful.

Will drop

> 
> No supplies? no address space? How do you actually trigger remoteproc
> startup?

No supplied and no address space. The core is booted by a
qcom_scm_auth_and_reset call to TrustZone which authenticated the
firmware, takes it out of reset and boots it.

> 
>> +
>> +required:
>> +  - compatible
>> +  - firmware-name
>> +  - clocks
>> +  - clock-names
>> +  - resets
>> +  - reset-names
>> +  - memory-region
>> +
>> +additionalProperties: false
> 
> Best regards,
> Krzysztof
> 


^ permalink raw reply

* Re: [patch 09/24] timekeeping: Add CLOCK_AUX support for ktime_get_snapshot_id()
From: David Woodhouse @ 2026-06-26 10:51 UTC (permalink / raw)
  To: Thomas Weißschuh, Thomas Gleixner
  Cc: LKML, Miroslav Lichvar, John Stultz, Stephen Boyd,
	Anna-Maria Behnsen, Frederic Weisbecker, Arthur Kiyanovski,
	Rodolfo Giometti, Vincent Donnefort, Marc Zyngier, Oliver Upton,
	kvmarm, Oliver Upton, Richard Cochran, netdev, Takashi Iwai,
	Miri Korenblit, Johannes Berg, Jacob Keller, Tony Nguyen,
	Saeed Mahameed, Peter Hilber, Michael S. Tsirkin, virtualization,
	linux-wireless, linux-sound
In-Reply-To: <20260626103359-66ab2b54-d36f-416b-94a4-3f3708dccced@linutronix.de>

[-- Attachment #1: Type: text/plain, Size: 1633 bytes --]

On Fri, 2026-06-26 at 10:48 +0200, Thomas Weißschuh wrote:
> 'tkd' is also used to compute 'monoraw'. However 'tkr_raw' and 'tkr_mono'
> are the same for auxilary clocks, so this will compute a wrong 'monoraw'.
> Instead 'monoraw' should be computed based on 'tk_core'.
> Which then also requires the sequence locking of 'tk_core'.
> 
> As you know I have a series which unifies the locking between the
> different timekeepers. Maybe we revert this patch for 7.2 and I send
> a fixed variant including the prerequisites for 7.3.
> 
> (The same goes for get_device_system_crosststamp())

No fundamental objection from me... but does it matter in practice?
Is it even reachable?

I think the only way these functions can end up being invoked for aux
clocks is via PTP_SYS_OFFSET_EXTENDED, which since commit a6d799608e6
("ptp: Switch to ktime_get_snapshot_id() for pre/post timestamps") and
some other driver-specific changes in this series, can now invoke them
with the user-requested clockid.

But those callers *only* care about ::systime, don't they? So the
problem never arises because there's no code path in which anything
actually cares about ::monoraw for AUX clocks.

If we back out the handling of AUX clocks in ktime_get_snapshot_id()
we'd have to roll back much of that other plumbing too. If we *just*
remove the 'case CLOCK_AUX ... CLOCK_AUX_LAST' hunk, we'd end up with
users being able to trigger the 'default: WARN_ON_ONCE(1)' which
follows.

So while you're right, I *think* it's harmless in practice and
reverting it safely is slightly more complex than it seems at first
glance?

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]

^ permalink raw reply

* Re: [patch 09/24] timekeeping: Add CLOCK_AUX support for ktime_get_snapshot_id()
From: Thomas Gleixner @ 2026-06-26 10:49 UTC (permalink / raw)
  To: Thomas Weißschuh
  Cc: LKML, David Woodhouse, Miroslav Lichvar, John Stultz,
	Stephen Boyd, Anna-Maria Behnsen, Frederic Weisbecker,
	Arthur Kiyanovski, Rodolfo Giometti, Vincent Donnefort,
	Marc Zyngier, Oliver Upton, kvmarm, Oliver Upton, Richard Cochran,
	netdev, Takashi Iwai, Miri Korenblit, Johannes Berg, Jacob Keller,
	Tony Nguyen, Saeed Mahameed, Peter Hilber, Michael S. Tsirkin,
	virtualization, linux-wireless, linux-sound
In-Reply-To: <20260626103359-66ab2b54-d36f-416b-94a4-3f3708dccced@linutronix.de>

On Fri, Jun 26 2026 at 10:48, Thomas Weißschuh wrote:
> On Tue, May 26, 2026 at 07:14:13PM +0200, Thomas Gleixner wrote:
> (...)
>
>>  static inline void tk_update_aux_offs(struct timekeeper *tk, ktime_t offs)
>> @@ -1218,6 +1223,12 @@ bool ktime_get_snapshot_id(struct system
>>  		tkd = &tk_core;
>>  		offs = &tk_core.timekeeper.offs_boot;
>>  		break;
>> +	case CLOCK_AUX ... CLOCK_AUX_LAST:
>> +		tkd = aux_get_tk_data(clock_id);
>> +		if (!tkd)
>> +			return false;
>> +		offs = &tkd->timekeeper.offs_aux;
>> +		break;
>
> 'tkd' is also used to compute 'monoraw'. However 'tkr_raw' and 'tkr_mono'
> are the same for auxilary clocks, so this will compute a wrong 'monoraw'.

AUX clocks are independent in the first place and the MONORAW part is
the "MONORAW" related to the AUX clock itself. 

> Instead 'monoraw' should be computed based on 'tk_core'.
> Which then also requires the sequence locking of 'tk_core'.

No. From a PTP and steering point of view you want the "raw" value which
is related to the AUX clock itself and not the global one.

The global MONORAW and the AUX clock MONORAW are related obviously as
they share the same conversion factors.

     MONORAW(AUX$N) = MONORAW(GLOBAL) + OFFSET(AUX$N)

We don't have that information anywhere right now and we might want to
expose it to allow user space a proper correlation, but that's an
orthogonal problem.

From a PTP/steering point of view the code is correct as is.

Thanks,

        tglx



^ permalink raw reply

* Re: [PATCH 1/6] dt-bindings: remoteproc: document M0 Bluetooth Subsystem secure PIL
From: Krzysztof Kozlowski @ 2026-06-26 10:47 UTC (permalink / raw)
  To: George Moussalem
  Cc: Jens Axboe, Ulf Hansson, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Johannes Berg, Jeff Johnson, Bartosz Golaszewski,
	Marcel Holtmann, Luiz Augusto von Dentz, Balakrishna Godavarthi,
	Rocky Liao, Saravana Kannan, Andrew Lunn, Heiner Kallweit,
	Russell King, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Bjorn Andersson, Konrad Dybcio,
	Mathieu Poirier, Philipp Zabel, linux-block, linux-kernel,
	linux-mmc, devicetree, linux-wireless, ath10k, linux-arm-msm,
	linux-bluetooth, netdev, linux-remoteproc
In-Reply-To: <20260625-ipq5018-bluetooth-v1-1-d999be0e04f7@outlook.com>

On Thu, Jun 25, 2026 at 06:10:05PM +0400, George Moussalem wrote:
> Document the M0 Bluetooth Subsystem remote processor core found in the
> Qualcomm IPQ5018 SoC. Firmware loaded is authenticated via TrustZone.
> The firmware running on the M0 core provides bluetooth functionality.
> 
> Signed-off-by: George Moussalem <george.moussalem@outlook.com>
> ---
>  .../bindings/remoteproc/qcom,m0-btss-pil.yaml      | 72 ++++++++++++++++++++++
>  1 file changed, 72 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/remoteproc/qcom,m0-btss-pil.yaml b/Documentation/devicetree/bindings/remoteproc/qcom,m0-btss-pil.yaml
> new file mode 100644
> index 000000000000..397bb6815d71
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/remoteproc/qcom,m0-btss-pil.yaml

Use compatible as filename.

> @@ -0,0 +1,72 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/remoteproc/qcom,m0-btss-pil.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Qualcomm M0 BTSS Peripheral Image Loader
> +
> +maintainers:
> +  - George Moussalem <george.moussalem@outlook.com>
> +
> +description:
> +  Qualcomm M0 BTSS Peripheral Secure Image Loader loads firmware and powers up
> +  the M0 BTSS remote processor core on the Qualcomm IPQ5018 SoC.
> +
> +properties:
> +  compatible:
> +    enum:
> +      - qcom,ipq5018-btss-pil
> +
> +  firmware-name:
> +    maxItems: 1
> +    description: Firmware name for the M0 Bluetooth Subsystem core

You can drop description, pretty obvious.

> +
> +  clocks:
> +    items:
> +      - description: M0 BTSS low power oscillator clock
> +
> +  clock-names:
> +    items:
> +      - const: btss_lpo_clk

Just "lpo"

> +
> +  memory-region:
> +    items:
> +      - description: M0 BTSS reserved memory carveout
> +
> +  resets:
> +    items:
> +      - description: M0 BTSS reset
> +
> +  reset-names:
> +    items:
> +      - const: btss_reset

Drop names. Using block name as input name is not really useful.

No supplies? no address space? How do you actually trigger remoteproc
startup?

> +
> +required:
> +  - compatible
> +  - firmware-name
> +  - clocks
> +  - clock-names
> +  - resets
> +  - reset-names
> +  - memory-region
> +
> +additionalProperties: false

Best regards,
Krzysztof


^ permalink raw reply

* Re: [PATCH net v2 1/1] net/sched: sch_teql: Introduce slaves_lock to avoid race condition and UAF
From: Jamal Hadi Salim @ 2026-06-26 10:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, edumazet, kuba, pabeni, horms, jiri, victor, security,
	zdi-disclosures, stable, kernel test robot
In-Reply-To: <20260624224016.24018-1-jhs@mojatatu.com>

"

On Wed, Jun 24, 2026 at 6:40 PM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>
> The teql master->slaves singly linked list is not protected against
> multiple writes. It can be mod'ed concurently from teql_master_xmit(),
> teql_dequeue(), teql_init() and teql_destroy() without holding any list
> lock or RCU protection.
>
> zdi-disclosures@trendmicro.com has demonstrated that the qdisc is freed
> after an RCU grace period, but teql_master_xmit() running on another
> CPU can still hold a stale pointer into the list, resulting in a
> slab-use-after-free:
>
> BUG: KASAN: slab-use-after-free in teql_destroy+0x3ca/0x440 linux/net/sched/sch_teql.c:142
> Read of size 8 at addr ffff88802923aa80 by task ip/10024
>
> The zdi-disclosures@trendmicro.com repro created concurrent AF_PACKET
> senders on a teql device against a thread that repeatedly adds/deletes the
> slave qdisc, together with a SLUB spray that reclaims the freed slot; the
> resulting UAF is controllable enough to be turned into a read/write
> primitive against the freed qdisc object.
>
> The fix?
> Add a per-master slaves_lock spinlock that serializes all mutations of
> master->slaves and the NEXT_SLAVE() links in teql_destroy() and
> teql_qdisc_init(). teql_master_xmit() also takes the same slaves_lock
> around those updates.
> Annotate master->slaves and the per-slave ->next pointer with __rcu and
> use the appropriate RCU accessors everywhere they are touched:
> rcu_assign_pointer() on the writer side (under slaves_lock),
> rcu_dereference_protected() for the writer-side loads (also under
> slaves_lock), rcu_dereference_bh() for the loads in teql_master_xmit() and
> rtnl_dereference() for the loads in teql_master_open()/teql_master_mtu(),
> which run under RTNL.
> Pair this with rcu_read_lock_bh()/rcu_read_unlock_bh() around the list
> traversal in teql_master_xmit(), so that readers either observe a fully
> linked list or are deferred until the in-flight mutation completes. The two
> early-return paths in teql_master_xmit() are updated to release the RCU-bh
> read-side critical section before returning, since leaving it held would
> disable BH on that CPU for good.
>

sashiko-gemini's complaints:
https://sashiko.dev/#/patchset/20260624224016.24018-1-jhs%40mojatatu.com
seem bogus to me (someone correct me if i am wrong). I am only going
to address the first claim of "TOCTOU / "resurrection" race in
teql_master_xmit()"
teql_master_xmit() holds rcu_read_lock_bh() across the entire
traversal. teql_destroy() freeing can only proceed once the qdisc's
RCU grace period has elapsed - so where is this TOCTOU? Let's say this
were true: both calls hold the slaves_lock.
The other issues are of similar nature.

OTOH, sashiko-claude
(https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260624224016.24018-1-jhs%40mojatatu.com)
does make some valid claims which are low value, so not sure a resend
is worth it.
For example in claim 1 it says "Should the changelog mention this
teql_dequeue() site too?" Sure I can - but just because I provided
extra information in the commit log, which I could have omitted, now I
have to add more info? ;-> The second claim is "rcu_dereference_bh()
should be rcu_dereference_protected() on writer side". Sparse didnt
complain and i dont see this as breakage rather a consistency measure.

Unless I am missing something ..

cheers,
jamal

^ permalink raw reply

* Re: [PATCH net-next 6/9] net: sparx5: vcap: add lpm vcap implementation
From: Jens Emil Schulz Ostergaard @ 2026-06-26 10:16 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: horatiu.vultur, UNGLinuxDriver, andrew+netdev, davem, edumazet,
	pabeni, daniel.machon, Steen.Hegelund, kees, gustavoars, netdev,
	linux-kernel, linux-arm-kernel, linux-hardening
In-Reply-To: <20260615233849.1042290-1-kuba@kernel.org>

On Mon, 2026-06-15 at 16:38 -0700, Jakub Kicinski wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> ---
> net: sparx5: vcap: add lpm vcap implementation
> 
> Add the implementation for the LPM VCAP for lan969x and sparx5. The LPM
> VCAP has a static keyset configuration, and a single lookup dedicated to
> LPM, so this is straight forward, and many functions can use existing
> functionality.
> 
> > diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_vcap_impl.c b/drivers/net/ethernet/microchip/sparx5/sparx5_vcap_impl.c
> > index 95b93e46a41dc..e25e759c24f60 100644
> > --- a/drivers/net/ethernet/microchip/sparx5/sparx5_vcap_impl.c
> > +++ b/drivers/net/ethernet/microchip/sparx5/sparx5_vcap_impl.c
> > @@ -1083,6 +1146,46 @@ static void sparx5_vcap_es2_cache_write(struct sparx5 *sparx5,
> >       }
> >  }
> > 
> > +static void sparx5_vcap_lpm_cache_write(struct sparx5 *sparx5,
> > +                                     struct vcap_admin *admin,
> > +                                     enum vcap_selection sel,
> > +                                     u32 start,
> > +                                     u32 count)
> > +{
> > +     u32 *keystr, *mskstr, *actstr;
> > +     int idx;
> > +
> > +     keystr = &admin->cache.keystream[start];
> > +     mskstr = &admin->cache.maskstream[start];
> > +     actstr = &admin->cache.actionstream[start];
> > +
> > +     switch (sel) {
> > +     case VCAP_SEL_ENTRY:
> > +             for (idx = 0; idx < count; ++idx) {
> > +                     /* Avoid 'match-off' by setting value & mask */
> > +                     spx5_wr(keystr[idx] & mskstr[idx], sparx5,
> > +                             VCAP_SUPER_VCAP_ENTRY_DAT(idx));
> 
> [Severity: High]
> Are concurrent writes to the shared VCAP_SUPER_VCAP_ENTRY_DAT registers
> safe here without a global hardware lock across the different VCAP types?
> 
> It looks like IS0 and IS2 rules are updated via vcap_add_rule() under the
> rtnl_lock(). However, the LPM rules are updated under the
> sparx5->router->lock inside a workqueue.
> 
> Since both execution paths share the Super VCAP hardware block, could
> concurrent updates from TC (IS0/IS2) and routing (LPM) race on these shared
> cache registers, and also on VCAP_SUPER_CTRL in sparx5_vcap_super_update(),
> garbling the hardware command and corrupting the cache memory?
> 

Very nice catch, I missed this. This is even reachable before this series
because the debugfs dump also uses these registers, and runs outside of
rtnl.

I will send a fix to net which replaces the per instance locks with a single
vcap_control global mutex.

> > +                     spx5_wr(~mskstr[idx], sparx5,
> > +                             VCAP_SUPER_VCAP_MASK_DAT(idx));
> > +             }
> > +             break;
> 
> [ ... ]
> 
> > @@ -1379,6 +1523,7 @@ static void sparx5_vcap_update(struct net_device *ndev,
> >       struct sparx5 *sparx5 = port->sparx5;
> > 
> >       switch (admin->vtype) {
> > +     case VCAP_TYPE_LPM:
> >       case VCAP_TYPE_IS0:
> >       case VCAP_TYPE_IS2:
> >               sparx5_vcap_super_update(sparx5, cmd, sel, addr);


^ permalink raw reply

* Re: [PATCH net-next 1/9] net: microchip: vcap: fix rule move for rules of coprime size
From: Jens Emil Schulz Ostergaard @ 2026-06-26 10:07 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: horatiu.vultur, UNGLinuxDriver, andrew+netdev, davem, edumazet,
	pabeni, daniel.machon, Steen.Hegelund, kees, gustavoars, netdev,
	linux-kernel, linux-arm-kernel, linux-hardening
In-Reply-To: <20260615233845.1042219-1-kuba@kernel.org>

On Mon, 2026-06-15 at 16:38 -0700, Jakub Kicinski wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> ---
> Subject: net: microchip: vcap: fix rule move for rules of coprime size
> 
> This patch addresses a discrepancy in how software and hardware move rule
> addresses on insert and delete for the LPM VCAP. It introduces a uniform
> offset computation aligned to the least common multiple of all rule sizes
> in a block. This ensures that hardware and software representations remain
> synchronized and all rules land on correctly aligned addresses.
> 
> > diff --git a/drivers/net/ethernet/microchip/vcap/vcap_api.c b/drivers/net/ethernet/microchip/vcap/vcap_api.c
> > index 0fdb5e363badc..6946fd7384581 100644
> > --- a/drivers/net/ethernet/microchip/vcap/vcap_api.c
> > +++ b/drivers/net/ethernet/microchip/vcap/vcap_api.c
> 
> [ ... ]
> 
> > @@ -2275,8 +2336,7 @@ int vcap_add_rule(struct vcap_rule *rule)
> >                      __func__, __LINE__, ret);
> >               goto out;
> >       }
> > -     if (move.count > 0)
> > -             vcap_move_rules(ri, &move);
> > +     vcap_move_rules(ri, &move);
> > 
> >       /* Set the counter to zero */
> >       ret = vcap_write_counter(ri, &ctr);
> 
> [Severity: High]
> This is a pre-existing issue, but does vcap_add_rule() leak the newly
> duplicated rule if operations like vcap_write_counter() or
> vcap_write_rule() fail later on?
> 
> If a failure occurs, the code takes the goto out path:
> 
> vcap_add_rule() {
>     ...
>     ret = vcap_write_rule(ri);
>     if (ret) {
>         pr_err("%s:%d: rule write error: %d\n", __func__, __LINE__, ret);
>         goto out;
>     }
> out:
>     mutex_unlock(&ri->admin->lock);
>     return ret;
> }
> 
> This seems to exit without reverting the earlier list insertions and
> hardware moves.
> 
> Could this leave the software and hardware states permanently inconsistent
> and leak the newly duplicated rule?
> 

I think this is latent, and not actually reachable by current callers.

1. vcap_write_counter always returns 0
2. vcap_encode_rule can not error if vcap_val_rule is called before vcap_add_rule
3. vcap_write_rule can not error if vcap_val_rule is called before vcap_add_rule

So I propose this is fixed in seperate series.

> [ ... ]
> 
> > @@ -2488,59 +2548,52 @@ int vcap_mod_rule(struct vcap_rule *rule)
> >  }
> >  EXPORT_SYMBOL_GPL(vcap_mod_rule);
> 
> [Severity: High]
> This isn't a bug introduced by this patch, but does vcap_mod_rule() discard
> the active software rule pointer and overwrite hardware with stale addresses?
> 
> Looking at vcap_mod_rule(), the function validates the rule using
> vcap_get_locked_rule() but ignores the returned pointer:
> 
> vcap_mod_rule(struct vcap_rule *rule) {
>     struct vcap_rule_internal *ri = to_intrule(rule);
>     ...
>     if (!vcap_get_locked_rule(ri->vctrl, ri->data.id))
>         return -ENOENT;
>     ...
>     err = vcap_encode_rule(ri);
>     if (err)
>         goto out;
> 
>     err = vcap_write_rule(ri);
> }
> 
> Because the function continues to use the caller provided copy ri, it
> doesn't see updates to the real rule in the admin->rules list.
> 
> If the real rule's hardware address was previously shifted due to insertions
> or deletions, could writing to the stale address in ri->addr corrupt other
> rules in the hardware VCAP?
> 

This is also true but latent again. To trigger this we would need concurrency
against the same vcap instance. Before this series vcap_mod_rule was only
called under rtnl. This series introduces a call outside rtnl, but the new LPM VCAP
is not usable by tc only the routing module, which serializes everything under
one mutex.

So I think this should be fixed in a seperate series.

> [ ... ]
> 
> One other pre-existing issue noticed while reviewing this subsystem:
> 
> [Severity: High]
> This is a pre-existing issue, but is it safe to iterate over the admin->rules
> list across different VCAP instances without acquiring admin->lock for
> each instance?
> 
> In vcap_rule_exists():
> 
> static bool vcap_rule_exists(struct vcap_control *vctrl, u32 id) {
>     ...
>     list_for_each_entry(admin, &vctrl->list, list)
>         list_for_each_entry(ri, &admin->rules, list)
>             if (ri->data.id == id)
>                 return true;
>     return false;
> }
> 
> If another thread is concurrently inserting or deleting a rule on another
> VCAP instance while holding that instance's lock, could this unprotected
> concurrent access encounter a data race and dereference a poisoned pointer?

Yes this becomes reachable with this series, due to calls outside rtnl.
It is a symptom of a wider issue with the per instance locking in the VCAP
api. Sashiko found another existing bug with the shared SUPER vcap registers
also caused by this, and that one is reachable in mainline, so I will send a
fix to net for the vcap locking which will also fix this problem, then send
v2 once that is settled.

> --
> pw-bot: cr


^ permalink raw reply

* [ANNOUNCEMENT] LPC 2026: System Monitoring and Observability Microconference
From: Breno Leitao @ 2026-06-26  9:56 UTC (permalink / raw)
  To: linux-acpi, linux-hwmon, netdev, linux-kernel, linux-arm-kernel,
	kernel-team, linux-mm
  Cc: Breno Leitao, kerneljasonxing, iipeace5, gavinguo, linux,
	amscanne, sj, gpiccoli, Daniel Gomez, mfo, platform-driver-x86,
	acpica-devel

We are pleased to announce the Call for Proposals (CFP) for another
edition of  System Monitoring and Observability Microconference, this
time at the 2026 Linux Plumbers Conference (LPC), taking place in
Prague, Czechia, from Oct 5-7, 2026.

  https://lpc.events/event/20/sessions/262/

This microconference provides a valuable forum for key engineering areas
such as:

   - Kernel Health and Runtime Monitoring
   - Hardware Integration and Error Detection
   - Correlation of Issues (crashes, stalls, bugs)
   - Virtualization Stack Monitoring
   - Memory Management Monitoring and Observability
   - Anomaly Detection Algorithms for System Behavior
   - Automated Analysis, Remediation and post mortem analyzes

The purpose of each talk is to share challenges and discuss potential
improvements. Sessions will last 20 to 30 minutes and aim to encourage
brainstorming and open dialogue about ongoing issues rather than
delivering immediate solutions.

The conference acts as both a knowledge-sharing platform and a strategic
venue for guiding the future of kernel technologies to better meet the
demands of large-scale infrastructure.

We invite you to submit your proposals here:
	https://lpc.events/event/20/abstracts/

Please select track "Linux System Monitoring and Observability MC"

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox