* Re: [PATCH v4] vhost/vdpa: reject overflowing PA map page counts on 32-bit
From: Michael S. Tsirkin @ 2026-06-24 22:10 UTC (permalink / raw)
To: Yousef Alhouseen
Cc: Jason Wang, Eugenio Pérez, kvm, virtualization, netdev,
linux-kernel
In-Reply-To: <CAMuQ4bX-iDvcUOPPY+NLz95tkRJYwWqvzAr=U48uNaub_HZLGw@mail.gmail.com>
On Wed, Jun 24, 2026 at 03:02:02PM -0700, Yousef Alhouseen wrote:
> vhost_vdpa_pa_map() adds the IOVA page offset to the user-controlled map
> size before computing the number of pages to pin. On 32-bit systems,
> where unsigned long is narrower than u64, that addition can overflow and
> the code can pin and map fewer pages than the requested IOTLB range.
>
> Reject sizes that overflow the unsigned long page-count calculation.
>
> Fixes: 22af48cf91aa ("vdpa: factor out vhost_vdpa_pa_map() and
> vhost_vdpa_pa_unmap()")
still 2 lines?
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Yousef Alhouseen <alhouseenyousef@gmail.com>
> ---
> Changes in v4:
> - Keep the Fixes tag on one line.
> - Add Michael's Acked-by tag.
>
> Changes in v3:
> - Add the Fixes tag.
>
> Changes in v2:
> - Clarify that the overflow is on 32-bit systems.
> - Drop the unrelated memlock check change.
>
> drivers/vhost/vdpa.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index ac55275fa..38b28ed3d 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -1102,6 +1102,7 @@ static int vhost_vdpa_pa_map(struct vhost_vdpa *v,
> unsigned int gup_flags = FOLL_LONGTERM;
> unsigned long npages, cur_base, map_pfn, last_pfn = 0;
> unsigned long lock_limit, sz2pin, nchunks, i;
> + unsigned long page_offset;
> u64 start = iova;
> long pinned;
> int ret = 0;
> @@ -1114,7 +1115,13 @@ static int vhost_vdpa_pa_map(struct vhost_vdpa *v,
> if (perm & VHOST_ACCESS_WO)
> gup_flags |= FOLL_WRITE;
>
> - npages = PFN_UP(size + (iova & ~PAGE_MASK));
> + page_offset = iova & ~PAGE_MASK;
> + if (size > ULONG_MAX - page_offset) {
> + ret = -EINVAL;
> + goto free;
> + }
> +
> + npages = PFN_UP(size + page_offset);
> if (!npages) {
> ret = -EINVAL;
> goto free;
> --
> 2.54.0
^ permalink raw reply
* Re: [PATCH v29 4/5] sfc: obtain and map cxl range using devm_cxl_probe_mem
From: Dan Williams (nvidia) @ 2026-06-24 22:10 UTC (permalink / raw)
To: alejandro.lucero-palau, linux-cxl, netdev, dan.j.williams,
edward.cree, davem, kuba, pabeni, edumazet, dave.jiang
Cc: Alejandro Lucero, Edward Cree
In-Reply-To: <20260622124010.2192888-5-alejandro.lucero-palau@amd.com>
alejandro.lucero-palau@ wrote:
> From: Alejandro Lucero <alucerop@amd.com>
>
> Use core API for safely obtain the CXL range linked to an HDM committed
> by the BIOS. Map such a range for being used as the ctpio buffer.
>
> A potential user space action through sysfs unbinding or core cxl
> modules remove will trigger sfc driver device detachment, with that case
> not racing with this mapping as this is done during driver probe and
> therefore protected with device lock against those user space actions.
>
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> Acked-by: Edward Cree <ecree.xilinx@gmail.com>
> ---
> drivers/net/ethernet/sfc/efx.c | 2 ++
> drivers/net/ethernet/sfc/efx_cxl.c | 23 +++++++++++++++++++++++
> drivers/net/ethernet/sfc/efx_cxl.h | 3 +++
> 3 files changed, 28 insertions(+)
>
> diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
> index 61cbb6cfc360..3806cd3dd7f4 100644
> --- a/drivers/net/ethernet/sfc/efx.c
> +++ b/drivers/net/ethernet/sfc/efx.c
> @@ -984,6 +984,7 @@ static void efx_pci_remove(struct pci_dev *pci_dev)
> efx_fini_io(efx);
>
> probe_data = container_of(efx, struct efx_probe_data, efx);
> + efx_cxl_exit(probe_data);
>
> pci_dbg(efx->pci_dev, "shutdown successful\n");
>
> @@ -1242,6 +1243,7 @@ static int efx_pci_probe(struct pci_dev *pci_dev,
> return 0;
>
> fail3:
> + efx_cxl_exit(probe_data);
> efx_fini_io(efx);
> fail2:
> efx_fini_struct(efx);
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> index 18b535b3ea40..3e7c950f83e9 100644
> --- a/drivers/net/ethernet/sfc/efx_cxl.c
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -18,6 +18,7 @@ int efx_cxl_init(struct efx_probe_data *probe_data)
> {
> struct efx_nic *efx = &probe_data->efx;
> struct pci_dev *pci_dev = efx->pci_dev;
> + struct range cxl_pio_range;
> struct efx_cxl *cxl;
> u16 dvsec;
> int rc;
> @@ -73,9 +74,31 @@ int efx_cxl_init(struct efx_probe_data *probe_data)
> return -ENODEV;
> }
>
> + cxl->cxlmd = devm_cxl_probe_mem(&cxl->cxlds, &cxl_pio_range);
> + if (IS_ERR(cxl->cxlmd)) {
> + pci_err(pci_dev, "CXL accel memdev creation failed\n");
> + return PTR_ERR(cxl->cxlmd);
> + }
> +
> + cxl->ctpio_cxl = ioremap_wc(cxl_pio_range.start,
> + range_len(&cxl_pio_range));
> + if (!cxl->ctpio_cxl) {
> + pci_err(pci_dev, "CXL ioremap region (%pra) failed\n",
> + &cxl_pio_range);
> + return -ENOMEM;
> + }
> +
> probe_data->cxl = cxl;
>
> return 0;
> }
>
> +void efx_cxl_exit(struct efx_probe_data *probe_data)
> +{
If you are going to have an explicit efx_cxl_exit() then I would also
add an explicit unregistration of the memdev. This would also fix the
Sashiko report about pci_disable_device() running while the cxl_memdev
is still registered. Unfortunately, mixing devm and explicit unwind is
always fraught.
Let me know if this passes your testing, and I can send it out as a
standalone patch. You could also use it to unwind if the ioremap()
fails.
-- 8< --
From 47940f7d6de6dd1039d0e445eb944ca96ad5eb3a Mon Sep 17 00:00:00 2001
From: Dan Williams <djbw@kernel.org>
Date: Wed, 24 Jun 2026 11:47:24 -0700
Subject: [PATCH] cxl: Add devm_cxl_remove_mem()
Undo devm_cxl_probe_mem() without causing the host driver to unload.
The expectation of devm_cxl_probe_mem() is that upon attach the caller
depends on the CXL topology not being torn down. If it is then it is
escalated to a 'remove' event on the caller. However, if the caller wants
to cancel their interest in CXL operation, they should be able to do so
without that side effect.
This is useful for setup scenarios where CXL successfully attaches, but
some other event precludes CXL operation.
Cc: Alejandro Lucero <alucerop@amd.com>
Signed-off-by: Dan Williams <djbw@kernel.org>
---
include/cxl/cxl.h | 1 +
drivers/cxl/core/memdev.c | 25 +++++++++++++++++++++++++
2 files changed, 26 insertions(+)
diff --git a/include/cxl/cxl.h b/include/cxl/cxl.h
index 016c74fb747c..cff2922b0d4a 100644
--- a/include/cxl/cxl.h
+++ b/include/cxl/cxl.h
@@ -226,4 +226,5 @@ struct cxl_dev_state *_devm_cxl_dev_state_create(struct device *dev,
struct cxl_memdev *devm_cxl_probe_mem(struct cxl_dev_state *cxlds,
struct range *range);
+void devm_cxl_remove_mem(struct cxl_memdev *cxlmd);
#endif /* __CXL_CXL_H__ */
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 33a3d2e7b13a..aa35876f0067 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -1185,6 +1185,31 @@ struct cxl_memdev *__devm_cxl_add_memdev(struct cxl_dev_state *cxlds,
}
EXPORT_SYMBOL_FOR_MODULES(__devm_cxl_add_memdev, "cxl_mem");
+static struct cxl_attach_region *cancel_probe_mem_teardown(struct cxl_memdev *cxlmd)
+{
+ struct cxl_attach_region *attach;
+
+ guard(device)(&cxlmd->dev);
+ attach = container_of(cxlmd->attach, typeof(*attach), attach);
+ cxlmd->attach = NULL;
+ return attach;
+}
+
+void devm_cxl_remove_mem(struct cxl_memdev *cxlmd)
+{
+ struct cxl_attach_region *attach;
+ struct cxl_dev_state *cxlds;
+
+ if (IS_ERR(cxlmd))
+ return;
+
+ attach = cancel_probe_mem_teardown(cxlmd);
+ cxlds = cxlmd->cxlds;
+ devm_release_action(cxlds->dev, cxl_memdev_unregister, cxlmd);
+ devm_kfree(cxlds->dev, attach);
+}
+EXPORT_SYMBOL_NS_GPL(devm_cxl_remove_mem, "CXL");
+
static void sanitize_teardown_notifier(void *data)
{
struct cxl_memdev_state *mds = data;
--
2.54.0
^ permalink raw reply related
* Re: [PATCH net] eth: fbnic: fix race between concurrent hwmon sensor reads
From: Alexander Duyck @ 2026-06-24 22:22 UTC (permalink / raw)
To: Andrew Lunn
Cc: Zinc Lim, Alexander Duyck, Jakub Kicinski, Andrew Lunn,
David S. Miller, Eric Dumazet, Paolo Abeni, kernel-team, netdev,
linux-kernel, zinclim
In-Reply-To: <f2ecb668-ec56-4c84-ac97-474a2d761005@lunn.ch>
On Wed, Jun 24, 2026 at 2:56 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Wed, Jun 24, 2026 at 11:51:29PM +0200, Andrew Lunn wrote:
> > On Wed, Jun 24, 2026 at 01:05:14PM -0700, Zinc Lim wrote:
> > > From: Zinc Lim <limzhineng2@gmail.com>
> > >
> > > Reading an hwmon sensor
> >
> > Since this is a hwmon related patch, it is a good idea to Cc: the
> > hwmon Maintainer.
> >
> > I don't know hwmon too well, i've never had to look at locking, but...
> >
> > static int hwmon_thermal_get_temp(struct thermal_zone_device *tz, int *temp)
> > {
> > struct hwmon_thermal_data *tdata = thermal_zone_device_priv(tz);
> > struct hwmon_device *hwdev = to_hwmon_device(tdata->dev);
> > int ret;
> > long t;
> >
> > guard(mutex)(&hwdev->lock);
> >
> > ret = hwdev->chip->ops->read(tdata->dev, hwmon_temp, hwmon_temp_input,
> > tdata->index, &t);
> >
> > Could you explain why this lock in the hwmon core is not sufficient?
> > It could be i'm reading it wrong, but it looks like the lock is shared
> > by all hwmon instanced registered by a
> > hwmon_device_register_with_info() call.
>
> Documentation/hwmon/hwmon-kernel-api.rst
>
> When using ``[devm_]hwmon_device_register_with_info()`` to register the
> hardware monitoring device, accesses using the associated access functions
> are serialised by the hardware monitoring core. If a driver needs locking
> for other functions such as interrupt handlers or for attributes which are
> fully implemented in the driver, hwmon_lock() and hwmon_unlock() can be used
> to ensure that calls to those functions are serialized.
I did some digging and it looks like 3ad2a7b9b15d ("hwmon: Serialize
accesses in hwmon core") landed into 6.18. This was a patch put
together based on issues reported in an earlier kernel version and
explains how this got missed. So what we have is a redundant fix. We
can drop this for upstream and probably look at pulling the upstream
fix into our kernels that were having the issues.
Thanks for the review feedback.
- Alex
^ permalink raw reply
* [PATCH net v2 0/2] Fix MANA RX with bounce buffering
From: Dexuan Cui @ 2026-06-24 22:26 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, kotaranov, horms, ernis, dipayanroy, kees,
jacob.e.keller, ssengar, linux-hyperv, netdev, linux-kernel,
linux-rdma
With swiotlb=force, the MANA NIC fails to work properly due to commit
730ff06d3f5c ("net: mana: Use page pool fragments for RX buffers instead
of full pages to improve memory efficiency.")
Dipayaan tried to fix this by avoiding page pool frags when bounce
buffering is in use [1][2]. However, that is not a clean solution: no
other NIC drivers need to explicitly check whether bounce buffering is
in use. It is also not good for throughput, since
dma_map_single()/dma_unmap_single() are then called for each incoming
packet.
In fact, page pool frags can still be used with the standard MTU of
1500: all we need is to add page_pool_dma_sync_for_cpu() before the CPU
reads the incoming packet, so I implemented that in v1 [3].
As Simon suggested [4], this version splits v1 into two patches:
Patch 1 adds page_pool_dma_sync_for_cpu().
Patch 2 validates the packet length reported by the NIC.
There is no functional difference between v1 and v2, so I am keeping
Haiyang's Reviewed-by tag in v2.
Please review. Thanks!
Note that, with jumbo MTU and XDP, page pool frags are not used, and
dma_map_single()/dma_unmap_single() are still called for each incoming
packet, causing poor throughput with swiotlb=force; see
mana_get_rxbuf_cfg() and mana_refill_rx_oob() -> mana_get_rxfrag().
The jumbo MTU/XDP issue will be addressed later since that needs more
consideration if we want to use page pool with PP_FLAG_DMA_MAP there:
e.g., for XDP, the received packet can be transmitted in place, i.e. the
same RX buffer can be used as a TX buffer:
mana_rx_skb() -> mana_xdp_tx() -> mana_start_xmit() -> mana_map_skb().
In mana_create_page_pool(), we may have to set pprm.dma_dir to
DMA_BIDIRECTIONAL if XDP is in use:
pprm.dma_dir = mana_xdp_get(mpc) ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE;
In the case of XDP, the next issue is that mana_rx_skb() -> ... ->
mana_map_skb() appears to call dma_map_single() on an RX buffer allocated
from a page pool created with PP_FLAG_DMA_MAP, which seems incorrect.
Any thoughts?
[1] https://lore.kernel.org/all/ae91hyrLf4n23XE6@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net/#r
[2] https://lore.kernel.org/all/ae9pxvJfkAZYfKMf@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net/
[3] https://lore.kernel.org/all/20260618035029.249361-1-decui@microsoft.com/
[4] https://lore.kernel.org/all/20260619090514.GT827683@horms.kernel.org/
Dexuan Cui (2):
net: mana: Sync page pool RX frags for CPU
net: mana: Validate the packet length reported by the NIC
drivers/net/ethernet/microsoft/mana/mana_en.c | 61 +++++++++++++++----
include/net/mana/mana.h | 8 +++
2 files changed, 58 insertions(+), 11 deletions(-)
--
2.34.1
^ permalink raw reply
* [PATCH net v2 2/2] net: mana: Validate the packet length reported by the NIC
From: Dexuan Cui @ 2026-06-24 22:26 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, kotaranov, horms, ernis, dipayanroy, kees,
jacob.e.keller, ssengar, linux-hyperv, netdev, linux-kernel,
linux-rdma
Cc: stable
In-Reply-To: <20260624222605.1794719-1-decui@microsoft.com>
Validate the packet length reported in the RX CQE before using it as a DMA
sync length or passing it to skb processing. The CQE is supplied by the
NIC device and should not be blindly trusted.
Cc: stable@vger.kernel.org
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
Changes since v1:
v1 is split into two patches in the v2.
Add Haiyang's Reviewed-by.
drivers/net/ethernet/microsoft/mana/mana_en.c | 24 +++++++++++++++----
1 file changed, 19 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 1875bffd82b7..0b44c51ae6ec 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -2190,12 +2190,26 @@ static void mana_process_rx_cqe(struct mana_rxq *rxq, struct mana_cq *cq,
rxbuf_oob = &rxq->rx_oobs[curr];
WARN_ON_ONCE(rxbuf_oob->wqe_inf.wqe_size_in_bu != 1);
- mana_refill_rx_oob(dev, rxq, rxbuf_oob, pktlen, &old_buf, &old_fp);
+ if (unlikely(pktlen > rxq->datasize)) {
+ /* Increase it even if mana_rx_skb() isn't called. */
+ rxq->rx_cq.work_done++;
- /* Unsuccessful refill will have old_buf == NULL.
- * In this case, mana_rx_skb() will drop the packet.
- */
- mana_rx_skb(old_buf, old_fp, oob, rxq, i);
+ ++ndev->stats.rx_dropped;
+ netdev_warn_once(ndev,
+ "Dropped oversized RX packet: len=%u, datasize=%u\n",
+ pktlen, rxq->datasize);
+
+ /* Reuse the RX buffer since rxbuf_oob is unchanged. */
+ } else {
+
+ mana_refill_rx_oob(dev, rxq, rxbuf_oob, pktlen,
+ &old_buf, &old_fp);
+
+ /* Unsuccessful refill will have old_buf == NULL.
+ * In this case, mana_rx_skb() will drop the packet.
+ */
+ mana_rx_skb(old_buf, old_fp, oob, rxq, i);
+ }
mana_move_wq_tail(rxq->gdma_rq,
rxbuf_oob->wqe_inf.wqe_size_in_bu);
--
2.34.1
^ permalink raw reply related
* [PATCH net v2 1/2] net: mana: Sync page pool RX frags for CPU
From: Dexuan Cui @ 2026-06-24 22:26 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, kotaranov, horms, ernis, dipayanroy, kees,
jacob.e.keller, ssengar, linux-hyperv, netdev, linux-kernel,
linux-rdma
Cc: stable
In-Reply-To: <20260624222605.1794719-1-decui@microsoft.com>
MANA allocates RX buffers from page pool fragments when frag_count is
greater than 1. In that case the buffers remain DMA mapped by page pool
and the RX completion path does not call dma_unmap_single(). As a result,
the implicit sync-for-CPU normally performed by dma_unmap_single() is
missing before the packet data is passed to the networking stack.
This breaks RX on configurations which require explicit DMA syncing, for
example when booted with swiotlb=force.
Fix this by recording the page pool page and DMA sync offset when the RX
buffer is allocated, and syncing the received packet range for CPU access
before handing the RX buffer to the stack.
Fixes: 730ff06d3f5c ("net: mana: Use page pool fragments for RX buffers instead of full pages to improve memory efficiency.")
Cc: stable@vger.kernel.org
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
Changes since v1:
v1 is split into two patches in the v2.
Add Haiyang's Reviewed-by.
drivers/net/ethernet/microsoft/mana/mana_en.c | 39 +++++++++++++++----
include/net/mana/mana.h | 8 ++++
2 files changed, 40 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index c9b1df1ed109..1875bffd82b7 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -2044,12 +2044,16 @@ static void mana_rx_skb(void *buf_va, bool from_pool,
}
static void *mana_get_rxfrag(struct mana_rxq *rxq, struct device *dev,
- dma_addr_t *da, bool *from_pool)
+ dma_addr_t *da, bool *from_pool,
+ struct page **pp_page, u32 *dma_sync_offset)
{
struct page *page;
u32 offset;
void *va;
+
*from_pool = false;
+ *pp_page = NULL;
+ *dma_sync_offset = 0;
/* Don't use fragments for jumbo frames or XDP where it's 1 fragment
* per page.
@@ -2087,31 +2091,47 @@ static void *mana_get_rxfrag(struct mana_rxq *rxq, struct device *dev,
va = page_to_virt(page) + offset;
*da = page_pool_get_dma_addr(page) + offset + rxq->headroom;
*from_pool = true;
+ *pp_page = page;
+ *dma_sync_offset = offset + rxq->headroom;
return va;
}
/* Allocate frag for rx buffer, and save the old buf */
static void mana_refill_rx_oob(struct device *dev, struct mana_rxq *rxq,
- struct mana_recv_buf_oob *rxoob, void **old_buf,
- bool *old_fp)
+ struct mana_recv_buf_oob *rxoob, u32 pktlen,
+ void **old_buf, bool *old_fp)
{
+ struct page *pp_page;
+ u32 dma_sync_offset;
bool from_pool;
dma_addr_t da;
void *va;
- va = mana_get_rxfrag(rxq, dev, &da, &from_pool);
+ va = mana_get_rxfrag(rxq, dev, &da, &from_pool, &pp_page,
+ &dma_sync_offset);
if (!va)
return;
- if (!rxoob->from_pool || rxq->frag_count == 1)
+ if (!rxoob->from_pool || rxq->frag_count == 1) {
dma_unmap_single(dev, rxoob->sgl[0].address, rxq->datasize,
DMA_FROM_DEVICE);
+ } else {
+ /* The page pool maps the whole page and only syncs for device
+ * automatically (PP_FLAG_DMA_SYNC_DEV). Sync the received bytes
+ * for the CPU before they are read: this is required if DMA
+ * is incoherent or bounce buffers are used.
+ */
+ page_pool_dma_sync_for_cpu(rxq->page_pool, rxoob->pp_page,
+ rxoob->dma_sync_offset, pktlen);
+ }
*old_buf = rxoob->buf_va;
*old_fp = rxoob->from_pool;
rxoob->buf_va = va;
rxoob->sgl[0].address = da;
rxoob->from_pool = from_pool;
+ rxoob->pp_page = pp_page;
+ rxoob->dma_sync_offset = dma_sync_offset;
}
static void mana_process_rx_cqe(struct mana_rxq *rxq, struct mana_cq *cq,
@@ -2170,7 +2190,7 @@ static void mana_process_rx_cqe(struct mana_rxq *rxq, struct mana_cq *cq,
rxbuf_oob = &rxq->rx_oobs[curr];
WARN_ON_ONCE(rxbuf_oob->wqe_inf.wqe_size_in_bu != 1);
- mana_refill_rx_oob(dev, rxq, rxbuf_oob, &old_buf, &old_fp);
+ mana_refill_rx_oob(dev, rxq, rxbuf_oob, pktlen, &old_buf, &old_fp);
/* Unsuccessful refill will have old_buf == NULL.
* In this case, mana_rx_skb() will drop the packet.
@@ -2566,6 +2586,8 @@ static int mana_fill_rx_oob(struct mana_recv_buf_oob *rx_oob, u32 mem_key,
struct mana_rxq *rxq, struct device *dev)
{
struct mana_port_context *mpc = netdev_priv(rxq->ndev);
+ struct page *pp_page = NULL;
+ u32 dma_sync_offset = 0;
bool from_pool = false;
dma_addr_t da;
void *va;
@@ -2573,13 +2595,16 @@ static int mana_fill_rx_oob(struct mana_recv_buf_oob *rx_oob, u32 mem_key,
if (mpc->rxbufs_pre)
va = mana_get_rxbuf_pre(rxq, &da);
else
- va = mana_get_rxfrag(rxq, dev, &da, &from_pool);
+ va = mana_get_rxfrag(rxq, dev, &da, &from_pool, &pp_page,
+ &dma_sync_offset);
if (!va)
return -ENOMEM;
rx_oob->buf_va = va;
rx_oob->from_pool = from_pool;
+ rx_oob->pp_page = pp_page;
+ rx_oob->dma_sync_offset = dma_sync_offset;
rx_oob->sgl[0].address = da;
rx_oob->sgl[0].size = rxq->datasize;
diff --git a/include/net/mana/mana.h b/include/net/mana/mana.h
index 8f721cd4e4a7..4111b93169d2 100644
--- a/include/net/mana/mana.h
+++ b/include/net/mana/mana.h
@@ -305,6 +305,14 @@ struct mana_recv_buf_oob {
void *buf_va;
bool from_pool; /* allocated from a page pool */
+ /* head page of the page_pool fragment; valid only when
+ * from_pool && frag_count > 1.
+ */
+ struct page *pp_page;
+ /* Fragment offset plus rxq->headroom, passed to
+ * page_pool_dma_sync_for_cpu().
+ */
+ u32 dma_sync_offset;
/* SGL of the buffer going to be sent as part of the work request. */
u32 num_sge;
--
2.34.1
^ permalink raw reply related
* Re: [PATCH net] net: udp_tunnel: fix use-after-free by refcounting udp_tunnel_nic
From: Jakub Kicinski @ 2026-06-24 22:31 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, Paolo Abeni, Simon Horman, Ido Schimmel,
David Ahern, netdev, eric.dumazet, Yue Sun
In-Reply-To: <20260624145722.083632b6@kernel.org>
On Wed, 24 Jun 2026 14:57:22 -0700 Jakub Kicinski wrote:
> so we just need:
>
> diff --git a/net/ipv4/udp_tunnel_nic.c b/net/ipv4/udp_tunnel_nic.c
> index 9944ed923ddf..d7db89a222f8 100644
> --- a/net/ipv4/udp_tunnel_nic.c
> +++ b/net/ipv4/udp_tunnel_nic.c
> @@ -863,6 +863,7 @@ static void
> udp_tunnel_nic_unregister(struct net_device *dev, struct udp_tunnel_nic *utn)
> {
> const struct udp_tunnel_nic_info *info = dev->udp_tunnel_nic_info;
> + bool pending;
>
> udp_tunnel_nic_lock(dev);
>
> @@ -899,12 +900,14 @@ udp_tunnel_nic_unregister(struct net_device *dev, struct udp_tunnel_nic *utn)
> * from the work which we will boot immediately.
> */
> udp_tunnel_nic_flush(dev, utn);
> +
> + pending = utn->work_pending;
> udp_tunnel_nic_unlock(dev);
>
> /* Wait for the work to be done using the state, netdev core will
> * retry unregister until we give up our reference on this device.
> */
> - if (utn->work_pending)
> + if (pending)
> return;
>
> udp_tunnel_nic_free(utn);
Maybe not even.. by definition the driver should not race with its own
netdev's unregister? I don't get what the race is..
^ permalink raw reply
* [PATCH net v2 1/1] net/sched: sch_teql: Introduce slaves_lock to avoid race condition and UAF
From: Jamal Hadi Salim @ 2026-06-24 22:40 UTC (permalink / raw)
To: netdev
Cc: davem, edumazet, kuba, pabeni, horms, jiri, victor, security,
zdi-disclosures, stable, Jamal Hadi Salim, kernel test robot
The teql master->slaves singly linked list is not protected against
multiple writes. It can be mod'ed concurently from teql_master_xmit(),
teql_dequeue(), teql_init() and teql_destroy() without holding any list
lock or RCU protection.
zdi-disclosures@trendmicro.com has demonstrated that the qdisc is freed
after an RCU grace period, but teql_master_xmit() running on another
CPU can still hold a stale pointer into the list, resulting in a
slab-use-after-free:
BUG: KASAN: slab-use-after-free in teql_destroy+0x3ca/0x440 linux/net/sched/sch_teql.c:142
Read of size 8 at addr ffff88802923aa80 by task ip/10024
The zdi-disclosures@trendmicro.com repro created concurrent AF_PACKET
senders on a teql device against a thread that repeatedly adds/deletes the
slave qdisc, together with a SLUB spray that reclaims the freed slot; the
resulting UAF is controllable enough to be turned into a read/write
primitive against the freed qdisc object.
The fix?
Add a per-master slaves_lock spinlock that serializes all mutations of
master->slaves and the NEXT_SLAVE() links in teql_destroy() and
teql_qdisc_init(). teql_master_xmit() also takes the same slaves_lock
around those updates.
Annotate master->slaves and the per-slave ->next pointer with __rcu and
use the appropriate RCU accessors everywhere they are touched:
rcu_assign_pointer() on the writer side (under slaves_lock),
rcu_dereference_protected() for the writer-side loads (also under
slaves_lock), rcu_dereference_bh() for the loads in teql_master_xmit() and
rtnl_dereference() for the loads in teql_master_open()/teql_master_mtu(),
which run under RTNL.
Pair this with rcu_read_lock_bh()/rcu_read_unlock_bh() around the list
traversal in teql_master_xmit(), so that readers either observe a fully
linked list or are deferred until the in-flight mutation completes. The two
early-return paths in teql_master_xmit() are updated to release the RCU-bh
read-side critical section before returning, since leaving it held would
disable BH on that CPU for good.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: zdi-disclosures@trendmicro.com
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202606241501.XQBMu4b8-lkp@intel.com/
Tested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
Changes in v2:
Address sparse issues found by kernel test robot <lkp@intel.com>
- rcu annotated struct teql_master->slaves and struct teql_sched_data->next
- teql_destroy/teql_qdisc_init: replace all READ/WRITE_ONCE() with rcu_dereference_protected()/rcu_assign_pointer()
---
net/sched/sch_teql.c | 119 ++++++++++++++++++++++++++++++++++-----------------
1 file changed, 80 insertions(+), 39 deletions(-)
diff --git a/net/sched/sch_teql.c b/net/sched/sch_teql.c
index e7bbc9e5174d..f4654f1b1200 100644
--- a/net/sched/sch_teql.c
+++ b/net/sched/sch_teql.c
@@ -52,7 +52,8 @@
struct teql_master {
struct Qdisc_ops qops;
struct net_device *dev;
- struct Qdisc *slaves;
+ struct Qdisc __rcu *slaves;
+ spinlock_t slaves_lock; /* serializes writes to ->slaves */
struct list_head master_list;
unsigned long tx_bytes;
unsigned long tx_packets;
@@ -61,7 +62,7 @@ struct teql_master {
};
struct teql_sched_data {
- struct Qdisc *next;
+ struct Qdisc __rcu *next;
struct teql_master *m;
struct sk_buff_head q;
};
@@ -101,7 +102,9 @@ teql_dequeue(struct Qdisc *sch)
if (skb == NULL) {
struct net_device *m = qdisc_dev(q);
if (m) {
- dat->m->slaves = sch;
+ spin_lock_bh(&dat->m->slaves_lock);
+ rcu_assign_pointer(dat->m->slaves, sch);
+ spin_unlock_bh(&dat->m->slaves_lock);
netif_wake_queue(m);
}
} else {
@@ -132,34 +135,49 @@ teql_destroy(struct Qdisc *sch)
struct Qdisc *q, *prev;
struct teql_sched_data *dat = qdisc_priv(sch);
struct teql_master *master = dat->m;
+ struct netdev_queue *txq = NULL;
+ bool reset_master_queue = false;
if (!master)
return;
- prev = master->slaves;
+ spin_lock_bh(&master->slaves_lock);
+ prev = rcu_dereference_protected(master->slaves,
+ lockdep_is_held(&master->slaves_lock));
if (prev) {
do {
- q = NEXT_SLAVE(prev);
- if (q == sch) {
- NEXT_SLAVE(prev) = NEXT_SLAVE(q);
- if (q == master->slaves) {
- master->slaves = NEXT_SLAVE(q);
- if (q == master->slaves) {
- struct netdev_queue *txq;
-
- txq = netdev_get_tx_queue(master->dev, 0);
- master->slaves = NULL;
-
- dev_reset_queue(master->dev,
- txq, NULL);
- }
- }
- skb_queue_purge(&dat->q);
- break;
+ struct Qdisc *head, *next;
+
+ q = rcu_dereference_protected(NEXT_SLAVE(prev),
+ lockdep_is_held(&master->slaves_lock));
+ if (q != sch) {
+ prev = q;
+ continue;
}
- } while ((prev = q) != master->slaves);
+ next = rcu_dereference_protected(NEXT_SLAVE(q),
+ lockdep_is_held(&master->slaves_lock));
+ rcu_assign_pointer(NEXT_SLAVE(prev), next);
+
+ head = rcu_dereference_protected(master->slaves,
+ lockdep_is_held(&master->slaves_lock));
+ if (q == head) {
+ rcu_assign_pointer(master->slaves, next);
+ if (q == next) {
+ txq = netdev_get_tx_queue(master->dev, 0);
+ rcu_assign_pointer(master->slaves, NULL);
+ reset_master_queue = true;
+ }
+ }
+ skb_queue_purge(&dat->q);
+ break;
+ } while (prev != rcu_dereference_protected(master->slaves,
+ lockdep_is_held(&master->slaves_lock)));
}
+ spin_unlock_bh(&master->slaves_lock);
+
+ if (reset_master_queue)
+ dev_reset_queue(master->dev, txq, NULL);
}
static int teql_qdisc_init(struct Qdisc *sch, struct nlattr *opt,
@@ -168,6 +186,7 @@ static int teql_qdisc_init(struct Qdisc *sch, struct nlattr *opt,
struct net_device *dev = qdisc_dev(sch);
struct teql_master *m = (struct teql_master *)sch->ops;
struct teql_sched_data *q = qdisc_priv(sch);
+ struct Qdisc *first;
if (dev->hard_header_len > m->dev->hard_header_len)
return -EINVAL;
@@ -184,7 +203,9 @@ static int teql_qdisc_init(struct Qdisc *sch, struct nlattr *opt,
skb_queue_head_init(&q->q);
- if (m->slaves) {
+ spin_lock_bh(&m->slaves_lock);
+ first = rcu_dereference_protected(m->slaves, lockdep_is_held(&m->slaves_lock));
+ if (first) {
if (m->dev->flags & IFF_UP) {
if ((m->dev->flags & IFF_POINTOPOINT &&
!(dev->flags & IFF_POINTOPOINT)) ||
@@ -192,8 +213,10 @@ static int teql_qdisc_init(struct Qdisc *sch, struct nlattr *opt,
!(dev->flags & IFF_BROADCAST)) ||
(m->dev->flags & IFF_MULTICAST &&
!(dev->flags & IFF_MULTICAST)) ||
- dev->mtu < m->dev->mtu)
+ dev->mtu < m->dev->mtu) {
+ spin_unlock_bh(&m->slaves_lock);
return -EINVAL;
+ }
} else {
if (!(dev->flags&IFF_POINTOPOINT))
m->dev->flags &= ~IFF_POINTOPOINT;
@@ -204,14 +227,17 @@ static int teql_qdisc_init(struct Qdisc *sch, struct nlattr *opt,
if (dev->mtu < m->dev->mtu)
m->dev->mtu = dev->mtu;
}
- q->next = NEXT_SLAVE(m->slaves);
- NEXT_SLAVE(m->slaves) = sch;
+ rcu_assign_pointer(q->next,
+ rcu_dereference_protected(NEXT_SLAVE(first),
+ lockdep_is_held(&m->slaves_lock)));
+ rcu_assign_pointer(NEXT_SLAVE(first), sch);
} else {
- q->next = sch;
- m->slaves = sch;
+ rcu_assign_pointer(q->next, sch);
+ rcu_assign_pointer(m->slaves, sch);
m->dev->mtu = dev->mtu;
m->dev->flags = (m->dev->flags&~FMASK)|(dev->flags&FMASK);
}
+ spin_unlock_bh(&m->slaves_lock);
return 0;
}
@@ -285,7 +311,9 @@ static netdev_tx_t teql_master_xmit(struct sk_buff *skb, struct net_device *dev)
int subq = skb_get_queue_mapping(skb);
struct sk_buff *skb_res = NULL;
- start = master->slaves;
+ rcu_read_lock_bh();
+
+ start = rcu_dereference_bh(master->slaves);
restart:
nores = 0;
@@ -317,10 +345,14 @@ static netdev_tx_t teql_master_xmit(struct sk_buff *skb, struct net_device *dev)
netdev_start_xmit(skb, slave, slave_txq, false) ==
NETDEV_TX_OK) {
__netif_tx_unlock(slave_txq);
- master->slaves = NEXT_SLAVE(q);
+ spin_lock_bh(&master->slaves_lock);
+ rcu_assign_pointer(master->slaves,
+ rcu_dereference_bh(NEXT_SLAVE(q)));
+ spin_unlock_bh(&master->slaves_lock);
netif_wake_queue(dev);
master->tx_packets++;
master->tx_bytes += length;
+ rcu_read_unlock_bh();
return NETDEV_TX_OK;
}
__netif_tx_unlock(slave_txq);
@@ -329,14 +361,18 @@ static netdev_tx_t teql_master_xmit(struct sk_buff *skb, struct net_device *dev)
busy = 1;
break;
case 1:
- master->slaves = NEXT_SLAVE(q);
+ spin_lock_bh(&master->slaves_lock);
+ rcu_assign_pointer(master->slaves,
+ rcu_dereference_bh(NEXT_SLAVE(q)));
+ spin_unlock_bh(&master->slaves_lock);
+ rcu_read_unlock_bh();
return NETDEV_TX_OK;
default:
nores = 1;
break;
}
__skb_pull(skb, skb_network_offset(skb));
- } while ((q = NEXT_SLAVE(q)) != start);
+ } while ((q = rcu_dereference_bh(NEXT_SLAVE(q))) != start);
if (nores && skb_res == NULL) {
skb_res = skb;
@@ -345,29 +381,32 @@ static netdev_tx_t teql_master_xmit(struct sk_buff *skb, struct net_device *dev)
if (busy) {
netif_stop_queue(dev);
+ rcu_read_unlock_bh();
return NETDEV_TX_BUSY;
}
master->tx_errors++;
drop:
master->tx_dropped++;
+ rcu_read_unlock_bh();
dev_kfree_skb(skb);
return NETDEV_TX_OK;
}
static int teql_master_open(struct net_device *dev)
{
- struct Qdisc *q;
+ struct Qdisc *q, *first;
struct teql_master *m = netdev_priv(dev);
int mtu = 0xFFFE;
unsigned int flags = IFF_NOARP | IFF_MULTICAST;
- if (m->slaves == NULL)
+ first = rtnl_dereference(m->slaves);
+ if (!first)
return -EUNATCH;
flags = FMASK;
- q = m->slaves;
+ q = first;
do {
struct net_device *slave = qdisc_dev(q);
@@ -389,7 +428,7 @@ static int teql_master_open(struct net_device *dev)
flags &= ~IFF_BROADCAST;
if (!(slave->flags&IFF_MULTICAST))
flags &= ~IFF_MULTICAST;
- } while ((q = NEXT_SLAVE(q)) != m->slaves);
+ } while ((q = rtnl_dereference(NEXT_SLAVE(q))) != first);
m->dev->mtu = mtu;
m->dev->flags = (m->dev->flags&~FMASK) | flags;
@@ -417,14 +456,15 @@ static void teql_master_stats64(struct net_device *dev,
static int teql_master_mtu(struct net_device *dev, int new_mtu)
{
struct teql_master *m = netdev_priv(dev);
- struct Qdisc *q;
+ struct Qdisc *q, *first;
- q = m->slaves;
+ first = rtnl_dereference(m->slaves);
+ q = first;
if (q) {
do {
if (new_mtu > qdisc_dev(q)->mtu)
return -EINVAL;
- } while ((q = NEXT_SLAVE(q)) != m->slaves);
+ } while ((q = rtnl_dereference(NEXT_SLAVE(q))) != first);
}
WRITE_ONCE(dev->mtu, new_mtu);
@@ -444,6 +484,7 @@ static __init void teql_master_setup(struct net_device *dev)
struct teql_master *master = netdev_priv(dev);
struct Qdisc_ops *ops = &master->qops;
+ spin_lock_init(&master->slaves_lock);
master->dev = dev;
ops->priv_size = sizeof(struct teql_sched_data);
^ permalink raw reply related
* [PATCH net v2] nfc: nci: fix uninit-value in nci_core_init_rsp_packet()
From: Samuel Page @ 2026-06-24 22:44 UTC (permalink / raw)
To: David Heidelberg
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, oe-linux-nfc, netdev, linux-kernel, stable
The CORE_INIT_RSP handlers walk the response using length fields taken
from the packet itself, without checking they stay within skb->len:
- v1 computes
rsp_2 = skb->data + 6 + rsp_1->num_supported_rf_interfaces;
from the on-wire (unclamped) interface count and then dereferences
rsp_2, and memcpy()s the advertised interfaces - both can run past the
received data;
- v2 walks supported_rf_interfaces[], advancing the cursor by an
in-packet rf_extension_cnt with no bound.
A short CORE_INIT_RSP therefore makes the parser read past the packet
(into the uninitialised tail of the RX skb); the values are stored into
struct nci_dev and consumed while bringing the device up:
BUG: KMSAN: uninit-value in nci_dev_up+0x10f3/0x1720
nci_dev_up+0x10f3/0x1720
nfc_dev_up+0x187/0x380
nfc_genl_dev_up+0xdc/0x1a0
genl_rcv_msg+0x5d4/0x9e0
netlink_rcv_skb+0x28f/0x530
Uninit was stored to memory at:
nci_rsp_packet+0x68f/0x2310
nci_rx_work+0x25f/0x5d0
Uninit was created at:
__alloc_skb+0x540/0xd40
virtual_ncidev_write+0x65/0x210
Validate the response length before parsing or storing the
variable-length parts, rejecting truncated responses with
NCI_STATUS_SYNTAX_ERROR. In v1 the check is done before
num_supported_rf_interfaces is stored into ndev, so a truncated response
cannot leave ndev->num_supported_rf_interfaces holding the unclamped
on-wire count, which nci_init_complete_req() would otherwise use as a
bound for the fixed-size supported_rf_interfaces[] array.
Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation")
Fixes: bcd684aace34 ("net/nfc/nci: Support NCI 2.x initial sequence")
Cc: stable@vger.kernel.org
Tested-by: syzbot@syzkaller.appspotmail.com
Assisted-by: Bynario AI
Signed-off-by: Samuel Page <sam@bynar.io>
---
v2: validate the response length before storing num_supported_rf_interfaces
into @ndev. In v1 the unclamped on-wire count was stored first and the
length check returned early on a truncated response, leaving
ndev->num_supported_rf_interfaces > NCI_MAX_SUPPORTED_RF_INTERFACES; a
subsequent CORE_INIT completion then walked it in nci_init_complete_req(),
which the syzbot CI run on v1 flagged as a UBSAN array-index-out-of-bounds.
https://ci.syzbot.org/series/2a9a8657-37a3-4dce-8cb5-2035027791dd
v1: https://lore.kernel.org/all/20260623222402.175798-1-sam@bynar.io
net/nfc/nci/rsp.c | 28 ++++++++++++++++++++++++++--
1 file changed, 26 insertions(+), 2 deletions(-)
diff --git a/net/nfc/nci/rsp.c b/net/nfc/nci/rsp.c
index 9eeb862825c5..6b2fa6bdbd14 100644
--- a/net/nfc/nci/rsp.c
+++ b/net/nfc/nci/rsp.c
@@ -50,11 +50,25 @@ static u8 nci_core_init_rsp_packet_v1(struct nci_dev *ndev,
const struct nci_core_init_rsp_1 *rsp_1 = (void *)skb->data;
const struct nci_core_init_rsp_2 *rsp_2;
+ if (skb->len < sizeof(*rsp_1))
+ return NCI_STATUS_SYNTAX_ERROR;
+
pr_debug("status 0x%x\n", rsp_1->status);
if (rsp_1->status != NCI_STATUS_OK)
return rsp_1->status;
+ /*
+ * supported_rf_interfaces[] and the trailing nci_core_init_rsp_2 are
+ * addressed using the on-wire (unclamped) interface count, so the
+ * response must be long enough for both before any of it is parsed or
+ * stored into @ndev - otherwise a truncated response would leave
+ * ndev->num_supported_rf_interfaces holding the unclamped count.
+ */
+ if (skb->len < sizeof(*rsp_1) +
+ rsp_1->num_supported_rf_interfaces + sizeof(*rsp_2))
+ return NCI_STATUS_SYNTAX_ERROR;
+
ndev->nfcc_features = __le32_to_cpu(rsp_1->nfcc_features);
ndev->num_supported_rf_interfaces = rsp_1->num_supported_rf_interfaces;
@@ -88,9 +102,13 @@ static u8 nci_core_init_rsp_packet_v2(struct nci_dev *ndev,
{
const struct nci_core_init_rsp_nci_ver2 *rsp = (void *)skb->data;
const u8 *supported_rf_interface = rsp->supported_rf_interfaces;
+ const u8 *end = skb->data + skb->len;
u8 rf_interface_idx = 0;
u8 rf_extension_cnt = 0;
+ if (skb->len < sizeof(*rsp))
+ return NCI_STATUS_SYNTAX_ERROR;
+
pr_debug("status %x\n", rsp->status);
if (rsp->status != NCI_STATUS_OK)
@@ -104,10 +122,16 @@ static u8 nci_core_init_rsp_packet_v2(struct nci_dev *ndev,
NCI_MAX_SUPPORTED_RF_INTERFACES);
while (rf_interface_idx < ndev->num_supported_rf_interfaces) {
- ndev->supported_rf_interfaces[rf_interface_idx++] = *supported_rf_interface++;
+ /* one interface byte + one extension-count byte must be present */
+ if (end - supported_rf_interface < 2)
+ return NCI_STATUS_SYNTAX_ERROR;
+ ndev->supported_rf_interfaces[rf_interface_idx++] =
+ *supported_rf_interface++;
- /* skip rf extension parameters */
+ /* skip rf extension parameters, bounded by the packet */
rf_extension_cnt = *supported_rf_interface++;
+ if (rf_extension_cnt > end - supported_rf_interface)
+ return NCI_STATUS_SYNTAX_ERROR;
supported_rf_interface += rf_extension_cnt;
}
base-commit: a986fde914d88af47eb78fd29c5d1af7952c3500
--
2.54.0
^ permalink raw reply related
* RE: [EXTERNAL] Re: [PATCH net] net: mana: Sync page pool RX frags for CPU
From: Dexuan Cui @ 2026-06-24 22:50 UTC (permalink / raw)
To: Simon Horman
Cc: KY Srinivasan, Haiyang Zhang, wei.liu@kernel.org, Long Li,
andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, Konstantin Taranov,
ernis@linux.microsoft.com, dipayanroy@linux.microsoft.com,
kees@kernel.org, jacob.e.keller@intel.com,
ssengar@linux.microsoft.com, linux-hyperv@vger.kernel.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-rdma@vger.kernel.org, stable@vger.kernel.org
In-Reply-To: <20260619090514.GT827683@horms.kernel.org>
> From: Simon Horman <horms@kernel.org>
> Sent: Friday, June 19, 2026 2:05 AM
> > ...
> > Also validate the packet length reported in the RX CQE before using it as
> > a DMA sync length or passing it to skb processing. The CQE is supplied
> > by the device and should not be blindly trusted by Confidential VMs.
>
> I think this last part warrants being split out into a separate patch.
Sorry for the late reply. I split v1 into 2 patches of v2, which I just posted:
https://lwn.net/ml/linux-kernel/20260624222605.1794719-1-decui@microsoft.com/
Thanks,
Dexuan
^ permalink raw reply
* [PATCH net v3] sctp: add INIT verification after cookie unpacking
From: Xin Long @ 2026-06-24 22:53 UTC (permalink / raw)
To: network dev, linux-sctp
Cc: davem, kuba, Eric Dumazet, Paolo Abeni, Simon Horman,
Marcelo Ricardo Leitner
In SCTP handshake, the INIT chunk is initially processed by the server
and embedded into the cookie carried in INIT-ACK. The client then
returns this cookie via COOKIE-ECHO, where the server unpacks it and
reconstructs the original INIT chunk.
When cookie authentication is enabled, the cookie contents are protected
against tampering, so reusing the unpacked INIT without re-verification
is safe.
However, when cookie authentication is disabled, the reconstructed INIT
can no longer be trusted. In this case, the INIT must be explicitly
validated after unpacking to avoid processing potentially tampered data.
Add sctp_verify_init() checks after cookie unpacking in COOKIE-ECHO
processing paths (sctp_sf_do_5_1D_ce() and sctp_sf_do_5_2_4_dupcook())
when cookie_auth_enable is disabled. On failure, the new association is
freed and the packet is discarded.
Also tighten cookie validation in sctp_unpack_cookie() by verifying the
embedded chunk type is SCTP_CID_INIT before treating it as an INIT
chunk.
Finally, update sctp_verify_init() to validate parameter bounds using
the actual embedded INIT length instead of chunk->chunk_end, since the
INIT stored in COOKIE-ECHO may not span the entire chunk buffer.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
v2:
- Because of sctp_abort_on_init_err() param change in patch 1/2,
pass cid and not chunk.
- Use SCTP_PAD4() around ntohs(peer_init->chunk_hdr.length) when
checking param.v in sctp_verify_init() to make Sashiko happy.
v3:
- Validate the embedded INIT chunk type in sctp_unpack_cookie(), as
noted by Sashiko.
- Discard the packet if embedded INIT chunk validation fails,
consistent with malformed cookie handling.
---
net/sctp/sm_make_chunk.c | 5 ++++-
net/sctp/sm_statefuns.c | 36 +++++++++++++++++++++++++++++++++---
2 files changed, 37 insertions(+), 4 deletions(-)
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index 41958b8e59fd..8adac9e0cd66 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -1761,6 +1761,8 @@ struct sctp_association *sctp_unpack_cookie(
bear_cookie = &cookie->c;
ch = (struct sctp_chunkhdr *)(bear_cookie + 1);
+ if (ch->type != SCTP_CID_INIT)
+ goto malformed;
chlen = ntohs(ch->length);
if (chlen < sizeof(struct sctp_init_chunk))
goto malformed;
@@ -2298,7 +2300,8 @@ int sctp_verify_init(struct net *net, const struct sctp_endpoint *ep,
* VIOLATION error. We build the ERROR chunk here and let the normal
* error handling code build and send the packet.
*/
- if (param.v != (void *)chunk->chunk_end)
+ if (param.v != (void *)peer_init +
+ SCTP_PAD4(ntohs(peer_init->chunk_hdr.length)))
return sctp_process_inv_paramlength(asoc, param.p, chunk, errp);
/* The only missing mandatory param possible today is
diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index 8e920cef0858..d23d935e128e 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -707,11 +707,12 @@ enum sctp_disposition sctp_sf_do_5_1D_ce(struct net *net,
struct sctp_cmd_seq *commands)
{
struct sctp_ulpevent *ev, *ai_ev = NULL, *auth_ev = NULL;
+ struct sctp_chunk *err_chk_p = NULL;
struct sctp_association *new_asoc;
struct sctp_init_chunk *peer_init;
struct sctp_chunk *chunk = arg;
- struct sctp_chunk *err_chk_p;
struct sctp_chunk *repl;
+ enum sctp_cid cid;
struct sock *sk;
int error = 0;
@@ -785,6 +786,19 @@ enum sctp_disposition sctp_sf_do_5_1D_ce(struct net *net,
}
}
+ peer_init = (struct sctp_init_chunk *)(chunk->subh.cookie_hdr + 1);
+ cid = peer_init->chunk_hdr.type;
+ if (!sctp_sk(sk)->cookie_auth_enable &&
+ !sctp_verify_init(net, ep, asoc, cid, peer_init, chunk,
+ &err_chk_p)) {
+ sctp_association_free(new_asoc);
+ if (err_chk_p)
+ sctp_chunk_free(err_chk_p);
+ return sctp_sf_pdiscard(net, ep, asoc, type, arg, commands);
+ }
+ if (err_chk_p)
+ sctp_chunk_free(err_chk_p);
+
if (security_sctp_assoc_request(new_asoc, chunk->head_skb ?: chunk->skb)) {
sctp_association_free(new_asoc);
return sctp_sf_pdiscard(net, ep, asoc, type, arg, commands);
@@ -798,7 +812,6 @@ enum sctp_disposition sctp_sf_do_5_1D_ce(struct net *net,
/* This is a brand-new association, so these are not yet side
* effects--it is safe to run them here.
*/
- peer_init = (struct sctp_init_chunk *)(chunk->subh.cookie_hdr + 1);
if (!sctp_process_init(new_asoc, chunk,
&chunk->subh.cookie_hdr->c.peer_addr,
peer_init, GFP_ATOMIC))
@@ -2215,10 +2228,12 @@ enum sctp_disposition sctp_sf_do_5_2_4_dupcook(
void *arg,
struct sctp_cmd_seq *commands)
{
+ struct sctp_chunk *err_chk_p = NULL;
struct sctp_association *new_asoc;
+ struct sctp_init_chunk *peer_init;
struct sctp_chunk *chunk = arg;
enum sctp_disposition retval;
- struct sctp_chunk *err_chk_p;
+ enum sctp_cid cid;
int error = 0;
char action;
@@ -2287,6 +2302,21 @@ enum sctp_disposition sctp_sf_do_5_2_4_dupcook(
switch (action) {
case 'A': /* Association restart. */
case 'B': /* Collision case B. */
+ peer_init = (struct sctp_init_chunk *)
+ (chunk->subh.cookie_hdr + 1);
+ cid = peer_init->chunk_hdr.type;
+ if (!sctp_sk(ep->base.sk)->cookie_auth_enable &&
+ !sctp_verify_init(net, ep, asoc, cid, peer_init, chunk,
+ &err_chk_p)) {
+ sctp_association_free(new_asoc);
+ if (err_chk_p)
+ sctp_chunk_free(err_chk_p);
+ return sctp_sf_pdiscard(net, ep, asoc, type, arg,
+ commands);
+ }
+ if (err_chk_p)
+ sctp_chunk_free(err_chk_p);
+ fallthrough;
case 'D': /* Collision case D. */
/* Update socket peer label if first association. */
if (security_sctp_assoc_request((struct sctp_association *)asoc,
--
2.47.1
^ permalink raw reply related
* Re: [External Mail] [PATCH v2 1/7] net: wwan: t9xx: Add PCIe core
From: Jakub Kicinski @ 2026-06-24 23:35 UTC (permalink / raw)
To: Wu. JackBB (GSM)
Cc: Loic Poulain, Sergey Ryazanov, Johannes Berg, Andrew Lunn,
David S. Miller, Eric Dumazet, Paolo Abeni, Wen-Zhi Huang,
Shi-Wei Yeh, Minano Tseng, Matthias Brugger,
AngeloGioacchino Del Regno, Simon Horman, Jonathan Corbet,
Shuah Khan, linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-mediatek@lists.infradead.org, linux-doc@vger.kernel.org
In-Reply-To: <b02c0e1e9f0449f2b819197e4329373b@compal.com>
On Wed, 24 Jun 2026 09:15:17 +0000 Wu. JackBB (GSM) wrote:
> ================================================================================================================================================================
> This message may contain information which is private, privileged or confidential of Compal Electronics, Inc. If you are not the intended recipient of this message, please notify the sender and destroy/delete the message. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information, by persons or entities other than the intended recipient is prohibited.
> ================================================================================================================================================================
If you want to do anything upstream you have to get rid of this first.
^ permalink raw reply
* Re: [PATCH v3 0/7] net: wwan: t9xx: Add MediaTek T9XX WWAN driver
From: Jakub Kicinski @ 2026-06-25 0:09 UTC (permalink / raw)
To: Jack Wu via B4 Relay
Cc: jackbb_wu, Loic Poulain, Sergey Ryazanov, Johannes Berg,
Andrew Lunn, David S. Miller, Eric Dumazet, Paolo Abeni,
Wen-Zhi Huang, Shi-Wei Yeh, Minano Tseng, Matthias Brugger,
AngeloGioacchino Del Regno, Simon Horman, Jonathan Corbet,
Shuah Khan, linux-kernel, netdev, linux-arm-kernel,
linux-mediatek, linux-doc
In-Reply-To: <20260624-t9xx_driver_v1-v3-0-73ff03f60c48@compal.com>
On Wed, 24 Jun 2026 18:04:06 +0800 Jack Wu via B4 Relay wrote:
> T9XX is the PCIe host device driver for MediaTek's
> t900 modem. The driver uses the WWAN framework
> infrastructure to create the following control ports
> and network interfaces for data transactions.
Replying after a long delay and then immediately posting a new version
of patches is very bad. Don't bother replying and just put the comments
you had in the changelog of the new posting. Otherwise the discussion
may get split.
^ permalink raw reply
* Re:Re: [PATCH] octeontx2-af: Free BPID bitmap on setup failure
From: haoxiang_li2024 @ 2026-06-25 0:34 UTC (permalink / raw)
To: Simon Horman
Cc: sgoutham, lcherian, gakula, hkelam, sbhatta, andrew+netdev, davem,
edumazet, kuba, pabeni, netdev, linux-kernel, stable
In-Reply-To: <20260624170930.GB1131256@horms.kernel.org>
At 2026-06-25 01:09:30, "Simon Horman" <horms@kernel.org> wrote:
>On Tue, Jun 23, 2026 at 07:43:16PM +0800, Haoxiang Li wrote:
>> nix_setup_bpids() allocates bp->bpids with rvu_alloc_bitmap(), which uses
>> a plain kcalloc(). If any of the following devm_kcalloc() allocations for
>> the BPID mapping arrays fails, the function returns without freeing the
>> bitmap. Free the BPID bitmap before returning from those error paths.
>>
>> Fixes: d6212d2e41a0 ("octeontx2-af: Create BPIDs free pool")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com>
>
>Reviewed-by: Simon Horman <horms@kernel.org>
>
>I am wondering if you did a pass for any other similar problems
>with users of rvu_alloc_bitmap.
Thanks for your review! Yes, I did. I found similar issues in
nix_setup_ipolicers() and rvu_setup_msix_resources(), and
I will address them in follow-up patches.
Thanks,
Haoxiang
^ permalink raw reply
* [PATCH net v3] fsl/fman: Free init resources on KeyGen failure in fman_init()
From: Haoxiang Li @ 2026-06-25 0:48 UTC (permalink / raw)
To: madalin.bucur, sean.anderson, andrew+netdev, davem, edumazet,
kuba, pabeni, florinel.iordache
Cc: netdev, linux-kernel, Haoxiang Li, stable, Pavan Chebbi,
Breno Leitao
fman_muram_alloc() allocates initialization resources before
initializing the KeyGen block. If keygen_init() fails, the
function returns -EINVAL directly and leaves those resources
allocated. Free the initialization resources before returning
from the KeyGen failure path.
Fixes: 7472f4f281d0 ("fsl/fman: enable FMan Keygen")
Cc: stable@kernel.org
Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
---
Changes in v2:
- Add "net" to patch title. Thanks, Pavan!
Changes in v3:
- Drop the unrelated enable() cleanup from this fix. Thanks, Breno!
---
drivers/net/ethernet/freescale/fman/fman.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/freescale/fman/fman.c b/drivers/net/ethernet/freescale/fman/fman.c
index 013273a2de32..299bab043175 100644
--- a/drivers/net/ethernet/freescale/fman/fman.c
+++ b/drivers/net/ethernet/freescale/fman/fman.c
@@ -1995,8 +1995,10 @@ static int fman_init(struct fman *fman)
/* Init KeyGen */
fman->keygen = keygen_init(fman->kg_regs);
- if (!fman->keygen)
+ if (!fman->keygen) {
+ free_init_resources(fman);
return -EINVAL;
+ }
err = enable(fman, cfg);
if (err != 0)
--
2.25.1
^ permalink raw reply related
* Re: [PATCH net 1/2] net: dsa: mxl862xx: avoid unaligned 16-bit access in api_wrap
From: Jakub Kicinski @ 2026-06-25 0:52 UTC (permalink / raw)
To: David Laight
Cc: Daniel Golle, Andrew Lunn, Vladimir Oltean, David S. Miller,
Eric Dumazet, Paolo Abeni, netdev, linux-kernel
In-Reply-To: <20260619100154.794168e5@pumpkin>
On Fri, 19 Jun 2026 10:01:54 +0100 David Laight wrote:
> > The MXL862XX_API_* macros pass the address of a stack-allocated, __packed
> > firmware-ABI struct to mxl862xx_api_wrap() as a void *. The struct has an
> > alignment of 1, so the compiler is free to place it at an odd address.
> >
> > mxl862xx_api_wrap() reinterprets that buffer as a __le16 * and accesses it
> > with data[i], for which the compiler assumes the natural 2-byte alignment
> > of __le16 and emits aligned 16-bit loads/stores (e.g. lhu/sh on MIPS).
> > When the buffer lands on an odd address these fault on architectures that
> > do not support unaligned access, such as MIPS32.
>
> Isn't the correct fix to not pack the structure?
> (or probably any of the associated structures??)
Agreed, this is very silly:
struct mxl862xx_register_mod {
__le16 addr;
__le16 data;
__le16 mask;
} __packed;
But some structs won't get aligned:
struct mxl862xx_mac_table_clear {
u8 type;
u8 port_id;
} __packed;
So I guess the "just don't pack" will have some corner cases, too.
^ permalink raw reply
* [PATCH net-next] fsl/fman: make enable() return void
From: Haoxiang Li @ 2026-06-25 0:56 UTC (permalink / raw)
To: madalin.bucur, sean.anderson, andrew+netdev, davem, edumazet,
kuba, pabeni
Cc: netdev, linux-kernel, Haoxiang Li
The enable() helper always returns 0 and has no failure path.
Make it return void and update the only caller accordingly.
Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com>
---
drivers/net/ethernet/freescale/fman/fman.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fman/fman.c b/drivers/net/ethernet/freescale/fman/fman.c
index 013273a2de32..aa35373413d4 100644
--- a/drivers/net/ethernet/freescale/fman/fman.c
+++ b/drivers/net/ethernet/freescale/fman/fman.c
@@ -924,7 +924,7 @@ static void hwp_init(struct fman_hwp_regs __iomem *hwp_rg)
iowrite32be(HWP_RPIMAC_PEN, &hwp_rg->fmprrpimac);
}
-static int enable(struct fman *fman, struct fman_cfg *cfg)
+static void enable(struct fman *fman, struct fman_cfg *cfg)
{
u32 cfg_reg = 0;
@@ -941,8 +941,6 @@ static int enable(struct fman *fman, struct fman_cfg *cfg)
iowrite32be(BMI_INIT_START, &fman->bmi_regs->fmbm_init);
iowrite32be(cfg_reg | QMI_CFG_ENQ_EN | QMI_CFG_DEQ_EN,
&fman->qmi_regs->fmqm_gc);
-
- return 0;
}
static int set_exception(struct fman *fman,
@@ -1998,9 +1996,7 @@ static int fman_init(struct fman *fman)
if (!fman->keygen)
return -EINVAL;
- err = enable(fman, cfg);
- if (err != 0)
- return err;
+ enable(fman, cfg);
enable_time_stamp(fman);
--
2.25.1
^ permalink raw reply related
* [PATCH nf] netfilter: ipset: fix race between dump and ip_set_list resize
From: Xiang Mei @ 2026-06-25 1:00 UTC (permalink / raw)
To: Florian Westphal, Pablo Neira Ayuso, Jozsef Kadlecsik,
Phil Sutter, netfilter-devel
Cc: kees, horms, Weiming Shi, coreteam, netdev, linux-kernel,
Xiang Mei
The release path of ip_set_dump_do() and ip_set_dump_done() read
inst->ip_set_list via ip_set_ref_netlink(), a plain rcu_dereference_raw()
of the array pointer. These run from netlink_recvmsg() without the nfnl
mutex and without an RCU read-side critical section.
A concurrent ip_set_create() can grow the array: it publishes the new
array, calls synchronize_net() and then kvfree()s the old one. Since the
dump paths read the array outside any RCU reader, synchronize_net() does
not wait for them and the old array can be freed while they still index
into it, causing a use-after-free.
The dumped set itself stays pinned via set->ref_netlink, so only the
array load needs protecting. Take rcu_read_lock() around it, matching
ip_set_get_byname() and __ip_set_put_byindex().
BUG: KASAN: slab-use-after-free in ip_set_dump_do (net/netfilter/ipset/ip_set_core.c:1697)
Read of size 8 at addr ffff88800b5c4018 by task exploit/150
Call Trace:
...
kasan_report (mm/kasan/report.c:595)
ip_set_dump_do (net/netfilter/ipset/ip_set_core.c:1697)
netlink_dump (net/netlink/af_netlink.c:2325)
netlink_recvmsg (net/netlink/af_netlink.c:1976)
sock_recvmsg (net/socket.c:1159)
__sys_recvfrom (net/socket.c:2315)
...
Oops: general protection fault, probably for non-canonical address ... KASAN NOPTI
KASAN: maybe wild-memory-access in range [0x02d6...d0-0x02d6...d7]
RIP: 0010:ip_set_dump_do (net/netfilter/ipset/ip_set_core.c:1698)
Kernel panic - not syncing: Fatal exception
Fixes: 8a02bdd50b2e ("netfilter: ipset: Fix calling ip_set() macro at dumping")
Reported-by: Weiming Shi <bestswngs@gmail.com>
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Xiang Mei <xmei5@asu.edu>
---
net/netfilter/ipset/ip_set_core.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c
index a531b654b8d9..6cfad152d7d1 100644
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -1480,7 +1480,11 @@ ip_set_dump_done(struct netlink_callback *cb)
struct ip_set_net *inst =
(struct ip_set_net *)cb->args[IPSET_CB_NET];
ip_set_id_t index = (ip_set_id_t)cb->args[IPSET_CB_INDEX];
- struct ip_set *set = ip_set_ref_netlink(inst, index);
+ struct ip_set *set;
+
+ rcu_read_lock();
+ set = ip_set_ref_netlink(inst, index);
+ rcu_read_unlock();
if (set->variant->uref)
set->variant->uref(set, cb, false);
@@ -1686,7 +1690,9 @@ ip_set_dump_do(struct sk_buff *skb, struct netlink_callback *cb)
release_refcount:
/* If there was an error or set is done, release set */
if (ret || !cb->args[IPSET_CB_ARG0]) {
+ rcu_read_lock();
set = ip_set_ref_netlink(inst, index);
+ rcu_read_unlock();
if (set->variant->uref)
set->variant->uref(set, cb, false);
pr_debug("release set %s\n", set->name);
--
2.43.0
^ permalink raw reply related
* Re: [PATCH net 1/2] net: dsa: mxl862xx: avoid unaligned 16-bit access in api_wrap
From: patchwork-bot+netdevbpf @ 2026-06-25 1:08 UTC (permalink / raw)
To: Daniel Golle
Cc: andrew, olteanv, davem, edumazet, kuba, pabeni, netdev,
linux-kernel
In-Reply-To: <599327521db465a534d277de53ab9b6cac01928b.1781702256.git.daniel@makrotopia.org>
Hello:
This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Fri, 19 Jun 2026 04:39:25 +0100 you wrote:
> The MXL862XX_API_* macros pass the address of a stack-allocated, __packed
> firmware-ABI struct to mxl862xx_api_wrap() as a void *. The struct has an
> alignment of 1, so the compiler is free to place it at an odd address.
>
> mxl862xx_api_wrap() reinterprets that buffer as a __le16 * and accesses it
> with data[i], for which the compiler assumes the natural 2-byte alignment
> of __le16 and emits aligned 16-bit loads/stores (e.g. lhu/sh on MIPS).
> When the buffer lands on an odd address these fault on architectures that
> do not support unaligned access, such as MIPS32.
>
> [...]
Here is the summary with links:
- [net,1/2] net: dsa: mxl862xx: avoid unaligned 16-bit access in api_wrap
https://git.kernel.org/netdev/net/c/6b3f7af57881
- [net,2/2] net: dsa: mxl862xx: fix use-after-free of DSA ports in crc_err_work
https://git.kernel.org/netdev/net/c/bcb3b8314611
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply
* Re: [PATCH net] net: mana: Fall back to standard MTU when PF reports adapter_mtu of 0
From: patchwork-bot+netdevbpf @ 2026-06-25 1:08 UTC (permalink / raw)
To: Erni Sri Satya Vennela
Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, dipayanroy, ssengar, jacob.e.keller,
horms, gargaditya, kees, linux-hyperv, netdev, linux-kernel, bpf
In-Reply-To: <20260619055348.467224-1-ernis@linux.microsoft.com>
Hello:
This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Thu, 18 Jun 2026 22:53:38 -0700 you wrote:
> Commit d7709812e13d ("net: mana: hardening: Validate adapter_mtu from
> MANA_QUERY_DEV_CONFIG") rejected any adapter_mtu value smaller than
> ETH_MIN_MTU + ETH_HLEN, including 0, returning -EPROTO and failing
> mana_probe().
>
> Some older PF firmware versions still in the field report
> adapter_mtu as 0 in the MANA_QUERY_DEV_CONFIG response. With the
> hardening check in place, the MANA VF driver now fails to load on
> those hosts, breaking networking entirely for guests.
>
> [...]
Here is the summary with links:
- [net] net: mana: Fall back to standard MTU when PF reports adapter_mtu of 0
https://git.kernel.org/netdev/net/c/6bd81a5b4e0d
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply
* Re: [PATCH net v2 0/2] airoha: fixes for sched HTB offload support
From: patchwork-bot+netdevbpf @ 2026-06-25 1:08 UTC (permalink / raw)
To: Lorenzo Bianconi
Cc: andrew+netdev, davem, edumazet, kuba, pabeni, horms, win847,
linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <20260619-airoha-qos-fixes-v2-0-5c43485038f9@kernel.org>
Hello:
This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Fri, 19 Jun 2026 13:37:12 +0200 you wrote:
> ---
> Changes in v2:
> - cosmetics
> - Link to v1: https://lore.kernel.org/r/20260618-airoha-qos-fixes-v1-0-37192652157f@kernel.org
>
> ---
> Lorenzo Bianconi (2):
> net: airoha: Fix off-by-one in airoha_tc_remove_htb_queue()
> net: airoha: fix netif_set_real_num_tx_queues for sparse QoS channels
>
> [...]
Here is the summary with links:
- [net,v2,1/2] net: airoha: Fix off-by-one in airoha_tc_remove_htb_queue()
https://git.kernel.org/netdev/net/c/bfcce49c4aaa
- [net,v2,2/2] net: airoha: fix netif_set_real_num_tx_queues for sparse QoS channels
https://git.kernel.org/netdev/net/c/788663dd28e4
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply
* Re: [PATCH net] net: dsa: realtek: fix memory leak in rtl8366rb_setup_led()
From: patchwork-bot+netdevbpf @ 2026-06-25 1:09 UTC (permalink / raw)
To: David Yang
Cc: netdev, linusw, alsi, andrew, olteanv, davem, edumazet, kuba,
pabeni, luizluca, linux-kernel
In-Reply-To: <20260618140200.1888707-1-mmyangfl@gmail.com>
Hello:
This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Thu, 18 Jun 2026 22:01:55 +0800 you wrote:
> led_classdev_register_ext() only reads init_data.devicename - it never
> stores the pointer. However, the caller allocated devicename with
> kasprintf() but never freed it, leaking the string memory.
>
> Fix it with a stack buffer to avoid dynamic buffers completely.
>
> Fixes: 32d617005475 ("net: dsa: realtek: add LED drivers for rtl8366rb")
> Signed-off-by: David Yang <mmyangfl@gmail.com>
>
> [...]
Here is the summary with links:
- [net] net: dsa: realtek: fix memory leak in rtl8366rb_setup_led()
https://git.kernel.org/netdev/net/c/056a5087d87e
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply
* Re: [PATCH] net: ixp4xx_hss: fix duplicate HDLC netdev allocation
From: patchwork-bot+netdevbpf @ 2026-06-25 1:09 UTC (permalink / raw)
To: Haoxiang Li
Cc: linusw, kaloz, andrew+netdev, davem, edumazet, kuba, pabeni,
huangguangbin2, lipeng321, linux-arm-kernel, netdev, linux-kernel,
stable
In-Reply-To: <20260622043015.643637-1-haoxiang_li2024@163.com>
Hello:
This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Mon, 22 Jun 2026 12:30:15 +0800 you wrote:
> ixp4xx_hss_probe() allocates two HDLC netdevs. The first one is stored
> in ndev, initialized, and registered with register_hdlc_device(). The
> second one is stored in port->netdev and later used by the remove path
> for unregister_hdlc_device() and free_netdev().
>
> This means that the registered netdev is not the same object that is
> unregistered and freed on remove. It also leaks the first allocation if
> the second alloc_hdlcdev() call fails, and the first allocation is not
> checked before ndev is used.
>
> [...]
Here is the summary with links:
- net: ixp4xx_hss: fix duplicate HDLC netdev allocation
https://git.kernel.org/netdev/net/c/db818b0e8af7
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply
* [PATCH] ipv6: fib6: fix NULL deref in fib6_walk_continue() on multi-batch dump
From: Pengfei Zhang @ 2026-06-25 1:23 UTC (permalink / raw)
To: dsahern, idosch
Cc: davem, edumazet, kuba, pabeni, horms, netdev, linux-kernel,
chenzhangqi, baohua, Pengfei Zhang, Pengfei Zhang
In-Reply-To: <20260624171156.822055-1-zhangfeionline@gmail.com>
From: Pengfei Zhang <zhangpengfei16@xiaomi.com>
inet6_dump_fib() saves its progress in cb->args[1] as a positional
index within the current hash chain. Between batches the RTNL lock
is released, so a concurrent fib6_new_table() can insert a new table
at the chain head, shifting all existing entries. The saved index
then lands on a different table, causing fib6_dump_table() to set
w->root to the wrong table while w->node still points into the
previous one. fib6_walk_continue() dereferences w->node->parent
(NULL) and panics:
BUG: kernel NULL pointer dereference, address: 0000000000000008
RIP: 0010:fib6_walk_continue+0x6e/0x170
Call Trace:
<TASK>
fib6_dump_table.isra.0+0xc5/0x240
inet6_dump_fib+0xf6/0x420
rtnl_dumpit+0x30/0xa0
netlink_dump+0x15b/0x460
netlink_recvmsg+0x1d6/0x2a0
____sys_recvmsg+0x17a/0x190
Fix by storing tb->tb6_id in cb->args[1] instead of a positional
index. On resume, skip entries until the id matches; a concurrent
head-insert can never match the saved id, so the walker always
resumes on the correct table.
Fixes: 1b43af5480c3 ("[IPV6]: Increase number of possible routing tables to 2^32")
Signed-off-by: Pengfei Zhang <zhangfeionline@gmail.com>
---
net/ipv6/ip6_fib.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index fc95738de..bda492634 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -636,11 +636,11 @@ static int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
};
const struct nlmsghdr *nlh = cb->nlh;
struct net *net = sock_net(skb->sk);
- unsigned int e = 0, s_e;
struct hlist_head *head;
struct fib6_walker *w;
struct fib6_table *tb;
unsigned int h, s_h;
+ u32 s_id;
int err = 0;
rcu_read_lock();
@@ -701,23 +701,22 @@ static int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
}
s_h = cb->args[0];
- s_e = cb->args[1];
+ s_id = cb->args[1];
- for (h = s_h; h < FIB6_TABLE_HASHSZ; h++, s_e = 0) {
- e = 0;
+ for (h = s_h; h < FIB6_TABLE_HASHSZ; h++, s_id = 0) {
head = &net->ipv6.fib_table_hash[h];
hlist_for_each_entry_rcu(tb, head, tb6_hlist) {
- if (e < s_e)
- goto next;
+ if (s_id && tb->tb6_id != s_id)
+ continue;
+ s_id = 0;
+
+ cb->args[1] = tb->tb6_id;
err = fib6_dump_table(tb, skb, cb);
if (err != 0)
goto out;
-next:
- e++;
}
}
out:
- cb->args[1] = e;
cb->args[0] = h;
unlock:
--
2.34.1
^ permalink raw reply related
* [PATCH v2] ipv6: fib6: fix NULL deref in fib6_walk_continue() on multi-batch dump
From: Pengfei Zhang @ 2026-06-25 1:23 UTC (permalink / raw)
To: dsahern, idosch
Cc: davem, edumazet, kuba, pabeni, horms, netdev, linux-kernel,
chenzhangqi, baohua, Pengfei Zhang, Pengfei Zhang
In-Reply-To: <20260624171156.822055-1-zhangfeionline@gmail.com>
From: Pengfei Zhang <zhangpengfei16@xiaomi.com>
inet6_dump_fib() saves its progress in cb->args[1] as a positional
index within the current hash chain. Between batches the RTNL lock
is released, so a concurrent fib6_new_table() can insert a new table
at the chain head, shifting all existing entries. The saved index
then lands on a different table, causing fib6_dump_table() to set
w->root to the wrong table while w->node still points into the
previous one. fib6_walk_continue() dereferences w->node->parent
(NULL) and panics:
BUG: kernel NULL pointer dereference, address: 0000000000000008
RIP: 0010:fib6_walk_continue+0x6e/0x170
Call Trace:
<TASK>
fib6_dump_table.isra.0+0xc5/0x240
inet6_dump_fib+0xf6/0x420
rtnl_dumpit+0x30/0xa0
netlink_dump+0x15b/0x460
netlink_recvmsg+0x1d6/0x2a0
____sys_recvmsg+0x17a/0x190
Fix by storing tb->tb6_id in cb->args[1] instead of a positional
index. On resume, skip entries until the id matches; a concurrent
head-insert can never match the saved id, so the walker always
resumes on the correct table.
Fixes: 1b43af5480c3 ("[IPV6]: Increase number of possible routing tables to 2^32")
Signed-off-by: Pengfei Zhang <zhangfeionline@gmail.com>
---
net/ipv6/ip6_fib.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index fc95738de..bda492634 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -636,11 +636,11 @@ static int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
};
const struct nlmsghdr *nlh = cb->nlh;
struct net *net = sock_net(skb->sk);
- unsigned int e = 0, s_e;
struct hlist_head *head;
struct fib6_walker *w;
struct fib6_table *tb;
unsigned int h, s_h;
+ u32 s_id;
int err = 0;
rcu_read_lock();
@@ -701,23 +701,22 @@ static int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
}
s_h = cb->args[0];
- s_e = cb->args[1];
+ s_id = cb->args[1];
- for (h = s_h; h < FIB6_TABLE_HASHSZ; h++, s_e = 0) {
- e = 0;
+ for (h = s_h; h < FIB6_TABLE_HASHSZ; h++, s_id = 0) {
head = &net->ipv6.fib_table_hash[h];
hlist_for_each_entry_rcu(tb, head, tb6_hlist) {
- if (e < s_e)
- goto next;
+ if (s_id && tb->tb6_id != s_id)
+ continue;
+ s_id = 0;
+
+ cb->args[1] = tb->tb6_id;
err = fib6_dump_table(tb, skb, cb);
if (err != 0)
goto out;
-next:
- e++;
}
}
out:
- cb->args[1] = e;
cb->args[0] = h;
unlock:
--
2.34.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox