From: Paolo Abeni <pabeni@redhat.com>
To: rosenp@gmail.com
Cc: netdev@vger.kernel.org, andrew+netdev@lunn.ch,
davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 07/10] net: emac: fix DMA API mapping and unmapping correctness
Date: Thu, 2 Jul 2026 12:39:31 +0200 [thread overview]
Message-ID: <20260702103931.147571-1-pabeni@redhat.com> (raw)
In-Reply-To: <20260630041634.284127-8-rosenp@gmail.com>
From: AI Reviewer <ai@example.com>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://netdev-ai.bots.linux.dev/sashiko/
---
net: emac: fix DMA API mapping and unmapping correctness
This patch adds dma_mapping_error() checks after every dma_map_single()
and skb_frag_dma_map() call in the ibm emac driver, introduces a shadow
rx_dma[] array in struct emac_instance, and replaces the map-and-leak
pattern in emac_recycle_rx_skb() with dma_sync_single_for_device().
A few questions follow inline.
> net: emac: fix DMA API mapping and unmapping correctness
[Low]
The subject says the fix is for DMA API mapping and unmapping
correctness, but the in-tree comment added by this patch states:
+ * TX buffers still lack dma_unmap calls for the reasons explained in the
+ * original note below (a single skb may be split across multiple BDs on
+ * TAH-equipped EMACs, making per-fragment tracking complex).
Should the subject be scoped to the RX path (for example, "fix DMA API
correctness on the RX path") since the TX-side DMA API violation is
intentionally preserved?
[Low]
The commit message bullet says:
- emac_recycle_rx_skb: map first, check error, then unmap old;
use dma_sync_single_for_device() with SKB_DATA_ALIGN(len + NET_IP_ALIGN)
But emac_recycle_rx_skb() only calls dma_sync_single_for_device() and
performs no map/check/unmap sequence; the "map first, check, unmap old"
sequence is in the emac_poll_rx() MAL_RX_CTRL_FIRST branch. Would it
help to split this bullet across the two functions?
[Low]
The changelog ends with:
For whatever reason after this patch, ath9k stopped throwing DMA
errors with CONFIG_DMA_API_DEBUG.
Is this perhaps an artifact of CONFIG_DMA_API_DEBUG hash exhaustion
caused by the leaked mappings in the pre-patch driver? It might be
worth investigating and stating the mechanism rather than leaving it
as an unexplained side effect.
[Low]
The emac_recycle_rx_skb() change (discarding the returned DMA address
from dma_map_single()) looks like a long-standing functional bug on
systems with IOMMU/SWIOTLB. Would a Fixes: tag and stable consideration
be appropriate for that hunk?
> diff --git a/drivers/net/ethernet/ibm/emac/core.c b/drivers/net/ethernet/ibm/emac/core.c
> index ced9690cddc3..aed1ad21e2ea 100644
> --- a/drivers/net/ethernet/ibm/emac/core.c
> +++ b/drivers/net/ethernet/ibm/emac/core.c
[ ... ]
> @@ -1058,6 +1066,7 @@ static int emac_resize_rx_ring(struct emac_instance *dev, int new_mtu)
> /* Second pass, allocate new skbs */
> for (i = 0; i < NUM_RX_BUFF; ++i) {
> struct sk_buff *skb;
> + dma_addr_t dma;
>
> skb = netdev_alloc_skb_ip_align(dev->ndev, rx_skb_size);
> if (!skb) {
> @@ -1066,12 +1075,24 @@ static int emac_resize_rx_ring(struct emac_instance *dev, int new_mtu)
> }
>
> BUG_ON(!dev->rx_skb[i]);
> + dma_unmap_single(&dev->ofdev->dev,
> + dev->rx_dma[i],
> + dev->rx_sync_size, DMA_FROM_DEVICE);
> dev_kfree_skb(dev->rx_skb[i]);
>
> - dev->rx_desc[i].data_ptr =
> - dma_map_single(&dev->ofdev->dev, skb->data - NET_IP_ALIGN,
> - rx_sync_size, DMA_FROM_DEVICE)
> - + NET_IP_ALIGN;
> + dma = dma_map_single(&dev->ofdev->dev, skb->data - NET_IP_ALIGN,
> + rx_sync_size, DMA_FROM_DEVICE);
> + if (dma_mapping_error(&dev->ofdev->dev, dma)) {
> + dev_kfree_skb(skb);
> + dev->rx_skb[i] = NULL;
> + dev->rx_dma[i] = 0;
> + dev->rx_desc[i].data_ptr = 0;
> + dev->rx_desc[i].ctrl = 0;
> + ret = -ENOMEM;
> + goto oom;
> + }
[High]
Can this error path leave the descriptor ring in a state that crashes
the next NAPI poll?
The first pass earlier in emac_resize_rx_ring() initialises every
slot's ctrl with:
dev->rx_desc[i].ctrl = MAL_RX_CTRL_EMPTY |
(i == (NUM_RX_BUFF - 1) ? MAL_RX_CTRL_WRAP : 0);
so software (which polls with "if (ctrl & MAL_RX_CTRL_EMPTY) break;")
treats those slots as owned by hardware and skips them. The new error
path here sets ctrl = 0 instead, which clears MAL_RX_CTRL_EMPTY (and
MAL_RX_CTRL_WRAP for the last slot).
The oom: label below unconditionally re-enables RX:
oom:
clear_bit(MAL_COMMAC_RX_STOPPED, &dev->commac.flags);
dev->rx_slot = 0;
mal_enable_rx_channel(dev->mal, dev->mal_rx_chan);
emac_rx_enable(dev);
emac_netif_start(dev);
On the next emac_poll_rx() call, the loop reads ctrl = 0, finds the
EMPTY bit clear, and proceeds with skb = dev->rx_skb[slot] = NULL,
followed by memcpy()/skb_put()/eth_type_trans() on that NULL skb.
Additionally, the slot's rx_skb[i] is now NULL but the loop above
starts each iteration with BUG_ON(!dev->rx_skb[i]). Would a subsequent
MTU change deterministically panic on that BUG_ON()?
Should this error path mirror the first pass and use MAL_RX_CTRL_EMPTY
(plus MAL_RX_CTRL_WRAP for the last slot) rather than zero?
[ ... ]
> @@ -1176,15 +1201,23 @@ static void emac_clear_mal_desc(struct mal_descriptor *desc, int count)
> static int
> __emac_prepare_rx_skb(struct sk_buff *skb, struct emac_instance *dev, int slot)
> {
> + dma_addr_t dma;
> +
> if (unlikely(!skb))
> return -ENOMEM;
>
> dev->rx_skb[slot] = skb;
> dev->rx_desc[slot].data_len = 0;
>
> - dev->rx_desc[slot].data_ptr =
> - dma_map_single(&dev->ofdev->dev, skb->data - NET_IP_ALIGN,
> - dev->rx_sync_size, DMA_FROM_DEVICE) + NET_IP_ALIGN;
> + dma = dma_map_single(&dev->ofdev->dev, skb->data - NET_IP_ALIGN,
> + dev->rx_sync_size, DMA_FROM_DEVICE);
> + if (dma_mapping_error(&dev->ofdev->dev, dma)) {
> + dev->rx_skb[slot] = NULL;
> + dev_kfree_skb(skb);
> + return -ENOMEM;
> + }
> + dev->rx_desc[slot].data_ptr = dma + NET_IP_ALIGN;
> + dev->rx_dma[slot] = dma;
[Critical]
Can this corrupt the slot state on a DMA mapping failure?
The new skb is installed into the slot before the mapping is performed:
dev->rx_skb[slot] = skb;
dev->rx_desc[slot].data_len = 0;
dma = dma_map_single(...);
if (dma_mapping_error(...)) {
dev->rx_skb[slot] = NULL;
dev_kfree_skb(skb);
return -ENOMEM;
}
On failure, the new skb is freed and rx_skb[slot] is set to NULL, but
dev->rx_desc[slot].data_ptr and dev->rx_dma[slot] still hold the old
mapping, and the previous skb pointer that the caller saved (the
emac_poll_rx() local "skb") is also lost.
The emac_poll_rx() caller paths handle the failure as:
} else if (unlikely(emac_alloc_rx_skb_napi(dev, slot)))
goto oom;
...
oom:
++dev->estats.rx_dropped_oom;
emac_recycle_rx_skb(dev, slot, 0);
emac_recycle_rx_skb() then re-arms the descriptor with
MAL_RX_CTRL_EMPTY, handing the still-mapped old buffer back to
hardware. On the next received packet, dev->rx_skb[slot] is NULL and:
skb = dev->rx_skb[slot];
...
memcpy(copy_skb->data - NET_IP_ALIGN,
skb->data - NET_IP_ALIGN,
len + NET_IP_ALIGN);
would dereference NULL in NAPI context. The MAL_RX_CTRL_FIRST branch
that calls emac_alloc_rx_skb_napi() has the same shape.
Would it be safer to perform the dma_map_single() and error check
before installing the new skb into rx_skb[slot], so the slot still
points at a valid skb and mapping on failure?
[ ... ]
> @@ -1808,12 +1863,17 @@ static int emac_poll_rx(void *param, int budget)
> continue;
> sg:
> if (ctrl & MAL_RX_CTRL_FIRST) {
> + dma_addr_t old_dma = dev->rx_dma[slot];
> +
> BUG_ON(dev->rx_sg_skb);
> if (unlikely(emac_alloc_rx_skb_napi(dev, slot))) {
> DBG(dev, "rx OOM %d" NL, slot);
> ++dev->estats.rx_dropped_oom;
> emac_recycle_rx_skb(dev, slot, 0);
> } else {
> + dma_unmap_single(&dev->ofdev->dev, old_dma,
> + dev->rx_sync_size,
> + DMA_FROM_DEVICE);
> dev->rx_sg_skb = skb;
> skb_put(skb, len);
> }
[High]
The commit message bullet says:
- emac_poll_rx: use rx_dma[] shadow array, unmap old skb before
passing to napi_gro_receive
but the new dma_unmap_single() above only covers the SG
MAL_RX_CTRL_FIRST branch. The non-SG large-packet path
(len >= EMAC_RX_COPY_THRESH) is in emac_poll_rx():
skb = dev->rx_skb[slot];
...
if (len && len < EMAC_RX_COPY_THRESH) {
...
} else if (unlikely(emac_alloc_rx_skb_napi(dev, slot)))
goto oom;
skb_put(skb, len);
push_packet:
skb->protocol = eth_type_trans(skb, dev->ndev);
emac_rx_csum(dev, skb, ctrl);
napi_gro_receive(&dev->mal->napi, skb);
emac_alloc_rx_skb_napi() calls __emac_prepare_rx_skb() which
overwrites dev->rx_dma[slot] with the new mapping, so the previous
mapping address is lost. The original "skb" is then handed to
napi_gro_receive() while still mapped DMA_FROM_DEVICE.
Does this leak one DMA mapping per large-packet RX, and on the
non-coherent PowerPC platforms this patch targets, hand the stack a
buffer that has not been dma_unmap_single()'d?
Should the non-SG large-packet branch also save old_dma before
emac_alloc_rx_skb_napi() and call dma_unmap_single() before
napi_gro_receive(), mirroring the SG-first branch?
[ ... ]
--
This is an AI-generated review.
next prev parent reply other threads:[~2026-07-02 10:39 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-30 4:16 [PATCH 00/10] net: emac: various cleanups, fixes, and feature additions Rosen Penev
2026-06-30 4:16 ` [PATCH 01/10] net: emac: remove emac_xaht_base() Rosen Penev
2026-06-30 4:16 ` [PATCH 02/10] net: emac: fix sparse __iomem warnings in IAHT register access Rosen Penev
2026-07-02 10:37 ` Paolo Abeni
2026-06-30 4:16 ` [PATCH 03/10] net: emac: use DMA-specific and SMP memory barriers Rosen Penev
2026-07-02 10:42 ` Paolo Abeni
2026-06-30 4:16 ` [PATCH 04/10] net: emac: mal: replace of_get_property with of_property_read_u32 Rosen Penev
2026-06-30 4:16 ` [PATCH 05/10] net: emac: mal: replace busy-wait in mal_poll_disable with wait_event Rosen Penev
2026-07-02 10:46 ` Paolo Abeni
2026-06-30 4:16 ` [PATCH 06/10] net: emac: batch stats, eliminate modulo, tighten barrier in RX poll Rosen Penev
2026-06-30 4:16 ` [PATCH 07/10] net: emac: fix DMA API mapping and unmapping correctness Rosen Penev
2026-07-02 10:39 ` Paolo Abeni [this message]
2026-06-30 4:16 ` [PATCH 08/10] net: emac: replace #ifdef CONFIG_PPC_DCR_NATIVE with IS_ENABLED() Rosen Penev
2026-06-30 4:16 ` [PATCH 09/10] net: emac: add Byte Queue Limits (BQL) support Rosen Penev
2026-06-30 4:16 ` [PATCH 10/10] net: emac: use ndo_get_stats64 instead of ndo_get_stats Rosen Penev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260702103931.147571-1-pabeni@redhat.com \
--to=pabeni@redhat.com \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=rosenp@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox