From: Rosen Penev <rosenp@gmail.com>
To: netdev@vger.kernel.org
Cc: Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
linux-kernel@vger.kernel.org (open list)
Subject: [PATCH 07/10] net: emac: fix DMA API mapping and unmapping correctness
Date: Mon, 29 Jun 2026 21:16:31 -0700 [thread overview]
Message-ID: <20260630041634.284127-8-rosenp@gmail.com> (raw)
In-Reply-To: <20260630041634.284127-1-rosenp@gmail.com>
Add missing dma_mapping_error() checks after every dma_map_single()
and skb_frag_dma_map() call. Without these, CONFIG_DMA_API_DEBUG
emits:
DMA-API: emac ... device driver failed to check map error
Also fix emac_recycle_rx_skb() which called dma_map_single() but
discarded the returned DMA address -- the descriptor kept a stale
(unmapped) address after the old mapping was freed.
The RX descriptors are allocated in coherent (uncached) DMA memory.
Add a shadow rx_dma[] array in regular cached memory to store the
raw DMA address of each RX slot, avoiding a slow uncached read of
dev->rx_desc[slot].data_ptr on the per-packet hot path. This
prevents a measurable throughput regression on non-coherent PowerPC
platforms where the original fix added such a read.
In emac_recycle_rx_skb(), use dma_sync_single_for_device() with the
actual received length (cache-line-aligned) instead of destroying
and recreating the mapping. The mapping is long-lived; only the
bytes touched by skb_copy_from_linear_data_offset() need
synchronization.
- __emac_prepare_rx_skb: check map error, free skb on failure
- emac_resize_rx_ring: check map error, invalidate slot on failure
- emac_recycle_rx_skb: map first, check error, then unmap old;
use dma_sync_single_for_device() with SKB_DATA_ALIGN(len + NET_IP_ALIGN)
- emac_start_xmit: check map error, free skb on failure
- emac_start_xmit_sg: check map error on both data and frag maps,
undo partial descriptor setup on frag failure
- emac_poll_rx: use rx_dma[] shadow array, unmap old skb before
passing to napi_gro_receive
- emac_clean_rx_ring: use rx_dma[] shadow array for consistency
There's a small performance decrease tested with iperf3
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 574 MBytes 481 Mbits/sec 1 sender
[ 5] 0.00-10.00 sec 572 MBytes 479 Mbits/sec receiver
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 558 MBytes 468 Mbits/sec 0 sender
[ 5] 0.00-10.00 sec 556 MBytes 466 Mbits/sec receiver
but probably worth it. For whatever reason after this patch, ath9k
stopped throwing DMA errors with CONFIG_DMA_API_DEBUG.
Assisted-by: opencode:big-pickle
Signed-off-by: Rosen Penev <rosenp@gmail.com>
---
drivers/net/ethernet/ibm/emac/core.c | 92 +++++++++++++++++++++++-----
drivers/net/ethernet/ibm/emac/core.h | 1 +
2 files changed, 77 insertions(+), 16 deletions(-)
diff --git a/drivers/net/ethernet/ibm/emac/core.c b/drivers/net/ethernet/ibm/emac/core.c
index ced9690cddc3..aed1ad21e2ea 100644
--- a/drivers/net/ethernet/ibm/emac/core.c
+++ b/drivers/net/ethernet/ibm/emac/core.c
@@ -52,7 +52,15 @@
#include "core.h"
/*
- * Lack of dma_unmap_???? calls is intentional.
+ * Note on dma_unmap calls:
+ *
+ * RX buffers are properly unmapped before being remapped or passed to the
+ * network stack. See emac_recycle_rx_skb() and emac_poll_rx().
+ *
+ * TX buffers still lack dma_unmap calls for the reasons explained in the
+ * original note below (a single skb may be split across multiple BDs on
+ * TAH-equipped EMACs, making per-fragment tracking complex).
+ * The original rationale is kept for the TX path only:
*
* API-correct usage requires additional support state information to be
* maintained for every RX and TX buffer descriptor (BD). Unfortunately, due to
@@ -1058,6 +1066,7 @@ static int emac_resize_rx_ring(struct emac_instance *dev, int new_mtu)
/* Second pass, allocate new skbs */
for (i = 0; i < NUM_RX_BUFF; ++i) {
struct sk_buff *skb;
+ dma_addr_t dma;
skb = netdev_alloc_skb_ip_align(dev->ndev, rx_skb_size);
if (!skb) {
@@ -1066,12 +1075,24 @@ static int emac_resize_rx_ring(struct emac_instance *dev, int new_mtu)
}
BUG_ON(!dev->rx_skb[i]);
+ dma_unmap_single(&dev->ofdev->dev,
+ dev->rx_dma[i],
+ dev->rx_sync_size, DMA_FROM_DEVICE);
dev_kfree_skb(dev->rx_skb[i]);
- dev->rx_desc[i].data_ptr =
- dma_map_single(&dev->ofdev->dev, skb->data - NET_IP_ALIGN,
- rx_sync_size, DMA_FROM_DEVICE)
- + NET_IP_ALIGN;
+ dma = dma_map_single(&dev->ofdev->dev, skb->data - NET_IP_ALIGN,
+ rx_sync_size, DMA_FROM_DEVICE);
+ if (dma_mapping_error(&dev->ofdev->dev, dma)) {
+ dev_kfree_skb(skb);
+ dev->rx_skb[i] = NULL;
+ dev->rx_dma[i] = 0;
+ dev->rx_desc[i].data_ptr = 0;
+ dev->rx_desc[i].ctrl = 0;
+ ret = -ENOMEM;
+ goto oom;
+ }
+ dev->rx_desc[i].data_ptr = dma + NET_IP_ALIGN;
+ dev->rx_dma[i] = dma;
dev->rx_skb[i] = skb;
}
skip:
@@ -1150,9 +1171,13 @@ static void emac_clean_rx_ring(struct emac_instance *dev)
for (i = 0; i < NUM_RX_BUFF; ++i)
if (dev->rx_skb[i]) {
+ dma_unmap_single(&dev->ofdev->dev,
+ dev->rx_dma[i],
+ dev->rx_sync_size, DMA_FROM_DEVICE);
dev->rx_desc[i].ctrl = 0;
dev_kfree_skb(dev->rx_skb[i]);
dev->rx_skb[i] = NULL;
+ dev->rx_dma[i] = 0;
dev->rx_desc[i].data_ptr = 0;
}
@@ -1176,15 +1201,23 @@ static void emac_clear_mal_desc(struct mal_descriptor *desc, int count)
static int
__emac_prepare_rx_skb(struct sk_buff *skb, struct emac_instance *dev, int slot)
{
+ dma_addr_t dma;
+
if (unlikely(!skb))
return -ENOMEM;
dev->rx_skb[slot] = skb;
dev->rx_desc[slot].data_len = 0;
- dev->rx_desc[slot].data_ptr =
- dma_map_single(&dev->ofdev->dev, skb->data - NET_IP_ALIGN,
- dev->rx_sync_size, DMA_FROM_DEVICE) + NET_IP_ALIGN;
+ dma = dma_map_single(&dev->ofdev->dev, skb->data - NET_IP_ALIGN,
+ dev->rx_sync_size, DMA_FROM_DEVICE);
+ if (dma_mapping_error(&dev->ofdev->dev, dma)) {
+ dev->rx_skb[slot] = NULL;
+ dev_kfree_skb(skb);
+ return -ENOMEM;
+ }
+ dev->rx_desc[slot].data_ptr = dma + NET_IP_ALIGN;
+ dev->rx_dma[slot] = dma;
dma_wmb();
dev->rx_desc[slot].ctrl = MAL_RX_CTRL_EMPTY |
(slot == (NUM_RX_BUFF - 1) ? MAL_RX_CTRL_WRAP : 0);
@@ -1463,6 +1496,12 @@ static netdev_tx_t emac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
dev->tx_desc[slot].data_ptr = dma_map_single(&dev->ofdev->dev,
skb->data, len,
DMA_TO_DEVICE);
+ if (dma_mapping_error(&dev->ofdev->dev,
+ dev->tx_desc[slot].data_ptr)) {
+ dev->tx_skb[slot] = NULL;
+ dev_kfree_skb(skb);
+ return NETDEV_TX_OK;
+ }
dev->tx_desc[slot].data_len = (u16) len;
dma_wmb();
dev->tx_desc[slot].ctrl = ctrl;
@@ -1530,8 +1569,12 @@ emac_start_xmit_sg(struct sk_buff *skb, struct net_device *ndev)
/* skb data */
dev->tx_skb[slot] = NULL;
chunk = min(len, MAL_MAX_TX_SIZE);
- dev->tx_desc[slot].data_ptr = pd =
- dma_map_single(&dev->ofdev->dev, skb->data, len, DMA_TO_DEVICE);
+ pd = dma_map_single(&dev->ofdev->dev, skb->data, len, DMA_TO_DEVICE);
+ if (dma_mapping_error(&dev->ofdev->dev, pd)) {
+ dev_kfree_skb(skb);
+ return NETDEV_TX_OK;
+ }
+ dev->tx_desc[slot].data_ptr = pd;
dev->tx_desc[slot].data_len = (u16) chunk;
len -= chunk;
if (unlikely(len))
@@ -1547,6 +1590,18 @@ emac_start_xmit_sg(struct sk_buff *skb, struct net_device *ndev)
pd = skb_frag_dma_map(&dev->ofdev->dev, frag, 0, len,
DMA_TO_DEVICE);
+ if (dma_mapping_error(&dev->ofdev->dev, pd)) {
+ /* Undo partial descriptor setup and drop packet */
+ while (slot != dev->tx_slot) {
+ dev->tx_desc[slot].ctrl = 0;
+ --dev->tx_cnt;
+ if (--slot < 0)
+ slot = NUM_TX_BUFF - 1;
+ }
+ ++dev->estats.tx_undo;
+ dev_kfree_skb(skb);
+ return NETDEV_TX_OK;
+ }
slot = emac_xmit_split(dev, slot, pd, len, i == nr_frags - 1,
ctrl);
@@ -1661,14 +1716,14 @@ static void emac_poll_tx(void *param)
static inline void emac_recycle_rx_skb(struct emac_instance *dev, int slot,
int len)
{
- struct sk_buff *skb = dev->rx_skb[slot];
-
DBG2(dev, "recycle %d %d" NL, slot, len);
- if (len)
- dma_map_single(&dev->ofdev->dev, skb->data - NET_IP_ALIGN,
- SKB_DATA_ALIGN(len + NET_IP_ALIGN),
- DMA_FROM_DEVICE);
+ if (len) {
+ dma_sync_single_for_device(&dev->ofdev->dev,
+ dev->rx_dma[slot],
+ SKB_DATA_ALIGN(len + NET_IP_ALIGN),
+ DMA_FROM_DEVICE);
+ }
dev->rx_desc[slot].data_len = 0;
dma_wmb();
@@ -1808,12 +1863,17 @@ static int emac_poll_rx(void *param, int budget)
continue;
sg:
if (ctrl & MAL_RX_CTRL_FIRST) {
+ dma_addr_t old_dma = dev->rx_dma[slot];
+
BUG_ON(dev->rx_sg_skb);
if (unlikely(emac_alloc_rx_skb_napi(dev, slot))) {
DBG(dev, "rx OOM %d" NL, slot);
++dev->estats.rx_dropped_oom;
emac_recycle_rx_skb(dev, slot, 0);
} else {
+ dma_unmap_single(&dev->ofdev->dev, old_dma,
+ dev->rx_sync_size,
+ DMA_FROM_DEVICE);
dev->rx_sg_skb = skb;
skb_put(skb, len);
}
diff --git a/drivers/net/ethernet/ibm/emac/core.h b/drivers/net/ethernet/ibm/emac/core.h
index 296da4bf3781..0719f98f3325 100644
--- a/drivers/net/ethernet/ibm/emac/core.h
+++ b/drivers/net/ethernet/ibm/emac/core.h
@@ -246,6 +246,7 @@ struct emac_instance {
struct sk_buff *tx_skb[NUM_TX_BUFF];
struct sk_buff *rx_skb[NUM_RX_BUFF];
+ dma_addr_t rx_dma[NUM_RX_BUFF];
/* Stats
*/
--
2.54.0
next prev parent reply other threads:[~2026-06-30 4:16 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-30 4:16 [PATCH 00/10] net: emac: various cleanups, fixes, and feature additions Rosen Penev
2026-06-30 4:16 ` [PATCH 01/10] net: emac: remove emac_xaht_base() Rosen Penev
2026-06-30 4:16 ` [PATCH 02/10] net: emac: fix sparse __iomem warnings in IAHT register access Rosen Penev
2026-06-30 4:16 ` [PATCH 03/10] net: emac: use DMA-specific and SMP memory barriers Rosen Penev
2026-06-30 4:16 ` [PATCH 04/10] net: emac: mal: replace of_get_property with of_property_read_u32 Rosen Penev
2026-06-30 4:16 ` [PATCH 05/10] net: emac: mal: replace busy-wait in mal_poll_disable with wait_event Rosen Penev
2026-06-30 4:16 ` [PATCH 06/10] net: emac: batch stats, eliminate modulo, tighten barrier in RX poll Rosen Penev
2026-06-30 4:16 ` Rosen Penev [this message]
2026-06-30 4:16 ` [PATCH 08/10] net: emac: replace #ifdef CONFIG_PPC_DCR_NATIVE with IS_ENABLED() Rosen Penev
2026-06-30 4:16 ` [PATCH 09/10] net: emac: add Byte Queue Limits (BQL) support Rosen Penev
2026-06-30 4:16 ` [PATCH 10/10] net: emac: use ndo_get_stats64 instead of ndo_get_stats Rosen Penev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260630041634.284127-8-rosenp@gmail.com \
--to=rosenp@gmail.com \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox