* [RFC net-next 00/10] Add TSO map-once DMA helpers and bnxt SW USO support
@ 2026-03-10 21:21 Joe Damato
2026-03-10 21:21 ` [RFC net-next 01/10] net: tso: Introduce tso_dma_map Joe Damato
` (10 more replies)
0 siblings, 11 replies; 12+ messages in thread
From: Joe Damato @ 2026-03-10 21:21 UTC (permalink / raw)
To: netdev
Cc: michael.chan, pavan.chebbi, linux-kernel, Joe Damato,
Alexei Starovoitov, Andrew Lunn, bpf, Daniel Borkmann,
David S. Miller, Eric Dumazet, Jakub Kicinski,
Jesper Dangaard Brouer, John Fastabend, Paolo Abeni,
Richard Cochran, Simon Horman, Stanislav Fomichev
Greetings:
This series extends net/tso to add a data structure and some helpers allowing
drivers to DMA map headers and packet payloads a single time. The helpers can
then be used to reference slices of shared mapping for each segment. This
helps to avoid the cost of repeated DMA mappings, especially on systems which
use an IOMMU. N per-packet DMA maps are replaced with a single map for the
entire GSO skb.
The added helpers are then used in bnxt to add support for software UDP
Segmentation Offloading (SW USO) for older bnxt devices which do not have
support for USO in hardware. Since the helpers are generic, other drivers
can be extended similarly.
Early testing shows a ~4x reduction in DMA mapping calls at the same wire
packet rate.
Special care is taken to make bnxt ethtool operations work correctly: the ring
size cannot be reduced below a minimum threshold while USO is enabled and
growing the ring automatically re-enables USO if it was previously blocked.
Thanks,
Joe
Joe Damato (10):
net: tso: Introduce tso_dma_map
net: tso: Add tso_dma_map helpers
net: bnxt: Export bnxt_xmit_get_cfa_action
net: bnxt: Add a helper for tx_bd_ext
net: bnxt: Use dma_unmap_len for TX completion unmapping
net: bnxt: Add TX inline buffer infrastructure
net: bnxt: Add boilerplate GSO code
net: bnxt: Implement software USO
net: bnxt: Add SW GSO completion and teardown support
net: bnxt: Dispatch to SW USO
drivers/net/ethernet/broadcom/bnxt/Makefile | 2 +-
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 159 ++++++++++++---
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 27 +++
.../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 19 +-
drivers/net/ethernet/broadcom/bnxt/bnxt_gso.c | 188 ++++++++++++++++++
drivers/net/ethernet/broadcom/bnxt/bnxt_gso.h | 31 +++
include/net/tso.h | 45 +++++
net/core/tso.c | 165 +++++++++++++++
8 files changed, 601 insertions(+), 35 deletions(-)
create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_gso.c
create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_gso.h
--
2.52.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC net-next 01/10] net: tso: Introduce tso_dma_map
2026-03-10 21:21 [RFC net-next 00/10] Add TSO map-once DMA helpers and bnxt SW USO support Joe Damato
@ 2026-03-10 21:21 ` Joe Damato
2026-03-10 21:21 ` [RFC net-next 02/10] net: tso: Add tso_dma_map helpers Joe Damato
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Joe Damato @ 2026-03-10 21:21 UTC (permalink / raw)
To: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: michael.chan, pavan.chebbi, linux-kernel, Joe Damato
Add struct tso_dma_map to tso.h for tracking DMA addresses of mapped
GSO payload data.
The struct combines DMA mapping storage (linear_dma, frags[]) with
iterator state (frag_idx, offset), allowing drivers to walk pre-mapped
DMA regions linearly. Helpers to initialize and operate on this struct
will be added in the next commit.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Joe Damato <joe@dama.to>
---
include/net/tso.h | 37 +++++++++++++++++++++++++++++++++++++
1 file changed, 37 insertions(+)
diff --git a/include/net/tso.h b/include/net/tso.h
index e7e157ae0526..9a508e60ee19 100644
--- a/include/net/tso.h
+++ b/include/net/tso.h
@@ -3,6 +3,7 @@
#define _TSO_H
#include <linux/skbuff.h>
+#include <linux/dma-mapping.h>
#include <net/ip.h>
#define TSO_HEADER_SIZE 256
@@ -28,4 +29,40 @@ void tso_build_hdr(const struct sk_buff *skb, char *hdr, struct tso_t *tso,
void tso_build_data(const struct sk_buff *skb, struct tso_t *tso, int size);
int tso_start(struct sk_buff *skb, struct tso_t *tso);
+/**
+ * struct tso_dma_map - DMA mapping state for GSO payload
+ * @dev: device used for DMA mapping
+ * @skb: the GSO skb being mapped
+ * @hdr_len: per-segment header length
+ * @frag_idx: current region (-1 = linear, 0..nr_frags-1 = frag)
+ * @offset: byte offset within current region
+ * @linear_dma: DMA address of the linear payload (after headers)
+ * @linear_len: length of the linear payload
+ * @nr_frags: number of frags successfully DMA-mapped
+ * @frags: per-frag DMA address and length
+ *
+ * Struct that DMA-maps the payload regions of a GSO skb
+ * (linear data + frags) upfront, then provides iteration to yield
+ * (dma_addr, chunk_len) pairs bounded by region boundaries.
+ *
+ * Drivers set dma_unmap_len on the first descriptor touching each DMA
+ * mapping; the completion path unmaps via per-descriptor dma_unmap_len.
+ */
+struct tso_dma_map {
+ struct device *dev;
+ const struct sk_buff *skb;
+ unsigned int hdr_len;
+ /* Iterator state */
+ int frag_idx;
+ unsigned int offset;
+ /* Pre-mapped regions */
+ dma_addr_t linear_dma;
+ unsigned int linear_len;
+ unsigned int nr_frags;
+ struct {
+ dma_addr_t dma;
+ unsigned int len;
+ } frags[MAX_SKB_FRAGS];
+};
+
#endif /* _TSO_H */
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC net-next 02/10] net: tso: Add tso_dma_map helpers
2026-03-10 21:21 [RFC net-next 00/10] Add TSO map-once DMA helpers and bnxt SW USO support Joe Damato
2026-03-10 21:21 ` [RFC net-next 01/10] net: tso: Introduce tso_dma_map Joe Damato
@ 2026-03-10 21:21 ` Joe Damato
2026-03-10 21:21 ` [RFC net-next 03/10] net: bnxt: Export bnxt_xmit_get_cfa_action Joe Damato
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Joe Damato @ 2026-03-10 21:21 UTC (permalink / raw)
To: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: michael.chan, pavan.chebbi, linux-kernel, Joe Damato
Add helpers to initialize, iterate, and clean up a tso_dma_map:
tso_dma_map_init(): DMA-maps the linear payload region and all frags
upfront into the tso_dma_map struct. Returns 0 on success, cleans up
partial mappings on failure.
tso_dma_map_cleanup(): unmaps all DMA regions. Used on error paths.
tso_dma_map_count(): counts how many descriptors the next N bytes of
payload will need, without advancing the iterator.
tso_dma_map_next(): yields the next (dma_addr, chunk_len) pair.
Indicates when a chunk starts a new DMA mapping so the driver can set
dma_unmap_len on that BD for completion-time unmapping.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Joe Damato <joe@dama.to>
---
include/net/tso.h | 8 +++
net/core/tso.c | 165 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 173 insertions(+)
diff --git a/include/net/tso.h b/include/net/tso.h
index 9a508e60ee19..dcb93c7fb917 100644
--- a/include/net/tso.h
+++ b/include/net/tso.h
@@ -65,4 +65,12 @@ struct tso_dma_map {
} frags[MAX_SKB_FRAGS];
};
+int tso_dma_map_init(struct tso_dma_map *map, struct device *dev,
+ const struct sk_buff *skb, unsigned int hdr_len);
+void tso_dma_map_cleanup(struct tso_dma_map *map);
+unsigned int tso_dma_map_count(const struct tso_dma_map *map, unsigned int len);
+bool tso_dma_map_next(struct tso_dma_map *map, dma_addr_t *addr,
+ unsigned int *chunk_len, unsigned int *mapping_len,
+ unsigned int seg_remaining);
+
#endif /* _TSO_H */
diff --git a/net/core/tso.c b/net/core/tso.c
index 6df997b9076e..48348ad94501 100644
--- a/net/core/tso.c
+++ b/net/core/tso.c
@@ -3,6 +3,7 @@
#include <linux/if_vlan.h>
#include <net/ip.h>
#include <net/tso.h>
+#include <linux/dma-mapping.h>
#include <linux/unaligned.h>
void tso_build_hdr(const struct sk_buff *skb, char *hdr, struct tso_t *tso,
@@ -87,3 +88,167 @@ int tso_start(struct sk_buff *skb, struct tso_t *tso)
return hdr_len;
}
EXPORT_SYMBOL(tso_start);
+
+/**
+ * tso_dma_map_init - DMA-map GSO payload regions
+ * @map: map struct to initialize
+ * @dev: device for DMA mapping
+ * @skb: the GSO skb
+ * @hdr_len: per-segment header length in bytes
+ *
+ * DMA-maps the linear payload (after headers) and all frags.
+ * Positions the iterator at byte 0 of the payload.
+ *
+ * Returns 0 on success, -ENOMEM on DMA mapping failure (partial mappings
+ * are cleaned up internally).
+ */
+int tso_dma_map_init(struct tso_dma_map *map, struct device *dev,
+ const struct sk_buff *skb, unsigned int hdr_len)
+{
+ unsigned int linear_len = skb_headlen(skb) - hdr_len;
+ unsigned int nr_frags = skb_shinfo(skb)->nr_frags;
+ int i;
+
+ map->dev = dev;
+ map->skb = skb;
+ map->hdr_len = hdr_len;
+ map->frag_idx = -1;
+ map->offset = 0;
+ map->linear_len = 0;
+ map->nr_frags = 0;
+
+ if (linear_len > 0) {
+ map->linear_dma = dma_map_single(dev, skb->data + hdr_len,
+ linear_len, DMA_TO_DEVICE);
+ if (dma_mapping_error(dev, map->linear_dma))
+ return -ENOMEM;
+ map->linear_len = linear_len;
+ }
+
+ for (i = 0; i < nr_frags; i++) {
+ skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
+
+ map->frags[i].len = skb_frag_size(frag);
+ map->frags[i].dma = skb_frag_dma_map(dev, frag, 0,
+ map->frags[i].len,
+ DMA_TO_DEVICE);
+ if (dma_mapping_error(dev, map->frags[i].dma)) {
+ tso_dma_map_cleanup(map);
+ return -ENOMEM;
+ }
+ map->nr_frags = i + 1;
+ }
+
+ if (linear_len == 0 && nr_frags > 0)
+ map->frag_idx = 0;
+
+ return 0;
+}
+EXPORT_SYMBOL(tso_dma_map_init);
+
+/**
+ * tso_dma_map_cleanup - unmap all DMA regions in a tso_dma_map
+ * @map: the map to clean up
+ *
+ * Unmaps linear payload and all mapped frags. Used on error paths.
+ * Success paths use the driver's completion path to handle unmapping.
+ */
+void tso_dma_map_cleanup(struct tso_dma_map *map)
+{
+ int i;
+
+ if (map->linear_len)
+ dma_unmap_single(map->dev, map->linear_dma, map->linear_len,
+ DMA_TO_DEVICE);
+
+ for (i = 0; i < map->nr_frags; i++)
+ dma_unmap_page(map->dev, map->frags[i].dma, map->frags[i].len,
+ DMA_TO_DEVICE);
+
+ map->linear_len = 0;
+ map->nr_frags = 0;
+}
+EXPORT_SYMBOL(tso_dma_map_cleanup);
+
+/**
+ * tso_dma_map_count - count descriptors for a payload range
+ * @map: the payload map
+ * @len: number of payload bytes in this segment
+ *
+ * Counts how many contiguous DMA region chunks the next @len bytes
+ * will span, without advancing the iterator. Uses frag sizes from
+ * the current position.
+ *
+ * Returns the number of descriptors needed for @len bytes of payload.
+ */
+unsigned int tso_dma_map_count(const struct tso_dma_map *map, unsigned int len)
+{
+ unsigned int offset = map->offset;
+ int idx = map->frag_idx;
+ unsigned int count = 0;
+
+ while (len > 0) {
+ unsigned int region_len, chunk;
+
+ if (idx == -1)
+ region_len = map->linear_len;
+ else
+ region_len = map->frags[idx].len;
+
+ chunk = min(len, region_len - offset);
+ len -= chunk;
+ count++;
+ offset = 0;
+ idx++;
+ }
+
+ return count;
+}
+EXPORT_SYMBOL(tso_dma_map_count);
+
+/**
+ * tso_dma_map_next - yield the next DMA address range
+ * @map: the payload map
+ * @addr: output DMA address
+ * @chunk_len: output chunk length
+ * @mapping_len: output full mapping length when this is the first BD of
+ * a DMA mapping (driver should set dma_unmap_len to this),
+ * or 0 when continuing a previous mapping
+ * @seg_remaining: bytes left in current segment
+ *
+ * Yields the next (dma_addr, chunk_len) pair and advances the iterator.
+ *
+ * Returns true if a chunk was yielded, false when @seg_remaining is 0.
+ */
+bool tso_dma_map_next(struct tso_dma_map *map, dma_addr_t *addr,
+ unsigned int *chunk_len, unsigned int *mapping_len,
+ unsigned int seg_remaining)
+{
+ unsigned int region_len, chunk;
+
+ if (!seg_remaining)
+ return false;
+
+ if (map->frag_idx == -1) {
+ region_len = map->linear_len;
+ chunk = min(seg_remaining, region_len - map->offset);
+ *addr = map->linear_dma + map->offset;
+ *mapping_len = (map->offset == 0) ? region_len : 0;
+ } else {
+ region_len = map->frags[map->frag_idx].len;
+ chunk = min(seg_remaining, region_len - map->offset);
+ *addr = map->frags[map->frag_idx].dma + map->offset;
+ *mapping_len = (map->offset == 0) ? region_len : 0;
+ }
+
+ *chunk_len = chunk;
+ map->offset += chunk;
+
+ if (map->offset >= region_len) {
+ map->frag_idx++;
+ map->offset = 0;
+ }
+
+ return true;
+}
+EXPORT_SYMBOL(tso_dma_map_next);
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC net-next 03/10] net: bnxt: Export bnxt_xmit_get_cfa_action
2026-03-10 21:21 [RFC net-next 00/10] Add TSO map-once DMA helpers and bnxt SW USO support Joe Damato
2026-03-10 21:21 ` [RFC net-next 01/10] net: tso: Introduce tso_dma_map Joe Damato
2026-03-10 21:21 ` [RFC net-next 02/10] net: tso: Add tso_dma_map helpers Joe Damato
@ 2026-03-10 21:21 ` Joe Damato
2026-03-10 21:21 ` [RFC net-next 04/10] net: bnxt: Add a helper for tx_bd_ext Joe Damato
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Joe Damato @ 2026-03-10 21:21 UTC (permalink / raw)
To: netdev, Michael Chan, Pavan Chebbi, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: linux-kernel, Joe Damato
Export bnxt_xmit_get_cfa_action so that it can be used in future commits
which add software USO support to bnxt.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Joe Damato <joe@dama.to>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 84095d09ecf8..5033aab2d3a6 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -445,7 +445,7 @@ const u16 bnxt_lhint_arr[] = {
TX_BD_FLAGS_LHINT_2048_AND_LARGER,
};
-static u16 bnxt_xmit_get_cfa_action(struct sk_buff *skb)
+u16 bnxt_xmit_get_cfa_action(struct sk_buff *skb)
{
struct metadata_dst *md_dst = skb_metadata_dst(skb);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 663708fd3cbc..48c91cdf1975 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -2936,6 +2936,7 @@ unsigned int bnxt_get_avail_cp_rings_for_en(struct bnxt *bp);
int bnxt_reserve_rings(struct bnxt *bp, bool irq_re_init);
void bnxt_tx_disable(struct bnxt *bp);
void bnxt_tx_enable(struct bnxt *bp);
+u16 bnxt_xmit_get_cfa_action(struct sk_buff *skb);
void bnxt_sched_reset_txr(struct bnxt *bp, struct bnxt_tx_ring_info *txr,
u16 curr);
void bnxt_report_link(struct bnxt *bp);
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC net-next 04/10] net: bnxt: Add a helper for tx_bd_ext
2026-03-10 21:21 [RFC net-next 00/10] Add TSO map-once DMA helpers and bnxt SW USO support Joe Damato
` (2 preceding siblings ...)
2026-03-10 21:21 ` [RFC net-next 03/10] net: bnxt: Export bnxt_xmit_get_cfa_action Joe Damato
@ 2026-03-10 21:21 ` Joe Damato
2026-03-10 21:21 ` [RFC net-next 05/10] net: bnxt: Use dma_unmap_len for TX completion unmapping Joe Damato
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Joe Damato @ 2026-03-10 21:21 UTC (permalink / raw)
To: netdev, Michael Chan, Pavan Chebbi, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: linux-kernel, Joe Damato
Factor out some code to setup tx_bd_exts into a helper function. This
helper will be used by SW USO implementation in the following commits.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Joe Damato <joe@dama.to>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 9 ++-------
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 18 ++++++++++++++++++
2 files changed, 20 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 5033aab2d3a6..c52db7135ded 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -664,10 +664,9 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
txbd->tx_bd_opaque = SET_TX_OPAQUE(bp, txr, prod, 2 + last_frag);
prod = NEXT_TX(prod);
- txbd1 = (struct tx_bd_ext *)
- &txr->tx_desc_ring[TX_RING(bp, prod)][TX_IDX(prod)];
+ txbd1 = bnxt_init_ext_bd(bp, txr, prod, lflags, vlan_tag_flags,
+ cfa_action);
- txbd1->tx_bd_hsize_lflags = lflags;
if (skb_is_gso(skb)) {
bool udp_gso = !!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4);
u32 hdr_len;
@@ -694,7 +693,6 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
} else if (skb->ip_summed == CHECKSUM_PARTIAL) {
txbd1->tx_bd_hsize_lflags |=
cpu_to_le32(TX_BD_FLAGS_TCP_UDP_CHKSUM);
- txbd1->tx_bd_mss = 0;
}
length >>= 9;
@@ -707,9 +705,6 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
flags |= bnxt_lhint_arr[length];
txbd->tx_bd_len_flags_type = cpu_to_le32(flags);
- txbd1->tx_bd_cfa_meta = cpu_to_le32(vlan_tag_flags);
- txbd1->tx_bd_cfa_action =
- cpu_to_le32(cfa_action << TX_BD_CFA_ACTION_SHIFT);
txbd0 = txbd;
for (i = 0; i < last_frag; i++) {
frag = &skb_shinfo(skb)->frags[i];
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 48c91cdf1975..c5dd341e7d95 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -2820,6 +2820,24 @@ static inline u32 bnxt_tx_avail(struct bnxt *bp,
return bp->tx_ring_size - (used & bp->tx_ring_mask);
}
+static inline struct tx_bd_ext *
+bnxt_init_ext_bd(struct bnxt *bp, struct bnxt_tx_ring_info *txr,
+ u16 prod, __le32 lflags, u32 vlan_tag_flags,
+ u32 cfa_action)
+{
+ struct tx_bd_ext *txbd1;
+
+ txbd1 = (struct tx_bd_ext *)
+ &txr->tx_desc_ring[TX_RING(bp, prod)][TX_IDX(prod)];
+ txbd1->tx_bd_hsize_lflags = lflags;
+ txbd1->tx_bd_mss = 0;
+ txbd1->tx_bd_cfa_meta = cpu_to_le32(vlan_tag_flags);
+ txbd1->tx_bd_cfa_action =
+ cpu_to_le32(cfa_action << TX_BD_CFA_ACTION_SHIFT);
+
+ return txbd1;
+}
+
static inline void bnxt_writeq(struct bnxt *bp, u64 val,
volatile void __iomem *addr)
{
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC net-next 05/10] net: bnxt: Use dma_unmap_len for TX completion unmapping
2026-03-10 21:21 [RFC net-next 00/10] Add TSO map-once DMA helpers and bnxt SW USO support Joe Damato
` (3 preceding siblings ...)
2026-03-10 21:21 ` [RFC net-next 04/10] net: bnxt: Add a helper for tx_bd_ext Joe Damato
@ 2026-03-10 21:21 ` Joe Damato
2026-03-10 21:21 ` [RFC net-next 06/10] net: bnxt: Add TX inline buffer infrastructure Joe Damato
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Joe Damato @ 2026-03-10 21:21 UTC (permalink / raw)
To: netdev, Michael Chan, Pavan Chebbi, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: linux-kernel, Joe Damato
Store the DMA mapping length in each TX buffer descriptor via
dma_unmap_len_set at submit time, and use dma_unmap_len at completion
time.
This is a no-op for normal packets but prepares for software USO,
where header BDs set dma_unmap_len to 0 because the header buffer
is unmapped collectively rather than per-segment.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Joe Damato <joe@dama.to>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 51 ++++++++++++++---------
1 file changed, 32 insertions(+), 19 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index c52db7135ded..b801daf3f328 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -657,6 +657,7 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
goto tx_free;
dma_unmap_addr_set(tx_buf, mapping, mapping);
+ dma_unmap_len_set(tx_buf, len, len);
flags = (len << TX_BD_LEN_SHIFT) | TX_BD_TYPE_LONG_TX_BD |
TX_BD_CNT(last_frag + 2);
@@ -721,6 +722,7 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
tx_buf = &txr->tx_buf_ring[RING_TX(bp, prod)];
netmem_dma_unmap_addr_set(skb_frag_netmem(frag), tx_buf,
mapping, mapping);
+ dma_unmap_len_set(tx_buf, len, len);
txbd->tx_bd_haddr = cpu_to_le64(mapping);
@@ -810,7 +812,8 @@ static bool __bnxt_tx_int(struct bnxt *bp, struct bnxt_tx_ring_info *txr,
u16 hw_cons = txr->tx_hw_cons;
unsigned int tx_bytes = 0;
u16 cons = txr->tx_cons;
- skb_frag_t *frag;
+ unsigned int dma_len;
+ dma_addr_t dma_addr;
int tx_pkts = 0;
bool rc = false;
@@ -845,19 +848,27 @@ static bool __bnxt_tx_int(struct bnxt *bp, struct bnxt_tx_ring_info *txr,
goto next_tx_int;
}
- dma_unmap_single(&pdev->dev, dma_unmap_addr(tx_buf, mapping),
- skb_headlen(skb), DMA_TO_DEVICE);
+ if (dma_unmap_len(tx_buf, len)) {
+ dma_addr = dma_unmap_addr(tx_buf, mapping);
+ dma_len = dma_unmap_len(tx_buf, len);
+
+ dma_unmap_single(&pdev->dev, dma_addr, dma_len,
+ DMA_TO_DEVICE);
+ }
+
last = tx_buf->nr_frags;
for (j = 0; j < last; j++) {
- frag = &skb_shinfo(skb)->frags[j];
cons = NEXT_TX(cons);
tx_buf = &txr->tx_buf_ring[RING_TX(bp, cons)];
- netmem_dma_unmap_page_attrs(&pdev->dev,
- dma_unmap_addr(tx_buf,
- mapping),
- skb_frag_size(frag),
- DMA_TO_DEVICE, 0);
+ if (dma_unmap_len(tx_buf, len)) {
+ dma_addr = dma_unmap_addr(tx_buf, mapping);
+ dma_len = dma_unmap_len(tx_buf, len);
+
+ netmem_dma_unmap_page_attrs(&pdev->dev,
+ dma_addr, dma_len,
+ DMA_TO_DEVICE, 0);
+ }
}
if (unlikely(is_ts_pkt)) {
if (BNXT_CHIP_P5(bp)) {
@@ -3436,23 +3447,25 @@ static void bnxt_free_one_tx_ring_skbs(struct bnxt *bp,
continue;
}
- dma_unmap_single(&pdev->dev,
- dma_unmap_addr(tx_buf, mapping),
- skb_headlen(skb),
- DMA_TO_DEVICE);
+ if (dma_unmap_len(tx_buf, len))
+ dma_unmap_single(&pdev->dev,
+ dma_unmap_addr(tx_buf, mapping),
+ dma_unmap_len(tx_buf, len),
+ DMA_TO_DEVICE);
last = tx_buf->nr_frags;
i += 2;
for (j = 0; j < last; j++, i++) {
int ring_idx = i & bp->tx_ring_mask;
- skb_frag_t *frag = &skb_shinfo(skb)->frags[j];
tx_buf = &txr->tx_buf_ring[ring_idx];
- netmem_dma_unmap_page_attrs(&pdev->dev,
- dma_unmap_addr(tx_buf,
- mapping),
- skb_frag_size(frag),
- DMA_TO_DEVICE, 0);
+ if (dma_unmap_len(tx_buf, len))
+ netmem_dma_unmap_page_attrs(&pdev->dev,
+ dma_unmap_addr(tx_buf,
+ mapping),
+ dma_unmap_len(tx_buf,
+ len),
+ DMA_TO_DEVICE, 0);
}
dev_kfree_skb(skb);
}
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC net-next 06/10] net: bnxt: Add TX inline buffer infrastructure
2026-03-10 21:21 [RFC net-next 00/10] Add TSO map-once DMA helpers and bnxt SW USO support Joe Damato
` (4 preceding siblings ...)
2026-03-10 21:21 ` [RFC net-next 05/10] net: bnxt: Use dma_unmap_len for TX completion unmapping Joe Damato
@ 2026-03-10 21:21 ` Joe Damato
2026-03-10 21:21 ` [RFC net-next 07/10] net: bnxt: Add boilerplate GSO code Joe Damato
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Joe Damato @ 2026-03-10 21:21 UTC (permalink / raw)
To: netdev, Michael Chan, Pavan Chebbi, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: linux-kernel, Joe Damato
Add per-ring pre-allocated inline buffer fields (tx_inline_buf,
tx_inline_dma, tx_inline_size) to bnxt_tx_ring_info and helpers to
allocate and free them.
The inline buffer will be used by the SW USO path for pre-allocated,
pre-DMA-mapped per-segment header copies. In the future, this
could be extended to support TX copybreak.
Allocation helper is marked __maybe_unused in this commit because it
will be wired in later.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Joe Damato <joe@dama.to>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 35 +++++++++++++++++++++++
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 4 +++
2 files changed, 39 insertions(+)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index b801daf3f328..906e842d9c53 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -3962,6 +3962,39 @@ static int bnxt_alloc_rx_rings(struct bnxt *bp)
return rc;
}
+static void bnxt_free_tx_inline_buf(struct bnxt_tx_ring_info *txr,
+ struct pci_dev *pdev)
+{
+ if (!txr->tx_inline_buf)
+ return;
+
+ dma_unmap_single(&pdev->dev, txr->tx_inline_dma,
+ txr->tx_inline_size, DMA_TO_DEVICE);
+ kfree(txr->tx_inline_buf);
+ txr->tx_inline_buf = NULL;
+ txr->tx_inline_size = 0;
+}
+
+static int __maybe_unused bnxt_alloc_tx_inline_buf(struct bnxt_tx_ring_info *txr,
+ struct pci_dev *pdev,
+ unsigned int size)
+{
+ txr->tx_inline_buf = kmalloc(size, GFP_KERNEL);
+ if (!txr->tx_inline_buf)
+ return -ENOMEM;
+
+ txr->tx_inline_dma = dma_map_single(&pdev->dev, txr->tx_inline_buf,
+ size, DMA_TO_DEVICE);
+ if (dma_mapping_error(&pdev->dev, txr->tx_inline_dma)) {
+ kfree(txr->tx_inline_buf);
+ txr->tx_inline_buf = NULL;
+ return -ENOMEM;
+ }
+ txr->tx_inline_size = size;
+
+ return 0;
+}
+
static void bnxt_free_tx_rings(struct bnxt *bp)
{
int i;
@@ -3980,6 +4013,8 @@ static void bnxt_free_tx_rings(struct bnxt *bp)
txr->tx_push = NULL;
}
+ bnxt_free_tx_inline_buf(txr, pdev);
+
ring = &txr->tx_ring_struct;
bnxt_free_ring(bp, &ring->ring_mem);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index c5dd341e7d95..41d933a4c282 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -992,6 +992,10 @@ struct bnxt_tx_ring_info {
dma_addr_t tx_push_mapping;
__le64 data_mapping;
+ void *tx_inline_buf;
+ dma_addr_t tx_inline_dma;
+ unsigned int tx_inline_size;
+
#define BNXT_DEV_STATE_CLOSING 0x1
u32 dev_state;
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC net-next 07/10] net: bnxt: Add boilerplate GSO code
2026-03-10 21:21 [RFC net-next 00/10] Add TSO map-once DMA helpers and bnxt SW USO support Joe Damato
` (5 preceding siblings ...)
2026-03-10 21:21 ` [RFC net-next 06/10] net: bnxt: Add TX inline buffer infrastructure Joe Damato
@ 2026-03-10 21:21 ` Joe Damato
2026-03-10 21:21 ` [RFC net-next 08/10] net: bnxt: Implement software USO Joe Damato
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Joe Damato @ 2026-03-10 21:21 UTC (permalink / raw)
To: netdev, Michael Chan, Pavan Chebbi, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Richard Cochran,
Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
John Fastabend, Stanislav Fomichev
Cc: linux-kernel, Joe Damato, bpf
Add bnxt_gso.c and bnxt_gso.h with a stub bnxt_sw_udp_gso_xmit()
function, SW USO constants (BNXT_SW_USO_MAX_SEGS,
BNXT_SW_USO_MAX_DESCS), and the is_sw_gso field in bnxt_sw_tx_bd
with BNXT_SW_GSO_MID/LAST markers.
The full SW USO implementation will be added in a future commit.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Joe Damato <joe@dama.to>
---
drivers/net/ethernet/broadcom/bnxt/Makefile | 2 +-
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 4 +++
drivers/net/ethernet/broadcom/bnxt/bnxt_gso.c | 30 ++++++++++++++++++
drivers/net/ethernet/broadcom/bnxt/bnxt_gso.h | 31 +++++++++++++++++++
4 files changed, 66 insertions(+), 1 deletion(-)
create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_gso.c
create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_gso.h
diff --git a/drivers/net/ethernet/broadcom/bnxt/Makefile b/drivers/net/ethernet/broadcom/bnxt/Makefile
index ba6c239d52fa..debef78c8b6d 100644
--- a/drivers/net/ethernet/broadcom/bnxt/Makefile
+++ b/drivers/net/ethernet/broadcom/bnxt/Makefile
@@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0-only
obj-$(CONFIG_BNXT) += bnxt_en.o
-bnxt_en-y := bnxt.o bnxt_hwrm.o bnxt_sriov.o bnxt_ethtool.o bnxt_dcb.o bnxt_ulp.o bnxt_xdp.o bnxt_ptp.o bnxt_vfr.o bnxt_devlink.o bnxt_dim.o bnxt_coredump.o
+bnxt_en-y := bnxt.o bnxt_hwrm.o bnxt_sriov.o bnxt_ethtool.o bnxt_dcb.o bnxt_ulp.o bnxt_xdp.o bnxt_ptp.o bnxt_vfr.o bnxt_devlink.o bnxt_dim.o bnxt_coredump.o bnxt_gso.o
bnxt_en-$(CONFIG_BNXT_FLOWER_OFFLOAD) += bnxt_tc.o
bnxt_en-$(CONFIG_DEBUG_FS) += bnxt_debugfs.o
bnxt_en-$(CONFIG_BNXT_HWMON) += bnxt_hwmon.o
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 41d933a4c282..99c625b954d5 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -889,6 +889,7 @@ struct bnxt_sw_tx_bd {
u8 is_ts_pkt;
u8 is_push;
u8 action;
+ u8 is_sw_gso;
unsigned short nr_frags;
union {
u16 rx_prod;
@@ -896,6 +897,9 @@ struct bnxt_sw_tx_bd {
};
};
+#define BNXT_SW_GSO_MID 1
+#define BNXT_SW_GSO_LAST 2
+
struct bnxt_sw_rx_bd {
void *data;
u8 *data_ptr;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_gso.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_gso.c
new file mode 100644
index 000000000000..b296769ee4fe
--- /dev/null
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_gso.c
@@ -0,0 +1,30 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/* Broadcom NetXtreme-C/E network driver.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ */
+
+#include <linux/pci.h>
+#include <linux/netdevice.h>
+#include <linux/skbuff.h>
+#include <net/netdev_queues.h>
+#include <net/ip.h>
+#include <net/ipv6.h>
+#include <net/udp.h>
+#include <net/tso.h>
+#include <linux/bnxt/hsi.h>
+
+#include "bnxt.h"
+#include "bnxt_gso.h"
+
+netdev_tx_t bnxt_sw_udp_gso_xmit(struct bnxt *bp,
+ struct bnxt_tx_ring_info *txr,
+ struct netdev_queue *txq,
+ struct sk_buff *skb)
+{
+ dev_kfree_skb_any(skb);
+ dev_core_stats_tx_dropped_inc(bp->dev);
+ return NETDEV_TX_OK;
+}
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_gso.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_gso.h
new file mode 100644
index 000000000000..f01e8102dcd7
--- /dev/null
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_gso.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Broadcom NetXtreme-C/E network driver.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ */
+
+#ifndef BNXT_GSO_H
+#define BNXT_GSO_H
+
+/* Maximum segments the stack may send in a single SW USO skb.
+ * This caps gso_max_segs for NICs without HW USO support.
+ */
+#define BNXT_SW_USO_MAX_SEGS 64
+
+/* Worst-case TX descriptors consumed by one SW USO packet:
+ * Each segment: 1 long BD + 1 ext BD + payload BDs.
+ * Total payload BDs across all segs <= num_segs + nr_frags (each frag
+ * boundary crossing adds at most 1 extra BD).
+ * So: 3 * max_segs + MAX_SKB_FRAGS + 1 = 3 * 64 + 17 + 1 = 210.
+ */
+#define BNXT_SW_USO_MAX_DESCS (3 * BNXT_SW_USO_MAX_SEGS + MAX_SKB_FRAGS + 1)
+
+netdev_tx_t bnxt_sw_udp_gso_xmit(struct bnxt *bp,
+ struct bnxt_tx_ring_info *txr,
+ struct netdev_queue *txq,
+ struct sk_buff *skb);
+
+#endif
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC net-next 08/10] net: bnxt: Implement software USO
2026-03-10 21:21 [RFC net-next 00/10] Add TSO map-once DMA helpers and bnxt SW USO support Joe Damato
` (6 preceding siblings ...)
2026-03-10 21:21 ` [RFC net-next 07/10] net: bnxt: Add boilerplate GSO code Joe Damato
@ 2026-03-10 21:21 ` Joe Damato
2026-03-10 21:21 ` [RFC net-next 09/10] net: bnxt: Add SW GSO completion and teardown support Joe Damato
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Joe Damato @ 2026-03-10 21:21 UTC (permalink / raw)
To: netdev, Michael Chan, Pavan Chebbi, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: linux-kernel, Joe Damato
Implement bnxt_sw_udp_gso_xmit() using the core tso_dma_map API and
the pre-allocated TX inline buffer for per-segment headers.
The xmit path:
1. Calls tso_start() to initialize TSO state
2. Stack-allocates a tso_dma_map and calls tso_dma_map_init() to
DMA-map the linear payload and all frags upfront.
3. For each segment:
- Copies and patches headers via tso_build_hdr() into the
pre-allocated tx_inline_buf (DMA-synced per segment)
- Counts payload BDs via tso_dma_map_count()
- Emits long BD (header) + ext BD + payload BDs
- Payload BDs use tso_dma_map_next() which yields (dma_addr,
chunk_len, mapping_len) tuples; mapping_len is set as
dma_unmap_len on the first BD of each DMA mapping so the
completion path can unmap per-BD
Header BDs set dma_unmap_len=0 since the inline buffer is pre-allocated
and unmapped only at ring teardown.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Joe Damato <joe@dama.to>
---
drivers/net/ethernet/broadcom/bnxt/bnxt_gso.c | 158 ++++++++++++++++++
1 file changed, 158 insertions(+)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_gso.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_gso.c
index b296769ee4fe..fe1f791681e1 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_gso.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_gso.c
@@ -19,11 +19,169 @@
#include "bnxt.h"
#include "bnxt_gso.h"
+static u32 bnxt_sw_gso_lhint(unsigned int len)
+{
+ if (len <= 512)
+ return TX_BD_FLAGS_LHINT_512_AND_SMALLER;
+ else if (len <= 1023)
+ return TX_BD_FLAGS_LHINT_512_TO_1023;
+ else if (len <= 2047)
+ return TX_BD_FLAGS_LHINT_1024_TO_2047;
+ else
+ return TX_BD_FLAGS_LHINT_2048_AND_LARGER;
+}
+
netdev_tx_t bnxt_sw_udp_gso_xmit(struct bnxt *bp,
struct bnxt_tx_ring_info *txr,
struct netdev_queue *txq,
struct sk_buff *skb)
{
+ unsigned int hdr_len, mss, num_segs;
+ struct pci_dev *pdev = bp->pdev;
+ unsigned int total_payload;
+ struct tso_dma_map map;
+ u32 vlan_tag_flags = 0;
+ int i, bds_needed;
+ struct tso_t tso;
+ u16 cfa_action;
+ u16 prod;
+
+ hdr_len = tso_start(skb, &tso);
+ mss = skb_shinfo(skb)->gso_size;
+ total_payload = skb->len - hdr_len;
+ num_segs = DIV_ROUND_UP(total_payload, mss);
+
+ /* Zero the csum fields so tso_build_hdr will propagate zeroes into
+ * every segment header. HW csum offload will recompute from scratch.
+ */
+ udp_hdr(skb)->check = 0;
+ if (!tso.ipv6)
+ ip_hdr(skb)->check = 0;
+
+ if (unlikely(num_segs <= 1))
+ return NETDEV_TX_OK;
+
+ /* Upper bound on the number of descriptors needed.
+ *
+ * Each segment uses 1 long BD + 1 ext BD + payload BDs, which is
+ * at most num_segs + nr_frags (each frag boundary crossing adds at
+ * most 1 extra BD).
+ */
+ bds_needed = 3 * num_segs + skb_shinfo(skb)->nr_frags + 1;
+
+ if (unlikely(bnxt_tx_avail(bp, txr) < bds_needed)) {
+ netif_txq_try_stop(txq, bnxt_tx_avail(bp, txr),
+ bp->tx_wake_thresh);
+ return NETDEV_TX_BUSY;
+ }
+
+ if (unlikely(tso_dma_map_init(&map, &pdev->dev, skb, hdr_len)))
+ goto drop;
+
+ cfa_action = bnxt_xmit_get_cfa_action(skb);
+ if (skb_vlan_tag_present(skb)) {
+ vlan_tag_flags = TX_BD_CFA_META_KEY_VLAN |
+ skb_vlan_tag_get(skb);
+ if (skb->vlan_proto == htons(ETH_P_8021Q))
+ vlan_tag_flags |= 1 << TX_BD_CFA_META_TPID_SHIFT;
+ }
+
+ prod = txr->tx_prod;
+
+ for (i = 0; i < num_segs; i++) {
+ unsigned int seg_payload = min_t(unsigned int, mss,
+ total_payload - i * mss);
+ dma_addr_t this_hdr_dma = txr->tx_inline_dma + i * hdr_len;
+ void *this_hdr = txr->tx_inline_buf + i * hdr_len;
+ struct bnxt_sw_tx_bd *tx_buf;
+ unsigned int mapping_len;
+ unsigned int chunk_len;
+ dma_addr_t dma_addr;
+ struct tx_bd *txbd;
+ int bd_count;
+ __le32 csum;
+ bool last;
+ u32 flags;
+
+ last = (i == num_segs - 1);
+
+ tso_build_hdr(skb, this_hdr, &tso, seg_payload, last);
+
+ dma_sync_single_for_device(&pdev->dev, this_hdr_dma,
+ hdr_len, DMA_TO_DEVICE);
+
+ bd_count = tso_dma_map_count(&map, seg_payload);
+
+ tx_buf = &txr->tx_buf_ring[RING_TX(bp, prod)];
+ txbd = &txr->tx_desc_ring[TX_RING(bp, prod)][TX_IDX(prod)];
+
+ tx_buf->skb = skb;
+ tx_buf->nr_frags = bd_count;
+ tx_buf->is_push = 0;
+ tx_buf->is_ts_pkt = 0;
+
+ dma_unmap_addr_set(tx_buf, mapping, this_hdr_dma);
+ dma_unmap_len_set(tx_buf, len, 0);
+
+ tx_buf->is_sw_gso = last ? BNXT_SW_GSO_LAST : BNXT_SW_GSO_MID;
+
+ flags = (hdr_len << TX_BD_LEN_SHIFT) |
+ TX_BD_TYPE_LONG_TX_BD |
+ TX_BD_CNT(2 + bd_count);
+
+ flags |= bnxt_sw_gso_lhint(hdr_len + seg_payload);
+
+ txbd->tx_bd_len_flags_type = cpu_to_le32(flags);
+ txbd->tx_bd_haddr = cpu_to_le64(this_hdr_dma);
+ txbd->tx_bd_opaque = SET_TX_OPAQUE(bp, txr, prod,
+ 2 + bd_count);
+
+ csum = cpu_to_le32(TX_BD_FLAGS_TCP_UDP_CHKSUM |
+ TX_BD_FLAGS_IP_CKSUM);
+
+ prod = NEXT_TX(prod);
+ bnxt_init_ext_bd(bp, txr, prod, csum,
+ vlan_tag_flags, cfa_action);
+
+ while (tso_dma_map_next(&map, &dma_addr, &chunk_len,
+ &mapping_len, seg_payload)) {
+ prod = NEXT_TX(prod);
+ txbd = &txr->tx_desc_ring[TX_RING(bp, prod)][TX_IDX(prod)];
+ tx_buf = &txr->tx_buf_ring[RING_TX(bp, prod)];
+
+ txbd->tx_bd_haddr = cpu_to_le64(dma_addr);
+ dma_unmap_addr_set(tx_buf, mapping, dma_addr);
+ dma_unmap_len_set(tx_buf, len, mapping_len);
+ tx_buf->skb = NULL;
+ tx_buf->is_sw_gso = 0;
+
+ flags = chunk_len << TX_BD_LEN_SHIFT;
+ txbd->tx_bd_len_flags_type = cpu_to_le32(flags);
+ txbd->tx_bd_opaque = 0;
+
+ seg_payload -= chunk_len;
+ }
+
+ txbd->tx_bd_len_flags_type |=
+ cpu_to_le32(TX_BD_FLAGS_PACKET_END);
+
+ prod = NEXT_TX(prod);
+ }
+
+ netdev_tx_sent_queue(txq, skb->len);
+
+ WRITE_ONCE(txr->tx_prod, prod);
+ /* Sync BDs before doorbell */
+ wmb();
+ bnxt_db_write(bp, &txr->tx_db, prod);
+
+ if (unlikely(bnxt_tx_avail(bp, txr) <= bp->tx_wake_thresh))
+ netif_txq_try_stop(txq, bnxt_tx_avail(bp, txr),
+ bp->tx_wake_thresh);
+
+ return NETDEV_TX_OK;
+
+drop:
dev_kfree_skb_any(skb);
dev_core_stats_tx_dropped_inc(bp->dev);
return NETDEV_TX_OK;
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC net-next 09/10] net: bnxt: Add SW GSO completion and teardown support
2026-03-10 21:21 [RFC net-next 00/10] Add TSO map-once DMA helpers and bnxt SW USO support Joe Damato
` (7 preceding siblings ...)
2026-03-10 21:21 ` [RFC net-next 08/10] net: bnxt: Implement software USO Joe Damato
@ 2026-03-10 21:21 ` Joe Damato
2026-03-10 21:21 ` [RFC net-next 10/10] net: bnxt: Dispatch to SW USO Joe Damato
2026-03-10 22:04 ` [RFC net-next 00/10] Add TSO map-once DMA helpers and bnxt SW USO support Joe Damato
10 siblings, 0 replies; 12+ messages in thread
From: Joe Damato @ 2026-03-10 21:21 UTC (permalink / raw)
To: netdev, Michael Chan, Pavan Chebbi, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: linux-kernel, Joe Damato
Update __bnxt_tx_int and bnxt_free_one_tx_ring_skbs to handle SW GSO
segments:
- MID segments: adjust tx_pkts/tx_bytes accounting and skip skb free
(the skb is shared across all segments and freed only once)
- LAST segments: no special cleanup needed -- payload DMA unmapping is
handled by the existing per-BD dma_unmap_len walk, and the header
inline buffer is pre-allocated per-ring (freed at ring teardown)
sw_gso is initialized to zero, so the new code paths are not run.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Joe Damato <joe@dama.to>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 63 ++++++++++++++++---
.../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 19 +++++-
2 files changed, 72 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 906e842d9c53..47dc98479066 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -74,6 +74,8 @@
#include "bnxt_debugfs.h"
#include "bnxt_coredump.h"
#include "bnxt_hwmon.h"
+#include "bnxt_gso.h"
+#include <net/tso.h>
#define BNXT_TX_TIMEOUT (5 * HZ)
#define BNXT_DEF_MSG_ENABLE (NETIF_MSG_DRV | NETIF_MSG_HW | \
@@ -818,12 +820,13 @@ static bool __bnxt_tx_int(struct bnxt *bp, struct bnxt_tx_ring_info *txr,
bool rc = false;
while (RING_TX(bp, cons) != hw_cons) {
- struct bnxt_sw_tx_bd *tx_buf;
+ struct bnxt_sw_tx_bd *tx_buf, *head_buf;
struct sk_buff *skb;
bool is_ts_pkt;
int j, last;
tx_buf = &txr->tx_buf_ring[RING_TX(bp, cons)];
+ head_buf = tx_buf;
skb = tx_buf->skb;
if (unlikely(!skb)) {
@@ -870,6 +873,14 @@ static bool __bnxt_tx_int(struct bnxt *bp, struct bnxt_tx_ring_info *txr,
DMA_TO_DEVICE, 0);
}
}
+ if (unlikely(head_buf->is_sw_gso)) {
+ if (head_buf->is_sw_gso == BNXT_SW_GSO_MID) {
+ tx_pkts--;
+ tx_bytes -= skb->len;
+ skb = NULL;
+ }
+ head_buf->is_sw_gso = 0;
+ }
if (unlikely(is_ts_pkt)) {
if (BNXT_CHIP_P5(bp)) {
/* PTP worker takes ownership of the skb */
@@ -3417,6 +3428,7 @@ static void bnxt_free_one_tx_ring_skbs(struct bnxt *bp,
for (i = 0; i < max_idx;) {
struct bnxt_sw_tx_bd *tx_buf = &txr->tx_buf_ring[i];
+ struct bnxt_sw_tx_bd *head_buf = tx_buf;
struct sk_buff *skb;
int j, last;
@@ -3467,7 +3479,10 @@ static void bnxt_free_one_tx_ring_skbs(struct bnxt *bp,
len),
DMA_TO_DEVICE, 0);
}
- dev_kfree_skb(skb);
+ if (head_buf->is_sw_gso == BNXT_SW_GSO_MID)
+ skb = NULL;
+ if (skb)
+ dev_kfree_skb(skb);
}
netdev_tx_reset_queue(netdev_get_tx_queue(bp->dev, idx));
}
@@ -3975,9 +3990,9 @@ static void bnxt_free_tx_inline_buf(struct bnxt_tx_ring_info *txr,
txr->tx_inline_size = 0;
}
-static int __maybe_unused bnxt_alloc_tx_inline_buf(struct bnxt_tx_ring_info *txr,
- struct pci_dev *pdev,
- unsigned int size)
+static int bnxt_alloc_tx_inline_buf(struct bnxt_tx_ring_info *txr,
+ struct pci_dev *pdev,
+ unsigned int size)
{
txr->tx_inline_buf = kmalloc(size, GFP_KERNEL);
if (!txr->tx_inline_buf)
@@ -4080,6 +4095,14 @@ static int bnxt_alloc_tx_rings(struct bnxt *bp)
sizeof(struct tx_push_bd);
txr->data_mapping = cpu_to_le64(mapping);
}
+ if (!(bp->flags & BNXT_FLAG_UDP_GSO_CAP) &&
+ (bp->dev->features & NETIF_F_GSO_UDP_L4)) {
+ rc = bnxt_alloc_tx_inline_buf(txr, pdev,
+ BNXT_SW_USO_MAX_SEGS *
+ TSO_HEADER_SIZE);
+ if (rc)
+ return rc;
+ }
qidx = bp->tc_to_qidx[j];
ring->queue_id = bp->q_info[qidx].queue_id;
spin_lock_init(&txr->xdp_tx_lock);
@@ -4611,6 +4634,10 @@ static int bnxt_init_tx_rings(struct bnxt *bp)
bp->tx_wake_thresh = max_t(int, bp->tx_ring_size / 2,
BNXT_MIN_TX_DESC_CNT);
+ if (!(bp->flags & BNXT_FLAG_UDP_GSO_CAP) &&
+ (bp->dev->features & NETIF_F_GSO_UDP_L4))
+ bp->tx_wake_thresh = max_t(int, bp->tx_wake_thresh,
+ BNXT_SW_USO_MAX_DESCS);
for (i = 0; i < bp->tx_nr_rings; i++) {
struct bnxt_tx_ring_info *txr = &bp->tx_ring[i];
@@ -13778,6 +13805,11 @@ static netdev_features_t bnxt_fix_features(struct net_device *dev,
if ((features & NETIF_F_NTUPLE) && !bnxt_rfs_capable(bp, false))
features &= ~NETIF_F_NTUPLE;
+ if ((features & NETIF_F_GSO_UDP_L4) &&
+ !(bp->flags & BNXT_FLAG_UDP_GSO_CAP) &&
+ bp->tx_ring_size < 2 * BNXT_SW_USO_MAX_DESCS)
+ features &= ~NETIF_F_GSO_UDP_L4;
+
if ((bp->flags & BNXT_FLAG_NO_AGG_RINGS) || bp->xdp_prog)
features &= ~(NETIF_F_LRO | NETIF_F_GRO_HW);
@@ -13823,6 +13855,15 @@ static int bnxt_set_features(struct net_device *dev, netdev_features_t features)
int rc = 0;
bool re_init = false;
+ if (!(bp->flags & BNXT_FLAG_UDP_GSO_CAP)) {
+ if (features & NETIF_F_GSO_UDP_L4)
+ bp->tx_wake_thresh = max_t(int, bp->tx_wake_thresh,
+ BNXT_SW_USO_MAX_DESCS);
+ else
+ bp->tx_wake_thresh = max_t(int, bp->tx_ring_size / 2,
+ BNXT_MIN_TX_DESC_CNT);
+ }
+
flags &= ~BNXT_FLAG_ALL_CONFIG_FEATS;
if (features & NETIF_F_GRO_HW)
flags |= BNXT_FLAG_GRO;
@@ -16803,8 +16844,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
NETIF_F_GSO_UDP_TUNNEL_CSUM | NETIF_F_GSO_GRE_CSUM |
NETIF_F_GSO_PARTIAL | NETIF_F_RXHASH |
NETIF_F_RXCSUM | NETIF_F_GRO;
- if (bp->flags & BNXT_FLAG_UDP_GSO_CAP)
- dev->hw_features |= NETIF_F_GSO_UDP_L4;
+ dev->hw_features |= NETIF_F_GSO_UDP_L4;
if (BNXT_SUPPORTS_TPA(bp))
dev->hw_features |= NETIF_F_LRO;
@@ -16837,8 +16877,15 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
dev->priv_flags |= IFF_UNICAST_FLT;
netif_set_tso_max_size(dev, GSO_MAX_SIZE);
- if (bp->tso_max_segs)
+ if (!(bp->flags & BNXT_FLAG_UDP_GSO_CAP)) {
+ u16 max_segs = BNXT_SW_USO_MAX_SEGS;
+
+ if (bp->tso_max_segs)
+ max_segs = min_t(u16, max_segs, bp->tso_max_segs);
+ netif_set_tso_max_segs(dev, max_segs);
+ } else if (bp->tso_max_segs) {
netif_set_tso_max_segs(dev, bp->tso_max_segs);
+ }
dev->xdp_features = NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT |
NETDEV_XDP_ACT_RX_SG;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index 3ce092bc8bba..50a4736f512a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -33,6 +33,7 @@
#include "bnxt_xdp.h"
#include "bnxt_ptp.h"
#include "bnxt_ethtool.h"
+#include "bnxt_gso.h"
#include "bnxt_nvm_defs.h" /* NVRAM content constant and structure defs */
#include "bnxt_fw_hdr.h" /* Firmware hdr constant and structure defs */
#include "bnxt_coredump.h"
@@ -846,12 +847,18 @@ static int bnxt_set_ringparam(struct net_device *dev,
u8 tcp_data_split = kernel_ering->tcp_data_split;
struct bnxt *bp = netdev_priv(dev);
u8 hds_config_mod;
+ int rc;
if ((ering->rx_pending > BNXT_MAX_RX_DESC_CNT) ||
(ering->tx_pending > BNXT_MAX_TX_DESC_CNT) ||
(ering->tx_pending < BNXT_MIN_TX_DESC_CNT))
return -EINVAL;
+ if ((dev->features & NETIF_F_GSO_UDP_L4) &&
+ !(bp->flags & BNXT_FLAG_UDP_GSO_CAP) &&
+ ering->tx_pending < 2 * BNXT_SW_USO_MAX_DESCS)
+ return -EINVAL;
+
hds_config_mod = tcp_data_split != dev->cfg->hds_config;
if (tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_DISABLED && hds_config_mod)
return -EINVAL;
@@ -876,9 +883,17 @@ static int bnxt_set_ringparam(struct net_device *dev,
bp->tx_ring_size = ering->tx_pending;
bnxt_set_ring_params(bp);
- if (netif_running(dev))
- return bnxt_open_nic(bp, false, false);
+ if (netif_running(dev)) {
+ rc = bnxt_open_nic(bp, false, false);
+ if (rc)
+ return rc;
+ }
+ /* ring size changes may affect features (SW USO requires a minimum
+ * ring size), so recalculate features to ensure the correct features
+ * are blocked/available.
+ */
+ netdev_update_features(dev);
return 0;
}
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC net-next 10/10] net: bnxt: Dispatch to SW USO
2026-03-10 21:21 [RFC net-next 00/10] Add TSO map-once DMA helpers and bnxt SW USO support Joe Damato
` (8 preceding siblings ...)
2026-03-10 21:21 ` [RFC net-next 09/10] net: bnxt: Add SW GSO completion and teardown support Joe Damato
@ 2026-03-10 21:21 ` Joe Damato
2026-03-10 22:04 ` [RFC net-next 00/10] Add TSO map-once DMA helpers and bnxt SW USO support Joe Damato
10 siblings, 0 replies; 12+ messages in thread
From: Joe Damato @ 2026-03-10 21:21 UTC (permalink / raw)
To: netdev, Michael Chan, Pavan Chebbi, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: linux-kernel, Joe Damato
Wire in the SW USO path added in preceding commits when hardware USO is
not possible.
When a GSO skb with SKB_GSO_UDP_L4 arrives and the NIC lacks HW USO
capability, redirect to bnxt_sw_udp_gso_xmit() which handles software
segmentation into individual UDP frames submitted directly to the TX
ring.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Joe Damato <joe@dama.to>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 47dc98479066..72d66043096a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -506,6 +506,11 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
}
}
#endif
+ if (skb_is_gso(skb) &&
+ (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) &&
+ !(bp->flags & BNXT_FLAG_UDP_GSO_CAP))
+ return bnxt_sw_udp_gso_xmit(bp, txr, txq, skb);
+
free_size = bnxt_tx_avail(bp, txr);
if (unlikely(free_size < skb_shinfo(skb)->nr_frags + 2)) {
/* We must have raced with NAPI cleanup */
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [RFC net-next 00/10] Add TSO map-once DMA helpers and bnxt SW USO support
2026-03-10 21:21 [RFC net-next 00/10] Add TSO map-once DMA helpers and bnxt SW USO support Joe Damato
` (9 preceding siblings ...)
2026-03-10 21:21 ` [RFC net-next 10/10] net: bnxt: Dispatch to SW USO Joe Damato
@ 2026-03-10 22:04 ` Joe Damato
10 siblings, 0 replies; 12+ messages in thread
From: Joe Damato @ 2026-03-10 22:04 UTC (permalink / raw)
To: netdev
Cc: michael.chan, pavan.chebbi, linux-kernel, Alexei Starovoitov,
Andrew Lunn, bpf, Daniel Borkmann, David S. Miller, Eric Dumazet,
Jakub Kicinski, Jesper Dangaard Brouer, John Fastabend,
Paolo Abeni, Richard Cochran, Simon Horman, Stanislav Fomichev
On Tue, Mar 10, 2026 at 02:21:48PM -0700, Joe Damato wrote:
> Greetings:
>
> This series extends net/tso to add a data structure and some helpers allowing
> drivers to DMA map headers and packet payloads a single time. The helpers can
> then be used to reference slices of shared mapping for each segment. This
> helps to avoid the cost of repeated DMA mappings, especially on systems which
> use an IOMMU. N per-packet DMA maps are replaced with a single map for the
> entire GSO skb.
>
> The added helpers are then used in bnxt to add support for software UDP
> Segmentation Offloading (SW USO) for older bnxt devices which do not have
> support for USO in hardware. Since the helpers are generic, other drivers
> can be extended similarly.
Sorry for the noise; just realized this implementation is buggy.
Will fix and send an RFC v2, but this series is the general idea of what I'll be
posting, for anyone interested.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-03-10 22:04 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-10 21:21 [RFC net-next 00/10] Add TSO map-once DMA helpers and bnxt SW USO support Joe Damato
2026-03-10 21:21 ` [RFC net-next 01/10] net: tso: Introduce tso_dma_map Joe Damato
2026-03-10 21:21 ` [RFC net-next 02/10] net: tso: Add tso_dma_map helpers Joe Damato
2026-03-10 21:21 ` [RFC net-next 03/10] net: bnxt: Export bnxt_xmit_get_cfa_action Joe Damato
2026-03-10 21:21 ` [RFC net-next 04/10] net: bnxt: Add a helper for tx_bd_ext Joe Damato
2026-03-10 21:21 ` [RFC net-next 05/10] net: bnxt: Use dma_unmap_len for TX completion unmapping Joe Damato
2026-03-10 21:21 ` [RFC net-next 06/10] net: bnxt: Add TX inline buffer infrastructure Joe Damato
2026-03-10 21:21 ` [RFC net-next 07/10] net: bnxt: Add boilerplate GSO code Joe Damato
2026-03-10 21:21 ` [RFC net-next 08/10] net: bnxt: Implement software USO Joe Damato
2026-03-10 21:21 ` [RFC net-next 09/10] net: bnxt: Add SW GSO completion and teardown support Joe Damato
2026-03-10 21:21 ` [RFC net-next 10/10] net: bnxt: Dispatch to SW USO Joe Damato
2026-03-10 22:04 ` [RFC net-next 00/10] Add TSO map-once DMA helpers and bnxt SW USO support Joe Damato
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox