* [PATCH net-next 0/9] eth: fbnic Add XDP support for fbnic
@ 2025-07-23 14:59 Mohsin Bashir
2025-07-23 14:59 ` [PATCH net-next 1/9] eth: fbnic: Add support for HDS configuration Mohsin Bashir
` (8 more replies)
0 siblings, 9 replies; 20+ messages in thread
From: Mohsin Bashir @ 2025-07-23 14:59 UTC (permalink / raw)
To: netdev
Cc: kuba, alexanderduyck, andrew+netdev, davem, edumazet, pabeni,
mohsin.bashr, horms, vadim.fedorenko, jdamato, sdf,
aleksander.lobakin, ast, daniel, hawk, john.fastabend
This patch series introduces basic XDP support for fbnic. To enable this,
it includes preparatory changes such as making the HDS threshold
configurable via ethtool, updating headroom for fbnic, tracking
frag state in shinfo, and prefetching the first cacheline of data.
Mohsin Bashir (9):
eth: fbnic: Add support for HDS configuration
eth: fbnic: Update Headroom
eth: fbnic: Use shinfo to track frags state on Rx
eth: fbnic: Prefetch packet headers on Rx
eth: fbnic: Add XDP pass, drop, abort support
eth: fbnic: Add support for XDP queues
eth: fbnic: Add support for XDP_TX action
eth: fbnic: Collect packet statistics for XDP
eth: fbnic: Report XDP stats via ethtool
.../device_drivers/ethernet/meta/fbnic.rst | 10 +
.../net/ethernet/meta/fbnic/fbnic_ethtool.c | 82 +++-
.../net/ethernet/meta/fbnic/fbnic_netdev.c | 69 +++
.../net/ethernet/meta/fbnic/fbnic_netdev.h | 9 +-
drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 441 +++++++++++++++---
drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 22 +-
6 files changed, 555 insertions(+), 78 deletions(-)
--
2.47.1
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH net-next 1/9] eth: fbnic: Add support for HDS configuration
2025-07-23 14:59 [PATCH net-next 0/9] eth: fbnic Add XDP support for fbnic Mohsin Bashir
@ 2025-07-23 14:59 ` Mohsin Bashir
2025-07-23 14:59 ` [PATCH net-next 2/9] eth: fbnic: Update Headroom Mohsin Bashir
` (7 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Mohsin Bashir @ 2025-07-23 14:59 UTC (permalink / raw)
To: netdev
Cc: kuba, alexanderduyck, andrew+netdev, davem, edumazet, pabeni,
mohsin.bashr, horms, vadim.fedorenko, jdamato, sdf,
aleksander.lobakin, ast, daniel, hawk, john.fastabend
Add support for configuring the header data split threshold.
For fbnic, the tcp data split support is enabled all the time.
Fbnic supports a maximum buffer size of 4KB. However, the reservation
for the headroom, tailroom, and padding reduce the max header size
accordingly.
ethtool_hds -g eth0
Ring parameters for eth0:
Pre-set maximums:
...
HDS thresh: 3584
Current hardware settings:
...
HDS thresh: 1536
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
---
.../net/ethernet/meta/fbnic/fbnic_ethtool.c | 21 ++++++++++++++++---
.../net/ethernet/meta/fbnic/fbnic_netdev.c | 4 ++++
.../net/ethernet/meta/fbnic/fbnic_netdev.h | 2 ++
drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 2 +-
drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 6 +++++-
5 files changed, 30 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
index 588da02d6e22..84a0db9f1be0 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
@@ -2,6 +2,7 @@
/* Copyright (c) Meta Platforms, Inc. and affiliates. */
#include <linux/ethtool.h>
+#include <linux/ethtool_netlink.h>
#include <linux/netdevice.h>
#include <linux/pci.h>
#include <net/ipv6.h>
@@ -160,6 +161,7 @@ static void fbnic_clone_swap_cfg(struct fbnic_net *orig,
swap(clone->num_rx_queues, orig->num_rx_queues);
swap(clone->num_tx_queues, orig->num_tx_queues);
swap(clone->num_napi, orig->num_napi);
+ swap(clone->hds_thresh, orig->hds_thresh);
}
static void fbnic_aggregate_vector_counters(struct fbnic_net *fbn,
@@ -277,15 +279,21 @@ fbnic_get_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring,
ring->rx_mini_pending = fbn->hpq_size;
ring->rx_jumbo_pending = fbn->ppq_size;
ring->tx_pending = fbn->txq_size;
+
+ kernel_ring->tcp_data_split = ETHTOOL_TCP_DATA_SPLIT_ENABLED;
+ kernel_ring->hds_thresh_max = FBNIC_HDS_THRESH_MAX;
+ kernel_ring->hds_thresh = fbn->hds_thresh;
}
static void fbnic_set_rings(struct fbnic_net *fbn,
- struct ethtool_ringparam *ring)
+ struct ethtool_ringparam *ring,
+ struct kernel_ethtool_ringparam *kernel_ring)
{
fbn->rcq_size = ring->rx_pending;
fbn->hpq_size = ring->rx_mini_pending;
fbn->ppq_size = ring->rx_jumbo_pending;
fbn->txq_size = ring->tx_pending;
+ fbn->hds_thresh = kernel_ring->hds_thresh;
}
static int
@@ -316,8 +324,13 @@ fbnic_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring,
return -EINVAL;
}
+ if (kernel_ring->tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_DISABLED) {
+ NL_SET_ERR_MSG_MOD(extack, "Cannot disable TCP data split");
+ return -EINVAL;
+ }
+
if (!netif_running(netdev)) {
- fbnic_set_rings(fbn, ring);
+ fbnic_set_rings(fbn, ring, kernel_ring);
return 0;
}
@@ -325,7 +338,7 @@ fbnic_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring,
if (!clone)
return -ENOMEM;
- fbnic_set_rings(clone, ring);
+ fbnic_set_rings(clone, ring, kernel_ring);
err = fbnic_alloc_napi_vectors(clone);
if (err)
@@ -1678,6 +1691,8 @@ fbnic_get_rmon_stats(struct net_device *netdev,
static const struct ethtool_ops fbnic_ethtool_ops = {
.supported_coalesce_params = ETHTOOL_COALESCE_USECS |
ETHTOOL_COALESCE_RX_MAX_FRAMES,
+ .supported_ring_params = ETHTOOL_RING_USE_TCP_DATA_SPLIT |
+ ETHTOOL_RING_USE_HDS_THRS,
.rxfh_max_num_contexts = FBNIC_RPC_RSS_TBL_COUNT,
.get_drvinfo = fbnic_get_drvinfo,
.get_regs_len = fbnic_get_regs_len,
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
index 7bd7812d9c06..d039e1c7a0d5 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
@@ -691,6 +691,10 @@ struct net_device *fbnic_netdev_alloc(struct fbnic_dev *fbd)
fbn->rx_usecs = FBNIC_RX_USECS_DEFAULT;
fbn->rx_max_frames = FBNIC_RX_FRAMES_DEFAULT;
+ /* Initialize the hds_thresh */
+ netdev->cfg->hds_thresh = FBNIC_HDS_THRESH_DEFAULT;
+ fbn->hds_thresh = FBNIC_HDS_THRESH_DEFAULT;
+
default_queues = netif_get_num_default_rss_queues();
if (default_queues > fbd->max_num_queues)
default_queues = fbd->max_num_queues;
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
index 86576ae04262..04c5c7ed6c3a 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
@@ -31,6 +31,8 @@ struct fbnic_net {
u32 ppq_size;
u32 rcq_size;
+ u32 hds_thresh;
+
u16 rx_usecs;
u16 tx_usecs;
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
index ac11389a764c..1fc7bd19e7a1 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
@@ -2238,7 +2238,7 @@ static void fbnic_enable_rcq(struct fbnic_napi_vector *nv,
rcq_ctl = FIELD_PREP(FBNIC_QUEUE_RDE_CTL1_PADLEN_MASK, FBNIC_RX_PAD) |
FIELD_PREP(FBNIC_QUEUE_RDE_CTL1_MAX_HDR_MASK,
- FBNIC_RX_MAX_HDR) |
+ fbn->hds_thresh) |
FIELD_PREP(FBNIC_QUEUE_RDE_CTL1_PAYLD_OFF_MASK,
FBNIC_RX_PAYLD_OFFSET) |
FIELD_PREP(FBNIC_QUEUE_RDE_CTL1_PAYLD_PG_CL_MASK,
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
index 2e361d6f03ff..4d248f94ec9b 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
@@ -53,7 +53,6 @@ struct fbnic_net;
#define FBNIC_RX_HROOM \
(ALIGN(FBNIC_RX_TROOM + NET_SKB_PAD, 128) - FBNIC_RX_TROOM)
#define FBNIC_RX_PAD 0
-#define FBNIC_RX_MAX_HDR (1536 - FBNIC_RX_PAD)
#define FBNIC_RX_PAYLD_OFFSET 0
#define FBNIC_RX_PAYLD_PG_CL 0
@@ -61,6 +60,11 @@ struct fbnic_net;
#define FBNIC_RING_F_CTX BIT(1)
#define FBNIC_RING_F_STATS BIT(2) /* Ring's stats may be used */
+#define FBNIC_HDS_THRESH_MAX \
+ (4096 - FBNIC_RX_HROOM - FBNIC_RX_TROOM - FBNIC_RX_PAD)
+#define FBNIC_HDS_THRESH_DEFAULT \
+ (1536 - FBNIC_RX_PAD)
+
struct fbnic_pkt_buff {
struct xdp_buff buff;
ktime_t hwtstamp;
--
2.47.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH net-next 2/9] eth: fbnic: Update Headroom
2025-07-23 14:59 [PATCH net-next 0/9] eth: fbnic Add XDP support for fbnic Mohsin Bashir
2025-07-23 14:59 ` [PATCH net-next 1/9] eth: fbnic: Add support for HDS configuration Mohsin Bashir
@ 2025-07-23 14:59 ` Mohsin Bashir
2025-07-23 14:59 ` [PATCH net-next 3/9] eth: fbnic: Use shinfo to track frags state on Rx Mohsin Bashir
` (6 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Mohsin Bashir @ 2025-07-23 14:59 UTC (permalink / raw)
To: netdev
Cc: kuba, alexanderduyck, andrew+netdev, davem, edumazet, pabeni,
mohsin.bashr, horms, vadim.fedorenko, jdamato, sdf,
aleksander.lobakin, ast, daniel, hawk, john.fastabend
Fbnic currently reserves a minimum of 64B headroom, but this is
insufficient for inserting additional headers (e.g., IPV6) via XDP, as
only 24 bytes are available for adjustment. To address this limitation,
increase the headroom to a larger value while ensuring better page use.
Although the resulting headroom (192B) is smaller than the recommended
value (256B), forcing the headroom to 256B would require aligning to
256B (as opposed to the current 128B), which can push the max headroom
to 511B.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
---
drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
index 4d248f94ec9b..398310be592e 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
@@ -50,8 +50,9 @@ struct fbnic_net;
#define FBNIC_RX_TROOM \
SKB_DATA_ALIGN(sizeof(struct skb_shared_info))
+#define FBNIC_RX_HROOM_PAD 128
#define FBNIC_RX_HROOM \
- (ALIGN(FBNIC_RX_TROOM + NET_SKB_PAD, 128) - FBNIC_RX_TROOM)
+ (ALIGN(FBNIC_RX_TROOM + FBNIC_RX_HROOM_PAD, 128) - FBNIC_RX_TROOM)
#define FBNIC_RX_PAD 0
#define FBNIC_RX_PAYLD_OFFSET 0
#define FBNIC_RX_PAYLD_PG_CL 0
--
2.47.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH net-next 3/9] eth: fbnic: Use shinfo to track frags state on Rx
2025-07-23 14:59 [PATCH net-next 0/9] eth: fbnic Add XDP support for fbnic Mohsin Bashir
2025-07-23 14:59 ` [PATCH net-next 1/9] eth: fbnic: Add support for HDS configuration Mohsin Bashir
2025-07-23 14:59 ` [PATCH net-next 2/9] eth: fbnic: Update Headroom Mohsin Bashir
@ 2025-07-23 14:59 ` Mohsin Bashir
2025-07-23 14:59 ` [PATCH net-next 4/9] eth: fbnic: Prefetch packet headers " Mohsin Bashir
` (5 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Mohsin Bashir @ 2025-07-23 14:59 UTC (permalink / raw)
To: netdev
Cc: kuba, alexanderduyck, andrew+netdev, davem, edumazet, pabeni,
mohsin.bashr, horms, vadim.fedorenko, jdamato, sdf,
aleksander.lobakin, ast, daniel, hawk, john.fastabend
Remove local fields that track frags state and instead store this
information directly in the shinfo struct. This change is necessary
because the current implementation can lead to inaccuracies in certain
scenarios, such as when using XDP multi-buff support. Specifically, the
XDP program may update nr_frags without updating the local variables,
resulting in an inconsistent state.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
---
drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 79 ++++++--------------
drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 4 +-
2 files changed, 25 insertions(+), 58 deletions(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
index 1fc7bd19e7a1..f5725c0972a5 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
@@ -892,9 +892,7 @@ static void fbnic_pkt_prepare(struct fbnic_napi_vector *nv, u64 rcd,
xdp_prepare_buff(&pkt->buff, hdr_start, headroom,
len - FBNIC_RX_PAD, true);
- pkt->data_truesize = 0;
- pkt->data_len = 0;
- pkt->nr_frags = 0;
+ pkt->add_frag_failed = false;
}
static void fbnic_add_rx_frag(struct fbnic_napi_vector *nv, u64 rcd,
@@ -905,8 +903,8 @@ static void fbnic_add_rx_frag(struct fbnic_napi_vector *nv, u64 rcd,
unsigned int pg_off = FIELD_GET(FBNIC_RCD_AL_BUFF_OFF_MASK, rcd);
unsigned int len = FIELD_GET(FBNIC_RCD_AL_BUFF_LEN_MASK, rcd);
struct page *page = fbnic_page_pool_get(&qt->sub1, pg_idx);
- struct skb_shared_info *shinfo;
unsigned int truesize;
+ bool added;
truesize = FIELD_GET(FBNIC_RCD_AL_PAGE_FIN, rcd) ?
FBNIC_BD_FRAG_SIZE - pg_off : ALIGN(len, 128);
@@ -918,34 +916,34 @@ static void fbnic_add_rx_frag(struct fbnic_napi_vector *nv, u64 rcd,
dma_sync_single_range_for_cpu(nv->dev, page_pool_get_dma_addr(page),
pg_off, truesize, DMA_BIDIRECTIONAL);
- /* Add page to xdp shared info */
- shinfo = xdp_get_shared_info_from_buff(&pkt->buff);
-
- /* We use gso_segs to store truesize */
- pkt->data_truesize += truesize;
-
- __skb_fill_page_desc_noacc(shinfo, pkt->nr_frags++, page, pg_off, len);
-
- /* Store data_len in gso_size */
- pkt->data_len += len;
+ added = xdp_buff_add_frag(&pkt->buff, page_to_netmem(page), pg_off, len,
+ truesize);
+ if (unlikely(!added)) {
+ pkt->add_frag_failed = true;
+ netdev_err_once(nv->napi.dev,
+ "Failed to add fragment to xdp_buff\n");
+ }
}
static void fbnic_put_pkt_buff(struct fbnic_napi_vector *nv,
struct fbnic_pkt_buff *pkt, int budget)
{
- struct skb_shared_info *shinfo;
struct page *page;
- int nr_frags;
if (!pkt->buff.data_hard_start)
return;
- shinfo = xdp_get_shared_info_from_buff(&pkt->buff);
- nr_frags = pkt->nr_frags;
+ if (xdp_buff_has_frags(&pkt->buff)) {
+ struct skb_shared_info *shinfo;
+ int nr_frags;
- while (nr_frags--) {
- page = skb_frag_page(&shinfo->frags[nr_frags]);
- page_pool_put_full_page(nv->page_pool, page, !!budget);
+ shinfo = xdp_get_shared_info_from_buff(&pkt->buff);
+ nr_frags = shinfo->nr_frags;
+
+ while (nr_frags--) {
+ page = skb_frag_page(&shinfo->frags[nr_frags]);
+ page_pool_put_full_page(nv->page_pool, page, !!budget);
+ }
}
page = virt_to_page(pkt->buff.data_hard_start);
@@ -955,43 +953,12 @@ static void fbnic_put_pkt_buff(struct fbnic_napi_vector *nv,
static struct sk_buff *fbnic_build_skb(struct fbnic_napi_vector *nv,
struct fbnic_pkt_buff *pkt)
{
- unsigned int nr_frags = pkt->nr_frags;
- struct skb_shared_info *shinfo;
- unsigned int truesize;
struct sk_buff *skb;
- truesize = xdp_data_hard_end(&pkt->buff) + FBNIC_RX_TROOM -
- pkt->buff.data_hard_start;
-
- /* Build frame around buffer */
- skb = napi_build_skb(pkt->buff.data_hard_start, truesize);
- if (unlikely(!skb))
+ skb = xdp_build_skb_from_buff(&pkt->buff);
+ if (!skb)
return NULL;
- /* Push data pointer to start of data, put tail to end of data */
- skb_reserve(skb, pkt->buff.data - pkt->buff.data_hard_start);
- __skb_put(skb, pkt->buff.data_end - pkt->buff.data);
-
- /* Add tracking for metadata at the start of the frame */
- skb_metadata_set(skb, pkt->buff.data - pkt->buff.data_meta);
-
- /* Add Rx frags */
- if (nr_frags) {
- /* Verify that shared info didn't move */
- shinfo = xdp_get_shared_info_from_buff(&pkt->buff);
- WARN_ON(skb_shinfo(skb) != shinfo);
-
- skb->truesize += pkt->data_truesize;
- skb->data_len += pkt->data_len;
- shinfo->nr_frags = nr_frags;
- skb->len += pkt->data_len;
- }
-
- skb_mark_for_recycle(skb);
-
- /* Set MAC header specific fields */
- skb->protocol = eth_type_trans(skb, nv->napi.dev);
-
/* Add timestamp if present */
if (pkt->hwtstamp)
skb_hwtstamps(skb)->hwtstamp = pkt->hwtstamp;
@@ -1094,7 +1061,9 @@ static int fbnic_clean_rcq(struct fbnic_napi_vector *nv,
/* We currently ignore the action table index */
break;
case FBNIC_RCD_TYPE_META:
- if (likely(!fbnic_rcd_metadata_err(rcd)))
+ if (unlikely(pkt->add_frag_failed))
+ skb = NULL;
+ else if (likely(!fbnic_rcd_metadata_err(rcd)))
skb = fbnic_build_skb(nv, pkt);
/* Populate skb and invalidate XDP */
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
index 398310be592e..be34962c465e 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
@@ -69,9 +69,7 @@ struct fbnic_net;
struct fbnic_pkt_buff {
struct xdp_buff buff;
ktime_t hwtstamp;
- u32 data_truesize;
- u16 data_len;
- u16 nr_frags;
+ bool add_frag_failed;
};
struct fbnic_queue_stats {
--
2.47.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH net-next 4/9] eth: fbnic: Prefetch packet headers on Rx
2025-07-23 14:59 [PATCH net-next 0/9] eth: fbnic Add XDP support for fbnic Mohsin Bashir
` (2 preceding siblings ...)
2025-07-23 14:59 ` [PATCH net-next 3/9] eth: fbnic: Use shinfo to track frags state on Rx Mohsin Bashir
@ 2025-07-23 14:59 ` Mohsin Bashir
2025-07-23 14:59 ` [PATCH net-next 5/9] eth: fbnic: Add XDP pass, drop, abort support Mohsin Bashir
` (4 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Mohsin Bashir @ 2025-07-23 14:59 UTC (permalink / raw)
To: netdev
Cc: kuba, alexanderduyck, andrew+netdev, davem, edumazet, pabeni,
mohsin.bashr, horms, vadim.fedorenko, jdamato, sdf,
aleksander.lobakin, ast, daniel, hawk, john.fastabend
Issue a prefetch for the start of the buffer on Rx to try to avoid cache
miss on packet headers.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
---
drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
index f5725c0972a5..71af7b9d5bcd 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
@@ -888,7 +888,7 @@ static void fbnic_pkt_prepare(struct fbnic_napi_vector *nv, u64 rcd,
/* Build frame around buffer */
hdr_start = page_address(page) + hdr_pg_start;
-
+ net_prefetch(pkt->buff.data);
xdp_prepare_buff(&pkt->buff, hdr_start, headroom,
len - FBNIC_RX_PAD, true);
--
2.47.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH net-next 5/9] eth: fbnic: Add XDP pass, drop, abort support
2025-07-23 14:59 [PATCH net-next 0/9] eth: fbnic Add XDP support for fbnic Mohsin Bashir
` (3 preceding siblings ...)
2025-07-23 14:59 ` [PATCH net-next 4/9] eth: fbnic: Prefetch packet headers " Mohsin Bashir
@ 2025-07-23 14:59 ` Mohsin Bashir
2025-07-23 17:35 ` Maciej Fijalkowski
2025-07-23 14:59 ` [PATCH net-next 6/9] eth: fbnic: Add support for XDP queues Mohsin Bashir
` (3 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Mohsin Bashir @ 2025-07-23 14:59 UTC (permalink / raw)
To: netdev
Cc: kuba, alexanderduyck, andrew+netdev, davem, edumazet, pabeni,
mohsin.bashr, horms, vadim.fedorenko, jdamato, sdf,
aleksander.lobakin, ast, daniel, hawk, john.fastabend
Add basic support for attaching an XDP program to the device and support
for PASS/DROP/ABORT actions.
In fbnic, buffers are always mapped as DMA_BIDIRECTIONAL.
Testing:
Hook a simple XDP program that passes all the packets destined for a
specific port
iperf3 -c 192.168.1.10 -P 5 -p 12345
Connecting to host 192.168.1.10, port 12345
[ 5] local 192.168.1.9 port 46702 connected to 192.168.1.10 port 12345
[ ID] Interval Transfer Bitrate Retr Cwnd
- - - - - - - - - - - - - - - - - - - - - - - - -
[SUM] 1.00-2.00 sec 3.86 GBytes 33.2 Gbits/sec 0
XDP_DROP:
Hook an XDP program that drops packets destined for a specific port
iperf3 -c 192.168.1.10 -P 5 -p 12345
^C- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[SUM] 0.00-0.00 sec 0.00 Bytes 0.00 bits/sec 0 sender
[SUM] 0.00-0.00 sec 0.00 Bytes 0.00 bits/sec receiver
iperf3: interrupt - the client has terminated
XDP with HDS:
- Validate XDP attachment failure when HDS is low
~] ethtool -G eth0 hds-thresh 512
~] sudo ip link set eth0 xdpdrv obj xdp_pass_12345.o sec xdp
~] Error: fbnic: MTU too high, or HDS threshold is too low for single
buffer XDP.
- Validate successful XDP attachment when HDS threshold is appropriate
~] ethtool -G eth0 hds-thresh 1536
~] sudo ip link set eth0 xdpdrv obj xdp_pass_12345.o sec xdp
- Validate when the XDP program is attached, changing HDS thresh to a
lower value fails
~] ethtool -G eth0 hds-thresh 512
~] netlink error: fbnic: Use higher HDS threshold or multi-buf capable
program
- Validate HDS thresh does not matter when xdp frags support is
available
~] ethtool -G eth0 hds-thresh 512
~] sudo ip link set eth0 xdpdrv obj xdp_pass_mb_12345.o sec xdp.frags
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
---
.../net/ethernet/meta/fbnic/fbnic_ethtool.c | 11 +++
.../net/ethernet/meta/fbnic/fbnic_netdev.c | 35 +++++++
.../net/ethernet/meta/fbnic/fbnic_netdev.h | 5 +
drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 95 +++++++++++++++++--
drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 1 +
5 files changed, 140 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
index 84a0db9f1be0..d7b9eb267ead 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
@@ -329,6 +329,17 @@ fbnic_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring,
return -EINVAL;
}
+ /* If an XDP program is attached, we should check for potential frame
+ * splitting. If the new HDS threshold can cause splitting, we should
+ * only allow if the attached XDP program can handle frags.
+ */
+ if (fbnic_check_split_frames(fbn->xdp_prog, netdev->mtu,
+ kernel_ring->hds_thresh)) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Use higher HDS threshold or multi-buf capable program");
+ return -EINVAL;
+ }
+
if (!netif_running(netdev)) {
fbnic_set_rings(fbn, ring, kernel_ring);
return 0;
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
index d039e1c7a0d5..0621b89cbf3d 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
@@ -504,6 +504,40 @@ static void fbnic_get_stats64(struct net_device *dev,
}
}
+bool fbnic_check_split_frames(struct bpf_prog *prog, unsigned int mtu,
+ u32 hds_thresh)
+{
+ if (!prog)
+ return false;
+
+ if (prog->aux->xdp_has_frags)
+ return false;
+
+ return mtu + ETH_HLEN > hds_thresh;
+}
+
+static int fbnic_bpf(struct net_device *netdev, struct netdev_bpf *bpf)
+{
+ struct bpf_prog *prog = bpf->prog, *prev_prog;
+ struct fbnic_net *fbn = netdev_priv(netdev);
+
+ if (bpf->command != XDP_SETUP_PROG)
+ return -EINVAL;
+
+ if (fbnic_check_split_frames(prog, netdev->mtu,
+ fbn->hds_thresh)) {
+ NL_SET_ERR_MSG_MOD(bpf->extack,
+ "MTU too high, or HDS threshold is too low for single buffer XDP");
+ return -EOPNOTSUPP;
+ }
+
+ prev_prog = xchg(&fbn->xdp_prog, prog);
+ if (prev_prog)
+ bpf_prog_put(prev_prog);
+
+ return 0;
+}
+
static const struct net_device_ops fbnic_netdev_ops = {
.ndo_open = fbnic_open,
.ndo_stop = fbnic_stop,
@@ -513,6 +547,7 @@ static const struct net_device_ops fbnic_netdev_ops = {
.ndo_set_mac_address = fbnic_set_mac,
.ndo_set_rx_mode = fbnic_set_rx_mode,
.ndo_get_stats64 = fbnic_get_stats64,
+ .ndo_bpf = fbnic_bpf,
.ndo_hwtstamp_get = fbnic_hwtstamp_get,
.ndo_hwtstamp_set = fbnic_hwtstamp_set,
};
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
index 04c5c7ed6c3a..bfa79ea910d8 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
@@ -18,6 +18,8 @@
#define FBNIC_TUN_GSO_FEATURES NETIF_F_GSO_IPXIP6
struct fbnic_net {
+ struct bpf_prog *xdp_prog;
+
struct fbnic_ring *tx[FBNIC_MAX_TXQS];
struct fbnic_ring *rx[FBNIC_MAX_RXQS];
@@ -104,4 +106,7 @@ int fbnic_phylink_ethtool_ksettings_get(struct net_device *netdev,
int fbnic_phylink_get_fecparam(struct net_device *netdev,
struct ethtool_fecparam *fecparam);
int fbnic_phylink_init(struct net_device *netdev);
+
+bool fbnic_check_split_frames(struct bpf_prog *prog,
+ unsigned int mtu, u32 hds_threshold);
#endif /* _FBNIC_NETDEV_H_ */
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
index 71af7b9d5bcd..486c14e83ad5 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
@@ -2,17 +2,26 @@
/* Copyright (c) Meta Platforms, Inc. and affiliates. */
#include <linux/bitfield.h>
+#include <linux/bpf.h>
+#include <linux/bpf_trace.h>
#include <linux/iopoll.h>
#include <linux/pci.h>
#include <net/netdev_queues.h>
#include <net/page_pool/helpers.h>
#include <net/tcp.h>
+#include <net/xdp.h>
#include "fbnic.h"
#include "fbnic_csr.h"
#include "fbnic_netdev.h"
#include "fbnic_txrx.h"
+enum {
+ FBNIC_XDP_PASS = 0,
+ FBNIC_XDP_CONSUME,
+ FBNIC_XDP_LEN_ERR,
+};
+
enum {
FBNIC_XMIT_CB_TS = 0x01,
};
@@ -877,7 +886,7 @@ static void fbnic_pkt_prepare(struct fbnic_napi_vector *nv, u64 rcd,
headroom = hdr_pg_off - hdr_pg_start + FBNIC_RX_PAD;
frame_sz = hdr_pg_end - hdr_pg_start;
- xdp_init_buff(&pkt->buff, frame_sz, NULL);
+ xdp_init_buff(&pkt->buff, frame_sz, &qt->xdp_rxq);
hdr_pg_start += (FBNIC_RCD_AL_BUFF_FRAG_MASK & rcd) *
FBNIC_BD_FRAG_SIZE;
@@ -966,6 +975,38 @@ static struct sk_buff *fbnic_build_skb(struct fbnic_napi_vector *nv,
return skb;
}
+static struct sk_buff *fbnic_run_xdp(struct fbnic_napi_vector *nv,
+ struct fbnic_pkt_buff *pkt)
+{
+ struct fbnic_net *fbn = netdev_priv(nv->napi.dev);
+ struct bpf_prog *xdp_prog;
+ int act;
+
+ xdp_prog = READ_ONCE(fbn->xdp_prog);
+ if (!xdp_prog)
+ goto xdp_pass;
+
+ if (xdp_buff_has_frags(&pkt->buff) && !xdp_prog->aux->xdp_has_frags)
+ return ERR_PTR(-FBNIC_XDP_LEN_ERR);
+
+ act = bpf_prog_run_xdp(xdp_prog, &pkt->buff);
+ switch (act) {
+ case XDP_PASS:
+xdp_pass:
+ return fbnic_build_skb(nv, pkt);
+ default:
+ bpf_warn_invalid_xdp_action(nv->napi.dev, xdp_prog, act);
+ fallthrough;
+ case XDP_ABORTED:
+ trace_xdp_exception(nv->napi.dev, xdp_prog, act);
+ fallthrough;
+ case XDP_DROP:
+ break;
+ }
+
+ return ERR_PTR(-FBNIC_XDP_CONSUME);
+}
+
static enum pkt_hash_types fbnic_skb_hash_type(u64 rcd)
{
return (FBNIC_RCD_META_L4_TYPE_MASK & rcd) ? PKT_HASH_TYPE_L4 :
@@ -1064,7 +1105,7 @@ static int fbnic_clean_rcq(struct fbnic_napi_vector *nv,
if (unlikely(pkt->add_frag_failed))
skb = NULL;
else if (likely(!fbnic_rcd_metadata_err(rcd)))
- skb = fbnic_build_skb(nv, pkt);
+ skb = fbnic_run_xdp(nv, pkt);
/* Populate skb and invalidate XDP */
if (!IS_ERR_OR_NULL(skb)) {
@@ -1250,6 +1291,7 @@ static void fbnic_free_napi_vector(struct fbnic_net *fbn,
}
for (j = 0; j < nv->rxt_count; j++, i++) {
+ xdp_rxq_info_unreg(&nv->qt[i].xdp_rxq);
fbnic_remove_rx_ring(fbn, &nv->qt[i].sub0);
fbnic_remove_rx_ring(fbn, &nv->qt[i].sub1);
fbnic_remove_rx_ring(fbn, &nv->qt[i].cmpl);
@@ -1422,6 +1464,11 @@ static int fbnic_alloc_napi_vector(struct fbnic_dev *fbd, struct fbnic_net *fbn,
fbnic_ring_init(&qt->cmpl, db, rxq_idx, FBNIC_RING_F_STATS);
fbn->rx[rxq_idx] = &qt->cmpl;
+ err = xdp_rxq_info_reg(&qt->xdp_rxq, fbn->netdev, rxq_idx,
+ nv->napi.napi_id);
+ if (err)
+ goto free_ring_cur_qt;
+
/* Update Rx queue index */
rxt_count--;
rxq_idx += v_count;
@@ -1432,6 +1479,25 @@ static int fbnic_alloc_napi_vector(struct fbnic_dev *fbd, struct fbnic_net *fbn,
return 0;
+ while (rxt_count < nv->rxt_count) {
+ qt--;
+
+ xdp_rxq_info_unreg(&qt->xdp_rxq);
+free_ring_cur_qt:
+ fbnic_remove_rx_ring(fbn, &qt->sub0);
+ fbnic_remove_rx_ring(fbn, &qt->sub1);
+ fbnic_remove_rx_ring(fbn, &qt->cmpl);
+ rxt_count++;
+ }
+ while (txt_count < nv->txt_count) {
+ qt--;
+
+ fbnic_remove_tx_ring(fbn, &qt->sub0);
+ fbnic_remove_tx_ring(fbn, &qt->cmpl);
+
+ txt_count++;
+ }
+ fbnic_napi_free_irq(fbd, nv);
pp_destroy:
page_pool_destroy(nv->page_pool);
napi_del:
@@ -1708,8 +1774,10 @@ static void fbnic_free_nv_resources(struct fbnic_net *fbn,
for (i = 0; i < nv->txt_count; i++)
fbnic_free_qt_resources(fbn, &nv->qt[i]);
- for (j = 0; j < nv->rxt_count; j++, i++)
+ for (j = 0; j < nv->rxt_count; j++, i++) {
fbnic_free_qt_resources(fbn, &nv->qt[i]);
+ xdp_rxq_info_unreg_mem_model(&nv->qt[i].xdp_rxq);
+ }
}
static int fbnic_alloc_nv_resources(struct fbnic_net *fbn,
@@ -1721,19 +1789,32 @@ static int fbnic_alloc_nv_resources(struct fbnic_net *fbn,
for (i = 0; i < nv->txt_count; i++) {
err = fbnic_alloc_tx_qt_resources(fbn, &nv->qt[i]);
if (err)
- goto free_resources;
+ goto free_qt_resources;
}
/* Allocate Rx Resources */
for (j = 0; j < nv->rxt_count; j++, i++) {
+ /* Register XDP memory model for completion queue */
+ err = xdp_reg_mem_model(&nv->qt[i].xdp_rxq.mem,
+ MEM_TYPE_PAGE_POOL,
+ nv->page_pool);
+ if (err)
+ goto xdp_unreg_mem_model;
+
err = fbnic_alloc_rx_qt_resources(fbn, &nv->qt[i]);
if (err)
- goto free_resources;
+ goto xdp_unreg_cur_model;
}
return 0;
-free_resources:
+xdp_unreg_mem_model:
+ while (j-- && i--) {
+ fbnic_free_qt_resources(fbn, &nv->qt[i]);
+xdp_unreg_cur_model:
+ xdp_rxq_info_unreg_mem_model(&nv->qt[i].xdp_rxq);
+ }
+free_qt_resources:
while (i--)
fbnic_free_qt_resources(fbn, &nv->qt[i]);
return err;
@@ -2025,7 +2106,7 @@ void fbnic_flush(struct fbnic_net *fbn)
memset(qt->cmpl.desc, 0, qt->cmpl.size);
fbnic_put_pkt_buff(nv, qt->cmpl.pkt, 0);
- qt->cmpl.pkt->buff.data_hard_start = NULL;
+ memset(qt->cmpl.pkt, 0, sizeof(struct fbnic_pkt_buff));
}
}
}
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
index be34962c465e..0fefd1f00196 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
@@ -129,6 +129,7 @@ struct fbnic_ring {
struct fbnic_q_triad {
struct fbnic_ring sub0, sub1, cmpl;
+ struct xdp_rxq_info xdp_rxq;
};
struct fbnic_napi_vector {
--
2.47.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH net-next 6/9] eth: fbnic: Add support for XDP queues
2025-07-23 14:59 [PATCH net-next 0/9] eth: fbnic Add XDP support for fbnic Mohsin Bashir
` (4 preceding siblings ...)
2025-07-23 14:59 ` [PATCH net-next 5/9] eth: fbnic: Add XDP pass, drop, abort support Mohsin Bashir
@ 2025-07-23 14:59 ` Mohsin Bashir
2025-07-23 23:54 ` Jakub Kicinski
2025-07-23 14:59 ` [PATCH net-next 7/9] eth: fbnic: Add support for XDP_TX action Mohsin Bashir
` (2 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Mohsin Bashir @ 2025-07-23 14:59 UTC (permalink / raw)
To: netdev
Cc: kuba, alexanderduyck, andrew+netdev, davem, edumazet, pabeni,
mohsin.bashr, horms, vadim.fedorenko, jdamato, sdf,
aleksander.lobakin, ast, daniel, hawk, john.fastabend
Add support for allocating XDP_TX queues and configuring ring support.
FBNIC has been designed with XDP support in mind. Each Tx queue has 2
submission queues and one completion queue, with the expectation that
one of the submission queues will be used by the stack, and the other
by XDP. XDP queues are populated by XDP_TX and start from index 128
in the TX queue array.
The support for XDP_TX is added in the next patch.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
---
.../net/ethernet/meta/fbnic/fbnic_netdev.h | 2 +-
drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 144 +++++++++++++++++-
drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 7 +
3 files changed, 147 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
index bfa79ea910d8..0a6347f28210 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
@@ -20,7 +20,7 @@
struct fbnic_net {
struct bpf_prog *xdp_prog;
- struct fbnic_ring *tx[FBNIC_MAX_TXQS];
+ struct fbnic_ring *tx[FBNIC_MAX_TXQS + FBNIC_MAX_XDPQS];
struct fbnic_ring *rx[FBNIC_MAX_RXQS];
struct fbnic_napi_vector *napi[FBNIC_MAX_NAPI_VECTORS];
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
index 486c14e83ad5..993c0da42f2f 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
@@ -615,6 +615,42 @@ static void fbnic_clean_twq0(struct fbnic_napi_vector *nv, int napi_budget,
}
}
+static void fbnic_clean_twq1(struct fbnic_napi_vector *nv, bool pp_allow_direct,
+ struct fbnic_ring *ring, bool discard,
+ unsigned int hw_head)
+{
+ u64 total_bytes = 0, total_packets = 0;
+ unsigned int head = ring->head;
+
+ while (hw_head != head) {
+ struct page *page;
+ u64 twd;
+
+ if (unlikely(!(ring->desc[head] & FBNIC_TWD_TYPE(AL))))
+ goto next_desc;
+
+ twd = le64_to_cpu(ring->desc[head]);
+ page = ring->tx_buf[head];
+
+ /* TYPE_AL is 2, TYPE_LAST_AL is 3. So this trick gives
+ * us one increment per packet, with no branches.
+ */
+ total_packets += FIELD_GET(FBNIC_TWD_TYPE_MASK, twd) -
+ FBNIC_TWD_TYPE_AL;
+ total_bytes += FIELD_GET(FBNIC_TWD_LEN_MASK, twd);
+
+ page_pool_put_page(nv->page_pool, page, -1, pp_allow_direct);
+next_desc:
+ head++;
+ head &= ring->size_mask;
+ }
+
+ if (!total_bytes)
+ return;
+
+ ring->head = head;
+}
+
static void fbnic_clean_tsq(struct fbnic_napi_vector *nv,
struct fbnic_ring *ring,
u64 tcd, int *ts_head, int *head0)
@@ -698,12 +734,21 @@ static void fbnic_page_pool_drain(struct fbnic_ring *ring, unsigned int idx,
}
static void fbnic_clean_twq(struct fbnic_napi_vector *nv, int napi_budget,
- struct fbnic_q_triad *qt, s32 ts_head, s32 head0)
+ struct fbnic_q_triad *qt, s32 ts_head, s32 head0,
+ s32 head1)
{
if (head0 >= 0)
fbnic_clean_twq0(nv, napi_budget, &qt->sub0, false, head0);
else if (ts_head >= 0)
fbnic_clean_twq0(nv, napi_budget, &qt->sub0, false, ts_head);
+
+ if (head1 >= 0) {
+ qt->cmpl.deferred_head = -1;
+ if (napi_budget)
+ fbnic_clean_twq1(nv, true, &qt->sub1, false, head1);
+ else
+ qt->cmpl.deferred_head = head1;
+ }
}
static void
@@ -711,6 +756,7 @@ fbnic_clean_tcq(struct fbnic_napi_vector *nv, struct fbnic_q_triad *qt,
int napi_budget)
{
struct fbnic_ring *cmpl = &qt->cmpl;
+ s32 head1 = cmpl->deferred_head;
s32 head0 = -1, ts_head = -1;
__le64 *raw_tcd, done;
u32 head = cmpl->head;
@@ -728,7 +774,10 @@ fbnic_clean_tcq(struct fbnic_napi_vector *nv, struct fbnic_q_triad *qt,
switch (FIELD_GET(FBNIC_TCD_TYPE_MASK, tcd)) {
case FBNIC_TCD_TYPE_0:
- if (!(tcd & FBNIC_TCD_TWQ1))
+ if (tcd & FBNIC_TCD_TWQ1)
+ head1 = FIELD_GET(FBNIC_TCD_TYPE0_HEAD1_MASK,
+ tcd);
+ else
head0 = FIELD_GET(FBNIC_TCD_TYPE0_HEAD0_MASK,
tcd);
/* Currently all err status bits are related to
@@ -761,7 +810,7 @@ fbnic_clean_tcq(struct fbnic_napi_vector *nv, struct fbnic_q_triad *qt,
}
/* Unmap and free processed buffers */
- fbnic_clean_twq(nv, napi_budget, qt, ts_head, head0);
+ fbnic_clean_twq(nv, napi_budget, qt, ts_head, head0, head1);
}
static void fbnic_clean_bdq(struct fbnic_napi_vector *nv, int napi_budget,
@@ -1266,6 +1315,17 @@ static void fbnic_remove_tx_ring(struct fbnic_net *fbn,
fbn->tx[txr->q_idx] = NULL;
}
+static void fbnic_remove_xdp_ring(struct fbnic_net *fbn,
+ struct fbnic_ring *xdpr)
+{
+ if (!(xdpr->flags & FBNIC_RING_F_STATS))
+ return;
+
+ /* Remove pointer to the Tx ring */
+ WARN_ON(fbn->tx[xdpr->q_idx] && fbn->tx[xdpr->q_idx] != xdpr);
+ fbn->tx[xdpr->q_idx] = NULL;
+}
+
static void fbnic_remove_rx_ring(struct fbnic_net *fbn,
struct fbnic_ring *rxr)
{
@@ -1287,6 +1347,7 @@ static void fbnic_free_napi_vector(struct fbnic_net *fbn,
for (i = 0; i < nv->txt_count; i++) {
fbnic_remove_tx_ring(fbn, &nv->qt[i].sub0);
+ fbnic_remove_xdp_ring(fbn, &nv->qt[i].sub1);
fbnic_remove_tx_ring(fbn, &nv->qt[i].cmpl);
}
@@ -1361,6 +1422,7 @@ static void fbnic_ring_init(struct fbnic_ring *ring, u32 __iomem *doorbell,
ring->doorbell = doorbell;
ring->q_idx = q_idx;
ring->flags = flags;
+ ring->deferred_head = -1;
}
static int fbnic_alloc_napi_vector(struct fbnic_dev *fbd, struct fbnic_net *fbn,
@@ -1370,11 +1432,18 @@ static int fbnic_alloc_napi_vector(struct fbnic_dev *fbd, struct fbnic_net *fbn,
{
int txt_count = txq_count, rxt_count = rxq_count;
u32 __iomem *uc_addr = fbd->uc_addr0;
+ int xdp_count = 0, qt_count, err;
struct fbnic_napi_vector *nv;
struct fbnic_q_triad *qt;
- int qt_count, err;
u32 __iomem *db;
+ /* We need to reserve at least one Tx Queue Triad for an XDP ring */
+ if (rxq_count) {
+ xdp_count = 1;
+ if (!txt_count)
+ txt_count = 1;
+ }
+
qt_count = txt_count + rxq_count;
if (!qt_count)
return -EINVAL;
@@ -1423,12 +1492,13 @@ static int fbnic_alloc_napi_vector(struct fbnic_dev *fbd, struct fbnic_net *fbn,
qt = nv->qt;
while (txt_count) {
+ u8 flags = FBNIC_RING_F_CTX | FBNIC_RING_F_STATS;
+
/* Configure Tx queue */
db = &uc_addr[FBNIC_QUEUE(txq_idx) + FBNIC_QUEUE_TWQ0_TAIL];
/* Assign Tx queue to netdev if applicable */
if (txq_count > 0) {
- u8 flags = FBNIC_RING_F_CTX | FBNIC_RING_F_STATS;
fbnic_ring_init(&qt->sub0, db, txq_idx, flags);
fbn->tx[txq_idx] = &qt->sub0;
@@ -1438,6 +1508,28 @@ static int fbnic_alloc_napi_vector(struct fbnic_dev *fbd, struct fbnic_net *fbn,
FBNIC_RING_F_DISABLED);
}
+ /* Configure XDP queue */
+ db = &uc_addr[FBNIC_QUEUE(txq_idx) + FBNIC_QUEUE_TWQ1_TAIL];
+
+ /* Assign XDP queue to netdev if applicable
+ *
+ * The setup for this is in itself a bit different.
+ * 1. We only need one XDP Tx queue per NAPI vector.
+ * 2. We associate it to the first Rx queue index.
+ * 3. The hardware side is associated based on the Tx Queue.
+ * 4. The netdev queue is offset by FBNIC_MAX_TXQs.
+ */
+ if (xdp_count > 0) {
+ unsigned int xdp_idx = FBNIC_MAX_TXQS + rxq_idx;
+
+ fbnic_ring_init(&qt->sub1, db, xdp_idx, flags);
+ fbn->tx[xdp_idx] = &qt->sub1;
+ xdp_count--;
+ } else {
+ fbnic_ring_init(&qt->sub1, db, 0,
+ FBNIC_RING_F_DISABLED);
+ }
+
/* Configure Tx completion queue */
db = &uc_addr[FBNIC_QUEUE(txq_idx) + FBNIC_QUEUE_TCQ_HEAD];
fbnic_ring_init(&qt->cmpl, db, 0, 0);
@@ -1493,6 +1585,7 @@ static int fbnic_alloc_napi_vector(struct fbnic_dev *fbd, struct fbnic_net *fbn,
qt--;
fbnic_remove_tx_ring(fbn, &qt->sub0);
+ fbnic_remove_xdp_ring(fbn, &qt->sub1);
fbnic_remove_tx_ring(fbn, &qt->cmpl);
txt_count++;
@@ -1727,6 +1820,10 @@ static int fbnic_alloc_tx_qt_resources(struct fbnic_net *fbn,
if (err)
return err;
+ err = fbnic_alloc_tx_ring_resources(fbn, &qt->sub1);
+ if (err)
+ goto free_sub0;
+
err = fbnic_alloc_tx_ring_resources(fbn, &qt->cmpl);
if (err)
goto free_sub1;
@@ -1734,6 +1831,8 @@ static int fbnic_alloc_tx_qt_resources(struct fbnic_net *fbn,
return 0;
free_sub1:
+ fbnic_free_ring_resources(dev, &qt->sub1);
+free_sub0:
fbnic_free_ring_resources(dev, &qt->sub0);
return err;
}
@@ -1921,6 +2020,15 @@ static void fbnic_disable_twq0(struct fbnic_ring *txr)
fbnic_ring_wr32(txr, FBNIC_QUEUE_TWQ0_CTL, twq_ctl);
}
+static void fbnic_disable_twq1(struct fbnic_ring *txr)
+{
+ u32 twq_ctl = fbnic_ring_rd32(txr, FBNIC_QUEUE_TWQ1_CTL);
+
+ twq_ctl &= ~FBNIC_QUEUE_TWQ_CTL_ENABLE;
+
+ fbnic_ring_wr32(txr, FBNIC_QUEUE_TWQ1_CTL, twq_ctl);
+}
+
static void fbnic_disable_tcq(struct fbnic_ring *txr)
{
fbnic_ring_wr32(txr, FBNIC_QUEUE_TCQ_CTL, 0);
@@ -1966,6 +2074,7 @@ void fbnic_disable(struct fbnic_net *fbn)
struct fbnic_q_triad *qt = &nv->qt[t];
fbnic_disable_twq0(&qt->sub0);
+ fbnic_disable_twq1(&qt->sub1);
fbnic_disable_tcq(&qt->cmpl);
}
@@ -2080,6 +2189,8 @@ void fbnic_flush(struct fbnic_net *fbn)
/* Clean the work queues of unprocessed work */
fbnic_clean_twq0(nv, 0, &qt->sub0, true, qt->sub0.tail);
+ fbnic_clean_twq1(nv, false, &qt->sub1, true,
+ qt->sub1.tail);
/* Reset completion queue descriptor ring */
memset(qt->cmpl.desc, 0, qt->cmpl.size);
@@ -2154,6 +2265,28 @@ static void fbnic_enable_twq0(struct fbnic_ring *twq)
fbnic_ring_wr32(twq, FBNIC_QUEUE_TWQ0_CTL, FBNIC_QUEUE_TWQ_CTL_ENABLE);
}
+static void fbnic_enable_twq1(struct fbnic_ring *twq)
+{
+ u32 log_size = fls(twq->size_mask);
+
+ if (!twq->size_mask)
+ return;
+
+ /* Reset head/tail */
+ fbnic_ring_wr32(twq, FBNIC_QUEUE_TWQ1_CTL, FBNIC_QUEUE_TWQ_CTL_RESET);
+ twq->tail = 0;
+ twq->head = 0;
+
+ /* Store descriptor ring address and size */
+ fbnic_ring_wr32(twq, FBNIC_QUEUE_TWQ1_BAL, lower_32_bits(twq->dma));
+ fbnic_ring_wr32(twq, FBNIC_QUEUE_TWQ1_BAH, upper_32_bits(twq->dma));
+
+ /* Write lower 4 bits of log size as 64K ring size is 0 */
+ fbnic_ring_wr32(twq, FBNIC_QUEUE_TWQ1_SIZE, log_size & 0xf);
+
+ fbnic_ring_wr32(twq, FBNIC_QUEUE_TWQ1_CTL, FBNIC_QUEUE_TWQ_CTL_ENABLE);
+}
+
static void fbnic_enable_tcq(struct fbnic_napi_vector *nv,
struct fbnic_ring *tcq)
{
@@ -2330,6 +2463,7 @@ void fbnic_enable(struct fbnic_net *fbn)
struct fbnic_q_triad *qt = &nv->qt[t];
fbnic_enable_twq0(&qt->sub0);
+ fbnic_enable_twq1(&qt->sub1);
fbnic_enable_tcq(nv, &qt->cmpl);
}
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
index 0fefd1f00196..b31b450c10fd 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
@@ -35,6 +35,7 @@ struct fbnic_net;
#define FBNIC_MAX_TXQS 128u
#define FBNIC_MAX_RXQS 128u
+#define FBNIC_MAX_XDPQS 128u
/* These apply to TWQs, TCQ, RCQ */
#define FBNIC_QUEUE_SIZE_MIN 16u
@@ -120,6 +121,12 @@ struct fbnic_ring {
u32 head, tail; /* Head/Tail of ring */
+ /* Deferred_head is used to cache the head for TWQ1 if an attempt
+ * is made to clean TWQ1 with zero napi_budget. We do not use it for
+ * any other ring.
+ */
+ s32 deferred_head;
+
struct fbnic_queue_stats stats;
/* Slow path fields follow */
--
2.47.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH net-next 7/9] eth: fbnic: Add support for XDP_TX action
2025-07-23 14:59 [PATCH net-next 0/9] eth: fbnic Add XDP support for fbnic Mohsin Bashir
` (5 preceding siblings ...)
2025-07-23 14:59 ` [PATCH net-next 6/9] eth: fbnic: Add support for XDP queues Mohsin Bashir
@ 2025-07-23 14:59 ` Mohsin Bashir
2025-07-23 14:59 ` [PATCH net-next 8/9] eth: fbnic: Collect packet statistics for XDP Mohsin Bashir
2025-07-23 14:59 ` [PATCH net-next 9/9] eth: fbnic: Report XDP stats via ethtool Mohsin Bashir
8 siblings, 0 replies; 20+ messages in thread
From: Mohsin Bashir @ 2025-07-23 14:59 UTC (permalink / raw)
To: netdev
Cc: kuba, alexanderduyck, andrew+netdev, davem, edumazet, pabeni,
mohsin.bashr, horms, vadim.fedorenko, jdamato, sdf,
aleksander.lobakin, ast, daniel, hawk, john.fastabend
Add support for XDP_TX action and cleaning the associated work.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
---
drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 85 +++++++++++++++++++-
1 file changed, 84 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
index 993c0da42f2f..a1656c66a512 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
@@ -19,6 +19,7 @@
enum {
FBNIC_XDP_PASS = 0,
FBNIC_XDP_CONSUME,
+ FBNIC_XDP_TX,
FBNIC_XDP_LEN_ERR,
};
@@ -1024,6 +1025,80 @@ static struct sk_buff *fbnic_build_skb(struct fbnic_napi_vector *nv,
return skb;
}
+static long fbnic_pkt_tx(struct fbnic_napi_vector *nv,
+ struct fbnic_pkt_buff *pkt)
+{
+ struct fbnic_ring *ring = &nv->qt[0].sub1;
+ int size, offset, nsegs = 1, data_len = 0;
+ unsigned int tail = ring->tail;
+ struct skb_shared_info *shinfo;
+ skb_frag_t *frag = NULL;
+ struct page *page;
+ dma_addr_t dma;
+ __le64 *twd;
+
+ if (unlikely(xdp_buff_has_frags(&pkt->buff))) {
+ shinfo = xdp_get_shared_info_from_buff(&pkt->buff);
+ nsegs += shinfo->nr_frags;
+ data_len = shinfo->xdp_frags_size;
+ frag = &shinfo->frags[0];
+ }
+
+ if (fbnic_desc_unused(ring) < nsegs)
+ return -FBNIC_XDP_CONSUME;
+
+ page = virt_to_page(pkt->buff.data_hard_start);
+ offset = offset_in_page(pkt->buff.data);
+ dma = page_pool_get_dma_addr(page);
+
+ size = pkt->buff.data_end - pkt->buff.data;
+
+ while (nsegs--) {
+ dma_sync_single_range_for_device(nv->dev, dma, offset, size,
+ DMA_BIDIRECTIONAL);
+ dma += offset;
+
+ ring->tx_buf[tail] = page;
+
+ twd = &ring->desc[tail];
+ *twd = cpu_to_le64(FIELD_PREP(FBNIC_TWD_ADDR_MASK, dma) |
+ FIELD_PREP(FBNIC_TWD_LEN_MASK, size) |
+ FIELD_PREP(FBNIC_TWD_TYPE_MASK,
+ FBNIC_TWD_TYPE_AL));
+
+ tail++;
+ tail &= ring->size_mask;
+
+ if (!data_len)
+ break;
+
+ offset = skb_frag_off(frag);
+ page = skb_frag_page(frag);
+ dma = page_pool_get_dma_addr(page);
+
+ size = skb_frag_size(frag);
+ data_len -= size;
+ frag++;
+ }
+
+ *twd |= FBNIC_TWD_TYPE(LAST_AL);
+
+ ring->tail = tail;
+
+ return -FBNIC_XDP_TX;
+}
+
+static void fbnic_pkt_commit_tail(struct fbnic_napi_vector *nv,
+ unsigned int pkt_tail)
+{
+ struct fbnic_ring *ring = &nv->qt[0].sub1;
+
+ /* Force DMA writes to flush before writing to tail */
+ dma_wmb();
+
+ writel(pkt_tail, ring->doorbell);
+}
+
static struct sk_buff *fbnic_run_xdp(struct fbnic_napi_vector *nv,
struct fbnic_pkt_buff *pkt)
{
@@ -1043,6 +1118,8 @@ static struct sk_buff *fbnic_run_xdp(struct fbnic_napi_vector *nv,
case XDP_PASS:
xdp_pass:
return fbnic_build_skb(nv, pkt);
+ case XDP_TX:
+ return ERR_PTR(fbnic_pkt_tx(nv, pkt));
default:
bpf_warn_invalid_xdp_action(nv->napi.dev, xdp_prog, act);
fallthrough;
@@ -1107,10 +1184,10 @@ static int fbnic_clean_rcq(struct fbnic_napi_vector *nv,
struct fbnic_q_triad *qt, int budget)
{
unsigned int packets = 0, bytes = 0, dropped = 0, alloc_failed = 0;
+ s32 head0 = -1, head1 = -1, pkt_tail = -1;
u64 csum_complete = 0, csum_none = 0;
struct fbnic_ring *rcq = &qt->cmpl;
struct fbnic_pkt_buff *pkt;
- s32 head0 = -1, head1 = -1;
__le64 *raw_rcd, done;
u32 head = rcq->head;
@@ -1166,6 +1243,9 @@ static int fbnic_clean_rcq(struct fbnic_napi_vector *nv,
bytes += skb->len;
napi_gro_receive(&nv->napi, skb);
+ } else if (PTR_ERR(skb) == -FBNIC_XDP_TX) {
+ pkt_tail = nv->qt[0].sub1.tail;
+ bytes += xdp_get_buff_len(&pkt->buff);
} else {
if (!skb) {
alloc_failed++;
@@ -1201,6 +1281,9 @@ static int fbnic_clean_rcq(struct fbnic_napi_vector *nv,
rcq->stats.rx.csum_none += csum_none;
u64_stats_update_end(&rcq->stats.syncp);
+ if (pkt_tail >= 0)
+ fbnic_pkt_commit_tail(nv, pkt_tail);
+
/* Unmap and free processed buffers */
if (head0 >= 0)
fbnic_clean_bdq(nv, budget, &qt->sub0, head0);
--
2.47.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH net-next 8/9] eth: fbnic: Collect packet statistics for XDP
2025-07-23 14:59 [PATCH net-next 0/9] eth: fbnic Add XDP support for fbnic Mohsin Bashir
` (6 preceding siblings ...)
2025-07-23 14:59 ` [PATCH net-next 7/9] eth: fbnic: Add support for XDP_TX action Mohsin Bashir
@ 2025-07-23 14:59 ` Mohsin Bashir
2025-07-24 10:18 ` Simon Horman
2025-07-23 14:59 ` [PATCH net-next 9/9] eth: fbnic: Report XDP stats via ethtool Mohsin Bashir
8 siblings, 1 reply; 20+ messages in thread
From: Mohsin Bashir @ 2025-07-23 14:59 UTC (permalink / raw)
To: netdev
Cc: kuba, alexanderduyck, andrew+netdev, davem, edumazet, pabeni,
mohsin.bashr, horms, vadim.fedorenko, jdamato, sdf,
aleksander.lobakin, ast, daniel, hawk, john.fastabend
Add support for XDP statistics collection and reporting via rtnl_link
and netdev_queue API.
For XDP programs without frags support, fbnic requires MTU to be less
than the HDS threshold. If an over-sized frame is received, the frame
is dropped and recorded as rx_length_errors reported via ip stats to
highlight that this is an error.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
---
.../device_drivers/ethernet/meta/fbnic.rst | 10 +++++
.../net/ethernet/meta/fbnic/fbnic_netdev.c | 30 +++++++++++++
drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 44 +++++++++++++++++--
drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 1 +
4 files changed, 82 insertions(+), 3 deletions(-)
diff --git a/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst b/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst
index afb8353daefd..ad5e2cba7afc 100644
--- a/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst
+++ b/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst
@@ -160,3 +160,13 @@ behavior and potential performance bottlenecks.
credit exhaustion
- ``pcie_ob_rd_no_np_cred``: Read requests dropped due to non-posted
credit exhaustion
+
+XDP Length Error:
+~~~~~~~~~~~~~~~~~
+
+For XDP programs without frags support, fbnic tries to make sure that MTU fits
+into a single buffer. If an oversized frame is received and gets fragmented,
+it is dropped and the following netlink counters are updated
+ - ``rx-length``: number of frames dropped due to lack of fragmentation
+ support in the attached XDP program
+ - ``rx-errors``: total number of packets with errors received on the interface
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
index 0621b89cbf3d..4991f9214c0d 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
@@ -485,6 +485,7 @@ static void fbnic_get_stats64(struct net_device *dev,
stats64->rx_missed_errors = rx_missed;
for (i = 0; i < fbn->num_rx_queues; i++) {
+ struct fbnic_ring *xdpr = fbn->tx[FBNIC_MAX_TXQS + i];
struct fbnic_ring *rxr = fbn->rx[i];
if (!rxr)
@@ -501,6 +502,21 @@ static void fbnic_get_stats64(struct net_device *dev,
stats64->rx_bytes += rx_bytes;
stats64->rx_packets += rx_packets;
stats64->rx_dropped += rx_dropped;
+
+ if (!xdpr)
+ continue;
+
+ stats = &xdpr->stats;
+ do {
+ start = u64_stats_fetch_begin(&stats->syncp);
+ tx_bytes = stats->bytes;
+ tx_packets = stats->packets;
+ tx_dropped = stats->dropped;
+ } while (u64_stats_fetch_retry(&stats->syncp, start));
+
+ stats64->tx_bytes += tx_bytes;
+ stats64->tx_packets += tx_packets;
+ stats64->tx_dropped += tx_dropped;
}
}
@@ -599,6 +615,7 @@ static void fbnic_get_queue_stats_tx(struct net_device *dev, int idx,
struct fbnic_ring *txr = fbn->tx[idx];
struct fbnic_queue_stats *stats;
u64 stop, wake, csum, lso;
+ struct fbnic_ring *xdpr;
unsigned int start;
u64 bytes, packets;
@@ -622,6 +639,19 @@ static void fbnic_get_queue_stats_tx(struct net_device *dev, int idx,
tx->hw_gso_wire_packets = lso;
tx->stop = stop;
tx->wake = wake;
+
+ xdpr = fbn->tx[FBNIC_MAX_TXQS + idx];
+ if (xdpr) {
+ stats = &xdpr->stats;
+ do {
+ start = u64_stats_fetch_begin(&stats->syncp);
+ bytes = stats->bytes;
+ packets = stats->packets;
+ } while (u64_stats_fetch_retry(&stats->syncp, start));
+
+ tx->bytes += bytes;
+ tx->packets += packets;
+ }
}
static void fbnic_get_base_stats(struct net_device *dev,
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
index a1656c66a512..eb5d071b727a 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
@@ -650,6 +650,18 @@ static void fbnic_clean_twq1(struct fbnic_napi_vector *nv, bool pp_allow_direct,
return;
ring->head = head;
+
+ if (discard) {
+ u64_stats_update_begin(&ring->stats.syncp);
+ ring->stats.dropped += total_packets;
+ u64_stats_update_end(&ring->stats.syncp);
+ return;
+ }
+
+ u64_stats_update_begin(&ring->stats.syncp);
+ ring->stats.bytes += total_bytes;
+ ring->stats.packets += total_packets;
+ u64_stats_update_end(&ring->stats.syncp);
}
static void fbnic_clean_tsq(struct fbnic_napi_vector *nv,
@@ -1044,8 +1056,12 @@ static long fbnic_pkt_tx(struct fbnic_napi_vector *nv,
frag = &shinfo->frags[0];
}
- if (fbnic_desc_unused(ring) < nsegs)
+ if (fbnic_desc_unused(ring) < nsegs) {
+ u64_stats_update_begin(&ring->stats.syncp);
+ ring->stats.dropped++;
+ u64_stats_update_end(&ring->stats.syncp);
return -FBNIC_XDP_CONSUME;
+ }
page = virt_to_page(pkt->buff.data_hard_start);
offset = offset_in_page(pkt->buff.data);
@@ -1184,8 +1200,8 @@ static int fbnic_clean_rcq(struct fbnic_napi_vector *nv,
struct fbnic_q_triad *qt, int budget)
{
unsigned int packets = 0, bytes = 0, dropped = 0, alloc_failed = 0;
+ u64 csum_complete = 0, csum_none = 0, length_errors = 0;
s32 head0 = -1, head1 = -1, pkt_tail = -1;
- u64 csum_complete = 0, csum_none = 0;
struct fbnic_ring *rcq = &qt->cmpl;
struct fbnic_pkt_buff *pkt;
__le64 *raw_rcd, done;
@@ -1250,6 +1266,8 @@ static int fbnic_clean_rcq(struct fbnic_napi_vector *nv,
if (!skb) {
alloc_failed++;
dropped++;
+ } else if (PTR_ERR(skb) == -FBNIC_XDP_LEN_ERR) {
+ length_errors++;
} else {
dropped++;
}
@@ -1279,6 +1297,7 @@ static int fbnic_clean_rcq(struct fbnic_napi_vector *nv,
rcq->stats.rx.alloc_failed += alloc_failed;
rcq->stats.rx.csum_complete += csum_complete;
rcq->stats.rx.csum_none += csum_none;
+ rcq->stats.rx.length_errors += length_errors;
u64_stats_update_end(&rcq->stats.syncp);
if (pkt_tail >= 0)
@@ -1362,8 +1381,9 @@ void fbnic_aggregate_ring_rx_counters(struct fbnic_net *fbn,
fbn->rx_stats.rx.alloc_failed += stats->rx.alloc_failed;
fbn->rx_stats.rx.csum_complete += stats->rx.csum_complete;
fbn->rx_stats.rx.csum_none += stats->rx.csum_none;
+ fbn->rx_stats.rx.length_errors += stats->rx.length_errors;
/* Remember to add new stats here */
- BUILD_BUG_ON(sizeof(fbn->rx_stats.rx) / 8 != 3);
+ BUILD_BUG_ON(sizeof(fbn->rx_stats.rx) / 8 != 4);
}
void fbnic_aggregate_ring_tx_counters(struct fbnic_net *fbn,
@@ -1385,6 +1405,22 @@ void fbnic_aggregate_ring_tx_counters(struct fbnic_net *fbn,
BUILD_BUG_ON(sizeof(fbn->tx_stats.twq) / 8 != 6);
}
+static void fbnic_aggregate_ring_xdp_counters(struct fbnic_net *fbn,
+ struct fbnic_ring *xdpr)
+{
+ struct fbnic_queue_stats *stats = &xdpr->stats;
+
+ if (!(xdpr->flags & FBNIC_RING_F_STATS))
+ return;
+
+ /* Capture stats from queues before dissasociating them */
+ fbn->rx_stats.bytes += stats->bytes;
+ fbn->rx_stats.packets += stats->packets;
+ fbn->rx_stats.dropped += stats->dropped;
+ fbn->tx_stats.bytes += stats->bytes;
+ fbn->tx_stats.packets += stats->packets;
+}
+
static void fbnic_remove_tx_ring(struct fbnic_net *fbn,
struct fbnic_ring *txr)
{
@@ -1404,6 +1440,8 @@ static void fbnic_remove_xdp_ring(struct fbnic_net *fbn,
if (!(xdpr->flags & FBNIC_RING_F_STATS))
return;
+ fbnic_aggregate_ring_xdp_counters(fbn, xdpr);
+
/* Remove pointer to the Tx ring */
WARN_ON(fbn->tx[xdpr->q_idx] && fbn->tx[xdpr->q_idx] != xdpr);
fbn->tx[xdpr->q_idx] = NULL;
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
index b31b450c10fd..c927a4a5f1ca 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
@@ -89,6 +89,7 @@ struct fbnic_queue_stats {
u64 alloc_failed;
u64 csum_complete;
u64 csum_none;
+ u64 length_errors;
} rx;
};
u64 dropped;
--
2.47.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH net-next 9/9] eth: fbnic: Report XDP stats via ethtool
2025-07-23 14:59 [PATCH net-next 0/9] eth: fbnic Add XDP support for fbnic Mohsin Bashir
` (7 preceding siblings ...)
2025-07-23 14:59 ` [PATCH net-next 8/9] eth: fbnic: Collect packet statistics for XDP Mohsin Bashir
@ 2025-07-23 14:59 ` Mohsin Bashir
8 siblings, 0 replies; 20+ messages in thread
From: Mohsin Bashir @ 2025-07-23 14:59 UTC (permalink / raw)
To: netdev
Cc: kuba, alexanderduyck, andrew+netdev, davem, edumazet, pabeni,
mohsin.bashr, horms, vadim.fedorenko, jdamato, sdf,
aleksander.lobakin, ast, daniel, hawk, john.fastabend
Add support to collect XDP stats via ethtool API. We record
packets and bytes sent, and packets dropped on the XDP_TX path.
ethtool -S eth0 | grep xdp | grep -v "0"
xdp_tx_queue_13_packets: 2
xdp_tx_queue_13_bytes: 16126
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
---
.../net/ethernet/meta/fbnic/fbnic_ethtool.c | 50 ++++++++++++++++++-
1 file changed, 49 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
index d7b9eb267ead..3744fa173f8b 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
@@ -112,6 +112,20 @@ static const struct fbnic_stat fbnic_gstrings_hw_q_stats[] = {
FBNIC_HW_RXB_DEQUEUE_STATS_LEN * FBNIC_RXB_DEQUEUE_INDICES + \
FBNIC_HW_Q_STATS_LEN * FBNIC_MAX_QUEUES)
+#define FBNIC_QUEUE_STAT(name, stat) \
+ FBNIC_STAT_FIELDS(fbnic_ring, name, stat)
+
+static const struct fbnic_stat fbnic_gstrings_xdp_stats[] = {
+ FBNIC_QUEUE_STAT("xdp_tx_queue_%u_packets", stats.packets),
+ FBNIC_QUEUE_STAT("xdp_tx_queue_%u_bytes", stats.bytes),
+ FBNIC_QUEUE_STAT("xdp_tx_queue_%u_dropped", stats.dropped),
+};
+
+#define FBNIC_XDP_STATS_LEN ARRAY_SIZE(fbnic_gstrings_xdp_stats)
+
+#define FBNIC_STATS_LEN \
+ (FBNIC_HW_STATS_LEN + FBNIC_XDP_STATS_LEN * FBNIC_MAX_XDPQS)
+
static void
fbnic_get_drvinfo(struct net_device *netdev, struct ethtool_drvinfo *drvinfo)
{
@@ -422,6 +436,16 @@ static void fbnic_get_rxb_dequeue_strings(u8 **data, unsigned int idx)
ethtool_sprintf(data, stat->string, idx);
}
+static void fbnic_get_xdp_queue_strings(u8 **data, unsigned int idx)
+{
+ const struct fbnic_stat *stat;
+ int i;
+
+ stat = fbnic_gstrings_xdp_stats;
+ for (i = 0; i < FBNIC_XDP_STATS_LEN; i++, stat++)
+ ethtool_sprintf(data, stat->string, idx);
+}
+
static void fbnic_get_strings(struct net_device *dev, u32 sset, u8 *data)
{
const struct fbnic_stat *stat;
@@ -447,6 +471,9 @@ static void fbnic_get_strings(struct net_device *dev, u32 sset, u8 *data)
for (i = 0; i < FBNIC_HW_Q_STATS_LEN; i++, stat++)
ethtool_sprintf(&data, stat->string, idx);
}
+
+ for (i = 0; i < FBNIC_MAX_XDPQS; i++)
+ fbnic_get_xdp_queue_strings(&data, i);
break;
}
}
@@ -464,6 +491,24 @@ static void fbnic_report_hw_stats(const struct fbnic_stat *stat,
}
}
+static void fbnic_get_xdp_queue_stats(struct fbnic_ring *ring, u64 **data)
+{
+ const struct fbnic_stat *stat;
+ int i;
+
+ if (!ring) {
+ *data += FBNIC_XDP_STATS_LEN;
+ return;
+ }
+
+ stat = fbnic_gstrings_xdp_stats;
+ for (i = 0; i < FBNIC_XDP_STATS_LEN; i++, stat++, (*data)++) {
+ u8 *p = (u8 *)ring + stat->offset;
+
+ **data = *(u64 *)p;
+ }
+}
+
static void fbnic_get_ethtool_stats(struct net_device *dev,
struct ethtool_stats *stats, u64 *data)
{
@@ -511,13 +556,16 @@ static void fbnic_get_ethtool_stats(struct net_device *dev,
FBNIC_HW_Q_STATS_LEN, &data);
}
spin_unlock(&fbd->hw_stats_lock);
+
+ for (i = 0; i < FBNIC_MAX_XDPQS; i++)
+ fbnic_get_xdp_queue_stats(fbn->tx[i + FBNIC_MAX_TXQS], &data);
}
static int fbnic_get_sset_count(struct net_device *dev, int sset)
{
switch (sset) {
case ETH_SS_STATS:
- return FBNIC_HW_STATS_LEN;
+ return FBNIC_STATS_LEN;
default:
return -EOPNOTSUPP;
}
--
2.47.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 5/9] eth: fbnic: Add XDP pass, drop, abort support
2025-07-23 14:59 ` [PATCH net-next 5/9] eth: fbnic: Add XDP pass, drop, abort support Mohsin Bashir
@ 2025-07-23 17:35 ` Maciej Fijalkowski
2025-07-24 15:47 ` Mohsin Bashir
2025-07-24 21:14 ` Alexander H Duyck
0 siblings, 2 replies; 20+ messages in thread
From: Maciej Fijalkowski @ 2025-07-23 17:35 UTC (permalink / raw)
To: Mohsin Bashir
Cc: netdev, kuba, alexanderduyck, andrew+netdev, davem, edumazet,
pabeni, horms, vadim.fedorenko, jdamato, sdf, aleksander.lobakin,
ast, daniel, hawk, john.fastabend
On Wed, Jul 23, 2025 at 07:59:22AM -0700, Mohsin Bashir wrote:
> Add basic support for attaching an XDP program to the device and support
> for PASS/DROP/ABORT actions.
> In fbnic, buffers are always mapped as DMA_BIDIRECTIONAL.
>
> Testing:
>
> Hook a simple XDP program that passes all the packets destined for a
> specific port
>
> iperf3 -c 192.168.1.10 -P 5 -p 12345
> Connecting to host 192.168.1.10, port 12345
> [ 5] local 192.168.1.9 port 46702 connected to 192.168.1.10 port 12345
> [ ID] Interval Transfer Bitrate Retr Cwnd
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [SUM] 1.00-2.00 sec 3.86 GBytes 33.2 Gbits/sec 0
>
> XDP_DROP:
> Hook an XDP program that drops packets destined for a specific port
>
> iperf3 -c 192.168.1.10 -P 5 -p 12345
> ^C- - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bitrate Retr
> [SUM] 0.00-0.00 sec 0.00 Bytes 0.00 bits/sec 0 sender
> [SUM] 0.00-0.00 sec 0.00 Bytes 0.00 bits/sec receiver
> iperf3: interrupt - the client has terminated
>
> XDP with HDS:
>
> - Validate XDP attachment failure when HDS is low
> ~] ethtool -G eth0 hds-thresh 512
> ~] sudo ip link set eth0 xdpdrv obj xdp_pass_12345.o sec xdp
> ~] Error: fbnic: MTU too high, or HDS threshold is too low for single
> buffer XDP.
>
> - Validate successful XDP attachment when HDS threshold is appropriate
> ~] ethtool -G eth0 hds-thresh 1536
> ~] sudo ip link set eth0 xdpdrv obj xdp_pass_12345.o sec xdp
>
> - Validate when the XDP program is attached, changing HDS thresh to a
> lower value fails
> ~] ethtool -G eth0 hds-thresh 512
> ~] netlink error: fbnic: Use higher HDS threshold or multi-buf capable
> program
>
> - Validate HDS thresh does not matter when xdp frags support is
> available
> ~] ethtool -G eth0 hds-thresh 512
> ~] sudo ip link set eth0 xdpdrv obj xdp_pass_mb_12345.o sec xdp.frags
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
> ---
> .../net/ethernet/meta/fbnic/fbnic_ethtool.c | 11 +++
> .../net/ethernet/meta/fbnic/fbnic_netdev.c | 35 +++++++
> .../net/ethernet/meta/fbnic/fbnic_netdev.h | 5 +
> drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 95 +++++++++++++++++--
> drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 1 +
> 5 files changed, 140 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
> index 84a0db9f1be0..d7b9eb267ead 100644
> --- a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
> +++ b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
> @@ -329,6 +329,17 @@ fbnic_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring,
> return -EINVAL;
> }
>
> + /* If an XDP program is attached, we should check for potential frame
> + * splitting. If the new HDS threshold can cause splitting, we should
> + * only allow if the attached XDP program can handle frags.
> + */
> + if (fbnic_check_split_frames(fbn->xdp_prog, netdev->mtu,
> + kernel_ring->hds_thresh)) {
> + NL_SET_ERR_MSG_MOD(extack,
> + "Use higher HDS threshold or multi-buf capable program");
> + return -EINVAL;
> + }
> +
> if (!netif_running(netdev)) {
> fbnic_set_rings(fbn, ring, kernel_ring);
> return 0;
> diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
> index d039e1c7a0d5..0621b89cbf3d 100644
> --- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
> +++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
> @@ -504,6 +504,40 @@ static void fbnic_get_stats64(struct net_device *dev,
> }
> }
>
> +bool fbnic_check_split_frames(struct bpf_prog *prog, unsigned int mtu,
> + u32 hds_thresh)
> +{
> + if (!prog)
> + return false;
> +
> + if (prog->aux->xdp_has_frags)
> + return false;
> +
> + return mtu + ETH_HLEN > hds_thresh;
> +}
> +
> +static int fbnic_bpf(struct net_device *netdev, struct netdev_bpf *bpf)
> +{
> + struct bpf_prog *prog = bpf->prog, *prev_prog;
> + struct fbnic_net *fbn = netdev_priv(netdev);
> +
> + if (bpf->command != XDP_SETUP_PROG)
> + return -EINVAL;
> +
> + if (fbnic_check_split_frames(prog, netdev->mtu,
> + fbn->hds_thresh)) {
> + NL_SET_ERR_MSG_MOD(bpf->extack,
> + "MTU too high, or HDS threshold is too low for single buffer XDP");
> + return -EOPNOTSUPP;
> + }
> +
> + prev_prog = xchg(&fbn->xdp_prog, prog);
> + if (prev_prog)
> + bpf_prog_put(prev_prog);
> +
> + return 0;
> +}
> +
> static const struct net_device_ops fbnic_netdev_ops = {
> .ndo_open = fbnic_open,
> .ndo_stop = fbnic_stop,
> @@ -513,6 +547,7 @@ static const struct net_device_ops fbnic_netdev_ops = {
> .ndo_set_mac_address = fbnic_set_mac,
> .ndo_set_rx_mode = fbnic_set_rx_mode,
> .ndo_get_stats64 = fbnic_get_stats64,
> + .ndo_bpf = fbnic_bpf,
> .ndo_hwtstamp_get = fbnic_hwtstamp_get,
> .ndo_hwtstamp_set = fbnic_hwtstamp_set,
> };
> diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
> index 04c5c7ed6c3a..bfa79ea910d8 100644
> --- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
> +++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
> @@ -18,6 +18,8 @@
> #define FBNIC_TUN_GSO_FEATURES NETIF_F_GSO_IPXIP6
>
> struct fbnic_net {
> + struct bpf_prog *xdp_prog;
> +
> struct fbnic_ring *tx[FBNIC_MAX_TXQS];
> struct fbnic_ring *rx[FBNIC_MAX_RXQS];
>
> @@ -104,4 +106,7 @@ int fbnic_phylink_ethtool_ksettings_get(struct net_device *netdev,
> int fbnic_phylink_get_fecparam(struct net_device *netdev,
> struct ethtool_fecparam *fecparam);
> int fbnic_phylink_init(struct net_device *netdev);
> +
> +bool fbnic_check_split_frames(struct bpf_prog *prog,
> + unsigned int mtu, u32 hds_threshold);
> #endif /* _FBNIC_NETDEV_H_ */
> diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
> index 71af7b9d5bcd..486c14e83ad5 100644
> --- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
> +++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
> @@ -2,17 +2,26 @@
> /* Copyright (c) Meta Platforms, Inc. and affiliates. */
>
> #include <linux/bitfield.h>
> +#include <linux/bpf.h>
> +#include <linux/bpf_trace.h>
> #include <linux/iopoll.h>
> #include <linux/pci.h>
> #include <net/netdev_queues.h>
> #include <net/page_pool/helpers.h>
> #include <net/tcp.h>
> +#include <net/xdp.h>
>
> #include "fbnic.h"
> #include "fbnic_csr.h"
> #include "fbnic_netdev.h"
> #include "fbnic_txrx.h"
>
> +enum {
> + FBNIC_XDP_PASS = 0,
> + FBNIC_XDP_CONSUME,
> + FBNIC_XDP_LEN_ERR,
> +};
> +
> enum {
> FBNIC_XMIT_CB_TS = 0x01,
> };
> @@ -877,7 +886,7 @@ static void fbnic_pkt_prepare(struct fbnic_napi_vector *nv, u64 rcd,
>
> headroom = hdr_pg_off - hdr_pg_start + FBNIC_RX_PAD;
> frame_sz = hdr_pg_end - hdr_pg_start;
> - xdp_init_buff(&pkt->buff, frame_sz, NULL);
> + xdp_init_buff(&pkt->buff, frame_sz, &qt->xdp_rxq);
> hdr_pg_start += (FBNIC_RCD_AL_BUFF_FRAG_MASK & rcd) *
> FBNIC_BD_FRAG_SIZE;
>
> @@ -966,6 +975,38 @@ static struct sk_buff *fbnic_build_skb(struct fbnic_napi_vector *nv,
> return skb;
> }
>
> +static struct sk_buff *fbnic_run_xdp(struct fbnic_napi_vector *nv,
> + struct fbnic_pkt_buff *pkt)
> +{
> + struct fbnic_net *fbn = netdev_priv(nv->napi.dev);
> + struct bpf_prog *xdp_prog;
> + int act;
> +
> + xdp_prog = READ_ONCE(fbn->xdp_prog);
> + if (!xdp_prog)
> + goto xdp_pass;
Hi Mohsin,
I thought we were past the times when we read prog pointer per each
processed packet and agreed on reading the pointer once per napi loop?
> +
> + if (xdp_buff_has_frags(&pkt->buff) && !xdp_prog->aux->xdp_has_frags)
> + return ERR_PTR(-FBNIC_XDP_LEN_ERR);
when can this happen and couldn't you catch this within ndo_bpf? i suppose
it's related to hds setup.
> +
> + act = bpf_prog_run_xdp(xdp_prog, &pkt->buff);
> + switch (act) {
> + case XDP_PASS:
> +xdp_pass:
> + return fbnic_build_skb(nv, pkt);
> + default:
> + bpf_warn_invalid_xdp_action(nv->napi.dev, xdp_prog, act);
> + fallthrough;
> + case XDP_ABORTED:
> + trace_xdp_exception(nv->napi.dev, xdp_prog, act);
> + fallthrough;
> + case XDP_DROP:
> + break;
> + }
> +
> + return ERR_PTR(-FBNIC_XDP_CONSUME);
> +}
> +
> static enum pkt_hash_types fbnic_skb_hash_type(u64 rcd)
> {
> return (FBNIC_RCD_META_L4_TYPE_MASK & rcd) ? PKT_HASH_TYPE_L4 :
> @@ -1064,7 +1105,7 @@ static int fbnic_clean_rcq(struct fbnic_napi_vector *nv,
> if (unlikely(pkt->add_frag_failed))
> skb = NULL;
> else if (likely(!fbnic_rcd_metadata_err(rcd)))
> - skb = fbnic_build_skb(nv, pkt);
> + skb = fbnic_run_xdp(nv, pkt);
>
> /* Populate skb and invalidate XDP */
> if (!IS_ERR_OR_NULL(skb)) {
> @@ -1250,6 +1291,7 @@ static void fbnic_free_napi_vector(struct fbnic_net *fbn,
> }
>
> for (j = 0; j < nv->rxt_count; j++, i++) {
> + xdp_rxq_info_unreg(&nv->qt[i].xdp_rxq);
> fbnic_remove_rx_ring(fbn, &nv->qt[i].sub0);
> fbnic_remove_rx_ring(fbn, &nv->qt[i].sub1);
> fbnic_remove_rx_ring(fbn, &nv->qt[i].cmpl);
> @@ -1422,6 +1464,11 @@ static int fbnic_alloc_napi_vector(struct fbnic_dev *fbd, struct fbnic_net *fbn,
> fbnic_ring_init(&qt->cmpl, db, rxq_idx, FBNIC_RING_F_STATS);
> fbn->rx[rxq_idx] = &qt->cmpl;
>
> + err = xdp_rxq_info_reg(&qt->xdp_rxq, fbn->netdev, rxq_idx,
> + nv->napi.napi_id);
> + if (err)
> + goto free_ring_cur_qt;
> +
> /* Update Rx queue index */
> rxt_count--;
> rxq_idx += v_count;
> @@ -1432,6 +1479,25 @@ static int fbnic_alloc_napi_vector(struct fbnic_dev *fbd, struct fbnic_net *fbn,
>
> return 0;
>
> + while (rxt_count < nv->rxt_count) {
> + qt--;
> +
> + xdp_rxq_info_unreg(&qt->xdp_rxq);
> +free_ring_cur_qt:
> + fbnic_remove_rx_ring(fbn, &qt->sub0);
> + fbnic_remove_rx_ring(fbn, &qt->sub1);
> + fbnic_remove_rx_ring(fbn, &qt->cmpl);
> + rxt_count++;
> + }
> + while (txt_count < nv->txt_count) {
> + qt--;
> +
> + fbnic_remove_tx_ring(fbn, &qt->sub0);
> + fbnic_remove_tx_ring(fbn, &qt->cmpl);
> +
> + txt_count++;
> + }
> + fbnic_napi_free_irq(fbd, nv);
> pp_destroy:
> page_pool_destroy(nv->page_pool);
> napi_del:
> @@ -1708,8 +1774,10 @@ static void fbnic_free_nv_resources(struct fbnic_net *fbn,
> for (i = 0; i < nv->txt_count; i++)
> fbnic_free_qt_resources(fbn, &nv->qt[i]);
>
> - for (j = 0; j < nv->rxt_count; j++, i++)
> + for (j = 0; j < nv->rxt_count; j++, i++) {
> fbnic_free_qt_resources(fbn, &nv->qt[i]);
> + xdp_rxq_info_unreg_mem_model(&nv->qt[i].xdp_rxq);
> + }
> }
>
> static int fbnic_alloc_nv_resources(struct fbnic_net *fbn,
> @@ -1721,19 +1789,32 @@ static int fbnic_alloc_nv_resources(struct fbnic_net *fbn,
> for (i = 0; i < nv->txt_count; i++) {
> err = fbnic_alloc_tx_qt_resources(fbn, &nv->qt[i]);
> if (err)
> - goto free_resources;
> + goto free_qt_resources;
> }
>
> /* Allocate Rx Resources */
> for (j = 0; j < nv->rxt_count; j++, i++) {
> + /* Register XDP memory model for completion queue */
> + err = xdp_reg_mem_model(&nv->qt[i].xdp_rxq.mem,
> + MEM_TYPE_PAGE_POOL,
> + nv->page_pool);
> + if (err)
> + goto xdp_unreg_mem_model;
> +
> err = fbnic_alloc_rx_qt_resources(fbn, &nv->qt[i]);
> if (err)
> - goto free_resources;
> + goto xdp_unreg_cur_model;
> }
>
> return 0;
>
> -free_resources:
> +xdp_unreg_mem_model:
> + while (j-- && i--) {
> + fbnic_free_qt_resources(fbn, &nv->qt[i]);
> +xdp_unreg_cur_model:
> + xdp_rxq_info_unreg_mem_model(&nv->qt[i].xdp_rxq);
> + }
> +free_qt_resources:
> while (i--)
> fbnic_free_qt_resources(fbn, &nv->qt[i]);
> return err;
> @@ -2025,7 +2106,7 @@ void fbnic_flush(struct fbnic_net *fbn)
> memset(qt->cmpl.desc, 0, qt->cmpl.size);
>
> fbnic_put_pkt_buff(nv, qt->cmpl.pkt, 0);
> - qt->cmpl.pkt->buff.data_hard_start = NULL;
> + memset(qt->cmpl.pkt, 0, sizeof(struct fbnic_pkt_buff));
> }
> }
> }
> diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
> index be34962c465e..0fefd1f00196 100644
> --- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
> +++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
> @@ -129,6 +129,7 @@ struct fbnic_ring {
>
> struct fbnic_q_triad {
> struct fbnic_ring sub0, sub1, cmpl;
> + struct xdp_rxq_info xdp_rxq;
> };
>
> struct fbnic_napi_vector {
> --
> 2.47.1
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 6/9] eth: fbnic: Add support for XDP queues
2025-07-23 14:59 ` [PATCH net-next 6/9] eth: fbnic: Add support for XDP queues Mohsin Bashir
@ 2025-07-23 23:54 ` Jakub Kicinski
0 siblings, 0 replies; 20+ messages in thread
From: Jakub Kicinski @ 2025-07-23 23:54 UTC (permalink / raw)
To: Mohsin Bashir
Cc: netdev, alexanderduyck, andrew+netdev, davem, edumazet, pabeni,
horms, vadim.fedorenko, jdamato, sdf, aleksander.lobakin, ast,
daniel, hawk, john.fastabend
On Wed, 23 Jul 2025 07:59:23 -0700 Mohsin Bashir wrote:
> Add support for allocating XDP_TX queues and configuring ring support.
> FBNIC has been designed with XDP support in mind. Each Tx queue has 2
> submission queues and one completion queue, with the expectation that
> one of the submission queues will be used by the stack, and the other
> by XDP. XDP queues are populated by XDP_TX and start from index 128
> in the TX queue array.
> The support for XDP_TX is added in the next patch.
transient warning, needs to be fixed to avoid breaking bisection:
drivers/net/ethernet/meta/fbnic/fbnic_txrx.c:622:23: warning: variable 'total_packets' set but not used [-Wunused-but-set-variable]
622 | u64 total_bytes = 0, total_packets = 0;
| ^
--
pw-bot: cr
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 8/9] eth: fbnic: Collect packet statistics for XDP
2025-07-23 14:59 ` [PATCH net-next 8/9] eth: fbnic: Collect packet statistics for XDP Mohsin Bashir
@ 2025-07-24 10:18 ` Simon Horman
2025-07-24 15:48 ` Mohsin Bashir
0 siblings, 1 reply; 20+ messages in thread
From: Simon Horman @ 2025-07-24 10:18 UTC (permalink / raw)
To: Mohsin Bashir
Cc: netdev, kuba, alexanderduyck, andrew+netdev, davem, edumazet,
pabeni, vadim.fedorenko, jdamato, sdf, aleksander.lobakin, ast,
daniel, hawk, john.fastabend
On Wed, Jul 23, 2025 at 07:59:25AM -0700, Mohsin Bashir wrote:
> Add support for XDP statistics collection and reporting via rtnl_link
> and netdev_queue API.
>
> For XDP programs without frags support, fbnic requires MTU to be less
> than the HDS threshold. If an over-sized frame is received, the frame
> is dropped and recorded as rx_length_errors reported via ip stats to
> highlight that this is an error.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
> ---
> .../device_drivers/ethernet/meta/fbnic.rst | 10 +++++
> .../net/ethernet/meta/fbnic/fbnic_netdev.c | 30 +++++++++++++
> drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 44 +++++++++++++++++--
> drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 1 +
> 4 files changed, 82 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst b/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst
> index afb8353daefd..ad5e2cba7afc 100644
> --- a/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst
> +++ b/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst
> @@ -160,3 +160,13 @@ behavior and potential performance bottlenecks.
> credit exhaustion
> - ``pcie_ob_rd_no_np_cred``: Read requests dropped due to non-posted
> credit exhaustion
> +
> +XDP Length Error:
> +~~~~~~~~~~~~~~~~~
> +
> +For XDP programs without frags support, fbnic tries to make sure that MTU fits
> +into a single buffer. If an oversized frame is received and gets fragmented,
> +it is dropped and the following netlink counters are updated
> + - ``rx-length``: number of frames dropped due to lack of fragmentation
> + support in the attached XDP program
> + - ``rx-errors``: total number of packets with errors received on the interface
Hi Mohsin,
make hmtldocs complains a bit about this:
.../fbnic.rst:170: ERROR: Unexpected indentation. [docutils]
.../fbnic.rst:171: WARNING: Bullet list ends without a blank line; unexpected unindent. [docutils]
Empirically, and I admit there was much trial and error involved,
I was able to address this by:
* adding a blank before the start of the list
* updating the indentation of the follow-on line of the first entry of the list
Your mileage may vary.
diff --git a/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst b/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst
index ad5e2cba7afc..fb6559fa4be4 100644
--- a/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst
+++ b/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst
@@ -167,6 +167,7 @@ XDP Length Error:
For XDP programs without frags support, fbnic tries to make sure that MTU fits
into a single buffer. If an oversized frame is received and gets fragmented,
it is dropped and the following netlink counters are updated
+
- ``rx-length``: number of frames dropped due to lack of fragmentation
- support in the attached XDP program
+ support in the attached XDP program
- ``rx-errors``: total number of packets with errors received on the interface
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 5/9] eth: fbnic: Add XDP pass, drop, abort support
2025-07-23 17:35 ` Maciej Fijalkowski
@ 2025-07-24 15:47 ` Mohsin Bashir
2025-07-24 16:51 ` Jakub Kicinski
2025-07-24 21:14 ` Alexander H Duyck
1 sibling, 1 reply; 20+ messages in thread
From: Mohsin Bashir @ 2025-07-24 15:47 UTC (permalink / raw)
To: Maciej Fijalkowski
Cc: netdev, kuba, alexanderduyck, andrew+netdev, davem, edumazet,
pabeni, horms, vadim.fedorenko, jdamato, sdf, aleksander.lobakin,
ast, daniel, hawk, john.fastabend
goto xdp_pass;
>
> Hi Mohsin,
>
> I thought we were past the times when we read prog pointer per each
> processed packet and agreed on reading the pointer once per napi loop?
>
>> +
>> + if (xdp_buff_has_frags(&pkt->buff) && !xdp_prog->aux->xdp_has_frags)
>> + return ERR_PTR(-FBNIC_XDP_LEN_ERR);
>
> when can this happen and couldn't you catch this within ndo_bpf? i suppose
> it's related to hds setup.
>
Hi Maciej,
It is important to avoid passing a packet with frags to a single-buff
XDP program. The implication being that a single-buff XDP program would
fail to access packet linearly. For example, we can send a jumbo UDP
packet to the SUT with a single-buffer XDP program attached and in the
XDP program, attempt to access payload linearly.
I believe handling this case within ndo_bpf may not be possible.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 8/9] eth: fbnic: Collect packet statistics for XDP
2025-07-24 10:18 ` Simon Horman
@ 2025-07-24 15:48 ` Mohsin Bashir
0 siblings, 0 replies; 20+ messages in thread
From: Mohsin Bashir @ 2025-07-24 15:48 UTC (permalink / raw)
To: Simon Horman
Cc: netdev, kuba, alexanderduyck, andrew+netdev, davem, edumazet,
pabeni, vadim.fedorenko, jdamato, sdf, aleksander.lobakin, ast,
daniel, hawk, john.fastabend
> Hi Mohsin,
>
> make hmtldocs complains a bit about this:
>
> .../fbnic.rst:170: ERROR: Unexpected indentation. [docutils]
> .../fbnic.rst:171: WARNING: Bullet list ends without a blank line; unexpected unindent. [docutils]
>
> Empirically, and I admit there was much trial and error involved,
> I was able to address this by:
> * adding a blank before the start of the list
> * updating the indentation of the follow-on line of the first entry of the list
>
> Your mileage may vary.
>
Hi Simon,
Thanks for the pointers. Let me handle this.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 5/9] eth: fbnic: Add XDP pass, drop, abort support
2025-07-24 15:47 ` Mohsin Bashir
@ 2025-07-24 16:51 ` Jakub Kicinski
0 siblings, 0 replies; 20+ messages in thread
From: Jakub Kicinski @ 2025-07-24 16:51 UTC (permalink / raw)
To: Mohsin Bashir
Cc: Maciej Fijalkowski, netdev, alexanderduyck, andrew+netdev, davem,
edumazet, pabeni, horms, vadim.fedorenko, jdamato, sdf,
aleksander.lobakin, ast, daniel, hawk, john.fastabend
On Thu, 24 Jul 2025 08:47:20 -0700 Mohsin Bashir wrote:
> >> + if (xdp_buff_has_frags(&pkt->buff) && !xdp_prog->aux->xdp_has_frags)
> >> + return ERR_PTR(-FBNIC_XDP_LEN_ERR);
> >
> > when can this happen and couldn't you catch this within ndo_bpf? i suppose
> > it's related to hds setup.
>
> It is important to avoid passing a packet with frags to a single-buff
> XDP program. The implication being that a single-buff XDP program would
> fail to access packet linearly. For example, we can send a jumbo UDP
> packet to the SUT with a single-buffer XDP program attached and in the
> XDP program, attempt to access payload linearly.
>
> I believe handling this case within ndo_bpf may not be possible.
Herm, we are handling it in ndo_bpf..
This check is just a safety in case somehow we get a packet larger
than MTU, and therefore crossing the "safe" HDS threshold.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 5/9] eth: fbnic: Add XDP pass, drop, abort support
2025-07-23 17:35 ` Maciej Fijalkowski
2025-07-24 15:47 ` Mohsin Bashir
@ 2025-07-24 21:14 ` Alexander H Duyck
2025-07-25 9:56 ` Maciej Fijalkowski
1 sibling, 1 reply; 20+ messages in thread
From: Alexander H Duyck @ 2025-07-24 21:14 UTC (permalink / raw)
To: Maciej Fijalkowski, Mohsin Bashir
Cc: netdev, kuba, alexanderduyck, andrew+netdev, davem, edumazet,
pabeni, horms, vadim.fedorenko, jdamato, sdf, aleksander.lobakin,
ast, daniel, hawk, john.fastabend
On Wed, 2025-07-23 at 19:35 +0200, Maciej Fijalkowski wrote:
> On Wed, Jul 23, 2025 at 07:59:22AM -0700, Mohsin Bashir wrote:
> > Add basic support for attaching an XDP program to the device and support
> > for PASS/DROP/ABORT actions.
> > In fbnic, buffers are always mapped as DMA_BIDIRECTIONAL.
> >
> > Testing:
> >
> > Hook a simple XDP program that passes all the packets destined for a
> > specific port
> >
> > iperf3 -c 192.168.1.10 -P 5 -p 12345
> > Connecting to host 192.168.1.10, port 12345
> > [ 5] local 192.168.1.9 port 46702 connected to 192.168.1.10 port 12345
> > [ ID] Interval Transfer Bitrate Retr Cwnd
> > - - - - - - - - - - - - - - - - - - - - - - - - -
> > [SUM] 1.00-2.00 sec 3.86 GBytes 33.2 Gbits/sec 0
> >
> > XDP_DROP:
> > Hook an XDP program that drops packets destined for a specific port
> >
> > iperf3 -c 192.168.1.10 -P 5 -p 12345
> > ^C- - - - - - - - - - - - - - - - - - - - - - - - -
> > [ ID] Interval Transfer Bitrate Retr
> > [SUM] 0.00-0.00 sec 0.00 Bytes 0.00 bits/sec 0 sender
> > [SUM] 0.00-0.00 sec 0.00 Bytes 0.00 bits/sec receiver
> > iperf3: interrupt - the client has terminated
> >
> > XDP with HDS:
> >
> > - Validate XDP attachment failure when HDS is low
> > ~] ethtool -G eth0 hds-thresh 512
> > ~] sudo ip link set eth0 xdpdrv obj xdp_pass_12345.o sec xdp
> > ~] Error: fbnic: MTU too high, or HDS threshold is too low for single
> > buffer XDP.
> >
> > - Validate successful XDP attachment when HDS threshold is appropriate
> > ~] ethtool -G eth0 hds-thresh 1536
> > ~] sudo ip link set eth0 xdpdrv obj xdp_pass_12345.o sec xdp
> >
> > - Validate when the XDP program is attached, changing HDS thresh to a
> > lower value fails
> > ~] ethtool -G eth0 hds-thresh 512
> > ~] netlink error: fbnic: Use higher HDS threshold or multi-buf capable
> > program
> >
> > - Validate HDS thresh does not matter when xdp frags support is
> > available
> > ~] ethtool -G eth0 hds-thresh 512
> > ~] sudo ip link set eth0 xdpdrv obj xdp_pass_mb_12345.o sec xdp.frags
> >
> > Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> > Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
> > ---
> > .../net/ethernet/meta/fbnic/fbnic_ethtool.c | 11 +++
> > .../net/ethernet/meta/fbnic/fbnic_netdev.c | 35 +++++++
> > .../net/ethernet/meta/fbnic/fbnic_netdev.h | 5 +
> > drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 95 +++++++++++++++++--
> > drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 1 +
> > 5 files changed, 140 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
> > index 84a0db9f1be0..d7b9eb267ead 100644
> > --- a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
> > +++ b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
> > @@ -329,6 +329,17 @@ fbnic_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring,
> > return -EINVAL;
> > }
> >
> > + /* If an XDP program is attached, we should check for potential frame
> > + * splitting. If the new HDS threshold can cause splitting, we should
> > + * only allow if the attached XDP program can handle frags.
> > + */
> > + if (fbnic_check_split_frames(fbn->xdp_prog, netdev->mtu,
> > + kernel_ring->hds_thresh)) {
> > + NL_SET_ERR_MSG_MOD(extack,
> > + "Use higher HDS threshold or multi-buf capable program");
> > + return -EINVAL;
> > + }
> > +
> > if (!netif_running(netdev)) {
> > fbnic_set_rings(fbn, ring, kernel_ring);
> > return 0;
> > diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
> > index d039e1c7a0d5..0621b89cbf3d 100644
> > --- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
> > +++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
> > @@ -504,6 +504,40 @@ static void fbnic_get_stats64(struct net_device *dev,
> > }
> > }
> >
> > +bool fbnic_check_split_frames(struct bpf_prog *prog, unsigned int mtu,
> > + u32 hds_thresh)
> > +{
> > + if (!prog)
> > + return false;
> > +
> > + if (prog->aux->xdp_has_frags)
> > + return false;
> > +
> > + return mtu + ETH_HLEN > hds_thresh;
> > +}
> > +
> > +static int fbnic_bpf(struct net_device *netdev, struct netdev_bpf *bpf)
> > +{
> > + struct bpf_prog *prog = bpf->prog, *prev_prog;
> > + struct fbnic_net *fbn = netdev_priv(netdev);
> > +
> > + if (bpf->command != XDP_SETUP_PROG)
> > + return -EINVAL;
> > +
> > + if (fbnic_check_split_frames(prog, netdev->mtu,
> > + fbn->hds_thresh)) {
> > + NL_SET_ERR_MSG_MOD(bpf->extack,
> > + "MTU too high, or HDS threshold is too low for single buffer XDP");
> > + return -EOPNOTSUPP;
> > + }
> > +
> > + prev_prog = xchg(&fbn->xdp_prog, prog);
> > + if (prev_prog)
> > + bpf_prog_put(prev_prog);
> > +
> > + return 0;
> > +}
> > +
> > static const struct net_device_ops fbnic_netdev_ops = {
> > .ndo_open = fbnic_open,
> > .ndo_stop = fbnic_stop,
> > @@ -513,6 +547,7 @@ static const struct net_device_ops fbnic_netdev_ops = {
> > .ndo_set_mac_address = fbnic_set_mac,
> > .ndo_set_rx_mode = fbnic_set_rx_mode,
> > .ndo_get_stats64 = fbnic_get_stats64,
> > + .ndo_bpf = fbnic_bpf,
> > .ndo_hwtstamp_get = fbnic_hwtstamp_get,
> > .ndo_hwtstamp_set = fbnic_hwtstamp_set,
> > };
> > diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
> > index 04c5c7ed6c3a..bfa79ea910d8 100644
> > --- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
> > +++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
> > @@ -18,6 +18,8 @@
> > #define FBNIC_TUN_GSO_FEATURES NETIF_F_GSO_IPXIP6
> >
> > struct fbnic_net {
> > + struct bpf_prog *xdp_prog;
> > +
> > struct fbnic_ring *tx[FBNIC_MAX_TXQS];
> > struct fbnic_ring *rx[FBNIC_MAX_RXQS];
> >
> > @@ -104,4 +106,7 @@ int fbnic_phylink_ethtool_ksettings_get(struct net_device *netdev,
> > int fbnic_phylink_get_fecparam(struct net_device *netdev,
> > struct ethtool_fecparam *fecparam);
> > int fbnic_phylink_init(struct net_device *netdev);
> > +
> > +bool fbnic_check_split_frames(struct bpf_prog *prog,
> > + unsigned int mtu, u32 hds_threshold);
> > #endif /* _FBNIC_NETDEV_H_ */
> > diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
> > index 71af7b9d5bcd..486c14e83ad5 100644
> > --- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
> > +++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
> > @@ -2,17 +2,26 @@
> > /* Copyright (c) Meta Platforms, Inc. and affiliates. */
> >
> > #include <linux/bitfield.h>
> > +#include <linux/bpf.h>
> > +#include <linux/bpf_trace.h>
> > #include <linux/iopoll.h>
> > #include <linux/pci.h>
> > #include <net/netdev_queues.h>
> > #include <net/page_pool/helpers.h>
> > #include <net/tcp.h>
> > +#include <net/xdp.h>
> >
> > #include "fbnic.h"
> > #include "fbnic_csr.h"
> > #include "fbnic_netdev.h"
> > #include "fbnic_txrx.h"
> >
> > +enum {
> > + FBNIC_XDP_PASS = 0,
> > + FBNIC_XDP_CONSUME,
> > + FBNIC_XDP_LEN_ERR,
> > +};
> > +
> > enum {
> > FBNIC_XMIT_CB_TS = 0x01,
> > };
> > @@ -877,7 +886,7 @@ static void fbnic_pkt_prepare(struct fbnic_napi_vector *nv, u64 rcd,
> >
> > headroom = hdr_pg_off - hdr_pg_start + FBNIC_RX_PAD;
> > frame_sz = hdr_pg_end - hdr_pg_start;
> > - xdp_init_buff(&pkt->buff, frame_sz, NULL);
> > + xdp_init_buff(&pkt->buff, frame_sz, &qt->xdp_rxq);
> > hdr_pg_start += (FBNIC_RCD_AL_BUFF_FRAG_MASK & rcd) *
> > FBNIC_BD_FRAG_SIZE;
> >
> > @@ -966,6 +975,38 @@ static struct sk_buff *fbnic_build_skb(struct fbnic_napi_vector *nv,
> > return skb;
> > }
> >
> > +static struct sk_buff *fbnic_run_xdp(struct fbnic_napi_vector *nv,
> > + struct fbnic_pkt_buff *pkt)
> > +{
> > + struct fbnic_net *fbn = netdev_priv(nv->napi.dev);
> > + struct bpf_prog *xdp_prog;
> > + int act;
> > +
> > + xdp_prog = READ_ONCE(fbn->xdp_prog);
> > + if (!xdp_prog)
> > + goto xdp_pass;
>
> Hi Mohsin,
>
> I thought we were past the times when we read prog pointer per each
> processed packet and agreed on reading the pointer once per napi loop?
This is reading the cached pointer from the netdev. Are you saying you
would rather have this as a stack pointer instead? I don't really see
the advantage to making this a once per napi poll session versus just
reading it once per packet.
>
> > +
> > + if (xdp_buff_has_frags(&pkt->buff) && !xdp_prog->aux->xdp_has_frags)
> > + return ERR_PTR(-FBNIC_XDP_LEN_ERR);
>
> when can this happen and couldn't you catch this within ndo_bpf? i suppose
> it's related to hds setup.
I was looking over the code and really the MTU is just a suggestion for
what size packets we can expect to receive. The MTU doesn't guarantee
the receive size, it is just the maximum transmission unit and
represents the minimum frame size we should support.
Much like what I did on the Intel NICs back in the day we can receive
up to the maximum frame size in almost all cases regardless of MTU
setting. Otherwise we would have to shut down the NIC and change the
buffer allocations much like we used to do on the old drivers every
time you changed the MTU.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 5/9] eth: fbnic: Add XDP pass, drop, abort support
2025-07-24 21:14 ` Alexander H Duyck
@ 2025-07-25 9:56 ` Maciej Fijalkowski
2025-07-25 15:10 ` Alexander Duyck
0 siblings, 1 reply; 20+ messages in thread
From: Maciej Fijalkowski @ 2025-07-25 9:56 UTC (permalink / raw)
To: Alexander H Duyck
Cc: Mohsin Bashir, netdev, kuba, alexanderduyck, andrew+netdev, davem,
edumazet, pabeni, horms, vadim.fedorenko, jdamato, sdf,
aleksander.lobakin, ast, daniel, hawk, john.fastabend
On Thu, Jul 24, 2025 at 02:14:11PM -0700, Alexander H Duyck wrote:
> On Wed, 2025-07-23 at 19:35 +0200, Maciej Fijalkowski wrote:
> > On Wed, Jul 23, 2025 at 07:59:22AM -0700, Mohsin Bashir wrote:
> > > Add basic support for attaching an XDP program to the device and support
> > > for PASS/DROP/ABORT actions.
> > > In fbnic, buffers are always mapped as DMA_BIDIRECTIONAL.
> > >
> > > Testing:
> > >
> > > Hook a simple XDP program that passes all the packets destined for a
> > > specific port
> > >
> > > iperf3 -c 192.168.1.10 -P 5 -p 12345
> > > Connecting to host 192.168.1.10, port 12345
> > > [ 5] local 192.168.1.9 port 46702 connected to 192.168.1.10 port 12345
> > > [ ID] Interval Transfer Bitrate Retr Cwnd
> > > - - - - - - - - - - - - - - - - - - - - - - - - -
> > > [SUM] 1.00-2.00 sec 3.86 GBytes 33.2 Gbits/sec 0
> > >
> > > XDP_DROP:
> > > Hook an XDP program that drops packets destined for a specific port
> > >
> > > iperf3 -c 192.168.1.10 -P 5 -p 12345
> > > ^C- - - - - - - - - - - - - - - - - - - - - - - - -
> > > [ ID] Interval Transfer Bitrate Retr
> > > [SUM] 0.00-0.00 sec 0.00 Bytes 0.00 bits/sec 0 sender
> > > [SUM] 0.00-0.00 sec 0.00 Bytes 0.00 bits/sec receiver
> > > iperf3: interrupt - the client has terminated
> > >
> > > XDP with HDS:
> > >
> > > - Validate XDP attachment failure when HDS is low
> > > ~] ethtool -G eth0 hds-thresh 512
> > > ~] sudo ip link set eth0 xdpdrv obj xdp_pass_12345.o sec xdp
> > > ~] Error: fbnic: MTU too high, or HDS threshold is too low for single
> > > buffer XDP.
> > >
> > > - Validate successful XDP attachment when HDS threshold is appropriate
> > > ~] ethtool -G eth0 hds-thresh 1536
> > > ~] sudo ip link set eth0 xdpdrv obj xdp_pass_12345.o sec xdp
> > >
> > > - Validate when the XDP program is attached, changing HDS thresh to a
> > > lower value fails
> > > ~] ethtool -G eth0 hds-thresh 512
> > > ~] netlink error: fbnic: Use higher HDS threshold or multi-buf capable
> > > program
> > >
> > > - Validate HDS thresh does not matter when xdp frags support is
> > > available
> > > ~] ethtool -G eth0 hds-thresh 512
> > > ~] sudo ip link set eth0 xdpdrv obj xdp_pass_mb_12345.o sec xdp.frags
> > >
> > > Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> > > Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
> > > ---
> > > .../net/ethernet/meta/fbnic/fbnic_ethtool.c | 11 +++
> > > .../net/ethernet/meta/fbnic/fbnic_netdev.c | 35 +++++++
> > > .../net/ethernet/meta/fbnic/fbnic_netdev.h | 5 +
> > > drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 95 +++++++++++++++++--
> > > drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 1 +
> > > 5 files changed, 140 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
> > > index 84a0db9f1be0..d7b9eb267ead 100644
> > > --- a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
> > > +++ b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
> > > @@ -329,6 +329,17 @@ fbnic_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring,
> > > return -EINVAL;
> > > }
> > >
> > > + /* If an XDP program is attached, we should check for potential frame
> > > + * splitting. If the new HDS threshold can cause splitting, we should
> > > + * only allow if the attached XDP program can handle frags.
> > > + */
> > > + if (fbnic_check_split_frames(fbn->xdp_prog, netdev->mtu,
> > > + kernel_ring->hds_thresh)) {
> > > + NL_SET_ERR_MSG_MOD(extack,
> > > + "Use higher HDS threshold or multi-buf capable program");
> > > + return -EINVAL;
> > > + }
> > > +
> > > if (!netif_running(netdev)) {
> > > fbnic_set_rings(fbn, ring, kernel_ring);
> > > return 0;
> > > diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
> > > index d039e1c7a0d5..0621b89cbf3d 100644
> > > --- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
> > > +++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
> > > @@ -504,6 +504,40 @@ static void fbnic_get_stats64(struct net_device *dev,
> > > }
> > > }
> > >
> > > +bool fbnic_check_split_frames(struct bpf_prog *prog, unsigned int mtu,
> > > + u32 hds_thresh)
> > > +{
> > > + if (!prog)
> > > + return false;
> > > +
> > > + if (prog->aux->xdp_has_frags)
> > > + return false;
> > > +
> > > + return mtu + ETH_HLEN > hds_thresh;
> > > +}
> > > +
> > > +static int fbnic_bpf(struct net_device *netdev, struct netdev_bpf *bpf)
> > > +{
> > > + struct bpf_prog *prog = bpf->prog, *prev_prog;
> > > + struct fbnic_net *fbn = netdev_priv(netdev);
> > > +
> > > + if (bpf->command != XDP_SETUP_PROG)
> > > + return -EINVAL;
> > > +
> > > + if (fbnic_check_split_frames(prog, netdev->mtu,
> > > + fbn->hds_thresh)) {
> > > + NL_SET_ERR_MSG_MOD(bpf->extack,
> > > + "MTU too high, or HDS threshold is too low for single buffer XDP");
> > > + return -EOPNOTSUPP;
> > > + }
> > > +
> > > + prev_prog = xchg(&fbn->xdp_prog, prog);
> > > + if (prev_prog)
> > > + bpf_prog_put(prev_prog);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > static const struct net_device_ops fbnic_netdev_ops = {
> > > .ndo_open = fbnic_open,
> > > .ndo_stop = fbnic_stop,
> > > @@ -513,6 +547,7 @@ static const struct net_device_ops fbnic_netdev_ops = {
> > > .ndo_set_mac_address = fbnic_set_mac,
> > > .ndo_set_rx_mode = fbnic_set_rx_mode,
> > > .ndo_get_stats64 = fbnic_get_stats64,
> > > + .ndo_bpf = fbnic_bpf,
> > > .ndo_hwtstamp_get = fbnic_hwtstamp_get,
> > > .ndo_hwtstamp_set = fbnic_hwtstamp_set,
> > > };
> > > diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
> > > index 04c5c7ed6c3a..bfa79ea910d8 100644
> > > --- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
> > > +++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
> > > @@ -18,6 +18,8 @@
> > > #define FBNIC_TUN_GSO_FEATURES NETIF_F_GSO_IPXIP6
> > >
> > > struct fbnic_net {
> > > + struct bpf_prog *xdp_prog;
> > > +
> > > struct fbnic_ring *tx[FBNIC_MAX_TXQS];
> > > struct fbnic_ring *rx[FBNIC_MAX_RXQS];
> > >
> > > @@ -104,4 +106,7 @@ int fbnic_phylink_ethtool_ksettings_get(struct net_device *netdev,
> > > int fbnic_phylink_get_fecparam(struct net_device *netdev,
> > > struct ethtool_fecparam *fecparam);
> > > int fbnic_phylink_init(struct net_device *netdev);
> > > +
> > > +bool fbnic_check_split_frames(struct bpf_prog *prog,
> > > + unsigned int mtu, u32 hds_threshold);
> > > #endif /* _FBNIC_NETDEV_H_ */
> > > diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
> > > index 71af7b9d5bcd..486c14e83ad5 100644
> > > --- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
> > > +++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
> > > @@ -2,17 +2,26 @@
> > > /* Copyright (c) Meta Platforms, Inc. and affiliates. */
> > >
> > > #include <linux/bitfield.h>
> > > +#include <linux/bpf.h>
> > > +#include <linux/bpf_trace.h>
> > > #include <linux/iopoll.h>
> > > #include <linux/pci.h>
> > > #include <net/netdev_queues.h>
> > > #include <net/page_pool/helpers.h>
> > > #include <net/tcp.h>
> > > +#include <net/xdp.h>
> > >
> > > #include "fbnic.h"
> > > #include "fbnic_csr.h"
> > > #include "fbnic_netdev.h"
> > > #include "fbnic_txrx.h"
> > >
> > > +enum {
> > > + FBNIC_XDP_PASS = 0,
> > > + FBNIC_XDP_CONSUME,
> > > + FBNIC_XDP_LEN_ERR,
> > > +};
> > > +
> > > enum {
> > > FBNIC_XMIT_CB_TS = 0x01,
> > > };
> > > @@ -877,7 +886,7 @@ static void fbnic_pkt_prepare(struct fbnic_napi_vector *nv, u64 rcd,
> > >
> > > headroom = hdr_pg_off - hdr_pg_start + FBNIC_RX_PAD;
> > > frame_sz = hdr_pg_end - hdr_pg_start;
> > > - xdp_init_buff(&pkt->buff, frame_sz, NULL);
> > > + xdp_init_buff(&pkt->buff, frame_sz, &qt->xdp_rxq);
> > > hdr_pg_start += (FBNIC_RCD_AL_BUFF_FRAG_MASK & rcd) *
> > > FBNIC_BD_FRAG_SIZE;
> > >
> > > @@ -966,6 +975,38 @@ static struct sk_buff *fbnic_build_skb(struct fbnic_napi_vector *nv,
> > > return skb;
> > > }
> > >
> > > +static struct sk_buff *fbnic_run_xdp(struct fbnic_napi_vector *nv,
> > > + struct fbnic_pkt_buff *pkt)
> > > +{
> > > + struct fbnic_net *fbn = netdev_priv(nv->napi.dev);
> > > + struct bpf_prog *xdp_prog;
> > > + int act;
> > > +
> > > + xdp_prog = READ_ONCE(fbn->xdp_prog);
> > > + if (!xdp_prog)
> > > + goto xdp_pass;
> >
> > Hi Mohsin,
> >
> > I thought we were past the times when we read prog pointer per each
> > processed packet and agreed on reading the pointer once per napi loop?
>
> This is reading the cached pointer from the netdev. Are you saying you
> would rather have this as a stack pointer instead? I don't really see
> the advantage to making this a once per napi poll session versus just
> reading it once per packet.
Hi Alex,
this is your only reason (at least currently in this patch) to load the
cacheline from netdev struct whereas i was just suggesting to piggyback on
the fact that bpf prog pointer will not change within single napi loop.
it's up to you of course and should be considered as micro-optimization.
>
> >
> > > +
> > > + if (xdp_buff_has_frags(&pkt->buff) && !xdp_prog->aux->xdp_has_frags)
> > > + return ERR_PTR(-FBNIC_XDP_LEN_ERR);
> >
> > when can this happen and couldn't you catch this within ndo_bpf? i suppose
> > it's related to hds setup.
>
> I was looking over the code and really the MTU is just a suggestion for
> what size packets we can expect to receive. The MTU doesn't guarantee
> the receive size, it is just the maximum transmission unit and
> represents the minimum frame size we should support.
>
> Much like what I did on the Intel NICs back in the day we can receive
> up to the maximum frame size in almost all cases regardless of MTU
mtu is usually an indicator what actual max frame size you are configuring
on rx side AFAICT. i asked about xdp_has_frags being looked up in hot path
as it's not what i usually have seen.
> setting. Otherwise we would have to shut down the NIC and change the
> buffer allocations much like we used to do on the old drivers every
> time you changed the MTU.
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 5/9] eth: fbnic: Add XDP pass, drop, abort support
2025-07-25 9:56 ` Maciej Fijalkowski
@ 2025-07-25 15:10 ` Alexander Duyck
2025-08-07 21:24 ` Mohsin Bashir
0 siblings, 1 reply; 20+ messages in thread
From: Alexander Duyck @ 2025-07-25 15:10 UTC (permalink / raw)
To: Maciej Fijalkowski
Cc: Mohsin Bashir, netdev, kuba, alexanderduyck, andrew+netdev, davem,
edumazet, pabeni, horms, vadim.fedorenko, jdamato, sdf,
aleksander.lobakin, ast, daniel, hawk, john.fastabend
On Fri, Jul 25, 2025 at 2:57 AM Maciej Fijalkowski
<maciej.fijalkowski@intel.com> wrote:
>
> On Thu, Jul 24, 2025 at 02:14:11PM -0700, Alexander H Duyck wrote:
> > On Wed, 2025-07-23 at 19:35 +0200, Maciej Fijalkowski wrote:
> > > On Wed, Jul 23, 2025 at 07:59:22AM -0700, Mohsin Bashir wrote:
> > > > Add basic support for attaching an XDP program to the device and support
> > > > for PASS/DROP/ABORT actions.
> > > > In fbnic, buffers are always mapped as DMA_BIDIRECTIONAL.
> > > >
> > > > Testing:
> > > >
> > > > Hook a simple XDP program that passes all the packets destined for a
> > > > specific port
> > > >
> > > > iperf3 -c 192.168.1.10 -P 5 -p 12345
> > > > Connecting to host 192.168.1.10, port 12345
> > > > [ 5] local 192.168.1.9 port 46702 connected to 192.168.1.10 port 12345
> > > > [ ID] Interval Transfer Bitrate Retr Cwnd
> > > > - - - - - - - - - - - - - - - - - - - - - - - - -
> > > > [SUM] 1.00-2.00 sec 3.86 GBytes 33.2 Gbits/sec 0
> > > >
> > > > XDP_DROP:
> > > > Hook an XDP program that drops packets destined for a specific port
> > > >
> > > > iperf3 -c 192.168.1.10 -P 5 -p 12345
> > > > ^C- - - - - - - - - - - - - - - - - - - - - - - - -
> > > > [ ID] Interval Transfer Bitrate Retr
> > > > [SUM] 0.00-0.00 sec 0.00 Bytes 0.00 bits/sec 0 sender
> > > > [SUM] 0.00-0.00 sec 0.00 Bytes 0.00 bits/sec receiver
> > > > iperf3: interrupt - the client has terminated
> > > >
> > > > XDP with HDS:
> > > >
> > > > - Validate XDP attachment failure when HDS is low
> > > > ~] ethtool -G eth0 hds-thresh 512
> > > > ~] sudo ip link set eth0 xdpdrv obj xdp_pass_12345.o sec xdp
> > > > ~] Error: fbnic: MTU too high, or HDS threshold is too low for single
> > > > buffer XDP.
> > > >
> > > > - Validate successful XDP attachment when HDS threshold is appropriate
> > > > ~] ethtool -G eth0 hds-thresh 1536
> > > > ~] sudo ip link set eth0 xdpdrv obj xdp_pass_12345.o sec xdp
> > > >
> > > > - Validate when the XDP program is attached, changing HDS thresh to a
> > > > lower value fails
> > > > ~] ethtool -G eth0 hds-thresh 512
> > > > ~] netlink error: fbnic: Use higher HDS threshold or multi-buf capable
> > > > program
> > > >
> > > > - Validate HDS thresh does not matter when xdp frags support is
> > > > available
> > > > ~] ethtool -G eth0 hds-thresh 512
> > > > ~] sudo ip link set eth0 xdpdrv obj xdp_pass_mb_12345.o sec xdp.frags
> > > >
> > > > Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> > > > Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
> > > > ---
> > > > .../net/ethernet/meta/fbnic/fbnic_ethtool.c | 11 +++
> > > > .../net/ethernet/meta/fbnic/fbnic_netdev.c | 35 +++++++
> > > > .../net/ethernet/meta/fbnic/fbnic_netdev.h | 5 +
> > > > drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 95 +++++++++++++++++--
> > > > drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 1 +
> > > > 5 files changed, 140 insertions(+), 7 deletions(-)
> > > >
> > > > diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
> > > > index 84a0db9f1be0..d7b9eb267ead 100644
> > > > --- a/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
> > > > +++ b/drivers/net/ethernet/meta/fbnic/fbnic_ethtool.c
> > > > @@ -329,6 +329,17 @@ fbnic_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring,
> > > > return -EINVAL;
> > > > }
> > > >
> > > > + /* If an XDP program is attached, we should check for potential frame
> > > > + * splitting. If the new HDS threshold can cause splitting, we should
> > > > + * only allow if the attached XDP program can handle frags.
> > > > + */
> > > > + if (fbnic_check_split_frames(fbn->xdp_prog, netdev->mtu,
> > > > + kernel_ring->hds_thresh)) {
> > > > + NL_SET_ERR_MSG_MOD(extack,
> > > > + "Use higher HDS threshold or multi-buf capable program");
> > > > + return -EINVAL;
> > > > + }
> > > > +
> > > > if (!netif_running(netdev)) {
> > > > fbnic_set_rings(fbn, ring, kernel_ring);
> > > > return 0;
> > > > diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
> > > > index d039e1c7a0d5..0621b89cbf3d 100644
> > > > --- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
> > > > +++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
> > > > @@ -504,6 +504,40 @@ static void fbnic_get_stats64(struct net_device *dev,
> > > > }
> > > > }
> > > >
> > > > +bool fbnic_check_split_frames(struct bpf_prog *prog, unsigned int mtu,
> > > > + u32 hds_thresh)
> > > > +{
> > > > + if (!prog)
> > > > + return false;
> > > > +
> > > > + if (prog->aux->xdp_has_frags)
> > > > + return false;
> > > > +
> > > > + return mtu + ETH_HLEN > hds_thresh;
> > > > +}
> > > > +
> > > > +static int fbnic_bpf(struct net_device *netdev, struct netdev_bpf *bpf)
> > > > +{
> > > > + struct bpf_prog *prog = bpf->prog, *prev_prog;
> > > > + struct fbnic_net *fbn = netdev_priv(netdev);
> > > > +
> > > > + if (bpf->command != XDP_SETUP_PROG)
> > > > + return -EINVAL;
> > > > +
> > > > + if (fbnic_check_split_frames(prog, netdev->mtu,
> > > > + fbn->hds_thresh)) {
> > > > + NL_SET_ERR_MSG_MOD(bpf->extack,
> > > > + "MTU too high, or HDS threshold is too low for single buffer XDP");
> > > > + return -EOPNOTSUPP;
> > > > + }
> > > > +
> > > > + prev_prog = xchg(&fbn->xdp_prog, prog);
> > > > + if (prev_prog)
> > > > + bpf_prog_put(prev_prog);
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > static const struct net_device_ops fbnic_netdev_ops = {
> > > > .ndo_open = fbnic_open,
> > > > .ndo_stop = fbnic_stop,
> > > > @@ -513,6 +547,7 @@ static const struct net_device_ops fbnic_netdev_ops = {
> > > > .ndo_set_mac_address = fbnic_set_mac,
> > > > .ndo_set_rx_mode = fbnic_set_rx_mode,
> > > > .ndo_get_stats64 = fbnic_get_stats64,
> > > > + .ndo_bpf = fbnic_bpf,
> > > > .ndo_hwtstamp_get = fbnic_hwtstamp_get,
> > > > .ndo_hwtstamp_set = fbnic_hwtstamp_set,
> > > > };
> > > > diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
> > > > index 04c5c7ed6c3a..bfa79ea910d8 100644
> > > > --- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
> > > > +++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
> > > > @@ -18,6 +18,8 @@
> > > > #define FBNIC_TUN_GSO_FEATURES NETIF_F_GSO_IPXIP6
> > > >
> > > > struct fbnic_net {
> > > > + struct bpf_prog *xdp_prog;
> > > > +
> > > > struct fbnic_ring *tx[FBNIC_MAX_TXQS];
> > > > struct fbnic_ring *rx[FBNIC_MAX_RXQS];
> > > >
> > > > @@ -104,4 +106,7 @@ int fbnic_phylink_ethtool_ksettings_get(struct net_device *netdev,
> > > > int fbnic_phylink_get_fecparam(struct net_device *netdev,
> > > > struct ethtool_fecparam *fecparam);
> > > > int fbnic_phylink_init(struct net_device *netdev);
> > > > +
> > > > +bool fbnic_check_split_frames(struct bpf_prog *prog,
> > > > + unsigned int mtu, u32 hds_threshold);
> > > > #endif /* _FBNIC_NETDEV_H_ */
> > > > diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
> > > > index 71af7b9d5bcd..486c14e83ad5 100644
> > > > --- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
> > > > +++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
> > > > @@ -2,17 +2,26 @@
> > > > /* Copyright (c) Meta Platforms, Inc. and affiliates. */
> > > >
> > > > #include <linux/bitfield.h>
> > > > +#include <linux/bpf.h>
> > > > +#include <linux/bpf_trace.h>
> > > > #include <linux/iopoll.h>
> > > > #include <linux/pci.h>
> > > > #include <net/netdev_queues.h>
> > > > #include <net/page_pool/helpers.h>
> > > > #include <net/tcp.h>
> > > > +#include <net/xdp.h>
> > > >
> > > > #include "fbnic.h"
> > > > #include "fbnic_csr.h"
> > > > #include "fbnic_netdev.h"
> > > > #include "fbnic_txrx.h"
> > > >
> > > > +enum {
> > > > + FBNIC_XDP_PASS = 0,
> > > > + FBNIC_XDP_CONSUME,
> > > > + FBNIC_XDP_LEN_ERR,
> > > > +};
> > > > +
> > > > enum {
> > > > FBNIC_XMIT_CB_TS = 0x01,
> > > > };
> > > > @@ -877,7 +886,7 @@ static void fbnic_pkt_prepare(struct fbnic_napi_vector *nv, u64 rcd,
> > > >
> > > > headroom = hdr_pg_off - hdr_pg_start + FBNIC_RX_PAD;
> > > > frame_sz = hdr_pg_end - hdr_pg_start;
> > > > - xdp_init_buff(&pkt->buff, frame_sz, NULL);
> > > > + xdp_init_buff(&pkt->buff, frame_sz, &qt->xdp_rxq);
> > > > hdr_pg_start += (FBNIC_RCD_AL_BUFF_FRAG_MASK & rcd) *
> > > > FBNIC_BD_FRAG_SIZE;
> > > >
> > > > @@ -966,6 +975,38 @@ static struct sk_buff *fbnic_build_skb(struct fbnic_napi_vector *nv,
> > > > return skb;
> > > > }
> > > >
> > > > +static struct sk_buff *fbnic_run_xdp(struct fbnic_napi_vector *nv,
> > > > + struct fbnic_pkt_buff *pkt)
> > > > +{
> > > > + struct fbnic_net *fbn = netdev_priv(nv->napi.dev);
> > > > + struct bpf_prog *xdp_prog;
> > > > + int act;
> > > > +
> > > > + xdp_prog = READ_ONCE(fbn->xdp_prog);
> > > > + if (!xdp_prog)
> > > > + goto xdp_pass;
> > >
> > > Hi Mohsin,
> > >
> > > I thought we were past the times when we read prog pointer per each
> > > processed packet and agreed on reading the pointer once per napi loop?
> >
> > This is reading the cached pointer from the netdev. Are you saying you
> > would rather have this as a stack pointer instead? I don't really see
> > the advantage to making this a once per napi poll session versus just
> > reading it once per packet.
>
> Hi Alex,
>
> this is your only reason (at least currently in this patch) to load the
> cacheline from netdev struct whereas i was just suggesting to piggyback on
> the fact that bpf prog pointer will not change within single napi loop.
>
> it's up to you of course and should be considered as micro-optimization.
The cost for the "extra cacheline" should be nil as from what I can
tell xdp_prog shares the cacheline with gro_max_size and _rx so in
either path that cacheline is going to eventually be pulled in anyway
regardless of what path it goes with.
> >
> > >
> > > > +
> > > > + if (xdp_buff_has_frags(&pkt->buff) && !xdp_prog->aux->xdp_has_frags)
> > > > + return ERR_PTR(-FBNIC_XDP_LEN_ERR);
> > >
> > > when can this happen and couldn't you catch this within ndo_bpf? i suppose
> > > it's related to hds setup.
> >
> > I was looking over the code and really the MTU is just a suggestion for
> > what size packets we can expect to receive. The MTU doesn't guarantee
> > the receive size, it is just the maximum transmission unit and
> > represents the minimum frame size we should support.
> >
> > Much like what I did on the Intel NICs back in the day we can receive
> > up to the maximum frame size in almost all cases regardless of MTU
>
> mtu is usually an indicator what actual max frame size you are configuring
> on rx side AFAICT. i asked about xdp_has_frags being looked up in hot path
> as it's not what i usually have seen.
The problem is if you are going to actually modify the Rx side there
are side effects to that. Usually there are settings in the hardware
that have to be changed and as a result the link has to bounce as the
data path has to be cleaned up and reset after reallocating buffers or
changing receive modes.
For the fm10k I had designed that driver to avoid that. As such it
doesn't even have a change_mtu function anymore as that got removed in
2016 when the min/max MTU checking was added. That is also one reason
why the fbnic driver doesn't need that checking, however I suppose it
does introduce an interesting problem as we will have to go back and
add change_mtu functions for the drivers that support BPF and
scatter-gather receive for large frames since we should probably be
checking for if the MTU is large enough to cause us to use frags.
Either that or we just make them mandatory for devices that support
it.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next 5/9] eth: fbnic: Add XDP pass, drop, abort support
2025-07-25 15:10 ` Alexander Duyck
@ 2025-08-07 21:24 ` Mohsin Bashir
0 siblings, 0 replies; 20+ messages in thread
From: Mohsin Bashir @ 2025-08-07 21:24 UTC (permalink / raw)
To: Alexander Duyck, Maciej Fijalkowski
Cc: netdev, kuba, alexanderduyck, andrew+netdev, davem, edumazet,
pabeni, horms, vadim.fedorenko, jdamato, sdf, aleksander.lobakin,
ast, daniel, hawk, john.fastabend
>>>> Hi Mohsin,
>>>>
>>>> I thought we were past the times when we read prog pointer per each
>>>> processed packet and agreed on reading the pointer once per napi loop?
>>>
>>> This is reading the cached pointer from the netdev. Are you saying you
>>> would rather have this as a stack pointer instead? I don't really see
>>> the advantage to making this a once per napi poll session versus just
>>> reading it once per packet.
>>
>> Hi Alex,
>>
>> this is your only reason (at least currently in this patch) to load the
>> cacheline from netdev struct whereas i was just suggesting to piggyback on
>> the fact that bpf prog pointer will not change within single napi loop.
>>
>> it's up to you of course and should be considered as micro-optimization.
>
> The cost for the "extra cacheline" should be nil as from what I can
> tell xdp_prog shares the cacheline with gro_max_size and _rx so in
> either path that cacheline is going to eventually be pulled in anyway
> regardless of what path it goes with.
>
Hi Maciej,
Appreciate your suggestion regarding the micro-optimization. However, at
this time, we are not planning to adopt this change. I am all ears to
any further thoughts or concerns you may have about it.
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-08-07 21:24 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-23 14:59 [PATCH net-next 0/9] eth: fbnic Add XDP support for fbnic Mohsin Bashir
2025-07-23 14:59 ` [PATCH net-next 1/9] eth: fbnic: Add support for HDS configuration Mohsin Bashir
2025-07-23 14:59 ` [PATCH net-next 2/9] eth: fbnic: Update Headroom Mohsin Bashir
2025-07-23 14:59 ` [PATCH net-next 3/9] eth: fbnic: Use shinfo to track frags state on Rx Mohsin Bashir
2025-07-23 14:59 ` [PATCH net-next 4/9] eth: fbnic: Prefetch packet headers " Mohsin Bashir
2025-07-23 14:59 ` [PATCH net-next 5/9] eth: fbnic: Add XDP pass, drop, abort support Mohsin Bashir
2025-07-23 17:35 ` Maciej Fijalkowski
2025-07-24 15:47 ` Mohsin Bashir
2025-07-24 16:51 ` Jakub Kicinski
2025-07-24 21:14 ` Alexander H Duyck
2025-07-25 9:56 ` Maciej Fijalkowski
2025-07-25 15:10 ` Alexander Duyck
2025-08-07 21:24 ` Mohsin Bashir
2025-07-23 14:59 ` [PATCH net-next 6/9] eth: fbnic: Add support for XDP queues Mohsin Bashir
2025-07-23 23:54 ` Jakub Kicinski
2025-07-23 14:59 ` [PATCH net-next 7/9] eth: fbnic: Add support for XDP_TX action Mohsin Bashir
2025-07-23 14:59 ` [PATCH net-next 8/9] eth: fbnic: Collect packet statistics for XDP Mohsin Bashir
2025-07-24 10:18 ` Simon Horman
2025-07-24 15:48 ` Mohsin Bashir
2025-07-23 14:59 ` [PATCH net-next 9/9] eth: fbnic: Report XDP stats via ethtool Mohsin Bashir
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).