* [PATCH net-next v5 1/7] bnxt_en: add support for rx-copybreak ethtool command
2024-11-13 17:32 [PATCH net-next v5 0/7] bnxt_en: implement tcp-data-split and thresh option Taehee Yoo
@ 2024-11-13 17:32 ` Taehee Yoo
2024-11-14 22:54 ` Andy Gospodarek
2024-11-15 0:22 ` Michael Chan
2024-11-13 17:32 ` [PATCH net-next v5 2/7] net: ethtool: add tcp_data_split_mod member in kernel_ethtool_ringparam Taehee Yoo
` (7 subsequent siblings)
8 siblings, 2 replies; 28+ messages in thread
From: Taehee Yoo @ 2024-11-13 17:32 UTC (permalink / raw)
To: davem, kuba, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev
Cc: kory.maincent, maxime.chevallier, danieller, hengqi, ecree.xilinx,
przemyslaw.kitszel, hkallweit1, ahmed.zaki, rrameshbabu, idosch,
jiri, bigeasy, lorenzo, jdamato, aleksander.lobakin, kaiyuanz,
willemb, daniel.zahka, ap420073
The bnxt_en driver supports rx-copybreak, but it couldn't be set by
userspace. Only the default value(256) has worked.
This patch makes the bnxt_en driver support following command.
`ethtool --set-tunable <devname> rx-copybreak <value> ` and
`ethtool --get-tunable <devname> rx-copybreak`.
By this patch, hds_threshol is set to the rx-copybreak value.
But it will be set by `ethtool -G eth0 header-data-split-thresh N`
in the next patch.
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Tested-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
v5:
- Do not set HDS if XDP is attached.
- rx_size and pkt_size are always bigger than 256.
v4:
- Remove min rx-copybreak value.
- Add Review tag from Brett.
- Add Test tag from Stanislav.
v3:
- Update copybreak value after closing nic and before opening nic when
the device is running.
v2:
- Define max/vim rx_copybreak value.
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 28 ++++++-----
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 5 +-
.../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 49 ++++++++++++++++++-
3 files changed, 68 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 4c1302a8f72d..d521b8918c02 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -81,7 +81,6 @@ MODULE_DESCRIPTION("Broadcom NetXtreme network driver");
#define BNXT_RX_OFFSET (NET_SKB_PAD + NET_IP_ALIGN)
#define BNXT_RX_DMA_OFFSET NET_SKB_PAD
-#define BNXT_RX_COPY_THRESH 256
#define BNXT_TX_PUSH_THRESH 164
@@ -1328,13 +1327,13 @@ static struct sk_buff *bnxt_copy_data(struct bnxt_napi *bnapi, u8 *data,
if (!skb)
return NULL;
- dma_sync_single_for_cpu(&pdev->dev, mapping, bp->rx_copy_thresh,
+ dma_sync_single_for_cpu(&pdev->dev, mapping, bp->rx_copybreak,
bp->rx_dir);
memcpy(skb->data - NET_IP_ALIGN, data - NET_IP_ALIGN,
len + NET_IP_ALIGN);
- dma_sync_single_for_device(&pdev->dev, mapping, bp->rx_copy_thresh,
+ dma_sync_single_for_device(&pdev->dev, mapping, bp->rx_copybreak,
bp->rx_dir);
skb_put(skb, len);
@@ -1827,7 +1826,7 @@ static inline struct sk_buff *bnxt_tpa_end(struct bnxt *bp,
return NULL;
}
- if (len <= bp->rx_copy_thresh) {
+ if (len <= bp->rx_copybreak) {
skb = bnxt_copy_skb(bnapi, data_ptr, len, mapping);
if (!skb) {
bnxt_abort_tpa(cpr, idx, agg_bufs);
@@ -2161,7 +2160,7 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
}
}
- if (len <= bp->rx_copy_thresh) {
+ if (len <= bp->rx_copybreak) {
if (!xdp_active)
skb = bnxt_copy_skb(bnapi, data_ptr, len, dma_addr);
else
@@ -4452,6 +4451,11 @@ void bnxt_set_tpa_flags(struct bnxt *bp)
bp->flags |= BNXT_FLAG_GRO;
}
+static void bnxt_init_ring_params(struct bnxt *bp)
+{
+ bp->rx_copybreak = BNXT_DEFAULT_RX_COPYBREAK;
+}
+
/* bp->rx_ring_size, bp->tx_ring_size, dev->mtu, BNXT_FLAG_{G|L}RO flags must
* be set on entry.
*/
@@ -4466,7 +4470,6 @@ void bnxt_set_ring_params(struct bnxt *bp)
rx_space = rx_size + ALIGN(max(NET_SKB_PAD, XDP_PACKET_HEADROOM), 8) +
SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
- bp->rx_copy_thresh = BNXT_RX_COPY_THRESH;
ring_size = bp->rx_ring_size;
bp->rx_agg_ring_size = 0;
bp->rx_agg_nr_pages = 0;
@@ -4511,7 +4514,9 @@ void bnxt_set_ring_params(struct bnxt *bp)
ALIGN(max(NET_SKB_PAD, XDP_PACKET_HEADROOM), 8) -
SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
} else {
- rx_size = SKB_DATA_ALIGN(BNXT_RX_COPY_THRESH + NET_IP_ALIGN);
+ rx_size = SKB_DATA_ALIGN(max(BNXT_DEFAULT_RX_COPYBREAK,
+ bp->rx_copybreak) +
+ NET_IP_ALIGN);
rx_space = rx_size + NET_SKB_PAD +
SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
}
@@ -6417,16 +6422,14 @@ static int bnxt_hwrm_vnic_set_hds(struct bnxt *bp, struct bnxt_vnic_info *vnic)
req->flags = cpu_to_le32(VNIC_PLCMODES_CFG_REQ_FLAGS_JUMBO_PLACEMENT);
req->enables = cpu_to_le32(VNIC_PLCMODES_CFG_REQ_ENABLES_JUMBO_THRESH_VALID);
+ req->jumbo_thresh = cpu_to_le16(bp->rx_buf_use_size);
- if (BNXT_RX_PAGE_MODE(bp)) {
- req->jumbo_thresh = cpu_to_le16(bp->rx_buf_use_size);
- } else {
+ if (!BNXT_RX_PAGE_MODE(bp) && (bp->flags & BNXT_FLAG_AGG_RINGS)) {
req->flags |= cpu_to_le32(VNIC_PLCMODES_CFG_REQ_FLAGS_HDS_IPV4 |
VNIC_PLCMODES_CFG_REQ_FLAGS_HDS_IPV6);
req->enables |=
cpu_to_le32(VNIC_PLCMODES_CFG_REQ_ENABLES_HDS_THRESHOLD_VALID);
- req->jumbo_thresh = cpu_to_le16(bp->rx_copy_thresh);
- req->hds_threshold = cpu_to_le16(bp->rx_copy_thresh);
+ req->hds_threshold = cpu_to_le16(bp->rx_copybreak);
}
req->vnic_id = cpu_to_le32(vnic->fw_vnic_id);
return hwrm_req_send(bp, req);
@@ -15872,6 +15875,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
bnxt_init_l2_fltr_tbl(bp);
bnxt_set_rx_skb_mode(bp, false);
bnxt_set_tpa_flags(bp);
+ bnxt_init_ring_params(bp);
bnxt_set_ring_params(bp);
bnxt_rdma_aux_device_init(bp);
rc = bnxt_set_dflt_rings(bp, true);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 649955fa3e37..d1eef880eec5 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -34,6 +34,9 @@
#include <linux/firmware/broadcom/tee_bnxt_fw.h>
#endif
+#define BNXT_DEFAULT_RX_COPYBREAK 256
+#define BNXT_MAX_RX_COPYBREAK 1024
+
extern struct list_head bnxt_block_cb_list;
struct page_pool;
@@ -2300,7 +2303,7 @@ struct bnxt {
enum dma_data_direction rx_dir;
u32 rx_ring_size;
u32 rx_agg_ring_size;
- u32 rx_copy_thresh;
+ u32 rx_copybreak;
u32 rx_ring_mask;
u32 rx_agg_ring_mask;
int rx_nr_pages;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index cfd2c65b1c90..adf30d1f738f 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -4318,6 +4318,50 @@ static int bnxt_get_eee(struct net_device *dev, struct ethtool_keee *edata)
return 0;
}
+static int bnxt_set_tunable(struct net_device *dev,
+ const struct ethtool_tunable *tuna,
+ const void *data)
+{
+ struct bnxt *bp = netdev_priv(dev);
+ u32 rx_copybreak;
+
+ switch (tuna->id) {
+ case ETHTOOL_RX_COPYBREAK:
+ rx_copybreak = *(u32 *)data;
+ if (rx_copybreak > BNXT_MAX_RX_COPYBREAK)
+ return -ERANGE;
+ if (rx_copybreak != bp->rx_copybreak) {
+ if (netif_running(dev)) {
+ bnxt_close_nic(bp, false, false);
+ bp->rx_copybreak = rx_copybreak;
+ bnxt_set_ring_params(bp);
+ bnxt_open_nic(bp, false, false);
+ } else {
+ bp->rx_copybreak = rx_copybreak;
+ }
+ }
+ return 0;
+ default:
+ return -EOPNOTSUPP;
+ }
+}
+
+static int bnxt_get_tunable(struct net_device *dev,
+ const struct ethtool_tunable *tuna, void *data)
+{
+ struct bnxt *bp = netdev_priv(dev);
+
+ switch (tuna->id) {
+ case ETHTOOL_RX_COPYBREAK:
+ *(u32 *)data = bp->rx_copybreak;
+ break;
+ default:
+ return -EOPNOTSUPP;
+ }
+
+ return 0;
+}
+
static int bnxt_read_sfp_module_eeprom_info(struct bnxt *bp, u16 i2c_addr,
u16 page_number, u8 bank,
u16 start_addr, u16 data_length,
@@ -4768,7 +4812,8 @@ static int bnxt_run_loopback(struct bnxt *bp)
cpr = &rxr->bnapi->cp_ring;
if (bp->flags & BNXT_FLAG_CHIP_P5_PLUS)
cpr = rxr->rx_cpr;
- pkt_size = min(bp->dev->mtu + ETH_HLEN, bp->rx_copy_thresh);
+ pkt_size = min(bp->dev->mtu + ETH_HLEN, max(BNXT_DEFAULT_RX_COPYBREAK,
+ bp->rx_copybreak));
skb = netdev_alloc_skb(bp->dev, pkt_size);
if (!skb)
return -ENOMEM;
@@ -5341,6 +5386,8 @@ const struct ethtool_ops bnxt_ethtool_ops = {
.get_link_ext_stats = bnxt_get_link_ext_stats,
.get_eee = bnxt_get_eee,
.set_eee = bnxt_set_eee,
+ .get_tunable = bnxt_get_tunable,
+ .set_tunable = bnxt_set_tunable,
.get_module_info = bnxt_get_module_info,
.get_module_eeprom = bnxt_get_module_eeprom,
.get_module_eeprom_by_page = bnxt_get_module_eeprom_by_page,
--
2.34.1
^ permalink raw reply related [flat|nested] 28+ messages in thread* Re: [PATCH net-next v5 1/7] bnxt_en: add support for rx-copybreak ethtool command
2024-11-13 17:32 ` [PATCH net-next v5 1/7] bnxt_en: add support for rx-copybreak ethtool command Taehee Yoo
@ 2024-11-14 22:54 ` Andy Gospodarek
2024-11-15 0:22 ` Michael Chan
1 sibling, 0 replies; 28+ messages in thread
From: Andy Gospodarek @ 2024-11-14 22:54 UTC (permalink / raw)
To: Taehee Yoo
Cc: davem, kuba, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev, kory.maincent, maxime.chevallier, danieller, hengqi,
ecree.xilinx, przemyslaw.kitszel, hkallweit1, ahmed.zaki,
rrameshbabu, idosch, jiri, bigeasy, lorenzo, jdamato,
aleksander.lobakin, kaiyuanz, willemb, daniel.zahka
On Wed, Nov 13, 2024 at 05:32:15PM +0000, Taehee Yoo wrote:
> The bnxt_en driver supports rx-copybreak, but it couldn't be set by
> userspace. Only the default value(256) has worked.
> This patch makes the bnxt_en driver support following command.
> `ethtool --set-tunable <devname> rx-copybreak <value> ` and
> `ethtool --get-tunable <devname> rx-copybreak`.
>
> By this patch, hds_threshol is set to the rx-copybreak value.
> But it will be set by `ethtool -G eth0 header-data-split-thresh N`
> in the next patch.
>
> Reviewed-by: Brett Creeley <brett.creeley@amd.com>
> Tested-by: Stanislav Fomichev <sdf@fomichev.me>
> Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Tested-by: Andy Gospodarek <gospo@broadcom.com>
> ---
>
> v5:
> - Do not set HDS if XDP is attached.
> - rx_size and pkt_size are always bigger than 256.
>
> v4:
> - Remove min rx-copybreak value.
> - Add Review tag from Brett.
> - Add Test tag from Stanislav.
>
> v3:
> - Update copybreak value after closing nic and before opening nic when
> the device is running.
>
> v2:
> - Define max/vim rx_copybreak value.
>
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 28 ++++++-----
> drivers/net/ethernet/broadcom/bnxt/bnxt.h | 5 +-
> .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 49 ++++++++++++++++++-
> 3 files changed, 68 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 4c1302a8f72d..d521b8918c02 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -81,7 +81,6 @@ MODULE_DESCRIPTION("Broadcom NetXtreme network driver");
>
> #define BNXT_RX_OFFSET (NET_SKB_PAD + NET_IP_ALIGN)
> #define BNXT_RX_DMA_OFFSET NET_SKB_PAD
> -#define BNXT_RX_COPY_THRESH 256
>
> #define BNXT_TX_PUSH_THRESH 164
>
> @@ -1328,13 +1327,13 @@ static struct sk_buff *bnxt_copy_data(struct bnxt_napi *bnapi, u8 *data,
> if (!skb)
> return NULL;
>
> - dma_sync_single_for_cpu(&pdev->dev, mapping, bp->rx_copy_thresh,
> + dma_sync_single_for_cpu(&pdev->dev, mapping, bp->rx_copybreak,
> bp->rx_dir);
>
> memcpy(skb->data - NET_IP_ALIGN, data - NET_IP_ALIGN,
> len + NET_IP_ALIGN);
>
> - dma_sync_single_for_device(&pdev->dev, mapping, bp->rx_copy_thresh,
> + dma_sync_single_for_device(&pdev->dev, mapping, bp->rx_copybreak,
> bp->rx_dir);
>
> skb_put(skb, len);
> @@ -1827,7 +1826,7 @@ static inline struct sk_buff *bnxt_tpa_end(struct bnxt *bp,
> return NULL;
> }
>
> - if (len <= bp->rx_copy_thresh) {
> + if (len <= bp->rx_copybreak) {
> skb = bnxt_copy_skb(bnapi, data_ptr, len, mapping);
> if (!skb) {
> bnxt_abort_tpa(cpr, idx, agg_bufs);
> @@ -2161,7 +2160,7 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
> }
> }
>
> - if (len <= bp->rx_copy_thresh) {
> + if (len <= bp->rx_copybreak) {
> if (!xdp_active)
> skb = bnxt_copy_skb(bnapi, data_ptr, len, dma_addr);
> else
> @@ -4452,6 +4451,11 @@ void bnxt_set_tpa_flags(struct bnxt *bp)
> bp->flags |= BNXT_FLAG_GRO;
> }
>
> +static void bnxt_init_ring_params(struct bnxt *bp)
> +{
> + bp->rx_copybreak = BNXT_DEFAULT_RX_COPYBREAK;
> +}
> +
> /* bp->rx_ring_size, bp->tx_ring_size, dev->mtu, BNXT_FLAG_{G|L}RO flags must
> * be set on entry.
> */
> @@ -4466,7 +4470,6 @@ void bnxt_set_ring_params(struct bnxt *bp)
> rx_space = rx_size + ALIGN(max(NET_SKB_PAD, XDP_PACKET_HEADROOM), 8) +
> SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
>
> - bp->rx_copy_thresh = BNXT_RX_COPY_THRESH;
> ring_size = bp->rx_ring_size;
> bp->rx_agg_ring_size = 0;
> bp->rx_agg_nr_pages = 0;
> @@ -4511,7 +4514,9 @@ void bnxt_set_ring_params(struct bnxt *bp)
> ALIGN(max(NET_SKB_PAD, XDP_PACKET_HEADROOM), 8) -
> SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> } else {
> - rx_size = SKB_DATA_ALIGN(BNXT_RX_COPY_THRESH + NET_IP_ALIGN);
> + rx_size = SKB_DATA_ALIGN(max(BNXT_DEFAULT_RX_COPYBREAK,
> + bp->rx_copybreak) +
> + NET_IP_ALIGN);
> rx_space = rx_size + NET_SKB_PAD +
> SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> }
> @@ -6417,16 +6422,14 @@ static int bnxt_hwrm_vnic_set_hds(struct bnxt *bp, struct bnxt_vnic_info *vnic)
>
> req->flags = cpu_to_le32(VNIC_PLCMODES_CFG_REQ_FLAGS_JUMBO_PLACEMENT);
> req->enables = cpu_to_le32(VNIC_PLCMODES_CFG_REQ_ENABLES_JUMBO_THRESH_VALID);
> + req->jumbo_thresh = cpu_to_le16(bp->rx_buf_use_size);
>
> - if (BNXT_RX_PAGE_MODE(bp)) {
> - req->jumbo_thresh = cpu_to_le16(bp->rx_buf_use_size);
> - } else {
> + if (!BNXT_RX_PAGE_MODE(bp) && (bp->flags & BNXT_FLAG_AGG_RINGS)) {
> req->flags |= cpu_to_le32(VNIC_PLCMODES_CFG_REQ_FLAGS_HDS_IPV4 |
> VNIC_PLCMODES_CFG_REQ_FLAGS_HDS_IPV6);
> req->enables |=
> cpu_to_le32(VNIC_PLCMODES_CFG_REQ_ENABLES_HDS_THRESHOLD_VALID);
> - req->jumbo_thresh = cpu_to_le16(bp->rx_copy_thresh);
> - req->hds_threshold = cpu_to_le16(bp->rx_copy_thresh);
> + req->hds_threshold = cpu_to_le16(bp->rx_copybreak);
> }
> req->vnic_id = cpu_to_le32(vnic->fw_vnic_id);
> return hwrm_req_send(bp, req);
> @@ -15872,6 +15875,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
> bnxt_init_l2_fltr_tbl(bp);
> bnxt_set_rx_skb_mode(bp, false);
> bnxt_set_tpa_flags(bp);
> + bnxt_init_ring_params(bp);
> bnxt_set_ring_params(bp);
> bnxt_rdma_aux_device_init(bp);
> rc = bnxt_set_dflt_rings(bp, true);
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> index 649955fa3e37..d1eef880eec5 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> @@ -34,6 +34,9 @@
> #include <linux/firmware/broadcom/tee_bnxt_fw.h>
> #endif
>
> +#define BNXT_DEFAULT_RX_COPYBREAK 256
> +#define BNXT_MAX_RX_COPYBREAK 1024
> +
> extern struct list_head bnxt_block_cb_list;
>
> struct page_pool;
> @@ -2300,7 +2303,7 @@ struct bnxt {
> enum dma_data_direction rx_dir;
> u32 rx_ring_size;
> u32 rx_agg_ring_size;
> - u32 rx_copy_thresh;
> + u32 rx_copybreak;
> u32 rx_ring_mask;
> u32 rx_agg_ring_mask;
> int rx_nr_pages;
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
> index cfd2c65b1c90..adf30d1f738f 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
> @@ -4318,6 +4318,50 @@ static int bnxt_get_eee(struct net_device *dev, struct ethtool_keee *edata)
> return 0;
> }
>
> +static int bnxt_set_tunable(struct net_device *dev,
> + const struct ethtool_tunable *tuna,
> + const void *data)
> +{
> + struct bnxt *bp = netdev_priv(dev);
> + u32 rx_copybreak;
> +
> + switch (tuna->id) {
> + case ETHTOOL_RX_COPYBREAK:
> + rx_copybreak = *(u32 *)data;
> + if (rx_copybreak > BNXT_MAX_RX_COPYBREAK)
> + return -ERANGE;
> + if (rx_copybreak != bp->rx_copybreak) {
> + if (netif_running(dev)) {
> + bnxt_close_nic(bp, false, false);
> + bp->rx_copybreak = rx_copybreak;
> + bnxt_set_ring_params(bp);
> + bnxt_open_nic(bp, false, false);
> + } else {
> + bp->rx_copybreak = rx_copybreak;
> + }
> + }
> + return 0;
> + default:
> + return -EOPNOTSUPP;
> + }
> +}
> +
> +static int bnxt_get_tunable(struct net_device *dev,
> + const struct ethtool_tunable *tuna, void *data)
> +{
> + struct bnxt *bp = netdev_priv(dev);
> +
> + switch (tuna->id) {
> + case ETHTOOL_RX_COPYBREAK:
> + *(u32 *)data = bp->rx_copybreak;
> + break;
> + default:
> + return -EOPNOTSUPP;
> + }
> +
> + return 0;
> +}
> +
> static int bnxt_read_sfp_module_eeprom_info(struct bnxt *bp, u16 i2c_addr,
> u16 page_number, u8 bank,
> u16 start_addr, u16 data_length,
> @@ -4768,7 +4812,8 @@ static int bnxt_run_loopback(struct bnxt *bp)
> cpr = &rxr->bnapi->cp_ring;
> if (bp->flags & BNXT_FLAG_CHIP_P5_PLUS)
> cpr = rxr->rx_cpr;
> - pkt_size = min(bp->dev->mtu + ETH_HLEN, bp->rx_copy_thresh);
> + pkt_size = min(bp->dev->mtu + ETH_HLEN, max(BNXT_DEFAULT_RX_COPYBREAK,
> + bp->rx_copybreak));
> skb = netdev_alloc_skb(bp->dev, pkt_size);
> if (!skb)
> return -ENOMEM;
> @@ -5341,6 +5386,8 @@ const struct ethtool_ops bnxt_ethtool_ops = {
> .get_link_ext_stats = bnxt_get_link_ext_stats,
> .get_eee = bnxt_get_eee,
> .set_eee = bnxt_set_eee,
> + .get_tunable = bnxt_get_tunable,
> + .set_tunable = bnxt_set_tunable,
> .get_module_info = bnxt_get_module_info,
> .get_module_eeprom = bnxt_get_module_eeprom,
> .get_module_eeprom_by_page = bnxt_get_module_eeprom_by_page,
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 28+ messages in thread* Re: [PATCH net-next v5 1/7] bnxt_en: add support for rx-copybreak ethtool command
2024-11-13 17:32 ` [PATCH net-next v5 1/7] bnxt_en: add support for rx-copybreak ethtool command Taehee Yoo
2024-11-14 22:54 ` Andy Gospodarek
@ 2024-11-15 0:22 ` Michael Chan
1 sibling, 0 replies; 28+ messages in thread
From: Michael Chan @ 2024-11-15 0:22 UTC (permalink / raw)
To: Taehee Yoo
Cc: davem, kuba, pabeni, edumazet, almasrymina, donald.hunter, corbet,
andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev, kory.maincent, maxime.chevallier, danieller, hengqi,
ecree.xilinx, przemyslaw.kitszel, hkallweit1, ahmed.zaki,
rrameshbabu, idosch, jiri, bigeasy, lorenzo, jdamato,
aleksander.lobakin, kaiyuanz, willemb, daniel.zahka
[-- Attachment #1: Type: text/plain, Size: 2022 bytes --]
On Wed, Nov 13, 2024 at 9:32 AM Taehee Yoo <ap420073@gmail.com> wrote:
>
> The bnxt_en driver supports rx-copybreak, but it couldn't be set by
> userspace. Only the default value(256) has worked.
> This patch makes the bnxt_en driver support following command.
> `ethtool --set-tunable <devname> rx-copybreak <value> ` and
> `ethtool --get-tunable <devname> rx-copybreak`.
>
> By this patch, hds_threshol is set to the rx-copybreak value.
> But it will be set by `ethtool -G eth0 header-data-split-thresh N`
> in the next patch.
>
> Reviewed-by: Brett Creeley <brett.creeley@amd.com>
> Tested-by: Stanislav Fomichev <sdf@fomichev.me>
> Signed-off-by: Taehee Yoo <ap420073@gmail.com>
> @@ -6417,16 +6422,14 @@ static int bnxt_hwrm_vnic_set_hds(struct bnxt *bp, struct bnxt_vnic_info *vnic)
>
> req->flags = cpu_to_le32(VNIC_PLCMODES_CFG_REQ_FLAGS_JUMBO_PLACEMENT);
> req->enables = cpu_to_le32(VNIC_PLCMODES_CFG_REQ_ENABLES_JUMBO_THRESH_VALID);
> + req->jumbo_thresh = cpu_to_le16(bp->rx_buf_use_size);
>
> - if (BNXT_RX_PAGE_MODE(bp)) {
> - req->jumbo_thresh = cpu_to_le16(bp->rx_buf_use_size);
> - } else {
> + if (!BNXT_RX_PAGE_MODE(bp) && (bp->flags & BNXT_FLAG_AGG_RINGS)) {
> req->flags |= cpu_to_le32(VNIC_PLCMODES_CFG_REQ_FLAGS_HDS_IPV4 |
> VNIC_PLCMODES_CFG_REQ_FLAGS_HDS_IPV6);
> req->enables |=
> cpu_to_le32(VNIC_PLCMODES_CFG_REQ_ENABLES_HDS_THRESHOLD_VALID);
> - req->jumbo_thresh = cpu_to_le16(bp->rx_copy_thresh);
> - req->hds_threshold = cpu_to_le16(bp->rx_copy_thresh);
> + req->hds_threshold = cpu_to_le16(bp->rx_copybreak);
I double checked our hardware spec and the HDS threshold is 10 bits,
so the maximum value is 1023. When we get to patch #5, the HDS
threshold is separated from RX copybreak and the HDS maximum becomes
256. So it is within the hardware limit.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH net-next v5 2/7] net: ethtool: add tcp_data_split_mod member in kernel_ethtool_ringparam
2024-11-13 17:32 [PATCH net-next v5 0/7] bnxt_en: implement tcp-data-split and thresh option Taehee Yoo
2024-11-13 17:32 ` [PATCH net-next v5 1/7] bnxt_en: add support for rx-copybreak ethtool command Taehee Yoo
@ 2024-11-13 17:32 ` Taehee Yoo
2024-11-15 4:22 ` Jakub Kicinski
2024-11-13 17:32 ` [PATCH net-next v5 3/7] bnxt_en: add support for tcp-data-split ethtool command Taehee Yoo
` (6 subsequent siblings)
8 siblings, 1 reply; 28+ messages in thread
From: Taehee Yoo @ 2024-11-13 17:32 UTC (permalink / raw)
To: davem, kuba, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev
Cc: kory.maincent, maxime.chevallier, danieller, hengqi, ecree.xilinx,
przemyslaw.kitszel, hkallweit1, ahmed.zaki, rrameshbabu, idosch,
jiri, bigeasy, lorenzo, jdamato, aleksander.lobakin, kaiyuanz,
willemb, daniel.zahka, ap420073
When tcp-data-split is UNKNOWN mode, drivers arbitrarily handle it.
For example, bnxt_en driver automatically enables if at least one of
LRO/GRO/JUMBO is enabled.
If tcp-data-split is UNKNOWN and LRO is enabled, a driver returns
ENABLES of tcp-data-split, not UNKNOWN.
So, `ethtool -g eth0` shows tcp-data-split is enabled.
The problem is in the setting situation.
In the ethnl_set_rings(), it first calls get_ringparam() to get the
current driver's config.
At that moment, if driver's tcp-data-split config is UNKNOWN, it returns
ENABLE if LRO/GRO/JUMBO is enabled.
Then, it sets values from the user and driver's current config to
kernel_ethtool_ringparam.
Last it calls .set_ringparam().
The driver, especially bnxt_en driver receives
ETHTOOL_TCP_DATA_SPLIT_ENABLED.
But it can't distinguish whether it is set by the user or just the
current config.
The new tcp_data_split_mod member indicates the tcp-data-split value is
explicitly set by the user.
So the driver can handle ETHTOOL_TCP_DATA_SPLIT_ENABLED properly.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
v5:
- Patch added.
include/linux/ethtool.h | 2 ++
net/ethtool/rings.c | 3 +++
2 files changed, 5 insertions(+)
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 1199e308c8dd..ecd52b99a63a 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -73,6 +73,7 @@ enum {
* struct kernel_ethtool_ringparam - RX/TX ring configuration
* @rx_buf_len: Current length of buffers on the rx ring.
* @tcp_data_split: Scatter packet headers and data to separate buffers
+ * @tcp_data_split_mod: Updated tcp-data-split from user
* @tx_push: The flag of tx push mode
* @rx_push: The flag of rx push mode
* @cqe_size: Size of TX/RX completion queue event
@@ -82,6 +83,7 @@ enum {
struct kernel_ethtool_ringparam {
u32 rx_buf_len;
u8 tcp_data_split;
+ bool tcp_data_split_mod;
u8 tx_push;
u8 rx_push;
u32 cqe_size;
diff --git a/net/ethtool/rings.c b/net/ethtool/rings.c
index b7865a14fdf8..c12ebb61394d 100644
--- a/net/ethtool/rings.c
+++ b/net/ethtool/rings.c
@@ -250,6 +250,9 @@ ethnl_set_rings(struct ethnl_req_info *req_info, struct genl_info *info)
return -EINVAL;
}
+ if (tb[ETHTOOL_A_RINGS_TCP_DATA_SPLIT])
+ kernel_ringparam.tcp_data_split_mod = true;
+
ret = dev->ethtool_ops->set_ringparam(dev, &ringparam,
&kernel_ringparam, info->extack);
return ret < 0 ? ret : 1;
--
2.34.1
^ permalink raw reply related [flat|nested] 28+ messages in thread* Re: [PATCH net-next v5 2/7] net: ethtool: add tcp_data_split_mod member in kernel_ethtool_ringparam
2024-11-13 17:32 ` [PATCH net-next v5 2/7] net: ethtool: add tcp_data_split_mod member in kernel_ethtool_ringparam Taehee Yoo
@ 2024-11-15 4:22 ` Jakub Kicinski
2024-11-15 17:17 ` Taehee Yoo
0 siblings, 1 reply; 28+ messages in thread
From: Jakub Kicinski @ 2024-11-15 4:22 UTC (permalink / raw)
To: Taehee Yoo
Cc: davem, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev, kory.maincent, maxime.chevallier, danieller, hengqi,
ecree.xilinx, przemyslaw.kitszel, hkallweit1, ahmed.zaki,
rrameshbabu, idosch, jiri, bigeasy, lorenzo, jdamato,
aleksander.lobakin, kaiyuanz, willemb, daniel.zahka
On Wed, 13 Nov 2024 17:32:16 +0000 Taehee Yoo wrote:
> When tcp-data-split is UNKNOWN mode, drivers arbitrarily handle it.
> For example, bnxt_en driver automatically enables if at least one of
> LRO/GRO/JUMBO is enabled.
> If tcp-data-split is UNKNOWN and LRO is enabled, a driver returns
> ENABLES of tcp-data-split, not UNKNOWN.
> So, `ethtool -g eth0` shows tcp-data-split is enabled.
>
> The problem is in the setting situation.
> In the ethnl_set_rings(), it first calls get_ringparam() to get the
> current driver's config.
> At that moment, if driver's tcp-data-split config is UNKNOWN, it returns
> ENABLE if LRO/GRO/JUMBO is enabled.
> Then, it sets values from the user and driver's current config to
> kernel_ethtool_ringparam.
> Last it calls .set_ringparam().
> The driver, especially bnxt_en driver receives
> ETHTOOL_TCP_DATA_SPLIT_ENABLED.
> But it can't distinguish whether it is set by the user or just the
> current config.
>
> The new tcp_data_split_mod member indicates the tcp-data-split value is
> explicitly set by the user.
> So the driver can handle ETHTOOL_TCP_DATA_SPLIT_ENABLED properly.
I think this can work, but it isn't exactly what I had in mind.
I was thinking we'd simply add u8 hds_config to
struct ethtool_netdev_state (which is stored inside netdev).
And update it there if user request via ethnl_set_rings() succeeds.
That gives the driver and the core quick and easy access to checking if
the user forced the setting to ENABLED or DISABLED, or didn't (UNKNOWN).
As far as the parameter passed to ->set_ringparam() goes we could do
(assuming the new fields in ethtool_netdev state is called hds):
kernel_ringparam.tcp_data_split =
nla_get_u32_default(tb[ETHTOOL_A_RINGS_TCP_DATA_SPLIT],
dev->ethtool->hds);
If the driver see UNKNOWN it means user doesn't care.
If the driver sees ENABLED/DISABLE it must comply, doesn't matter if
the user requested it in current netlink call, or previous and hasn't
reset it, yet.
Hope this makes sense...
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next v5 2/7] net: ethtool: add tcp_data_split_mod member in kernel_ethtool_ringparam
2024-11-15 4:22 ` Jakub Kicinski
@ 2024-11-15 17:17 ` Taehee Yoo
2024-11-15 20:07 ` Saeed Mahameed
0 siblings, 1 reply; 28+ messages in thread
From: Taehee Yoo @ 2024-11-15 17:17 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev, kory.maincent, maxime.chevallier, danieller, hengqi,
ecree.xilinx, przemyslaw.kitszel, hkallweit1, ahmed.zaki,
rrameshbabu, idosch, jiri, bigeasy, lorenzo, jdamato,
aleksander.lobakin, kaiyuanz, willemb, daniel.zahka
On Fri, Nov 15, 2024 at 1:22 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 13 Nov 2024 17:32:16 +0000 Taehee Yoo wrote:
> > When tcp-data-split is UNKNOWN mode, drivers arbitrarily handle it.
> > For example, bnxt_en driver automatically enables if at least one of
> > LRO/GRO/JUMBO is enabled.
> > If tcp-data-split is UNKNOWN and LRO is enabled, a driver returns
> > ENABLES of tcp-data-split, not UNKNOWN.
> > So, `ethtool -g eth0` shows tcp-data-split is enabled.
> >
> > The problem is in the setting situation.
> > In the ethnl_set_rings(), it first calls get_ringparam() to get the
> > current driver's config.
> > At that moment, if driver's tcp-data-split config is UNKNOWN, it returns
> > ENABLE if LRO/GRO/JUMBO is enabled.
> > Then, it sets values from the user and driver's current config to
> > kernel_ethtool_ringparam.
> > Last it calls .set_ringparam().
> > The driver, especially bnxt_en driver receives
> > ETHTOOL_TCP_DATA_SPLIT_ENABLED.
> > But it can't distinguish whether it is set by the user or just the
> > current config.
> >
> > The new tcp_data_split_mod member indicates the tcp-data-split value is
> > explicitly set by the user.
> > So the driver can handle ETHTOOL_TCP_DATA_SPLIT_ENABLED properly.
>
> I think this can work, but it isn't exactly what I had in mind.
>
> I was thinking we'd simply add u8 hds_config to
> struct ethtool_netdev_state (which is stored inside netdev).
> And update it there if user request via ethnl_set_rings() succeeds.
>
> That gives the driver and the core quick and easy access to checking if
> the user forced the setting to ENABLED or DISABLED, or didn't (UNKNOWN).
>
> As far as the parameter passed to ->set_ringparam() goes we could do
> (assuming the new fields in ethtool_netdev state is called hds):
>
> kernel_ringparam.tcp_data_split =
> nla_get_u32_default(tb[ETHTOOL_A_RINGS_TCP_DATA_SPLIT],
> dev->ethtool->hds);
>
> If the driver see UNKNOWN it means user doesn't care.
> If the driver sees ENABLED/DISABLE it must comply, doesn't matter if
> the user requested it in current netlink call, or previous and hasn't
> reset it, yet.
>
> Hope this makes sense...
Thank you so much for the details!
I will try to use ethtool_netdev_state instead of this approach.
Thanks a lot!
Taehee Yoo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next v5 2/7] net: ethtool: add tcp_data_split_mod member in kernel_ethtool_ringparam
2024-11-15 17:17 ` Taehee Yoo
@ 2024-11-15 20:07 ` Saeed Mahameed
0 siblings, 0 replies; 28+ messages in thread
From: Saeed Mahameed @ 2024-11-15 20:07 UTC (permalink / raw)
To: Taehee Yoo
Cc: Jakub Kicinski, davem, pabeni, edumazet, almasrymina,
donald.hunter, corbet, michael.chan, andrew+netdev, hawk,
ilias.apalodimas, ast, daniel, john.fastabend, dw, sdf,
asml.silence, brett.creeley, linux-doc, netdev, kory.maincent,
maxime.chevallier, danieller, hengqi, ecree.xilinx,
przemyslaw.kitszel, hkallweit1, ahmed.zaki, rrameshbabu, idosch,
jiri, bigeasy, lorenzo, jdamato, aleksander.lobakin, kaiyuanz,
willemb, daniel.zahka
On 16 Nov 02:17, Taehee Yoo wrote:
>On Fri, Nov 15, 2024 at 1:22 PM Jakub Kicinski <kuba@kernel.org> wrote:
>>
>> On Wed, 13 Nov 2024 17:32:16 +0000 Taehee Yoo wrote:
>> > When tcp-data-split is UNKNOWN mode, drivers arbitrarily handle it.
>> > For example, bnxt_en driver automatically enables if at least one of
>> > LRO/GRO/JUMBO is enabled.
>> > If tcp-data-split is UNKNOWN and LRO is enabled, a driver returns
>> > ENABLES of tcp-data-split, not UNKNOWN.
>> > So, `ethtool -g eth0` shows tcp-data-split is enabled.
>> >
>> > The problem is in the setting situation.
>> > In the ethnl_set_rings(), it first calls get_ringparam() to get the
>> > current driver's config.
>> > At that moment, if driver's tcp-data-split config is UNKNOWN, it returns
>> > ENABLE if LRO/GRO/JUMBO is enabled.
>> > Then, it sets values from the user and driver's current config to
>> > kernel_ethtool_ringparam.
>> > Last it calls .set_ringparam().
>> > The driver, especially bnxt_en driver receives
>> > ETHTOOL_TCP_DATA_SPLIT_ENABLED.
>> > But it can't distinguish whether it is set by the user or just the
>> > current config.
>> >
>> > The new tcp_data_split_mod member indicates the tcp-data-split value is
>> > explicitly set by the user.
>> > So the driver can handle ETHTOOL_TCP_DATA_SPLIT_ENABLED properly.
>>
>> I think this can work, but it isn't exactly what I had in mind.
>>
>> I was thinking we'd simply add u8 hds_config to
>> struct ethtool_netdev_state (which is stored inside netdev).
>> And update it there if user request via ethnl_set_rings() succeeds.
>>
>> That gives the driver and the core quick and easy access to checking if
>> the user forced the setting to ENABLED or DISABLED, or didn't (UNKNOWN).
>>
>> As far as the parameter passed to ->set_ringparam() goes we could do
>> (assuming the new fields in ethtool_netdev state is called hds):
>>
>> kernel_ringparam.tcp_data_split =
>> nla_get_u32_default(tb[ETHTOOL_A_RINGS_TCP_DATA_SPLIT],
>> dev->ethtool->hds);
>>
>> If the driver see UNKNOWN it means user doesn't care.
>> If the driver sees ENABLED/DISABLE it must comply, doesn't matter if
>> the user requested it in current netlink call, or previous and hasn't
>> reset it, yet.
>>
This complicates things, drivers shouldn't store previous uncommitted "wanted" values.
We have wanted_features for that, and I don't think it's smart to have yet another
wanted_features mechanism, let's keep it simple, any explicit config by
ethtool should either be immediately committed or returned as error to
user and driver will only reflect the old/current value in future get requests.
HDS can conflict with many other features e.g XDP/LRO/rx_copy_break/MTU
limitations etc ...
>> Hope this makes sense...
>
>Thank you so much for the details!
>I will try to use ethtool_netdev_state instead of this approach.
>
>Thanks a lot!
>Taehee Yoo
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH net-next v5 3/7] bnxt_en: add support for tcp-data-split ethtool command
2024-11-13 17:32 [PATCH net-next v5 0/7] bnxt_en: implement tcp-data-split and thresh option Taehee Yoo
2024-11-13 17:32 ` [PATCH net-next v5 1/7] bnxt_en: add support for rx-copybreak ethtool command Taehee Yoo
2024-11-13 17:32 ` [PATCH net-next v5 2/7] net: ethtool: add tcp_data_split_mod member in kernel_ethtool_ringparam Taehee Yoo
@ 2024-11-13 17:32 ` Taehee Yoo
2024-11-14 22:54 ` Andy Gospodarek
2024-11-15 4:15 ` Jakub Kicinski
2024-11-13 17:32 ` [PATCH net-next v5 4/7] net: ethtool: add support for configuring header-data-split-thresh Taehee Yoo
` (5 subsequent siblings)
8 siblings, 2 replies; 28+ messages in thread
From: Taehee Yoo @ 2024-11-13 17:32 UTC (permalink / raw)
To: davem, kuba, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev
Cc: kory.maincent, maxime.chevallier, danieller, hengqi, ecree.xilinx,
przemyslaw.kitszel, hkallweit1, ahmed.zaki, rrameshbabu, idosch,
jiri, bigeasy, lorenzo, jdamato, aleksander.lobakin, kaiyuanz,
willemb, daniel.zahka, ap420073
NICs that uses bnxt_en driver supports tcp-data-split feature by the
name of HDS(header-data-split).
But there is no implementation for the HDS to enable by ethtool.
Only getting the current HDS status is implemented and The HDS is just
automatically enabled only when either LRO, HW-GRO, or JUMBO is enabled.
The hds_threshold follows rx-copybreak value. and it was unchangeable.
This implements `ethtool -G <interface name> tcp-data-split <value>`
command option.
The value can be <on> and <auto>.
The value is <auto> and one of LRO/GRO/JUMBO is enabled, HDS is
automatically enabled and all LRO/GRO/JUMBO are disabled, HDS is
automatically disabled.
HDS feature relies on the aggregation ring.
So, if HDS is enabled, the bnxt_en driver initializes the aggregation ring.
This is the reason why BNXT_FLAG_AGG_RINGS contains HDS condition.
Tested-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
v5:
- Do not set HDS if XDP is attached.
- Enable tcp-data-split only when tcp_data_split_mod is true.
v4:
- Do not support disable tcp-data-split.
- Add Test tag from Stanislav.
v3:
- No changes.
v2:
- Do not set hds_threshold to 0.
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 5 +++--
.../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 19 +++++++++++++++++++
3 files changed, 23 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index d521b8918c02..608bcca71676 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4474,7 +4474,7 @@ void bnxt_set_ring_params(struct bnxt *bp)
bp->rx_agg_ring_size = 0;
bp->rx_agg_nr_pages = 0;
- if (bp->flags & BNXT_FLAG_TPA)
+ if (bp->flags & BNXT_FLAG_TPA || bp->flags & BNXT_FLAG_HDS)
agg_factor = min_t(u32, 4, 65536 / BNXT_RX_PAGE_SIZE);
bp->flags &= ~BNXT_FLAG_JUMBO;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index d1eef880eec5..3a7d2f3ebb2a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -2202,8 +2202,6 @@ struct bnxt {
#define BNXT_FLAG_TPA (BNXT_FLAG_LRO | BNXT_FLAG_GRO)
#define BNXT_FLAG_JUMBO 0x10
#define BNXT_FLAG_STRIP_VLAN 0x20
- #define BNXT_FLAG_AGG_RINGS (BNXT_FLAG_JUMBO | BNXT_FLAG_GRO | \
- BNXT_FLAG_LRO)
#define BNXT_FLAG_RFS 0x100
#define BNXT_FLAG_SHARED_RINGS 0x200
#define BNXT_FLAG_PORT_STATS 0x400
@@ -2224,6 +2222,9 @@ struct bnxt {
#define BNXT_FLAG_ROCE_MIRROR_CAP 0x4000000
#define BNXT_FLAG_TX_COAL_CMPL 0x8000000
#define BNXT_FLAG_PORT_STATS_EXT 0x10000000
+ #define BNXT_FLAG_HDS 0x20000000
+ #define BNXT_FLAG_AGG_RINGS (BNXT_FLAG_JUMBO | BNXT_FLAG_GRO | \
+ BNXT_FLAG_LRO | BNXT_FLAG_HDS)
#define BNXT_FLAG_ALL_CONFIG_FEATS (BNXT_FLAG_TPA | \
BNXT_FLAG_RFS | \
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index adf30d1f738f..b0054eef389b 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -840,6 +840,8 @@ static int bnxt_set_ringparam(struct net_device *dev,
struct kernel_ethtool_ringparam *kernel_ering,
struct netlink_ext_ack *extack)
{
+ u8 tcp_data_split_mod = kernel_ering->tcp_data_split_mod;
+ u8 tcp_data_split = kernel_ering->tcp_data_split;
struct bnxt *bp = netdev_priv(dev);
if ((ering->rx_pending > BNXT_MAX_RX_DESC_CNT) ||
@@ -847,9 +849,25 @@ static int bnxt_set_ringparam(struct net_device *dev,
(ering->tx_pending < BNXT_MIN_TX_DESC_CNT))
return -EINVAL;
+ if (tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_DISABLED &&
+ tcp_data_split_mod)
+ return -EOPNOTSUPP;
+
+ if (tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_ENABLED &&
+ tcp_data_split_mod && BNXT_RX_PAGE_MODE(bp)) {
+ NL_SET_ERR_MSG_MOD(extack, "tcp-data-split is disallowed when XDP is attached");
+ return -EOPNOTSUPP;
+ }
+
if (netif_running(dev))
bnxt_close_nic(bp, false, false);
+ if (tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_ENABLED &&
+ tcp_data_split_mod)
+ bp->flags |= BNXT_FLAG_HDS;
+ else if (tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_UNKNOWN)
+ bp->flags &= ~BNXT_FLAG_HDS;
+
bp->rx_ring_size = ering->rx_pending;
bp->tx_ring_size = ering->tx_pending;
bnxt_set_ring_params(bp);
@@ -5345,6 +5363,7 @@ const struct ethtool_ops bnxt_ethtool_ops = {
ETHTOOL_COALESCE_STATS_BLOCK_USECS |
ETHTOOL_COALESCE_USE_ADAPTIVE_RX |
ETHTOOL_COALESCE_USE_CQE,
+ .supported_ring_params = ETHTOOL_RING_USE_TCP_DATA_SPLIT,
.get_link_ksettings = bnxt_get_link_ksettings,
.set_link_ksettings = bnxt_set_link_ksettings,
.get_fec_stats = bnxt_get_fec_stats,
--
2.34.1
^ permalink raw reply related [flat|nested] 28+ messages in thread* Re: [PATCH net-next v5 3/7] bnxt_en: add support for tcp-data-split ethtool command
2024-11-13 17:32 ` [PATCH net-next v5 3/7] bnxt_en: add support for tcp-data-split ethtool command Taehee Yoo
@ 2024-11-14 22:54 ` Andy Gospodarek
2024-11-15 4:15 ` Jakub Kicinski
1 sibling, 0 replies; 28+ messages in thread
From: Andy Gospodarek @ 2024-11-14 22:54 UTC (permalink / raw)
To: Taehee Yoo
Cc: davem, kuba, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev, kory.maincent, maxime.chevallier, danieller, hengqi,
ecree.xilinx, przemyslaw.kitszel, hkallweit1, ahmed.zaki,
rrameshbabu, idosch, jiri, bigeasy, lorenzo, jdamato,
aleksander.lobakin, kaiyuanz, willemb, daniel.zahka
On Wed, Nov 13, 2024 at 05:32:17PM +0000, Taehee Yoo wrote:
> NICs that uses bnxt_en driver supports tcp-data-split feature by the
> name of HDS(header-data-split).
> But there is no implementation for the HDS to enable by ethtool.
> Only getting the current HDS status is implemented and The HDS is just
> automatically enabled only when either LRO, HW-GRO, or JUMBO is enabled.
> The hds_threshold follows rx-copybreak value. and it was unchangeable.
>
> This implements `ethtool -G <interface name> tcp-data-split <value>`
> command option.
> The value can be <on> and <auto>.
> The value is <auto> and one of LRO/GRO/JUMBO is enabled, HDS is
> automatically enabled and all LRO/GRO/JUMBO are disabled, HDS is
> automatically disabled.
>
> HDS feature relies on the aggregation ring.
> So, if HDS is enabled, the bnxt_en driver initializes the aggregation ring.
> This is the reason why BNXT_FLAG_AGG_RINGS contains HDS condition.
>
> Tested-by: Stanislav Fomichev <sdf@fomichev.me>
> Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Tested-by: Andy Gospodarek <gospo@broadcom.com>
> ---
>
> v5:
> - Do not set HDS if XDP is attached.
> - Enable tcp-data-split only when tcp_data_split_mod is true.
>
> v4:
> - Do not support disable tcp-data-split.
> - Add Test tag from Stanislav.
>
> v3:
> - No changes.
>
> v2:
> - Do not set hds_threshold to 0.
>
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
> drivers/net/ethernet/broadcom/bnxt/bnxt.h | 5 +++--
> .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 19 +++++++++++++++++++
> 3 files changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index d521b8918c02..608bcca71676 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -4474,7 +4474,7 @@ void bnxt_set_ring_params(struct bnxt *bp)
> bp->rx_agg_ring_size = 0;
> bp->rx_agg_nr_pages = 0;
>
> - if (bp->flags & BNXT_FLAG_TPA)
> + if (bp->flags & BNXT_FLAG_TPA || bp->flags & BNXT_FLAG_HDS)
> agg_factor = min_t(u32, 4, 65536 / BNXT_RX_PAGE_SIZE);
>
> bp->flags &= ~BNXT_FLAG_JUMBO;
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> index d1eef880eec5..3a7d2f3ebb2a 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> @@ -2202,8 +2202,6 @@ struct bnxt {
> #define BNXT_FLAG_TPA (BNXT_FLAG_LRO | BNXT_FLAG_GRO)
> #define BNXT_FLAG_JUMBO 0x10
> #define BNXT_FLAG_STRIP_VLAN 0x20
> - #define BNXT_FLAG_AGG_RINGS (BNXT_FLAG_JUMBO | BNXT_FLAG_GRO | \
> - BNXT_FLAG_LRO)
> #define BNXT_FLAG_RFS 0x100
> #define BNXT_FLAG_SHARED_RINGS 0x200
> #define BNXT_FLAG_PORT_STATS 0x400
> @@ -2224,6 +2222,9 @@ struct bnxt {
> #define BNXT_FLAG_ROCE_MIRROR_CAP 0x4000000
> #define BNXT_FLAG_TX_COAL_CMPL 0x8000000
> #define BNXT_FLAG_PORT_STATS_EXT 0x10000000
> + #define BNXT_FLAG_HDS 0x20000000
> + #define BNXT_FLAG_AGG_RINGS (BNXT_FLAG_JUMBO | BNXT_FLAG_GRO | \
> + BNXT_FLAG_LRO | BNXT_FLAG_HDS)
>
> #define BNXT_FLAG_ALL_CONFIG_FEATS (BNXT_FLAG_TPA | \
> BNXT_FLAG_RFS | \
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
> index adf30d1f738f..b0054eef389b 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
> @@ -840,6 +840,8 @@ static int bnxt_set_ringparam(struct net_device *dev,
> struct kernel_ethtool_ringparam *kernel_ering,
> struct netlink_ext_ack *extack)
> {
> + u8 tcp_data_split_mod = kernel_ering->tcp_data_split_mod;
> + u8 tcp_data_split = kernel_ering->tcp_data_split;
> struct bnxt *bp = netdev_priv(dev);
>
> if ((ering->rx_pending > BNXT_MAX_RX_DESC_CNT) ||
> @@ -847,9 +849,25 @@ static int bnxt_set_ringparam(struct net_device *dev,
> (ering->tx_pending < BNXT_MIN_TX_DESC_CNT))
> return -EINVAL;
>
> + if (tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_DISABLED &&
> + tcp_data_split_mod)
> + return -EOPNOTSUPP;
> +
> + if (tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_ENABLED &&
> + tcp_data_split_mod && BNXT_RX_PAGE_MODE(bp)) {
> + NL_SET_ERR_MSG_MOD(extack, "tcp-data-split is disallowed when XDP is attached");
> + return -EOPNOTSUPP;
> + }
> +
> if (netif_running(dev))
> bnxt_close_nic(bp, false, false);
>
> + if (tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_ENABLED &&
> + tcp_data_split_mod)
> + bp->flags |= BNXT_FLAG_HDS;
> + else if (tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_UNKNOWN)
> + bp->flags &= ~BNXT_FLAG_HDS;
> +
> bp->rx_ring_size = ering->rx_pending;
> bp->tx_ring_size = ering->tx_pending;
> bnxt_set_ring_params(bp);
> @@ -5345,6 +5363,7 @@ const struct ethtool_ops bnxt_ethtool_ops = {
> ETHTOOL_COALESCE_STATS_BLOCK_USECS |
> ETHTOOL_COALESCE_USE_ADAPTIVE_RX |
> ETHTOOL_COALESCE_USE_CQE,
> + .supported_ring_params = ETHTOOL_RING_USE_TCP_DATA_SPLIT,
> .get_link_ksettings = bnxt_get_link_ksettings,
> .set_link_ksettings = bnxt_set_link_ksettings,
> .get_fec_stats = bnxt_get_fec_stats,
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 28+ messages in thread* Re: [PATCH net-next v5 3/7] bnxt_en: add support for tcp-data-split ethtool command
2024-11-13 17:32 ` [PATCH net-next v5 3/7] bnxt_en: add support for tcp-data-split ethtool command Taehee Yoo
2024-11-14 22:54 ` Andy Gospodarek
@ 2024-11-15 4:15 ` Jakub Kicinski
2024-11-15 17:12 ` Taehee Yoo
1 sibling, 1 reply; 28+ messages in thread
From: Jakub Kicinski @ 2024-11-15 4:15 UTC (permalink / raw)
To: Taehee Yoo
Cc: davem, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev, kory.maincent, maxime.chevallier, danieller, hengqi,
ecree.xilinx, przemyslaw.kitszel, hkallweit1, ahmed.zaki,
rrameshbabu, idosch, jiri, bigeasy, lorenzo, jdamato,
aleksander.lobakin, kaiyuanz, willemb, daniel.zahka
On Wed, 13 Nov 2024 17:32:17 +0000 Taehee Yoo wrote:
> NICs that uses bnxt_en driver supports tcp-data-split feature by the
> name of HDS(header-data-split).
> But there is no implementation for the HDS to enable by ethtool.
> Only getting the current HDS status is implemented and The HDS is just
> automatically enabled only when either LRO, HW-GRO, or JUMBO is enabled.
> The hds_threshold follows rx-copybreak value. and it was unchangeable.
>
> This implements `ethtool -G <interface name> tcp-data-split <value>`
> command option.
> The value can be <on> and <auto>.
> The value is <auto> and one of LRO/GRO/JUMBO is enabled, HDS is
> automatically enabled and all LRO/GRO/JUMBO are disabled, HDS is
> automatically disabled.
>
> HDS feature relies on the aggregation ring.
> So, if HDS is enabled, the bnxt_en driver initializes the aggregation ring.
> This is the reason why BNXT_FLAG_AGG_RINGS contains HDS condition.
I may be missing some existing check but doesn't enabling XDP force
page_mode which in turn clears BNXT_FLAG_AGG_RINGS, including HDS ?
If user specifically requested HDS we should refuse to install XDP
in non-multibuf mode.
TBH a selftest under tools/testing/drivers/net would go a long way
to make it clear we caught all cases. You can add a dummy netdevsim
implementation for testing without bnxt present (some of the existing
python tests can work with real drivers and netdevsim):
https://github.com/linux-netdev/nipa/wiki/Running-driver-tests
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next v5 3/7] bnxt_en: add support for tcp-data-split ethtool command
2024-11-15 4:15 ` Jakub Kicinski
@ 2024-11-15 17:12 ` Taehee Yoo
0 siblings, 0 replies; 28+ messages in thread
From: Taehee Yoo @ 2024-11-15 17:12 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev, kory.maincent, maxime.chevallier, danieller, hengqi,
ecree.xilinx, przemyslaw.kitszel, hkallweit1, ahmed.zaki,
rrameshbabu, idosch, jiri, bigeasy, lorenzo, jdamato,
aleksander.lobakin, kaiyuanz, willemb, daniel.zahka
On Fri, Nov 15, 2024 at 1:15 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
Hi Jakub,
Thank you so much for the review!
> On Wed, 13 Nov 2024 17:32:17 +0000 Taehee Yoo wrote:
> > NICs that uses bnxt_en driver supports tcp-data-split feature by the
> > name of HDS(header-data-split).
> > But there is no implementation for the HDS to enable by ethtool.
> > Only getting the current HDS status is implemented and The HDS is just
> > automatically enabled only when either LRO, HW-GRO, or JUMBO is enabled.
> > The hds_threshold follows rx-copybreak value. and it was unchangeable.
> >
> > This implements `ethtool -G <interface name> tcp-data-split <value>`
> > command option.
> > The value can be <on> and <auto>.
> > The value is <auto> and one of LRO/GRO/JUMBO is enabled, HDS is
> > automatically enabled and all LRO/GRO/JUMBO are disabled, HDS is
> > automatically disabled.
> >
> > HDS feature relies on the aggregation ring.
> > So, if HDS is enabled, the bnxt_en driver initializes the aggregation ring.
> > This is the reason why BNXT_FLAG_AGG_RINGS contains HDS condition.
>
> I may be missing some existing check but doesn't enabling XDP force
> page_mode which in turn clears BNXT_FLAG_AGG_RINGS, including HDS ?
> If user specifically requested HDS we should refuse to install XDP
> in non-multibuf mode.
Sorry, I missed adding this check.
I added a check to reject setting HDS when XDP is attached.
But, I didn't add a check to reject attaching XDP when HDS is enabled.
bnxt driver doesn't allow setting HDS if XDP is attached even if it's
multibuffer XDP. So, I will reject installing singlebuffer and
multibuffer XDP if HDS is enabled.
>
> TBH a selftest under tools/testing/drivers/net would go a long way
> to make it clear we caught all cases. You can add a dummy netdevsim
> implementation for testing without bnxt present (some of the existing
> python tests can work with real drivers and netdevsim):
> https://github.com/linux-netdev/nipa/wiki/Running-driver-tests
Thanks for that, I will try to use this selftest.
I will add a dummy HDS feature for testing on the netdevsim.
Thanks a lot!
Taehee Yoo
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH net-next v5 4/7] net: ethtool: add support for configuring header-data-split-thresh
2024-11-13 17:32 [PATCH net-next v5 0/7] bnxt_en: implement tcp-data-split and thresh option Taehee Yoo
` (2 preceding siblings ...)
2024-11-13 17:32 ` [PATCH net-next v5 3/7] bnxt_en: add support for tcp-data-split ethtool command Taehee Yoo
@ 2024-11-13 17:32 ` Taehee Yoo
2024-11-15 4:24 ` Jakub Kicinski
2024-11-15 20:27 ` Saeed Mahameed
2024-11-13 17:32 ` [PATCH net-next v5 5/7] bnxt_en: add support for header-data-split-thresh ethtool command Taehee Yoo
` (4 subsequent siblings)
8 siblings, 2 replies; 28+ messages in thread
From: Taehee Yoo @ 2024-11-13 17:32 UTC (permalink / raw)
To: davem, kuba, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev
Cc: kory.maincent, maxime.chevallier, danieller, hengqi, ecree.xilinx,
przemyslaw.kitszel, hkallweit1, ahmed.zaki, rrameshbabu, idosch,
jiri, bigeasy, lorenzo, jdamato, aleksander.lobakin, kaiyuanz,
willemb, daniel.zahka, ap420073
The header-data-split-thresh option configures the threshold value of
the header-data-split.
If a received packet size is larger than this threshold value, a packet
will be split into header and payload.
The header indicates TCP and UDP header, but it depends on driver spec.
The bnxt_en driver supports HDS(Header-Data-Split) configuration at
FW level, affecting TCP and UDP too.
So, If header-data-split-thresh is set, it affects UDP and TCP packets.
Example:
# ethtool -G <interface name> header-data-split-thresh <value>
# ethtool -G enp14s0f0np0 tcp-data-split on header-data-split-thresh 256
# ethtool -g enp14s0f0np0
Ring parameters for enp14s0f0np0:
Pre-set maximums:
...
Header data split thresh: 256
Current hardware settings:
...
TCP data split: on
Header data split thresh: 256
The default/min/max values are not defined in the ethtool so the drivers
should define themself.
The 0 value means that all TCP/UDP packets' header and payload
will be split.
In general cases, HDS can increase the overhead of host memory and PCIe
bus because it copies data twice.
So users should consider the overhead of HDS.
If the HDS threshold is 0 and then the copybreak is 256 and the packet's
payload is 8 bytes.
So, two pages are used, one for headers and one for payloads.
By the copybreak, only the headers page is copied and then it can be
reused immediately and then a payloads page is still used.
If the HDS threshold is larger than 8, both headers and payloads are
copied and then a page can be recycled immediately.
So, too low HDS threshold has larger disadvantages than advantages
aspect of performance in general cases.
Users should consider the overhead of this feature.
Tested-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
v5:
- No changes.
v4:
- Fix 80 charactor wrap.
- Rename from tcp-data-split-thresh to header-data-split-thresh
- Add description about overhead of HDS.
- Add ETHTOOL_RING_USE_HDS_THRS flag.
- Add dev_xdp_sb_prog_count() helper.
- Add Test tag from Stanislav.
v3:
- Fix documentation and ynl
- Update error messages
- Validate configuration of tcp-data-split and tcp-data-split-thresh
v2:
- Patch added.
Documentation/netlink/specs/ethtool.yaml | 8 ++
Documentation/networking/ethtool-netlink.rst | 79 ++++++++++++--------
include/linux/ethtool.h | 6 ++
include/linux/netdevice.h | 1 +
include/uapi/linux/ethtool_netlink.h | 2 +
net/core/dev.c | 13 ++++
net/ethtool/netlink.h | 2 +-
net/ethtool/rings.c | 37 ++++++++-
8 files changed, 115 insertions(+), 33 deletions(-)
diff --git a/Documentation/netlink/specs/ethtool.yaml b/Documentation/netlink/specs/ethtool.yaml
index 93369f0eb816..edc07cc290da 100644
--- a/Documentation/netlink/specs/ethtool.yaml
+++ b/Documentation/netlink/specs/ethtool.yaml
@@ -220,6 +220,12 @@ attribute-sets:
-
name: tx-push-buf-len-max
type: u32
+ -
+ name: header-data-split-thresh
+ type: u32
+ -
+ name: header-data-split-thresh-max
+ type: u32
-
name: mm-stat
@@ -1398,6 +1404,8 @@ operations:
- rx-push
- tx-push-buf-len
- tx-push-buf-len-max
+ - header-data-split-thresh
+ - header-data-split-thresh-max
dump: *ring-get-op
-
name: rings-set
diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index b25926071ece..1fdfeca6f38e 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -878,24 +878,35 @@ Request contents:
Kernel response contents:
- ======================================= ====== ===========================
- ``ETHTOOL_A_RINGS_HEADER`` nested reply header
- ``ETHTOOL_A_RINGS_RX_MAX`` u32 max size of RX ring
- ``ETHTOOL_A_RINGS_RX_MINI_MAX`` u32 max size of RX mini ring
- ``ETHTOOL_A_RINGS_RX_JUMBO_MAX`` u32 max size of RX jumbo ring
- ``ETHTOOL_A_RINGS_TX_MAX`` u32 max size of TX ring
- ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring
- ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring
- ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring
- ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
- ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring
- ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data split
- ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
- ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode
- ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode
- ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer
- ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX`` u32 max size of TX push buffer
- ======================================= ====== ===========================
+ ================================================ ====== ====================
+ ``ETHTOOL_A_RINGS_HEADER`` nested reply header
+ ``ETHTOOL_A_RINGS_RX_MAX`` u32 max size of RX ring
+ ``ETHTOOL_A_RINGS_RX_MINI_MAX`` u32 max size of RX mini
+ ring
+ ``ETHTOOL_A_RINGS_RX_JUMBO_MAX`` u32 max size of RX jumbo
+ ring
+ ``ETHTOOL_A_RINGS_TX_MAX`` u32 max size of TX ring
+ ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring
+ ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring
+ ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo
+ ring
+ ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
+ ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on
+ the ring
+ ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data
+ split
+ ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
+ ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode
+ ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode
+ ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push
+ buffer
+ ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX`` u32 max size of TX push
+ buffer
+ ``ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH`` u32 threshold of
+ header / data split
+ ``ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH_MAX`` u32 max threshold of
+ header / data split
+ ================================================ ====== ====================
``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` indicates whether the device is usable with
page-flipping TCP zero-copy receive (``getsockopt(TCP_ZEROCOPY_RECEIVE)``).
@@ -930,18 +941,22 @@ Sets ring sizes like ``ETHTOOL_SRINGPARAM`` ioctl request.
Request contents:
- ==================================== ====== ===========================
- ``ETHTOOL_A_RINGS_HEADER`` nested reply header
- ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring
- ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring
- ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring
- ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
- ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring
- ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
- ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode
- ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode
- ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer
- ==================================== ====== ===========================
+ ============================================ ====== =======================
+ ``ETHTOOL_A_RINGS_HEADER`` nested reply header
+ ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring
+ ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring
+ ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring
+ ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
+ ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the
+ ring
+ ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data split
+ ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
+ ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode
+ ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode
+ ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer
+ ``ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH`` u32 threshold of
+ header / data split
+ ============================================ ====== =======================
Kernel checks that requested ring sizes do not exceed limits reported by
driver. Driver may impose additional constraints and may not support all
@@ -957,6 +972,10 @@ A bigger CQE can have more receive buffer pointers, and in turn the NIC can
transfer a bigger frame from wire. Based on the NIC hardware, the overall
completion queue size can be adjusted in the driver if CQE size is modified.
+``ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH`` specifies the threshold value of
+header / data split feature. If a received packet size is larger than this
+threshold value, header and data will be split.
+
CHANNELS_GET
============
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index ecd52b99a63a..b4b6955d7ab9 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -79,6 +79,8 @@ enum {
* @cqe_size: Size of TX/RX completion queue event
* @tx_push_buf_len: Size of TX push buffer
* @tx_push_buf_max_len: Maximum allowed size of TX push buffer
+ * @hds_thresh: Threshold value of header-data-split-thresh
+ * @hds_thresh_max: Maximum allowed threshold of header-data-split-thresh
*/
struct kernel_ethtool_ringparam {
u32 rx_buf_len;
@@ -89,6 +91,8 @@ struct kernel_ethtool_ringparam {
u32 cqe_size;
u32 tx_push_buf_len;
u32 tx_push_buf_max_len;
+ u32 hds_thresh;
+ u32 hds_thresh_max;
};
/**
@@ -99,6 +103,7 @@ struct kernel_ethtool_ringparam {
* @ETHTOOL_RING_USE_RX_PUSH: capture for setting rx_push
* @ETHTOOL_RING_USE_TX_PUSH_BUF_LEN: capture for setting tx_push_buf_len
* @ETHTOOL_RING_USE_TCP_DATA_SPLIT: capture for setting tcp_data_split
+ * @ETHTOOL_RING_USE_HDS_THRS: capture for setting header-data-split-thresh
*/
enum ethtool_supported_ring_param {
ETHTOOL_RING_USE_RX_BUF_LEN = BIT(0),
@@ -107,6 +112,7 @@ enum ethtool_supported_ring_param {
ETHTOOL_RING_USE_RX_PUSH = BIT(3),
ETHTOOL_RING_USE_TX_PUSH_BUF_LEN = BIT(4),
ETHTOOL_RING_USE_TCP_DATA_SPLIT = BIT(5),
+ ETHTOOL_RING_USE_HDS_THRS = BIT(6),
};
#define __ETH_RSS_HASH_BIT(bit) ((u32)1 << (bit))
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 0aae346d919e..0c29068577c4 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -4028,6 +4028,7 @@ struct sk_buff *dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
int bpf_xdp_link_attach(const union bpf_attr *attr, struct bpf_prog *prog);
u8 dev_xdp_prog_count(struct net_device *dev);
+u8 dev_xdp_sb_prog_count(struct net_device *dev);
int dev_xdp_propagate(struct net_device *dev, struct netdev_bpf *bpf);
u32 dev_xdp_prog_id(struct net_device *dev, enum bpf_xdp_mode mode);
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index 283305f6b063..7087c5c51017 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -364,6 +364,8 @@ enum {
ETHTOOL_A_RINGS_RX_PUSH, /* u8 */
ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN, /* u32 */
ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX, /* u32 */
+ ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH, /* u32 */
+ ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH_MAX, /* u32 */
/* add new constants above here */
__ETHTOOL_A_RINGS_CNT,
diff --git a/net/core/dev.c b/net/core/dev.c
index 13d00fc10f55..0321d7cbce0f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -9474,6 +9474,19 @@ u8 dev_xdp_prog_count(struct net_device *dev)
}
EXPORT_SYMBOL_GPL(dev_xdp_prog_count);
+u8 dev_xdp_sb_prog_count(struct net_device *dev)
+{
+ u8 count = 0;
+ int i;
+
+ for (i = 0; i < __MAX_XDP_MODE; i++)
+ if (dev->xdp_state[i].prog &&
+ !dev->xdp_state[i].prog->aux->xdp_has_frags)
+ count++;
+ return count;
+}
+EXPORT_SYMBOL_GPL(dev_xdp_sb_prog_count);
+
int dev_xdp_propagate(struct net_device *dev, struct netdev_bpf *bpf)
{
if (!dev->netdev_ops->ndo_bpf)
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index 203b08eb6c6f..9f51a252ebe0 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -455,7 +455,7 @@ extern const struct nla_policy ethnl_features_set_policy[ETHTOOL_A_FEATURES_WANT
extern const struct nla_policy ethnl_privflags_get_policy[ETHTOOL_A_PRIVFLAGS_HEADER + 1];
extern const struct nla_policy ethnl_privflags_set_policy[ETHTOOL_A_PRIVFLAGS_FLAGS + 1];
extern const struct nla_policy ethnl_rings_get_policy[ETHTOOL_A_RINGS_HEADER + 1];
-extern const struct nla_policy ethnl_rings_set_policy[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX + 1];
+extern const struct nla_policy ethnl_rings_set_policy[ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH_MAX + 1];
extern const struct nla_policy ethnl_channels_get_policy[ETHTOOL_A_CHANNELS_HEADER + 1];
extern const struct nla_policy ethnl_channels_set_policy[ETHTOOL_A_CHANNELS_COMBINED_COUNT + 1];
extern const struct nla_policy ethnl_coalesce_get_policy[ETHTOOL_A_COALESCE_HEADER + 1];
diff --git a/net/ethtool/rings.c b/net/ethtool/rings.c
index c12ebb61394d..ca836aad3fa9 100644
--- a/net/ethtool/rings.c
+++ b/net/ethtool/rings.c
@@ -61,7 +61,11 @@ static int rings_reply_size(const struct ethnl_req_info *req_base,
nla_total_size(sizeof(u8)) + /* _RINGS_TX_PUSH */
nla_total_size(sizeof(u8))) + /* _RINGS_RX_PUSH */
nla_total_size(sizeof(u32)) + /* _RINGS_TX_PUSH_BUF_LEN */
- nla_total_size(sizeof(u32)); /* _RINGS_TX_PUSH_BUF_LEN_MAX */
+ nla_total_size(sizeof(u32)) + /* _RINGS_TX_PUSH_BUF_LEN_MAX */
+ nla_total_size(sizeof(u32)) +
+ /* _RINGS_HEADER_DATA_SPLIT_THRESH */
+ nla_total_size(sizeof(u32));
+ /* _RINGS_HEADER_DATA_SPLIT_THRESH_MAX*/
}
static int rings_fill_reply(struct sk_buff *skb,
@@ -108,7 +112,12 @@ static int rings_fill_reply(struct sk_buff *skb,
(nla_put_u32(skb, ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX,
kr->tx_push_buf_max_len) ||
nla_put_u32(skb, ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN,
- kr->tx_push_buf_len))))
+ kr->tx_push_buf_len))) ||
+ ((supported_ring_params & ETHTOOL_RING_USE_HDS_THRS) &&
+ (nla_put_u32(skb, ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH,
+ kr->hds_thresh) ||
+ nla_put_u32(skb, ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH_MAX,
+ kr->hds_thresh_max))))
return -EMSGSIZE;
return 0;
@@ -130,6 +139,7 @@ const struct nla_policy ethnl_rings_set_policy[] = {
[ETHTOOL_A_RINGS_TX_PUSH] = NLA_POLICY_MAX(NLA_U8, 1),
[ETHTOOL_A_RINGS_RX_PUSH] = NLA_POLICY_MAX(NLA_U8, 1),
[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN] = { .type = NLA_U32 },
+ [ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH] = { .type = NLA_U32 },
};
static int
@@ -155,6 +165,14 @@ ethnl_set_rings_validate(struct ethnl_req_info *req_info,
return -EOPNOTSUPP;
}
+ if (tb[ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH] &&
+ !(ops->supported_ring_params & ETHTOOL_RING_USE_HDS_THRS)) {
+ NL_SET_ERR_MSG_ATTR(info->extack,
+ tb[ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH],
+ "setting header-data-split-thresh is not supported");
+ return -EOPNOTSUPP;
+ }
+
if (tb[ETHTOOL_A_RINGS_CQE_SIZE] &&
!(ops->supported_ring_params & ETHTOOL_RING_USE_CQE_SIZE)) {
NL_SET_ERR_MSG_ATTR(info->extack,
@@ -222,9 +240,24 @@ ethnl_set_rings(struct ethnl_req_info *req_info, struct genl_info *info)
tb[ETHTOOL_A_RINGS_RX_PUSH], &mod);
ethnl_update_u32(&kernel_ringparam.tx_push_buf_len,
tb[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN], &mod);
+ ethnl_update_u32(&kernel_ringparam.hds_thresh,
+ tb[ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH], &mod);
if (!mod)
return 0;
+ if (kernel_ringparam.tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_ENABLED &&
+ dev_xdp_sb_prog_count(dev)) {
+ NL_SET_ERR_MSG(info->extack,
+ "tcp-data-split can not be enabled with single buffer XDP");
+ return -EINVAL;
+ }
+
+ if (kernel_ringparam.hds_thresh > kernel_ringparam.hds_thresh_max) {
+ NL_SET_BAD_ATTR(info->extack,
+ tb[ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH_MAX]);
+ return -ERANGE;
+ }
+
/* ensure new ring parameters are within limits */
if (ringparam.rx_pending > ringparam.rx_max_pending)
err_attr = tb[ETHTOOL_A_RINGS_RX];
--
2.34.1
^ permalink raw reply related [flat|nested] 28+ messages in thread* Re: [PATCH net-next v5 4/7] net: ethtool: add support for configuring header-data-split-thresh
2024-11-13 17:32 ` [PATCH net-next v5 4/7] net: ethtool: add support for configuring header-data-split-thresh Taehee Yoo
@ 2024-11-15 4:24 ` Jakub Kicinski
2024-11-15 18:05 ` Taehee Yoo
2024-11-15 20:27 ` Saeed Mahameed
1 sibling, 1 reply; 28+ messages in thread
From: Jakub Kicinski @ 2024-11-15 4:24 UTC (permalink / raw)
To: Taehee Yoo
Cc: davem, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev, kory.maincent, maxime.chevallier, danieller, hengqi,
ecree.xilinx, przemyslaw.kitszel, hkallweit1, ahmed.zaki,
rrameshbabu, idosch, jiri, bigeasy, lorenzo, jdamato,
aleksander.lobakin, kaiyuanz, willemb, daniel.zahka
On Wed, 13 Nov 2024 17:32:18 +0000 Taehee Yoo wrote:
> + ``ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH`` u32 threshold of
> + header / data split
> + ``ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH_MAX`` u32 max threshold of
nit: s/_HEADER_DATA_SPLIT_/_HDS_/ ;)
We can explain in the text that HDS stands for header data split.
Let's not bloat the code with 40+ character names...
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next v5 4/7] net: ethtool: add support for configuring header-data-split-thresh
2024-11-15 4:24 ` Jakub Kicinski
@ 2024-11-15 18:05 ` Taehee Yoo
2024-11-15 19:18 ` Jakub Kicinski
0 siblings, 1 reply; 28+ messages in thread
From: Taehee Yoo @ 2024-11-15 18:05 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev, kory.maincent, maxime.chevallier, danieller, hengqi,
ecree.xilinx, przemyslaw.kitszel, hkallweit1, ahmed.zaki,
rrameshbabu, idosch, jiri, bigeasy, lorenzo, jdamato,
aleksander.lobakin, kaiyuanz, willemb, daniel.zahka
On Fri, Nov 15, 2024 at 1:24 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
Hi Jakub,
Thanks a lot for the review!
> On Wed, 13 Nov 2024 17:32:18 +0000 Taehee Yoo wrote:
> > + ``ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH`` u32 threshold of
> > + header / data split
> > + ``ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH_MAX`` u32 max threshold of
>
> nit: s/_HEADER_DATA_SPLIT_/_HDS_/ ;)
>
> We can explain in the text that HDS stands for header data split.
> Let's not bloat the code with 40+ character names...
I'm not familiar with the ynl, but I think there are some dependencies
between Documentation/netlink/spec/ethtool.yaml.
Because It seems to generate ethtool-user.c code automatically based
on ethtool.yaml spec. I think If we change this name to HDS, it need to
change ethtool spec form "header-data-split-thresh" to "hds-thresh" too.
However, I agree with changing ethtool option name from
"header-data-split-thresh" to "hds-thresh". So, If I understand correctly,
what about changing ethtool option name from "header-data-split-thresh"
to "hds-thresh"?
Thanks a lot!
Taehee Yoo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next v5 4/7] net: ethtool: add support for configuring header-data-split-thresh
2024-11-15 18:05 ` Taehee Yoo
@ 2024-11-15 19:18 ` Jakub Kicinski
2024-11-17 12:31 ` Taehee Yoo
0 siblings, 1 reply; 28+ messages in thread
From: Jakub Kicinski @ 2024-11-15 19:18 UTC (permalink / raw)
To: Taehee Yoo
Cc: davem, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev, kory.maincent, maxime.chevallier, danieller, hengqi,
ecree.xilinx, przemyslaw.kitszel, hkallweit1, ahmed.zaki,
rrameshbabu, idosch, jiri, bigeasy, lorenzo, jdamato,
aleksander.lobakin, kaiyuanz, willemb, daniel.zahka
On Sat, 16 Nov 2024 03:05:33 +0900 Taehee Yoo wrote:
> > nit: s/_HEADER_DATA_SPLIT_/_HDS_/ ;)
> >
> > We can explain in the text that HDS stands for header data split.
> > Let's not bloat the code with 40+ character names...
>
> I'm not familiar with the ynl, but I think there are some dependencies
> between Documentation/netlink/spec/ethtool.yaml.
> Because It seems to generate ethtool-user.c code automatically based
> on ethtool.yaml spec. I think If we change this name to HDS, it need to
> change ethtool spec form "header-data-split-thresh" to "hds-thresh" too.
>
> However, I agree with changing ethtool option name from
> "header-data-split-thresh" to "hds-thresh". So, If I understand correctly,
> what about changing ethtool option name from "header-data-split-thresh"
> to "hds-thresh"?
Correct. And yes, you'll need to change in both places the spec and the
header. FWIW Stanislav is working on auto-generating the ethtool header
from the YAML spec, hopefully that will avoid having to change both
places long term.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next v5 4/7] net: ethtool: add support for configuring header-data-split-thresh
2024-11-15 19:18 ` Jakub Kicinski
@ 2024-11-17 12:31 ` Taehee Yoo
0 siblings, 0 replies; 28+ messages in thread
From: Taehee Yoo @ 2024-11-17 12:31 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev, kory.maincent, maxime.chevallier, danieller, hengqi,
ecree.xilinx, przemyslaw.kitszel, hkallweit1, ahmed.zaki,
rrameshbabu, idosch, jiri, bigeasy, lorenzo, jdamato,
aleksander.lobakin, kaiyuanz, willemb, daniel.zahka
On Sat, Nov 16, 2024 at 4:18 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Sat, 16 Nov 2024 03:05:33 +0900 Taehee Yoo wrote:
> > > nit: s/_HEADER_DATA_SPLIT_/_HDS_/ ;)
> > >
> > > We can explain in the text that HDS stands for header data split.
> > > Let's not bloat the code with 40+ character names...
> >
> > I'm not familiar with the ynl, but I think there are some dependencies
> > between Documentation/netlink/spec/ethtool.yaml.
> > Because It seems to generate ethtool-user.c code automatically based
> > on ethtool.yaml spec. I think If we change this name to HDS, it need to
> > change ethtool spec form "header-data-split-thresh" to "hds-thresh" too.
> >
> > However, I agree with changing ethtool option name from
> > "header-data-split-thresh" to "hds-thresh". So, If I understand correctly,
> > what about changing ethtool option name from "header-data-split-thresh"
> > to "hds-thresh"?
>
> Correct. And yes, you'll need to change in both places the spec and the
> header. FWIW Stanislav is working on auto-generating the ethtool header
> from the YAML spec, hopefully that will avoid having to change both
> places long term.
Thanks! I will change both things.
Thanks a lot!
Taehee Yoo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH net-next v5 4/7] net: ethtool: add support for configuring header-data-split-thresh
2024-11-13 17:32 ` [PATCH net-next v5 4/7] net: ethtool: add support for configuring header-data-split-thresh Taehee Yoo
2024-11-15 4:24 ` Jakub Kicinski
@ 2024-11-15 20:27 ` Saeed Mahameed
2024-11-17 14:26 ` Taehee Yoo
1 sibling, 1 reply; 28+ messages in thread
From: Saeed Mahameed @ 2024-11-15 20:27 UTC (permalink / raw)
To: Taehee Yoo
Cc: davem, kuba, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev, kory.maincent, maxime.chevallier, danieller, hengqi,
ecree.xilinx, przemyslaw.kitszel, hkallweit1, ahmed.zaki,
rrameshbabu, idosch, jiri, bigeasy, lorenzo, jdamato,
aleksander.lobakin, kaiyuanz, willemb, daniel.zahka
On 13 Nov 17:32, Taehee Yoo wrote:
>The header-data-split-thresh option configures the threshold value of
>the header-data-split.
>If a received packet size is larger than this threshold value, a packet
>will be split into header and payload.
>The header indicates TCP and UDP header, but it depends on driver spec.
>The bnxt_en driver supports HDS(Header-Data-Split) configuration at
>FW level, affecting TCP and UDP too.
>So, If header-data-split-thresh is set, it affects UDP and TCP packets.
>
>Example:
> # ethtool -G <interface name> header-data-split-thresh <value>
>
> # ethtool -G enp14s0f0np0 tcp-data-split on header-data-split-thresh 256
> # ethtool -g enp14s0f0np0
> Ring parameters for enp14s0f0np0:
> Pre-set maximums:
> ...
> Header data split thresh: 256
> Current hardware settings:
> ...
> TCP data split: on
> Header data split thresh: 256
>
>The default/min/max values are not defined in the ethtool so the drivers
>should define themself.
>The 0 value means that all TCP/UDP packets' header and payload
>will be split.
>
Users will need default/min/max so they know the capabilities of current
device, otherwise it's a guessing game.. why not add it ? also we need an
indication of when the driver doesn't support changing this config but
still want to report default maybe when (min==max). And what is the default
expected by drivers?
>In general cases, HDS can increase the overhead of host memory and PCIe
>bus because it copies data twice.
what copy twice ? do you mean copy header into skb->data ?
this is driver implementation, other dirvers don't copy at all..
>So users should consider the overhead of HDS.
>If the HDS threshold is 0 and then the copybreak is 256 and the packet's
>payload is 8 bytes.
>So, two pages are used, one for headers and one for payloads.
>By the copybreak, only the headers page is copied and then it can be
>reused immediately and then a payloads page is still used.
>If the HDS threshold is larger than 8, both headers and payloads are
>copied and then a page can be recycled immediately.
>So, too low HDS threshold has larger disadvantages than advantages
>aspect of performance in general cases.
>Users should consider the overhead of this feature.
>
I really don't understand this example, rx-copybreak and hds shouldn't be
mixed up and the performance analysis you are describing above is driver
specific, some drivers build skbs around the whole frame, and in hds around
the header only, meaning for non hds case rx-copybreak doesn't make lots of
sense and has no advantage, and for hds it's only one copy..
Maybe we should define what rx-copybreak should behave like when enabled
with hds.. for many drivers they implement copybreak by copying the whole
fresh into a fresh skb->data wihtout splitting the header,
which would be wrong in case of hds enabled.
>Tested-by: Stanislav Fomichev <sdf@fomichev.me>
>Signed-off-by: Taehee Yoo <ap420073@gmail.com>
>---
>
>v5:
> - No changes.
>
>v4:
> - Fix 80 charactor wrap.
> - Rename from tcp-data-split-thresh to header-data-split-thresh
> - Add description about overhead of HDS.
> - Add ETHTOOL_RING_USE_HDS_THRS flag.
> - Add dev_xdp_sb_prog_count() helper.
> - Add Test tag from Stanislav.
>
>v3:
> - Fix documentation and ynl
> - Update error messages
> - Validate configuration of tcp-data-split and tcp-data-split-thresh
>
>v2:
> - Patch added.
>
> Documentation/netlink/specs/ethtool.yaml | 8 ++
> Documentation/networking/ethtool-netlink.rst | 79 ++++++++++++--------
> include/linux/ethtool.h | 6 ++
> include/linux/netdevice.h | 1 +
> include/uapi/linux/ethtool_netlink.h | 2 +
> net/core/dev.c | 13 ++++
> net/ethtool/netlink.h | 2 +-
> net/ethtool/rings.c | 37 ++++++++-
> 8 files changed, 115 insertions(+), 33 deletions(-)
>
>diff --git a/Documentation/netlink/specs/ethtool.yaml b/Documentation/netlink/specs/ethtool.yaml
>index 93369f0eb816..edc07cc290da 100644
>--- a/Documentation/netlink/specs/ethtool.yaml
>+++ b/Documentation/netlink/specs/ethtool.yaml
>@@ -220,6 +220,12 @@ attribute-sets:
> -
> name: tx-push-buf-len-max
> type: u32
>+ -
>+ name: header-data-split-thresh
>+ type: u32
>+ -
>+ name: header-data-split-thresh-max
>+ type: u32
>
> -
> name: mm-stat
>@@ -1398,6 +1404,8 @@ operations:
> - rx-push
> - tx-push-buf-len
> - tx-push-buf-len-max
>+ - header-data-split-thresh
>+ - header-data-split-thresh-max
> dump: *ring-get-op
> -
> name: rings-set
>diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
>index b25926071ece..1fdfeca6f38e 100644
>--- a/Documentation/networking/ethtool-netlink.rst
>+++ b/Documentation/networking/ethtool-netlink.rst
>@@ -878,24 +878,35 @@ Request contents:
>
> Kernel response contents:
>
>- ======================================= ====== ===========================
>- ``ETHTOOL_A_RINGS_HEADER`` nested reply header
>- ``ETHTOOL_A_RINGS_RX_MAX`` u32 max size of RX ring
>- ``ETHTOOL_A_RINGS_RX_MINI_MAX`` u32 max size of RX mini ring
>- ``ETHTOOL_A_RINGS_RX_JUMBO_MAX`` u32 max size of RX jumbo ring
>- ``ETHTOOL_A_RINGS_TX_MAX`` u32 max size of TX ring
>- ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring
>- ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring
>- ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring
>- ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
>- ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring
>- ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data split
>- ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
>- ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode
>- ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode
>- ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer
>- ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX`` u32 max size of TX push buffer
>- ======================================= ====== ===========================
>+ ================================================ ====== ====================
>+ ``ETHTOOL_A_RINGS_HEADER`` nested reply header
>+ ``ETHTOOL_A_RINGS_RX_MAX`` u32 max size of RX ring
>+ ``ETHTOOL_A_RINGS_RX_MINI_MAX`` u32 max size of RX mini
>+ ring
>+ ``ETHTOOL_A_RINGS_RX_JUMBO_MAX`` u32 max size of RX jumbo
>+ ring
>+ ``ETHTOOL_A_RINGS_TX_MAX`` u32 max size of TX ring
>+ ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring
>+ ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring
>+ ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo
>+ ring
>+ ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
>+ ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on
>+ the ring
>+ ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data
>+ split
>+ ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
>+ ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode
>+ ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode
>+ ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push
>+ buffer
>+ ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX`` u32 max size of TX push
>+ buffer
>+ ``ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH`` u32 threshold of
>+ header / data split
>+ ``ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH_MAX`` u32 max threshold of
>+ header / data split
>+ ================================================ ====== ====================
>
> ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` indicates whether the device is usable with
> page-flipping TCP zero-copy receive (``getsockopt(TCP_ZEROCOPY_RECEIVE)``).
>@@ -930,18 +941,22 @@ Sets ring sizes like ``ETHTOOL_SRINGPARAM`` ioctl request.
>
> Request contents:
>
>- ==================================== ====== ===========================
>- ``ETHTOOL_A_RINGS_HEADER`` nested reply header
>- ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring
>- ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring
>- ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring
>- ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
>- ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring
>- ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
>- ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode
>- ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode
>- ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer
>- ==================================== ====== ===========================
>+ ============================================ ====== =======================
>+ ``ETHTOOL_A_RINGS_HEADER`` nested reply header
>+ ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring
>+ ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring
>+ ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring
>+ ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
>+ ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the
>+ ring
>+ ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data split
>+ ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
>+ ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode
>+ ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode
>+ ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer
>+ ``ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH`` u32 threshold of
>+ header / data split
>+ ============================================ ====== =======================
>
> Kernel checks that requested ring sizes do not exceed limits reported by
> driver. Driver may impose additional constraints and may not support all
>@@ -957,6 +972,10 @@ A bigger CQE can have more receive buffer pointers, and in turn the NIC can
> transfer a bigger frame from wire. Based on the NIC hardware, the overall
> completion queue size can be adjusted in the driver if CQE size is modified.
>
>+``ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH`` specifies the threshold value of
>+header / data split feature. If a received packet size is larger than this
>+threshold value, header and data will be split.
>+
> CHANNELS_GET
> ============
>
>diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
>index ecd52b99a63a..b4b6955d7ab9 100644
>--- a/include/linux/ethtool.h
>+++ b/include/linux/ethtool.h
>@@ -79,6 +79,8 @@ enum {
> * @cqe_size: Size of TX/RX completion queue event
> * @tx_push_buf_len: Size of TX push buffer
> * @tx_push_buf_max_len: Maximum allowed size of TX push buffer
>+ * @hds_thresh: Threshold value of header-data-split-thresh
>+ * @hds_thresh_max: Maximum allowed threshold of header-data-split-thresh
> */
> struct kernel_ethtool_ringparam {
> u32 rx_buf_len;
>@@ -89,6 +91,8 @@ struct kernel_ethtool_ringparam {
> u32 cqe_size;
> u32 tx_push_buf_len;
> u32 tx_push_buf_max_len;
>+ u32 hds_thresh;
>+ u32 hds_thresh_max;
> };
>
> /**
>@@ -99,6 +103,7 @@ struct kernel_ethtool_ringparam {
> * @ETHTOOL_RING_USE_RX_PUSH: capture for setting rx_push
> * @ETHTOOL_RING_USE_TX_PUSH_BUF_LEN: capture for setting tx_push_buf_len
> * @ETHTOOL_RING_USE_TCP_DATA_SPLIT: capture for setting tcp_data_split
>+ * @ETHTOOL_RING_USE_HDS_THRS: capture for setting header-data-split-thresh
> */
> enum ethtool_supported_ring_param {
> ETHTOOL_RING_USE_RX_BUF_LEN = BIT(0),
>@@ -107,6 +112,7 @@ enum ethtool_supported_ring_param {
> ETHTOOL_RING_USE_RX_PUSH = BIT(3),
> ETHTOOL_RING_USE_TX_PUSH_BUF_LEN = BIT(4),
> ETHTOOL_RING_USE_TCP_DATA_SPLIT = BIT(5),
>+ ETHTOOL_RING_USE_HDS_THRS = BIT(6),
> };
>
> #define __ETH_RSS_HASH_BIT(bit) ((u32)1 << (bit))
>diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>index 0aae346d919e..0c29068577c4 100644
>--- a/include/linux/netdevice.h
>+++ b/include/linux/netdevice.h
>@@ -4028,6 +4028,7 @@ struct sk_buff *dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
>
> int bpf_xdp_link_attach(const union bpf_attr *attr, struct bpf_prog *prog);
> u8 dev_xdp_prog_count(struct net_device *dev);
>+u8 dev_xdp_sb_prog_count(struct net_device *dev);
> int dev_xdp_propagate(struct net_device *dev, struct netdev_bpf *bpf);
> u32 dev_xdp_prog_id(struct net_device *dev, enum bpf_xdp_mode mode);
>
>diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
>index 283305f6b063..7087c5c51017 100644
>--- a/include/uapi/linux/ethtool_netlink.h
>+++ b/include/uapi/linux/ethtool_netlink.h
>@@ -364,6 +364,8 @@ enum {
> ETHTOOL_A_RINGS_RX_PUSH, /* u8 */
> ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN, /* u32 */
> ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX, /* u32 */
>+ ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH, /* u32 */
>+ ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH_MAX, /* u32 */
>
> /* add new constants above here */
> __ETHTOOL_A_RINGS_CNT,
>diff --git a/net/core/dev.c b/net/core/dev.c
>index 13d00fc10f55..0321d7cbce0f 100644
>--- a/net/core/dev.c
>+++ b/net/core/dev.c
>@@ -9474,6 +9474,19 @@ u8 dev_xdp_prog_count(struct net_device *dev)
> }
> EXPORT_SYMBOL_GPL(dev_xdp_prog_count);
>
>+u8 dev_xdp_sb_prog_count(struct net_device *dev)
>+{
>+ u8 count = 0;
>+ int i;
>+
>+ for (i = 0; i < __MAX_XDP_MODE; i++)
>+ if (dev->xdp_state[i].prog &&
>+ !dev->xdp_state[i].prog->aux->xdp_has_frags)
>+ count++;
>+ return count;
>+}
>+EXPORT_SYMBOL_GPL(dev_xdp_sb_prog_count);
>+
> int dev_xdp_propagate(struct net_device *dev, struct netdev_bpf *bpf)
> {
> if (!dev->netdev_ops->ndo_bpf)
>diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
>index 203b08eb6c6f..9f51a252ebe0 100644
>--- a/net/ethtool/netlink.h
>+++ b/net/ethtool/netlink.h
>@@ -455,7 +455,7 @@ extern const struct nla_policy ethnl_features_set_policy[ETHTOOL_A_FEATURES_WANT
> extern const struct nla_policy ethnl_privflags_get_policy[ETHTOOL_A_PRIVFLAGS_HEADER + 1];
> extern const struct nla_policy ethnl_privflags_set_policy[ETHTOOL_A_PRIVFLAGS_FLAGS + 1];
> extern const struct nla_policy ethnl_rings_get_policy[ETHTOOL_A_RINGS_HEADER + 1];
>-extern const struct nla_policy ethnl_rings_set_policy[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX + 1];
>+extern const struct nla_policy ethnl_rings_set_policy[ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH_MAX + 1];
> extern const struct nla_policy ethnl_channels_get_policy[ETHTOOL_A_CHANNELS_HEADER + 1];
> extern const struct nla_policy ethnl_channels_set_policy[ETHTOOL_A_CHANNELS_COMBINED_COUNT + 1];
> extern const struct nla_policy ethnl_coalesce_get_policy[ETHTOOL_A_COALESCE_HEADER + 1];
>diff --git a/net/ethtool/rings.c b/net/ethtool/rings.c
>index c12ebb61394d..ca836aad3fa9 100644
>--- a/net/ethtool/rings.c
>+++ b/net/ethtool/rings.c
>@@ -61,7 +61,11 @@ static int rings_reply_size(const struct ethnl_req_info *req_base,
> nla_total_size(sizeof(u8)) + /* _RINGS_TX_PUSH */
> nla_total_size(sizeof(u8))) + /* _RINGS_RX_PUSH */
> nla_total_size(sizeof(u32)) + /* _RINGS_TX_PUSH_BUF_LEN */
>- nla_total_size(sizeof(u32)); /* _RINGS_TX_PUSH_BUF_LEN_MAX */
>+ nla_total_size(sizeof(u32)) + /* _RINGS_TX_PUSH_BUF_LEN_MAX */
>+ nla_total_size(sizeof(u32)) +
>+ /* _RINGS_HEADER_DATA_SPLIT_THRESH */
>+ nla_total_size(sizeof(u32));
>+ /* _RINGS_HEADER_DATA_SPLIT_THRESH_MAX*/
> }
>
> static int rings_fill_reply(struct sk_buff *skb,
>@@ -108,7 +112,12 @@ static int rings_fill_reply(struct sk_buff *skb,
> (nla_put_u32(skb, ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX,
> kr->tx_push_buf_max_len) ||
> nla_put_u32(skb, ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN,
>- kr->tx_push_buf_len))))
>+ kr->tx_push_buf_len))) ||
>+ ((supported_ring_params & ETHTOOL_RING_USE_HDS_THRS) &&
>+ (nla_put_u32(skb, ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH,
>+ kr->hds_thresh) ||
>+ nla_put_u32(skb, ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH_MAX,
>+ kr->hds_thresh_max))))
> return -EMSGSIZE;
>
> return 0;
>@@ -130,6 +139,7 @@ const struct nla_policy ethnl_rings_set_policy[] = {
> [ETHTOOL_A_RINGS_TX_PUSH] = NLA_POLICY_MAX(NLA_U8, 1),
> [ETHTOOL_A_RINGS_RX_PUSH] = NLA_POLICY_MAX(NLA_U8, 1),
> [ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN] = { .type = NLA_U32 },
>+ [ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH] = { .type = NLA_U32 },
> };
>
> static int
>@@ -155,6 +165,14 @@ ethnl_set_rings_validate(struct ethnl_req_info *req_info,
> return -EOPNOTSUPP;
> }
>
>+ if (tb[ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH] &&
>+ !(ops->supported_ring_params & ETHTOOL_RING_USE_HDS_THRS)) {
>+ NL_SET_ERR_MSG_ATTR(info->extack,
>+ tb[ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH],
>+ "setting header-data-split-thresh is not supported");
>+ return -EOPNOTSUPP;
>+ }
>+
> if (tb[ETHTOOL_A_RINGS_CQE_SIZE] &&
> !(ops->supported_ring_params & ETHTOOL_RING_USE_CQE_SIZE)) {
> NL_SET_ERR_MSG_ATTR(info->extack,
>@@ -222,9 +240,24 @@ ethnl_set_rings(struct ethnl_req_info *req_info, struct genl_info *info)
> tb[ETHTOOL_A_RINGS_RX_PUSH], &mod);
> ethnl_update_u32(&kernel_ringparam.tx_push_buf_len,
> tb[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN], &mod);
>+ ethnl_update_u32(&kernel_ringparam.hds_thresh,
>+ tb[ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH], &mod);
> if (!mod)
> return 0;
>
>+ if (kernel_ringparam.tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_ENABLED &&
>+ dev_xdp_sb_prog_count(dev)) {
>+ NL_SET_ERR_MSG(info->extack,
>+ "tcp-data-split can not be enabled with single buffer XDP");
>+ return -EINVAL;
>+ }
>+
>+ if (kernel_ringparam.hds_thresh > kernel_ringparam.hds_thresh_max) {
>+ NL_SET_BAD_ATTR(info->extack,
>+ tb[ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH_MAX]);
>+ return -ERANGE;
>+ }
>+
> /* ensure new ring parameters are within limits */
> if (ringparam.rx_pending > ringparam.rx_max_pending)
> err_attr = tb[ETHTOOL_A_RINGS_RX];
>--
>2.34.1
>
>
^ permalink raw reply [flat|nested] 28+ messages in thread* Re: [PATCH net-next v5 4/7] net: ethtool: add support for configuring header-data-split-thresh
2024-11-15 20:27 ` Saeed Mahameed
@ 2024-11-17 14:26 ` Taehee Yoo
0 siblings, 0 replies; 28+ messages in thread
From: Taehee Yoo @ 2024-11-17 14:26 UTC (permalink / raw)
To: Saeed Mahameed
Cc: davem, kuba, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev, kory.maincent, maxime.chevallier, danieller, hengqi,
ecree.xilinx, przemyslaw.kitszel, hkallweit1, ahmed.zaki,
rrameshbabu, idosch, jiri, bigeasy, lorenzo, jdamato,
aleksander.lobakin, kaiyuanz, willemb, daniel.zahka
On Sat, Nov 16, 2024 at 5:27 AM Saeed Mahameed <saeed@kernel.org> wrote:
>
Hi Saeed,
Thank you so much for the review!
> On 13 Nov 17:32, Taehee Yoo wrote:
> >The header-data-split-thresh option configures the threshold value of
> >the header-data-split.
> >If a received packet size is larger than this threshold value, a packet
> >will be split into header and payload.
> >The header indicates TCP and UDP header, but it depends on driver spec.
> >The bnxt_en driver supports HDS(Header-Data-Split) configuration at
> >FW level, affecting TCP and UDP too.
> >So, If header-data-split-thresh is set, it affects UDP and TCP packets.
> >
> >Example:
> > # ethtool -G <interface name> header-data-split-thresh <value>
> >
> > # ethtool -G enp14s0f0np0 tcp-data-split on header-data-split-thresh 256
> > # ethtool -g enp14s0f0np0
> > Ring parameters for enp14s0f0np0:
> > Pre-set maximums:
> > ...
> > Header data split thresh: 256
> > Current hardware settings:
> > ...
> > TCP data split: on
> > Header data split thresh: 256
> >
> >The default/min/max values are not defined in the ethtool so the drivers
> >should define themself.
> >The 0 value means that all TCP/UDP packets' header and payload
> >will be split.
> >
>
> Users will need default/min/max so they know the capabilities of current
> device, otherwise it's a guessing game.. why not add it ? also we need an
> indication of when the driver doesn't support changing this config but
> still want to report default maybe when (min==max). And what is the default
> expected by drivers?
I defined ETHTOOL_A_RINGS_HDS_THRESH_MAX, which indicates the maximum
value of HDS-threshold from drivers. ethtool will show this value.
I think the description needs to be changed.
I'm sure that if a driver doesn't support changing this value,
it indicates that the driver only supports 0 as hds-threshold value.
If so, I think It's okay to set ETHTOOL_A_RINGS_HDS_THRESH_MAX to 0.
This is a GVE case, GVE doesn't support changing this value,
it support only 0.
I'm also sure that there is no case that NIC doesn't support changing
HDS-thresh but the default value is not 0.
I think the default value would be 0 if there are no special cases.
The bnxt_en driver's default value is 256 because it has been being used.
>
> >In general cases, HDS can increase the overhead of host memory and PCIe
> >bus because it copies data twice.
>
> what copy twice ? do you mean copy header into skb->data ?
> this is driver implementation, other dirvers don't copy at all..
Sorry for the ambiguous description,
This means that DMA transfers the header/payload separately.
>
> >So users should consider the overhead of HDS.
> >If the HDS threshold is 0 and then the copybreak is 256 and the packet's
> >payload is 8 bytes.
> >So, two pages are used, one for headers and one for payloads.
> >By the copybreak, only the headers page is copied and then it can be
> >reused immediately and then a payloads page is still used.
> >If the HDS threshold is larger than 8, both headers and payloads are
> >copied and then a page can be recycled immediately.
> >So, too low HDS threshold has larger disadvantages than advantages
> >aspect of performance in general cases.
> >Users should consider the overhead of this feature.
> >
>
> I really don't understand this example, rx-copybreak and hds shouldn't be
> mixed up and the performance analysis you are describing above is driver
> specific, some drivers build skbs around the whole frame, and in hds around
> the header only, meaning for non hds case rx-copybreak doesn't make lots of
> sense and has no advantage, and for hds it's only one copy..
>
> Maybe we should define what rx-copybreak should behave like when enabled
> with hds.. for many drivers they implement copybreak by copying the whole
> fresh into a fresh skb->data wihtout splitting the header,
> which would be wrong in case of hds enabled.
>
Sorry for the driver-specific description, I will remove this example.
I thought HDS with rx-copybreak is okay, but as per your concern,
it could change that user expects for rx-copybreak.
what do you think about it? should we disallow HDS + rx-copybreak?
Thanks a lot!
Taehee Yoo
> >Tested-by: Stanislav Fomichev <sdf@fomichev.me>
> >Signed-off-by: Taehee Yoo <ap420073@gmail.com>
> >---
> >
> >v5:
> > - No changes.
> >
> >v4:
> > - Fix 80 charactor wrap.
> > - Rename from tcp-data-split-thresh to header-data-split-thresh
> > - Add description about overhead of HDS.
> > - Add ETHTOOL_RING_USE_HDS_THRS flag.
> > - Add dev_xdp_sb_prog_count() helper.
> > - Add Test tag from Stanislav.
> >
> >v3:
> > - Fix documentation and ynl
> > - Update error messages
> > - Validate configuration of tcp-data-split and tcp-data-split-thresh
> >
> >v2:
> > - Patch added.
> >
> > Documentation/netlink/specs/ethtool.yaml | 8 ++
> > Documentation/networking/ethtool-netlink.rst | 79 ++++++++++++--------
> > include/linux/ethtool.h | 6 ++
> > include/linux/netdevice.h | 1 +
> > include/uapi/linux/ethtool_netlink.h | 2 +
> > net/core/dev.c | 13 ++++
> > net/ethtool/netlink.h | 2 +-
> > net/ethtool/rings.c | 37 ++++++++-
> > 8 files changed, 115 insertions(+), 33 deletions(-)
> >
> >diff --git a/Documentation/netlink/specs/ethtool.yaml b/Documentation/netlink/specs/ethtool.yaml
> >index 93369f0eb816..edc07cc290da 100644
> >--- a/Documentation/netlink/specs/ethtool.yaml
> >+++ b/Documentation/netlink/specs/ethtool.yaml
> >@@ -220,6 +220,12 @@ attribute-sets:
> > -
> > name: tx-push-buf-len-max
> > type: u32
> >+ -
> >+ name: header-data-split-thresh
> >+ type: u32
> >+ -
> >+ name: header-data-split-thresh-max
> >+ type: u32
> >
> > -
> > name: mm-stat
> >@@ -1398,6 +1404,8 @@ operations:
> > - rx-push
> > - tx-push-buf-len
> > - tx-push-buf-len-max
> >+ - header-data-split-thresh
> >+ - header-data-split-thresh-max
> > dump: *ring-get-op
> > -
> > name: rings-set
> >diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
> >index b25926071ece..1fdfeca6f38e 100644
> >--- a/Documentation/networking/ethtool-netlink.rst
> >+++ b/Documentation/networking/ethtool-netlink.rst
> >@@ -878,24 +878,35 @@ Request contents:
> >
> > Kernel response contents:
> >
> >- ======================================= ====== ===========================
> >- ``ETHTOOL_A_RINGS_HEADER`` nested reply header
> >- ``ETHTOOL_A_RINGS_RX_MAX`` u32 max size of RX ring
> >- ``ETHTOOL_A_RINGS_RX_MINI_MAX`` u32 max size of RX mini ring
> >- ``ETHTOOL_A_RINGS_RX_JUMBO_MAX`` u32 max size of RX jumbo ring
> >- ``ETHTOOL_A_RINGS_TX_MAX`` u32 max size of TX ring
> >- ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring
> >- ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring
> >- ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring
> >- ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
> >- ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring
> >- ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data split
> >- ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
> >- ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode
> >- ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode
> >- ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer
> >- ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX`` u32 max size of TX push buffer
> >- ======================================= ====== ===========================
> >+ ================================================ ====== ====================
> >+ ``ETHTOOL_A_RINGS_HEADER`` nested reply header
> >+ ``ETHTOOL_A_RINGS_RX_MAX`` u32 max size of RX ring
> >+ ``ETHTOOL_A_RINGS_RX_MINI_MAX`` u32 max size of RX mini
> >+ ring
> >+ ``ETHTOOL_A_RINGS_RX_JUMBO_MAX`` u32 max size of RX jumbo
> >+ ring
> >+ ``ETHTOOL_A_RINGS_TX_MAX`` u32 max size of TX ring
> >+ ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring
> >+ ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring
> >+ ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo
> >+ ring
> >+ ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
> >+ ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on
> >+ the ring
> >+ ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data
> >+ split
> >+ ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
> >+ ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode
> >+ ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode
> >+ ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push
> >+ buffer
> >+ ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX`` u32 max size of TX push
> >+ buffer
> >+ ``ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH`` u32 threshold of
> >+ header / data split
> >+ ``ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH_MAX`` u32 max threshold of
> >+ header / data split
> >+ ================================================ ====== ====================
> >
> > ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` indicates whether the device is usable with
> > page-flipping TCP zero-copy receive (``getsockopt(TCP_ZEROCOPY_RECEIVE)``).
> >@@ -930,18 +941,22 @@ Sets ring sizes like ``ETHTOOL_SRINGPARAM`` ioctl request.
> >
> > Request contents:
> >
> >- ==================================== ====== ===========================
> >- ``ETHTOOL_A_RINGS_HEADER`` nested reply header
> >- ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring
> >- ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring
> >- ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring
> >- ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
> >- ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring
> >- ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
> >- ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode
> >- ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode
> >- ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer
> >- ==================================== ====== ===========================
> >+ ============================================ ====== =======================
> >+ ``ETHTOOL_A_RINGS_HEADER`` nested reply header
> >+ ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring
> >+ ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring
> >+ ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring
> >+ ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
> >+ ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the
> >+ ring
> >+ ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data split
> >+ ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
> >+ ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode
> >+ ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode
> >+ ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer
> >+ ``ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH`` u32 threshold of
> >+ header / data split
> >+ ============================================ ====== =======================
> >
> > Kernel checks that requested ring sizes do not exceed limits reported by
> > driver. Driver may impose additional constraints and may not support all
> >@@ -957,6 +972,10 @@ A bigger CQE can have more receive buffer pointers, and in turn the NIC can
> > transfer a bigger frame from wire. Based on the NIC hardware, the overall
> > completion queue size can be adjusted in the driver if CQE size is modified.
> >
> >+``ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH`` specifies the threshold value of
> >+header / data split feature. If a received packet size is larger than this
> >+threshold value, header and data will be split.
> >+
> > CHANNELS_GET
> > ============
> >
> >diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> >index ecd52b99a63a..b4b6955d7ab9 100644
> >--- a/include/linux/ethtool.h
> >+++ b/include/linux/ethtool.h
> >@@ -79,6 +79,8 @@ enum {
> > * @cqe_size: Size of TX/RX completion queue event
> > * @tx_push_buf_len: Size of TX push buffer
> > * @tx_push_buf_max_len: Maximum allowed size of TX push buffer
> >+ * @hds_thresh: Threshold value of header-data-split-thresh
> >+ * @hds_thresh_max: Maximum allowed threshold of header-data-split-thresh
> > */
> > struct kernel_ethtool_ringparam {
> > u32 rx_buf_len;
> >@@ -89,6 +91,8 @@ struct kernel_ethtool_ringparam {
> > u32 cqe_size;
> > u32 tx_push_buf_len;
> > u32 tx_push_buf_max_len;
> >+ u32 hds_thresh;
> >+ u32 hds_thresh_max;
> > };
> >
> > /**
> >@@ -99,6 +103,7 @@ struct kernel_ethtool_ringparam {
> > * @ETHTOOL_RING_USE_RX_PUSH: capture for setting rx_push
> > * @ETHTOOL_RING_USE_TX_PUSH_BUF_LEN: capture for setting tx_push_buf_len
> > * @ETHTOOL_RING_USE_TCP_DATA_SPLIT: capture for setting tcp_data_split
> >+ * @ETHTOOL_RING_USE_HDS_THRS: capture for setting header-data-split-thresh
> > */
> > enum ethtool_supported_ring_param {
> > ETHTOOL_RING_USE_RX_BUF_LEN = BIT(0),
> >@@ -107,6 +112,7 @@ enum ethtool_supported_ring_param {
> > ETHTOOL_RING_USE_RX_PUSH = BIT(3),
> > ETHTOOL_RING_USE_TX_PUSH_BUF_LEN = BIT(4),
> > ETHTOOL_RING_USE_TCP_DATA_SPLIT = BIT(5),
> >+ ETHTOOL_RING_USE_HDS_THRS = BIT(6),
> > };
> >
> > #define __ETH_RSS_HASH_BIT(bit) ((u32)1 << (bit))
> >diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> >index 0aae346d919e..0c29068577c4 100644
> >--- a/include/linux/netdevice.h
> >+++ b/include/linux/netdevice.h
> >@@ -4028,6 +4028,7 @@ struct sk_buff *dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
> >
> > int bpf_xdp_link_attach(const union bpf_attr *attr, struct bpf_prog *prog);
> > u8 dev_xdp_prog_count(struct net_device *dev);
> >+u8 dev_xdp_sb_prog_count(struct net_device *dev);
> > int dev_xdp_propagate(struct net_device *dev, struct netdev_bpf *bpf);
> > u32 dev_xdp_prog_id(struct net_device *dev, enum bpf_xdp_mode mode);
> >
> >diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
> >index 283305f6b063..7087c5c51017 100644
> >--- a/include/uapi/linux/ethtool_netlink.h
> >+++ b/include/uapi/linux/ethtool_netlink.h
> >@@ -364,6 +364,8 @@ enum {
> > ETHTOOL_A_RINGS_RX_PUSH, /* u8 */
> > ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN, /* u32 */
> > ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX, /* u32 */
> >+ ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH, /* u32 */
> >+ ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH_MAX, /* u32 */
> >
> > /* add new constants above here */
> > __ETHTOOL_A_RINGS_CNT,
> >diff --git a/net/core/dev.c b/net/core/dev.c
> >index 13d00fc10f55..0321d7cbce0f 100644
> >--- a/net/core/dev.c
> >+++ b/net/core/dev.c
> >@@ -9474,6 +9474,19 @@ u8 dev_xdp_prog_count(struct net_device *dev)
> > }
> > EXPORT_SYMBOL_GPL(dev_xdp_prog_count);
> >
> >+u8 dev_xdp_sb_prog_count(struct net_device *dev)
> >+{
> >+ u8 count = 0;
> >+ int i;
> >+
> >+ for (i = 0; i < __MAX_XDP_MODE; i++)
> >+ if (dev->xdp_state[i].prog &&
> >+ !dev->xdp_state[i].prog->aux->xdp_has_frags)
> >+ count++;
> >+ return count;
> >+}
> >+EXPORT_SYMBOL_GPL(dev_xdp_sb_prog_count);
> >+
> > int dev_xdp_propagate(struct net_device *dev, struct netdev_bpf *bpf)
> > {
> > if (!dev->netdev_ops->ndo_bpf)
> >diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
> >index 203b08eb6c6f..9f51a252ebe0 100644
> >--- a/net/ethtool/netlink.h
> >+++ b/net/ethtool/netlink.h
> >@@ -455,7 +455,7 @@ extern const struct nla_policy ethnl_features_set_policy[ETHTOOL_A_FEATURES_WANT
> > extern const struct nla_policy ethnl_privflags_get_policy[ETHTOOL_A_PRIVFLAGS_HEADER + 1];
> > extern const struct nla_policy ethnl_privflags_set_policy[ETHTOOL_A_PRIVFLAGS_FLAGS + 1];
> > extern const struct nla_policy ethnl_rings_get_policy[ETHTOOL_A_RINGS_HEADER + 1];
> >-extern const struct nla_policy ethnl_rings_set_policy[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX + 1];
> >+extern const struct nla_policy ethnl_rings_set_policy[ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH_MAX + 1];
> > extern const struct nla_policy ethnl_channels_get_policy[ETHTOOL_A_CHANNELS_HEADER + 1];
> > extern const struct nla_policy ethnl_channels_set_policy[ETHTOOL_A_CHANNELS_COMBINED_COUNT + 1];
> > extern const struct nla_policy ethnl_coalesce_get_policy[ETHTOOL_A_COALESCE_HEADER + 1];
> >diff --git a/net/ethtool/rings.c b/net/ethtool/rings.c
> >index c12ebb61394d..ca836aad3fa9 100644
> >--- a/net/ethtool/rings.c
> >+++ b/net/ethtool/rings.c
> >@@ -61,7 +61,11 @@ static int rings_reply_size(const struct ethnl_req_info *req_base,
> > nla_total_size(sizeof(u8)) + /* _RINGS_TX_PUSH */
> > nla_total_size(sizeof(u8))) + /* _RINGS_RX_PUSH */
> > nla_total_size(sizeof(u32)) + /* _RINGS_TX_PUSH_BUF_LEN */
> >- nla_total_size(sizeof(u32)); /* _RINGS_TX_PUSH_BUF_LEN_MAX */
> >+ nla_total_size(sizeof(u32)) + /* _RINGS_TX_PUSH_BUF_LEN_MAX */
> >+ nla_total_size(sizeof(u32)) +
> >+ /* _RINGS_HEADER_DATA_SPLIT_THRESH */
> >+ nla_total_size(sizeof(u32));
> >+ /* _RINGS_HEADER_DATA_SPLIT_THRESH_MAX*/
> > }
> >
> > static int rings_fill_reply(struct sk_buff *skb,
> >@@ -108,7 +112,12 @@ static int rings_fill_reply(struct sk_buff *skb,
> > (nla_put_u32(skb, ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX,
> > kr->tx_push_buf_max_len) ||
> > nla_put_u32(skb, ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN,
> >- kr->tx_push_buf_len))))
> >+ kr->tx_push_buf_len))) ||
> >+ ((supported_ring_params & ETHTOOL_RING_USE_HDS_THRS) &&
> >+ (nla_put_u32(skb, ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH,
> >+ kr->hds_thresh) ||
> >+ nla_put_u32(skb, ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH_MAX,
> >+ kr->hds_thresh_max))))
> > return -EMSGSIZE;
> >
> > return 0;
> >@@ -130,6 +139,7 @@ const struct nla_policy ethnl_rings_set_policy[] = {
> > [ETHTOOL_A_RINGS_TX_PUSH] = NLA_POLICY_MAX(NLA_U8, 1),
> > [ETHTOOL_A_RINGS_RX_PUSH] = NLA_POLICY_MAX(NLA_U8, 1),
> > [ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN] = { .type = NLA_U32 },
> >+ [ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH] = { .type = NLA_U32 },
> > };
> >
> > static int
> >@@ -155,6 +165,14 @@ ethnl_set_rings_validate(struct ethnl_req_info *req_info,
> > return -EOPNOTSUPP;
> > }
> >
> >+ if (tb[ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH] &&
> >+ !(ops->supported_ring_params & ETHTOOL_RING_USE_HDS_THRS)) {
> >+ NL_SET_ERR_MSG_ATTR(info->extack,
> >+ tb[ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH],
> >+ "setting header-data-split-thresh is not supported");
> >+ return -EOPNOTSUPP;
> >+ }
> >+
> > if (tb[ETHTOOL_A_RINGS_CQE_SIZE] &&
> > !(ops->supported_ring_params & ETHTOOL_RING_USE_CQE_SIZE)) {
> > NL_SET_ERR_MSG_ATTR(info->extack,
> >@@ -222,9 +240,24 @@ ethnl_set_rings(struct ethnl_req_info *req_info, struct genl_info *info)
> > tb[ETHTOOL_A_RINGS_RX_PUSH], &mod);
> > ethnl_update_u32(&kernel_ringparam.tx_push_buf_len,
> > tb[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN], &mod);
> >+ ethnl_update_u32(&kernel_ringparam.hds_thresh,
> >+ tb[ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH], &mod);
> > if (!mod)
> > return 0;
> >
> >+ if (kernel_ringparam.tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_ENABLED &&
> >+ dev_xdp_sb_prog_count(dev)) {
> >+ NL_SET_ERR_MSG(info->extack,
> >+ "tcp-data-split can not be enabled with single buffer XDP");
> >+ return -EINVAL;
> >+ }
> >+
> >+ if (kernel_ringparam.hds_thresh > kernel_ringparam.hds_thresh_max) {
> >+ NL_SET_BAD_ATTR(info->extack,
> >+ tb[ETHTOOL_A_RINGS_HEADER_DATA_SPLIT_THRESH_MAX]);
> >+ return -ERANGE;
> >+ }
> >+
> > /* ensure new ring parameters are within limits */
> > if (ringparam.rx_pending > ringparam.rx_max_pending)
> > err_attr = tb[ETHTOOL_A_RINGS_RX];
> >--
> >2.34.1
> >
> >
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH net-next v5 5/7] bnxt_en: add support for header-data-split-thresh ethtool command
2024-11-13 17:32 [PATCH net-next v5 0/7] bnxt_en: implement tcp-data-split and thresh option Taehee Yoo
` (3 preceding siblings ...)
2024-11-13 17:32 ` [PATCH net-next v5 4/7] net: ethtool: add support for configuring header-data-split-thresh Taehee Yoo
@ 2024-11-13 17:32 ` Taehee Yoo
2024-11-14 22:54 ` Andy Gospodarek
2024-11-13 17:32 ` [PATCH net-next v5 6/7] net: devmem: add ring parameter filtering Taehee Yoo
` (3 subsequent siblings)
8 siblings, 1 reply; 28+ messages in thread
From: Taehee Yoo @ 2024-11-13 17:32 UTC (permalink / raw)
To: davem, kuba, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev
Cc: kory.maincent, maxime.chevallier, danieller, hengqi, ecree.xilinx,
przemyslaw.kitszel, hkallweit1, ahmed.zaki, rrameshbabu, idosch,
jiri, bigeasy, lorenzo, jdamato, aleksander.lobakin, kaiyuanz,
willemb, daniel.zahka, ap420073
The bnxt_en driver has configured the hds_threshold value automatically
when TPA is enabled based on the rx-copybreak default value.
Now the header-data-split-thresh ethtool command is added, so it adds an
implementation of header-data-split-thresh option.
Configuration of the header-data-split-thresh is allowed only when
the header-data-split is enabled. The default value of
header-data-split-thresh is 256, which is the default value of
rx-copybreak, which used to be the hds_thresh value.
# Example:
# ethtool -G enp14s0f0np0 tcp-data-split on header-data-split-thresh 256
# ethtool -g enp14s0f0np0
Ring parameters for enp14s0f0np0:
Pre-set maximums:
...
Header data split thresh: 256
Current hardware settings:
...
TCP data split: on
Header data split thresh: 256
Tested-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
v5:
- No changes.
v4:
- Reduce hole in struct bnxt.
- Add ETHTOOL_RING_USE_HDS_THRS to indicate bnxt_en driver support
header-data-split-thresh option.
- Add Test tag from Stanislav.
v3:
- Drop validation logic tcp-data-split and tcp-data-split-thresh.
v2:
- Patch added.
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 ++-
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 2 ++
drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 7 ++++++-
3 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 608bcca71676..27d6bac3a69a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4454,6 +4454,7 @@ void bnxt_set_tpa_flags(struct bnxt *bp)
static void bnxt_init_ring_params(struct bnxt *bp)
{
bp->rx_copybreak = BNXT_DEFAULT_RX_COPYBREAK;
+ bp->hds_threshold = BNXT_DEFAULT_RX_COPYBREAK;
}
/* bp->rx_ring_size, bp->tx_ring_size, dev->mtu, BNXT_FLAG_{G|L}RO flags must
@@ -6429,7 +6430,7 @@ static int bnxt_hwrm_vnic_set_hds(struct bnxt *bp, struct bnxt_vnic_info *vnic)
VNIC_PLCMODES_CFG_REQ_FLAGS_HDS_IPV6);
req->enables |=
cpu_to_le32(VNIC_PLCMODES_CFG_REQ_ENABLES_HDS_THRESHOLD_VALID);
- req->hds_threshold = cpu_to_le16(bp->rx_copybreak);
+ req->hds_threshold = cpu_to_le16(bp->hds_threshold);
}
req->vnic_id = cpu_to_le32(vnic->fw_vnic_id);
return hwrm_req_send(bp, req);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 3a7d2f3ebb2a..058db26fb255 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -2362,6 +2362,8 @@ struct bnxt {
u8 q_ids[BNXT_MAX_QUEUE];
u8 max_q;
u8 num_tc;
+#define BNXT_HDS_THRESHOLD_MAX 256
+ u16 hds_threshold;
unsigned int current_interval;
#define BNXT_TIMER_INTERVAL HZ
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index b0054eef389b..a51a4cedecb9 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -833,6 +833,9 @@ static void bnxt_get_ringparam(struct net_device *dev,
ering->rx_pending = bp->rx_ring_size;
ering->rx_jumbo_pending = bp->rx_agg_ring_size;
ering->tx_pending = bp->tx_ring_size;
+
+ kernel_ering->hds_thresh = bp->hds_threshold;
+ kernel_ering->hds_thresh_max = BNXT_HDS_THRESHOLD_MAX;
}
static int bnxt_set_ringparam(struct net_device *dev,
@@ -868,6 +871,7 @@ static int bnxt_set_ringparam(struct net_device *dev,
else if (tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_UNKNOWN)
bp->flags &= ~BNXT_FLAG_HDS;
+ bp->hds_threshold = (u16)kernel_ering->hds_thresh;
bp->rx_ring_size = ering->rx_pending;
bp->tx_ring_size = ering->tx_pending;
bnxt_set_ring_params(bp);
@@ -5363,7 +5367,8 @@ const struct ethtool_ops bnxt_ethtool_ops = {
ETHTOOL_COALESCE_STATS_BLOCK_USECS |
ETHTOOL_COALESCE_USE_ADAPTIVE_RX |
ETHTOOL_COALESCE_USE_CQE,
- .supported_ring_params = ETHTOOL_RING_USE_TCP_DATA_SPLIT,
+ .supported_ring_params = ETHTOOL_RING_USE_TCP_DATA_SPLIT |
+ ETHTOOL_RING_USE_HDS_THRS,
.get_link_ksettings = bnxt_get_link_ksettings,
.set_link_ksettings = bnxt_set_link_ksettings,
.get_fec_stats = bnxt_get_fec_stats,
--
2.34.1
^ permalink raw reply related [flat|nested] 28+ messages in thread* Re: [PATCH net-next v5 5/7] bnxt_en: add support for header-data-split-thresh ethtool command
2024-11-13 17:32 ` [PATCH net-next v5 5/7] bnxt_en: add support for header-data-split-thresh ethtool command Taehee Yoo
@ 2024-11-14 22:54 ` Andy Gospodarek
2024-11-15 0:27 ` Michael Chan
0 siblings, 1 reply; 28+ messages in thread
From: Andy Gospodarek @ 2024-11-14 22:54 UTC (permalink / raw)
To: Taehee Yoo
Cc: davem, kuba, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev, kory.maincent, maxime.chevallier, danieller, hengqi,
ecree.xilinx, przemyslaw.kitszel, hkallweit1, ahmed.zaki,
rrameshbabu, idosch, jiri, bigeasy, lorenzo, jdamato,
aleksander.lobakin, kaiyuanz, willemb, daniel.zahka
On Wed, Nov 13, 2024 at 05:32:19PM +0000, Taehee Yoo wrote:
> The bnxt_en driver has configured the hds_threshold value automatically
> when TPA is enabled based on the rx-copybreak default value.
> Now the header-data-split-thresh ethtool command is added, so it adds an
> implementation of header-data-split-thresh option.
>
> Configuration of the header-data-split-thresh is allowed only when
> the header-data-split is enabled. The default value of
> header-data-split-thresh is 256, which is the default value of
> rx-copybreak, which used to be the hds_thresh value.
>
> # Example:
> # ethtool -G enp14s0f0np0 tcp-data-split on header-data-split-thresh 256
> # ethtool -g enp14s0f0np0
> Ring parameters for enp14s0f0np0:
> Pre-set maximums:
> ...
> Header data split thresh: 256
> Current hardware settings:
> ...
> TCP data split: on
> Header data split thresh: 256
>
> Tested-by: Stanislav Fomichev <sdf@fomichev.me>
> Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Tested-by: Andy Gospodarek <gospo@broadcom.com>
> ---
>
> v5:
> - No changes.
>
> v4:
> - Reduce hole in struct bnxt.
> - Add ETHTOOL_RING_USE_HDS_THRS to indicate bnxt_en driver support
> header-data-split-thresh option.
> - Add Test tag from Stanislav.
>
> v3:
> - Drop validation logic tcp-data-split and tcp-data-split-thresh.
>
> v2:
> - Patch added.
>
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 ++-
> drivers/net/ethernet/broadcom/bnxt/bnxt.h | 2 ++
> drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 7 ++++++-
> 3 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 608bcca71676..27d6bac3a69a 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -4454,6 +4454,7 @@ void bnxt_set_tpa_flags(struct bnxt *bp)
> static void bnxt_init_ring_params(struct bnxt *bp)
> {
> bp->rx_copybreak = BNXT_DEFAULT_RX_COPYBREAK;
> + bp->hds_threshold = BNXT_DEFAULT_RX_COPYBREAK;
> }
>
> /* bp->rx_ring_size, bp->tx_ring_size, dev->mtu, BNXT_FLAG_{G|L}RO flags must
> @@ -6429,7 +6430,7 @@ static int bnxt_hwrm_vnic_set_hds(struct bnxt *bp, struct bnxt_vnic_info *vnic)
> VNIC_PLCMODES_CFG_REQ_FLAGS_HDS_IPV6);
> req->enables |=
> cpu_to_le32(VNIC_PLCMODES_CFG_REQ_ENABLES_HDS_THRESHOLD_VALID);
> - req->hds_threshold = cpu_to_le16(bp->rx_copybreak);
> + req->hds_threshold = cpu_to_le16(bp->hds_threshold);
> }
> req->vnic_id = cpu_to_le32(vnic->fw_vnic_id);
> return hwrm_req_send(bp, req);
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> index 3a7d2f3ebb2a..058db26fb255 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> @@ -2362,6 +2362,8 @@ struct bnxt {
> u8 q_ids[BNXT_MAX_QUEUE];
> u8 max_q;
> u8 num_tc;
> +#define BNXT_HDS_THRESHOLD_MAX 256
> + u16 hds_threshold;
>
> unsigned int current_interval;
> #define BNXT_TIMER_INTERVAL HZ
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
> index b0054eef389b..a51a4cedecb9 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
> @@ -833,6 +833,9 @@ static void bnxt_get_ringparam(struct net_device *dev,
> ering->rx_pending = bp->rx_ring_size;
> ering->rx_jumbo_pending = bp->rx_agg_ring_size;
> ering->tx_pending = bp->tx_ring_size;
> +
> + kernel_ering->hds_thresh = bp->hds_threshold;
> + kernel_ering->hds_thresh_max = BNXT_HDS_THRESHOLD_MAX;
> }
>
> static int bnxt_set_ringparam(struct net_device *dev,
> @@ -868,6 +871,7 @@ static int bnxt_set_ringparam(struct net_device *dev,
> else if (tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_UNKNOWN)
> bp->flags &= ~BNXT_FLAG_HDS;
>
> + bp->hds_threshold = (u16)kernel_ering->hds_thresh;
> bp->rx_ring_size = ering->rx_pending;
> bp->tx_ring_size = ering->tx_pending;
> bnxt_set_ring_params(bp);
> @@ -5363,7 +5367,8 @@ const struct ethtool_ops bnxt_ethtool_ops = {
> ETHTOOL_COALESCE_STATS_BLOCK_USECS |
> ETHTOOL_COALESCE_USE_ADAPTIVE_RX |
> ETHTOOL_COALESCE_USE_CQE,
> - .supported_ring_params = ETHTOOL_RING_USE_TCP_DATA_SPLIT,
> + .supported_ring_params = ETHTOOL_RING_USE_TCP_DATA_SPLIT |
> + ETHTOOL_RING_USE_HDS_THRS,
> .get_link_ksettings = bnxt_get_link_ksettings,
> .set_link_ksettings = bnxt_set_link_ksettings,
> .get_fec_stats = bnxt_get_fec_stats,
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 28+ messages in thread* Re: [PATCH net-next v5 5/7] bnxt_en: add support for header-data-split-thresh ethtool command
2024-11-14 22:54 ` Andy Gospodarek
@ 2024-11-15 0:27 ` Michael Chan
2024-11-15 16:18 ` Taehee Yoo
0 siblings, 1 reply; 28+ messages in thread
From: Michael Chan @ 2024-11-15 0:27 UTC (permalink / raw)
To: Andy Gospodarek
Cc: Taehee Yoo, davem, kuba, pabeni, edumazet, almasrymina,
donald.hunter, corbet, andrew+netdev, hawk, ilias.apalodimas, ast,
daniel, john.fastabend, dw, sdf, asml.silence, brett.creeley,
linux-doc, netdev, kory.maincent, maxime.chevallier, danieller,
hengqi, ecree.xilinx, przemyslaw.kitszel, hkallweit1, ahmed.zaki,
rrameshbabu, idosch, jiri, bigeasy, lorenzo, jdamato,
aleksander.lobakin, kaiyuanz, willemb, daniel.zahka
[-- Attachment #1: Type: text/plain, Size: 1609 bytes --]
On Thu, Nov 14, 2024 at 2:54 PM Andy Gospodarek
<andrew.gospodarek@broadcom.com> wrote:
>
> On Wed, Nov 13, 2024 at 05:32:19PM +0000, Taehee Yoo wrote:
> > The bnxt_en driver has configured the hds_threshold value automatically
> > when TPA is enabled based on the rx-copybreak default value.
> > Now the header-data-split-thresh ethtool command is added, so it adds an
> > implementation of header-data-split-thresh option.
> >
> > Configuration of the header-data-split-thresh is allowed only when
> > the header-data-split is enabled. The default value of
> > header-data-split-thresh is 256, which is the default value of
> > rx-copybreak, which used to be the hds_thresh value.
> >
> > # Example:
> > # ethtool -G enp14s0f0np0 tcp-data-split on header-data-split-thresh 256
> > # ethtool -g enp14s0f0np0
> > Ring parameters for enp14s0f0np0:
> > Pre-set maximums:
> > ...
> > Header data split thresh: 256
> > Current hardware settings:
> > ...
> > TCP data split: on
> > Header data split thresh: 256
> >
> > Tested-by: Stanislav Fomichev <sdf@fomichev.me>
> > Signed-off-by: Taehee Yoo <ap420073@gmail.com>
>
> Tested-by: Andy Gospodarek <gospo@broadcom.com>
>
> > @@ -2362,6 +2362,8 @@ struct bnxt {
> > u8 q_ids[BNXT_MAX_QUEUE];
> > u8 max_q;
> > u8 num_tc;
> > +#define BNXT_HDS_THRESHOLD_MAX 256
As mentioned in my comments for patch #1, our NIC can support HDS
threshold of up to 1023, so we can set this to 1023. Thanks.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]
^ permalink raw reply [flat|nested] 28+ messages in thread* Re: [PATCH net-next v5 5/7] bnxt_en: add support for header-data-split-thresh ethtool command
2024-11-15 0:27 ` Michael Chan
@ 2024-11-15 16:18 ` Taehee Yoo
0 siblings, 0 replies; 28+ messages in thread
From: Taehee Yoo @ 2024-11-15 16:18 UTC (permalink / raw)
To: Michael Chan
Cc: Andy Gospodarek, davem, kuba, pabeni, edumazet, almasrymina,
donald.hunter, corbet, andrew+netdev, hawk, ilias.apalodimas, ast,
daniel, john.fastabend, dw, sdf, asml.silence, brett.creeley,
linux-doc, netdev, kory.maincent, maxime.chevallier, danieller,
hengqi, ecree.xilinx, przemyslaw.kitszel, hkallweit1, ahmed.zaki,
rrameshbabu, idosch, jiri, bigeasy, lorenzo, jdamato,
aleksander.lobakin, kaiyuanz, willemb, daniel.zahka
On Fri, Nov 15, 2024 at 9:27 AM Michael Chan <michael.chan@broadcom.com> wrote:
>
Hi Michael,
Thank you so much for the review!
> On Thu, Nov 14, 2024 at 2:54 PM Andy Gospodarek
> <andrew.gospodarek@broadcom.com> wrote:
> >
> > On Wed, Nov 13, 2024 at 05:32:19PM +0000, Taehee Yoo wrote:
> > > The bnxt_en driver has configured the hds_threshold value automatically
> > > when TPA is enabled based on the rx-copybreak default value.
> > > Now the header-data-split-thresh ethtool command is added, so it adds an
> > > implementation of header-data-split-thresh option.
> > >
> > > Configuration of the header-data-split-thresh is allowed only when
> > > the header-data-split is enabled. The default value of
> > > header-data-split-thresh is 256, which is the default value of
> > > rx-copybreak, which used to be the hds_thresh value.
> > >
> > > # Example:
> > > # ethtool -G enp14s0f0np0 tcp-data-split on header-data-split-thresh 256
> > > # ethtool -g enp14s0f0np0
> > > Ring parameters for enp14s0f0np0:
> > > Pre-set maximums:
> > > ...
> > > Header data split thresh: 256
> > > Current hardware settings:
> > > ...
> > > TCP data split: on
> > > Header data split thresh: 256
> > >
> > > Tested-by: Stanislav Fomichev <sdf@fomichev.me>
> > > Signed-off-by: Taehee Yoo <ap420073@gmail.com>
> >
> > Tested-by: Andy Gospodarek <gospo@broadcom.com>
> >
>
> > > @@ -2362,6 +2362,8 @@ struct bnxt {
> > > u8 q_ids[BNXT_MAX_QUEUE];
> > > u8 max_q;
> > > u8 num_tc;
> > > +#define BNXT_HDS_THRESHOLD_MAX 256
>
> As mentioned in my comments for patch #1, our NIC can support HDS
> threshold of up to 1023, so we can set this to 1023. Thanks.
Thanks for checking, I will change it to 1023.
Thanks a lot!
Taehee Yoo
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH net-next v5 6/7] net: devmem: add ring parameter filtering
2024-11-13 17:32 [PATCH net-next v5 0/7] bnxt_en: implement tcp-data-split and thresh option Taehee Yoo
` (4 preceding siblings ...)
2024-11-13 17:32 ` [PATCH net-next v5 5/7] bnxt_en: add support for header-data-split-thresh ethtool command Taehee Yoo
@ 2024-11-13 17:32 ` Taehee Yoo
2024-11-13 17:32 ` [PATCH net-next v5 7/7] net: ethtool: " Taehee Yoo
` (2 subsequent siblings)
8 siblings, 0 replies; 28+ messages in thread
From: Taehee Yoo @ 2024-11-13 17:32 UTC (permalink / raw)
To: davem, kuba, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev
Cc: kory.maincent, maxime.chevallier, danieller, hengqi, ecree.xilinx,
przemyslaw.kitszel, hkallweit1, ahmed.zaki, rrameshbabu, idosch,
jiri, bigeasy, lorenzo, jdamato, aleksander.lobakin, kaiyuanz,
willemb, daniel.zahka, ap420073
If driver doesn't support ring parameter or tcp-data-split configuration
is not sufficient, the devmem should not be set up.
Before setup the devmem, tcp-data-split should be ON and
header-data-split-thresh value should be 0.
Tested-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
v5:
- Add Review tag from Mina.
v4:
- Check condition before __netif_get_rx_queue().
- Separate condition check.
- Add Test tag from Stanislav.
v3:
- Patch added.
net/core/devmem.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/net/core/devmem.c b/net/core/devmem.c
index 11b91c12ee11..3425e872e87a 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -8,6 +8,8 @@
*/
#include <linux/dma-buf.h>
+#include <linux/ethtool.h>
+#include <linux/ethtool_netlink.h>
#include <linux/genalloc.h>
#include <linux/mm.h>
#include <linux/netdevice.h>
@@ -131,6 +133,8 @@ int net_devmem_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
struct net_devmem_dmabuf_binding *binding,
struct netlink_ext_ack *extack)
{
+ struct kernel_ethtool_ringparam kernel_ringparam = {};
+ struct ethtool_ringparam ringparam = {};
struct netdev_rx_queue *rxq;
u32 xa_idx;
int err;
@@ -140,6 +144,20 @@ int net_devmem_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
return -ERANGE;
}
+ if (!dev->ethtool_ops->get_ringparam)
+ return -EOPNOTSUPP;
+
+ dev->ethtool_ops->get_ringparam(dev, &ringparam, &kernel_ringparam,
+ extack);
+ if (kernel_ringparam.tcp_data_split != ETHTOOL_TCP_DATA_SPLIT_ENABLED) {
+ NL_SET_ERR_MSG(extack, "tcp-data-split is disabled");
+ return -EINVAL;
+ }
+ if (kernel_ringparam.hds_thresh) {
+ NL_SET_ERR_MSG(extack, "header-data-split-thresh is not zero");
+ return -EINVAL;
+ }
+
rxq = __netif_get_rx_queue(dev, rxq_idx);
if (rxq->mp_params.mp_priv) {
NL_SET_ERR_MSG(extack, "designated queue already memory provider bound");
--
2.34.1
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH net-next v5 7/7] net: ethtool: add ring parameter filtering
2024-11-13 17:32 [PATCH net-next v5 0/7] bnxt_en: implement tcp-data-split and thresh option Taehee Yoo
` (5 preceding siblings ...)
2024-11-13 17:32 ` [PATCH net-next v5 6/7] net: devmem: add ring parameter filtering Taehee Yoo
@ 2024-11-13 17:32 ` Taehee Yoo
2024-11-14 22:55 ` [PATCH net-next v5 0/7] bnxt_en: implement tcp-data-split and thresh option Andy Gospodarek
2024-12-17 16:30 ` Jakub Kicinski
8 siblings, 0 replies; 28+ messages in thread
From: Taehee Yoo @ 2024-11-13 17:32 UTC (permalink / raw)
To: davem, kuba, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev
Cc: kory.maincent, maxime.chevallier, danieller, hengqi, ecree.xilinx,
przemyslaw.kitszel, hkallweit1, ahmed.zaki, rrameshbabu, idosch,
jiri, bigeasy, lorenzo, jdamato, aleksander.lobakin, kaiyuanz,
willemb, daniel.zahka, ap420073
While the devmem is running, the tcp-data-split and
header-data-split-thresh configuration should not be changed.
If user tries to change tcp-data-split and threshold value while the
devmem is running, it fails and shows extack message.
Tested-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
v5:
- Change extack message.
- Add Review tag from Mina.
v4:
- Check condition before __netif_get_rx_queue().
- Separate condition check.
- Add Test tag from Stanislav.
v3:
- Patch added.
net/ethtool/rings.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/net/ethtool/rings.c b/net/ethtool/rings.c
index ca836aad3fa9..3f3be806c6c8 100644
--- a/net/ethtool/rings.c
+++ b/net/ethtool/rings.c
@@ -258,6 +258,19 @@ ethnl_set_rings(struct ethnl_req_info *req_info, struct genl_info *info)
return -ERANGE;
}
+ if (dev_get_min_mp_channel_count(dev)) {
+ if (kernel_ringparam.tcp_data_split !=
+ ETHTOOL_TCP_DATA_SPLIT_ENABLED) {
+ NL_SET_ERR_MSG(info->extack,
+ "can't disable tcp-data-split while device has memory provider enabled");
+ return -EINVAL;
+ } else if (kernel_ringparam.hds_thresh) {
+ NL_SET_ERR_MSG(info->extack,
+ "can't set non-zero hds_thresh while device is memory provider enabled");
+ return -EINVAL;
+ }
+ }
+
/* ensure new ring parameters are within limits */
if (ringparam.rx_pending > ringparam.rx_max_pending)
err_attr = tb[ETHTOOL_A_RINGS_RX];
--
2.34.1
^ permalink raw reply related [flat|nested] 28+ messages in thread* Re: [PATCH net-next v5 0/7] bnxt_en: implement tcp-data-split and thresh option
2024-11-13 17:32 [PATCH net-next v5 0/7] bnxt_en: implement tcp-data-split and thresh option Taehee Yoo
` (6 preceding siblings ...)
2024-11-13 17:32 ` [PATCH net-next v5 7/7] net: ethtool: " Taehee Yoo
@ 2024-11-14 22:55 ` Andy Gospodarek
2024-12-17 16:30 ` Jakub Kicinski
8 siblings, 0 replies; 28+ messages in thread
From: Andy Gospodarek @ 2024-11-14 22:55 UTC (permalink / raw)
To: Taehee Yoo
Cc: davem, kuba, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev, kory.maincent, maxime.chevallier, danieller, hengqi,
ecree.xilinx, przemyslaw.kitszel, hkallweit1, ahmed.zaki,
rrameshbabu, idosch, jiri, bigeasy, lorenzo, jdamato,
aleksander.lobakin, kaiyuanz, willemb, daniel.zahka
On Wed, Nov 13, 2024 at 05:32:14PM +0000, Taehee Yoo wrote:
> This series implements header-data-split-thresh ethtool command.
> This series also implements backend of tcp-data-split and
> header-data-split-thresh ethtool command for bnxt_en driver.
> These ethtool commands are mandatory options for device memory TCP.
>
> NICs that use the bnxt_en driver support tcp-data-split feature named
> HDS(header-data-split).
> But there is no implementation for the HDS to enable by ethtool.
> Only getting the current HDS status is implemented and the HDS is just
> automatically enabled only when either LRO, HW-GRO, or JUMBO is enabled.
> The hds_threshold follows the rx-copybreak value but it wasn't
> changeable.
>
> Currently, bnxt_en driver enables tcp-data-split by default but not
> always work.
> There is hds_threshold value, which indicates that a packet size is
> larger than this value, a packet will be split into header and data.
> hds_threshold value has been 256, which is a default value of
> rx-copybreak value too.
> The rx-copybreak value hasn't been allowed to change so the
> hds_threshold too.
>
> This patchset decouples hds_threshold and rx-copybreak first.
> and make tcp-data-split, rx-copybreak, and
> header-data-split-thresh configurable independently.
>
> But the default configuration is the same.
> The default value of rx-copybreak is 256 and default
> header-data-split-thresh is also 256.
>
> There are several related options.
> TPA(HW-GRO, LRO), JUMBO, jumbo_thresh(firmware command), and Aggregation
> Ring.
>
> The aggregation ring is fundamental to these all features.
> When gro/lro/jumbo packets are received, NIC receives the first packet
> from the normal ring.
> follow packets come from the aggregation ring.
>
> These features are working regardless of HDS.
> If HDS is enabled, the first packet contains the header only, and the
> following packets contain only payload.
> So, HW-GRO/LRO is working regardless of HDS.
>
> There is another threshold value, which is jumbo_thresh.
> This is very similar to hds_thresh, but jumbo thresh doesn't split
> header and data.
> It just split the first and following data based on length.
> When NIC receives 1500 sized packet, and jumbo_thresh is 256(default, but
> follows rx-copybreak),
> the first data is 256 and the following packet size is 1500-256.
>
> Before this patch, at least if one of GRO, LRO, and JUMBO flags is
> enabled, the Aggregation ring will be enabled.
> If the Aggregation ring is enabled, both hds_threshold and
> jumbo_thresh are set to the default value of rx-copybreak.
>
> So, GRO, LRO, JUMBO frames, they larger than 256 bytes, they will
> be split into header and data if the protocol is TCP or UDP.
> for the other protocol, jumbo_thresh works instead of hds_thresh.
>
> This means that tcp-data-split relies on the GRO, LRO, and JUMBO flags.
> But by this patch, tcp-data-split no longer relies on these flags.
> If the tcp-data-split is enabled, the Aggregation ring will be
> enabled.
> Also, hds_threshold no longer follows rx-copybreak value, it will
> be set to the header-data-split-thresh value by user-space, but the
> default value is still 256.
>
> If the protocol is TCP or UDP and the HDS is disabled and Aggregation
> ring is enabled, a packet will be split into several pieces due to
> jumbo_thresh.
>
> When single buffer XDP is attached, tcp-data-split is automatically
> disabled.
>
> LRO, GRO, and JUMBO are tested with BCM57414, BCM57504 and the firmware
> version is 230.0.157.0.
> I couldn't find any specification about minimum and maximum value
> of hds_threshold, but from my test result, it was about 0 ~ 1023.
> It means, over 1023 sized packets will be split into header and data if
> tcp-data-split is enabled regardless of hds_treshold value.
> When hds_threshold is 1500 and received packet size is 1400, HDS should
> not be activated, but it is activated.
> The maximum value of header-data-split-thresh value is 256 because it
> has been working. It was decided very conservatively.
>
> I checked out the tcp-data-split(HDS) works independently of GRO, LRO,
> JUMBO.
> Also, I checked out tcp-data-split should be disabled automatically
> when XDP is attached and disallowed to enable it again while XDP is
> attached. I tested ranged values from min to max for
> header-data-split-thresh and rx-copybreak, and it works.
> header-data-split-thresh from 0 to 256, and rx-copybreak 0 to 256.
> When testing this patchset, I checked skb->data, skb->data_len, and
> nr_frags values.
>
> By this patchset, bnxt_en driver supports a force enable tcp-data-split,
> but it doesn't support for disable tcp-data-split.
> When tcp-data-split is explicitly enabled, HDS works always.
> When tcp-data-split is unknown, it depends on the current
> configuration of LRO/GRO/JUMBO.
>
> 1/7 patch implements .{set, get}_tunable() in the bnxt_en.
> The bnxt_en driver has been supporting the rx-copybreak feature but is
> not configurable, Only the default rx-copybreak value has been working.
> So, it changes the bnxt_en driver to be able to configure
> the rx-copybreak value.
>
> 2/7 patch adds a new tcp_data_split_mod member in the
> kernel_ethtool_ringparam
> It indicates that user is explicitly set the tcp-data-split.
> So the driver can distinguish a passed tcp-data-split value is
> came from user or driver itself.
>
> 3/7 patch adds an implementation of tcp-data-split ethtool
> command.
> The HDS relies on the Aggregation ring, which is automatically enabled
> when either LRO, GRO, or large mtu is configured.
> So, if the Aggregation ring is enabled, HDS is automatically enabled by
> it.
>
> 4/7 patch adds header-data-split-thresh command in the ethtool.
> This threshold value indicates if a received packet size is larger
> than this threshold, the packet's header and payload will be split.
> Example:
> # ethtool -G <interface name> header-data-split-thresh <value>
> This option can not be used when tcp-data-split is disabled or not
> supported.
> # ethtool -G enp14s0f0np0 tcp-data-split on header-data-split-thresh 256
> # ethtool -g enp14s0f0np0
> Ring parameters for enp14s0f0np0:
> Pre-set maximums:
> ...
> Current hardware settings:
> ...
> TCP data split: on
> Header data split thresh: 256
>
> 5/7 patch adds the implementation of header-data-split-thresh logic
> in the bnxt_en driver.
> The default value is 256, which used to be the default rx-copybreak
> value.
>
> 6/7, 7/7 add condition checks for devmem and ethtool.
> If tcp-data-split is disabled or threshold value is not zero, setup of
> devmem will be failed.
> Also, tcp-data-split and header-data-split-thresh will not be changed
> while devmem is running.
>
> This series is tested with BCM57504.
>
> All necessary configuration validations exist at the core API level.
>
> v5:
> - Drop implementation of device memory TCP for bnxt_en.
> - Remove netdev_devmem_enabled() and use dev_get_min_mp_channel_count()
> instead.
> - change extack messages
> - Add Review tags from Mina.
>
> v4:
> - Remove min rx-copybreak value.
> - Do not support a disable of tcp-data-split by bnxt_en driver.
> - Rename from tcp-data-split-thresh to header-data-split-thresh.
> - Add ETHTOOL_RING_USE_HDS_THRS flag.
> - Add dev_xdp_sb_prog_count() helper.
> - Reduce hole in struct bnxt.
> - Use ETHTOOL_RING_USE_HDS_THRS in bnxt_en driver.
> - Improve condition check.
> - Add netdev_devmem_enabled() helper.
> - Add netmem_is_pfmemalloc() helper.
> - Do not select NET_DEVMEM in Kconfig for bnxt_en driver.
> - Pass PP_FLAG_ALLOW_UNREADABLE_NETMEM flag unconditionally.
> - Use gfp flag in __bnxt_alloc_rx_netmem() in the last patch.
> - Do not add *offset in the __bnxt_alloc_rx_netmem() in the last patch.
> - Do not pass queue_idx to bnxt_alloc_rx_page_pool() in the last patch.
> - Add Test tag from Stanislav.
> - Add Review tag from Brett.
> - Add page_pool_recycle_direct_netmem() helper
>
> v3:
> - Change headline
> - Add condition checks for ethtool and devmem
> - Fix documentation
> - Move validation of tcp-data-split and thresh from dirver to core API
> - Add implementation of device memory TCP for bnxt_en driver
>
> v2:
> - Add tcp-data-split-thresh ethtool command
> - Implement tcp-data-split-threh in the bnxt_en driver
> - Define min/max rx-copybreak value
> - Update commit message
>
> Taehee Yoo (7):
> bnxt_en: add support for rx-copybreak ethtool command
> net: ethtool: add tcp_data_split_mod member in
> kernel_ethtool_ringparam
> bnxt_en: add support for tcp-data-split ethtool command
> net: ethtool: add support for configuring header-data-split-thresh
> bnxt_en: add support for header-data-split-thresh ethtool command
> net: devmem: add ring parameter filtering
> net: ethtool: add ring parameter filtering
>
> Documentation/netlink/specs/ethtool.yaml | 8 ++
> Documentation/networking/ethtool-netlink.rst | 79 ++++++++++++-------
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 31 +++++---
> drivers/net/ethernet/broadcom/bnxt/bnxt.h | 12 ++-
> .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 73 ++++++++++++++++-
> include/linux/ethtool.h | 8 ++
> include/linux/netdevice.h | 1 +
> include/uapi/linux/ethtool_netlink.h | 2 +
> net/core/dev.c | 13 +++
> net/core/devmem.c | 18 +++++
> net/ethtool/netlink.h | 2 +-
> net/ethtool/rings.c | 53 ++++++++++++-
> 12 files changed, 250 insertions(+), 50 deletions(-)
>
Series looks good to me and testing also looks good. Thanks for doing this!
^ permalink raw reply [flat|nested] 28+ messages in thread* Re: [PATCH net-next v5 0/7] bnxt_en: implement tcp-data-split and thresh option
2024-11-13 17:32 [PATCH net-next v5 0/7] bnxt_en: implement tcp-data-split and thresh option Taehee Yoo
` (7 preceding siblings ...)
2024-11-14 22:55 ` [PATCH net-next v5 0/7] bnxt_en: implement tcp-data-split and thresh option Andy Gospodarek
@ 2024-12-17 16:30 ` Jakub Kicinski
2024-12-18 13:34 ` Taehee Yoo
8 siblings, 1 reply; 28+ messages in thread
From: Jakub Kicinski @ 2024-12-17 16:30 UTC (permalink / raw)
To: Taehee Yoo
Cc: davem, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev, kory.maincent, maxime.chevallier, danieller, hengqi,
ecree.xilinx, przemyslaw.kitszel, hkallweit1, ahmed.zaki,
rrameshbabu, idosch, jiri, bigeasy, lorenzo, jdamato,
aleksander.lobakin, kaiyuanz, willemb, daniel.zahka
On Wed, 13 Nov 2024 17:32:14 +0000 Taehee Yoo wrote:
> This series implements header-data-split-thresh ethtool command.
> This series also implements backend of tcp-data-split and
> header-data-split-thresh ethtool command for bnxt_en driver.
> These ethtool commands are mandatory options for device memory TCP.
Hi Taehee! Any progress on this series?
Being able to increase HDS threshold is highly beneficial for workloads
sending small RPCs, it'd be great if the changes were part of v6.14.
^ permalink raw reply [flat|nested] 28+ messages in thread* Re: [PATCH net-next v5 0/7] bnxt_en: implement tcp-data-split and thresh option
2024-12-17 16:30 ` Jakub Kicinski
@ 2024-12-18 13:34 ` Taehee Yoo
0 siblings, 0 replies; 28+ messages in thread
From: Taehee Yoo @ 2024-12-18 13:34 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, pabeni, edumazet, almasrymina, donald.hunter, corbet,
michael.chan, andrew+netdev, hawk, ilias.apalodimas, ast, daniel,
john.fastabend, dw, sdf, asml.silence, brett.creeley, linux-doc,
netdev, kory.maincent, maxime.chevallier, danieller, hengqi,
ecree.xilinx, przemyslaw.kitszel, hkallweit1, ahmed.zaki,
rrameshbabu, idosch, jiri, bigeasy, lorenzo, jdamato,
aleksander.lobakin, kaiyuanz, willemb, daniel.zahka
On Wed, Dec 18, 2024 at 1:30 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 13 Nov 2024 17:32:14 +0000 Taehee Yoo wrote:
> > This series implements header-data-split-thresh ethtool command.
> > This series also implements backend of tcp-data-split and
> > header-data-split-thresh ethtool command for bnxt_en driver.
> > These ethtool commands are mandatory options for device memory TCP.
>
> Hi Taehee! Any progress on this series?
> Being able to increase HDS threshold is highly beneficial for workloads
> sending small RPCs, it'd be great if the changes were part of v6.14.
Hi Jakub,
Sorry for the late response.
I'm going to send v6 patch today :)
Thanks a lot!
Taehee Yoo
^ permalink raw reply [flat|nested] 28+ messages in thread