* [PATCH net-next v4 0/6] enic: Use Page Pool API for receiving packets
@ 2025-01-02 22:24 John Daley
2025-01-02 22:24 ` [PATCH net-next v4 1/6] enic: Refactor RX path common code into helper functions John Daley
` (5 more replies)
0 siblings, 6 replies; 17+ messages in thread
From: John Daley @ 2025-01-02 22:24 UTC (permalink / raw)
To: benve, satishkh, andrew+netdev, davem, edumazet, kuba, pabeni,
netdev
Cc: John Daley
When the MTU is less than PAGE_SIZE, use the Page Pool API for RX.
The Page Pool API improves bandwidth and CPU overhead by recycling
pages instead of allocating new buffers in the driver. Also, page
pool fragment allocation for smaller MTUs allow multiple packets
to share a page.
Since older hardware does not support receiving packets into
multiple discontiguous pages, the original RX path where
netdev_alloc_skb_ip_align() is used for buffer allocation is now
used only when MTU > PAGE_SIZE. Function pointers are used to select
functions based on the MTU. Some refactoring was done so that common
code can be shared. The refactored functions and the new functions
using page pool are in a new file called enic_rq.c.
Fixed bug in the RX adaptive coalescing and did a minor cleanup.
Signed-off-by: John Daley <johndale@cisco.com>
---
Changes in v2:
- Fixed a valid warning found by build_clang where a variable was used
before it was initialized. The warnings in include/linux/mm.h were not
addressed since they do not appear to be realated to this patchset.
Changes in v3:
- Moved a function to before where is was called and removed the forward
declaration. Reworded a commit message. No functional changes.
Changes in v4:
- Replaced call to page_pool_put_page() with page_pool_put_full_page()
since page_pool_dev_alloc() API is used and page_pool is created with
PP_FLAG_DMA_SYNC_DEV flag.
- Reworded final commit message one more time to try to make it clear
that there is just one fix for the commit.
John Daley (6):
enic: Refactor RX path common code into helper functions
enic: Remove an unnecessary parameter from function enic_queue_rq_desc
enic: Use function pointers for buf alloc, free and RQ service
enic: Use the Page Pool API for RX when MTU is less than page size
enic: Move RX coalescing set function
enic: Obtain the Link speed only after the link comes up
drivers/net/ethernet/cisco/enic/Makefile | 2 +-
drivers/net/ethernet/cisco/enic/enic.h | 15 +
.../net/ethernet/cisco/enic/enic_ethtool.c | 1 +
drivers/net/ethernet/cisco/enic/enic_main.c | 229 ++++++---------
drivers/net/ethernet/cisco/enic/enic_res.h | 10 +-
drivers/net/ethernet/cisco/enic/enic_rq.c | 260 ++++++++++++++++++
drivers/net/ethernet/cisco/enic/enic_rq.h | 27 ++
drivers/net/ethernet/cisco/enic/vnic_rq.h | 2 +
8 files changed, 400 insertions(+), 146 deletions(-)
create mode 100644 drivers/net/ethernet/cisco/enic/enic_rq.c
create mode 100644 drivers/net/ethernet/cisco/enic/enic_rq.h
--
2.35.2
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH net-next v4 1/6] enic: Refactor RX path common code into helper functions
2025-01-02 22:24 [PATCH net-next v4 0/6] enic: Use Page Pool API for receiving packets John Daley
@ 2025-01-02 22:24 ` John Daley
2025-01-02 22:24 ` [PATCH net-next v4 2/6] enic: Remove an unnecessary parameter from function enic_queue_rq_desc John Daley
` (4 subsequent siblings)
5 siblings, 0 replies; 17+ messages in thread
From: John Daley @ 2025-01-02 22:24 UTC (permalink / raw)
To: benve, satishkh, andrew+netdev, davem, edumazet, kuba, pabeni,
netdev
Cc: John Daley, Nelson Escobar
In order to more easily support future packet receive processing schemes
and to simplify the receive path, put common code into helper functions
in a separate file.
Co-developed-by: Nelson Escobar <neescoba@cisco.com>
Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Co-developed-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
---
drivers/net/ethernet/cisco/enic/Makefile | 2 +-
drivers/net/ethernet/cisco/enic/enic_main.c | 100 ++--------------
drivers/net/ethernet/cisco/enic/enic_rq.c | 120 ++++++++++++++++++++
drivers/net/ethernet/cisco/enic/enic_rq.h | 22 ++++
4 files changed, 150 insertions(+), 94 deletions(-)
create mode 100644 drivers/net/ethernet/cisco/enic/enic_rq.c
create mode 100644 drivers/net/ethernet/cisco/enic/enic_rq.h
diff --git a/drivers/net/ethernet/cisco/enic/Makefile b/drivers/net/ethernet/cisco/enic/Makefile
index c3b6febfdbe4..b3b5196b2dfc 100644
--- a/drivers/net/ethernet/cisco/enic/Makefile
+++ b/drivers/net/ethernet/cisco/enic/Makefile
@@ -3,5 +3,5 @@ obj-$(CONFIG_ENIC) := enic.o
enic-y := enic_main.o vnic_cq.o vnic_intr.o vnic_wq.o \
enic_res.o enic_dev.o enic_pp.o vnic_dev.o vnic_rq.o vnic_vic.o \
- enic_ethtool.o enic_api.o enic_clsf.o
+ enic_ethtool.o enic_api.o enic_clsf.o enic_rq.o
diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c
index 9913952ccb42..33890e26b8e5 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -55,6 +55,7 @@
#include "vnic_vic.h"
#include "enic_res.h"
#include "enic.h"
+#include "enic_rq.h"
#include "enic_dev.h"
#include "enic_pp.h"
#include "enic_clsf.h"
@@ -83,7 +84,6 @@ MODULE_AUTHOR("Scott Feldman <scofeldm@cisco.com>");
MODULE_LICENSE("GPL");
MODULE_DEVICE_TABLE(pci, enic_id_table);
-#define ENIC_LARGE_PKT_THRESHOLD 1000
#define ENIC_MAX_COALESCE_TIMERS 10
/* Interrupt moderation table, which will be used to decide the
* coalescing timer values
@@ -1330,15 +1330,6 @@ static int enic_rq_alloc_buf(struct vnic_rq *rq)
return 0;
}
-static void enic_intr_update_pkt_size(struct vnic_rx_bytes_counter *pkt_size,
- u32 pkt_len)
-{
- if (ENIC_LARGE_PKT_THRESHOLD <= pkt_len)
- pkt_size->large_pkt_bytes_cnt += pkt_len;
- else
- pkt_size->small_pkt_bytes_cnt += pkt_len;
-}
-
static bool enic_rxcopybreak(struct net_device *netdev, struct sk_buff **skb,
struct vnic_rq_buf *buf, u16 len)
{
@@ -1358,9 +1349,8 @@ static bool enic_rxcopybreak(struct net_device *netdev, struct sk_buff **skb,
return true;
}
-static void enic_rq_indicate_buf(struct vnic_rq *rq,
- struct cq_desc *cq_desc, struct vnic_rq_buf *buf,
- int skipped, void *opaque)
+void enic_rq_indicate_buf(struct vnic_rq *rq, struct cq_desc *cq_desc,
+ struct vnic_rq_buf *buf, int skipped, void *opaque)
{
struct enic *enic = vnic_dev_priv(rq->vdev);
struct net_device *netdev = enic->netdev;
@@ -1375,7 +1365,6 @@ static void enic_rq_indicate_buf(struct vnic_rq *rq,
u8 packet_error;
u16 q_number, completed_index, bytes_written, vlan_tci, checksum;
u32 rss_hash;
- bool outer_csum_ok = true, encap = false;
rqstats->packets++;
if (skipped) {
@@ -1395,15 +1384,7 @@ static void enic_rq_indicate_buf(struct vnic_rq *rq,
&ipv4_csum_ok, &ipv6, &ipv4, &ipv4_fragment,
&fcs_ok);
- if (packet_error) {
-
- if (!fcs_ok) {
- if (bytes_written > 0)
- rqstats->bad_fcs++;
- else if (bytes_written == 0)
- rqstats->pkt_truncated++;
- }
-
+ if (enic_rq_pkt_error(rq, packet_error, fcs_ok, bytes_written)) {
dma_unmap_single(&enic->pdev->dev, buf->dma_addr, buf->len,
DMA_FROM_DEVICE);
dev_kfree_skb_any(skb);
@@ -1427,66 +1408,11 @@ static void enic_rq_indicate_buf(struct vnic_rq *rq,
skb_put(skb, bytes_written);
skb->protocol = eth_type_trans(skb, netdev);
skb_record_rx_queue(skb, q_number);
- if ((netdev->features & NETIF_F_RXHASH) && rss_hash &&
- (type == 3)) {
- switch (rss_type) {
- case CQ_ENET_RQ_DESC_RSS_TYPE_TCP_IPv4:
- case CQ_ENET_RQ_DESC_RSS_TYPE_TCP_IPv6:
- case CQ_ENET_RQ_DESC_RSS_TYPE_TCP_IPv6_EX:
- skb_set_hash(skb, rss_hash, PKT_HASH_TYPE_L4);
- rqstats->l4_rss_hash++;
- break;
- case CQ_ENET_RQ_DESC_RSS_TYPE_IPv4:
- case CQ_ENET_RQ_DESC_RSS_TYPE_IPv6:
- case CQ_ENET_RQ_DESC_RSS_TYPE_IPv6_EX:
- skb_set_hash(skb, rss_hash, PKT_HASH_TYPE_L3);
- rqstats->l3_rss_hash++;
- break;
- }
- }
- if (enic->vxlan.vxlan_udp_port_number) {
- switch (enic->vxlan.patch_level) {
- case 0:
- if (fcoe) {
- encap = true;
- outer_csum_ok = fcoe_fc_crc_ok;
- }
- break;
- case 2:
- if ((type == 7) &&
- (rss_hash & BIT(0))) {
- encap = true;
- outer_csum_ok = (rss_hash & BIT(1)) &&
- (rss_hash & BIT(2));
- }
- break;
- }
- }
- /* Hardware does not provide whole packet checksum. It only
- * provides pseudo checksum. Since hw validates the packet
- * checksum but not provide us the checksum value. use
- * CHECSUM_UNNECESSARY.
- *
- * In case of encap pkt tcp_udp_csum_ok/tcp_udp_csum_ok is
- * inner csum_ok. outer_csum_ok is set by hw when outer udp
- * csum is correct or is zero.
- */
- if ((netdev->features & NETIF_F_RXCSUM) && !csum_not_calc &&
- tcp_udp_csum_ok && outer_csum_ok &&
- (ipv4_csum_ok || ipv6)) {
- skb->ip_summed = CHECKSUM_UNNECESSARY;
- skb->csum_level = encap;
- if (encap)
- rqstats->csum_unnecessary_encap++;
- else
- rqstats->csum_unnecessary++;
- }
+ enic_rq_set_skb_flags(rq, type, rss_hash, rss_type, fcoe, fcoe_fc_crc_ok,
+ vlan_stripped, csum_not_calc, tcp_udp_csum_ok, ipv6,
+ ipv4_csum_ok, vlan_tci, skb);
- if (vlan_stripped) {
- __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlan_tci);
- rqstats->vlan_stripped++;
- }
skb_mark_napi_id(skb, &enic->napi[rq->index]);
if (!(netdev->features & NETIF_F_GRO))
netif_receive_skb(skb);
@@ -1507,18 +1433,6 @@ static void enic_rq_indicate_buf(struct vnic_rq *rq,
}
}
-static int enic_rq_service(struct vnic_dev *vdev, struct cq_desc *cq_desc,
- u8 type, u16 q_number, u16 completed_index, void *opaque)
-{
- struct enic *enic = vnic_dev_priv(vdev);
-
- vnic_rq_service(&enic->rq[q_number].vrq, cq_desc,
- completed_index, VNIC_RQ_RETURN_DESC,
- enic_rq_indicate_buf, opaque);
-
- return 0;
-}
-
static void enic_set_int_moderation(struct enic *enic, struct vnic_rq *rq)
{
unsigned int intr = enic_msix_rq_intr(enic, rq->index);
diff --git a/drivers/net/ethernet/cisco/enic/enic_rq.c b/drivers/net/ethernet/cisco/enic/enic_rq.c
new file mode 100644
index 000000000000..571af8f31470
--- /dev/null
+++ b/drivers/net/ethernet/cisco/enic/enic_rq.c
@@ -0,0 +1,120 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright 2024 Cisco Systems, Inc. All rights reserved.
+
+#include <linux/skbuff.h>
+#include <linux/if_vlan.h>
+#include "enic.h"
+#include "enic_rq.h"
+#include "vnic_rq.h"
+#include "cq_enet_desc.h"
+
+#define ENIC_LARGE_PKT_THRESHOLD 1000
+
+void enic_intr_update_pkt_size(struct vnic_rx_bytes_counter *pkt_size,
+ u32 pkt_len)
+{
+ if (pkt_len >= ENIC_LARGE_PKT_THRESHOLD)
+ pkt_size->large_pkt_bytes_cnt += pkt_len;
+ else
+ pkt_size->small_pkt_bytes_cnt += pkt_len;
+}
+
+void enic_rq_set_skb_flags(struct vnic_rq *vrq, u8 type, u32 rss_hash, u8 rss_type, u8 fcoe,
+ u8 fcoe_fc_crc_ok, u8 vlan_stripped, u8 csum_not_calc,
+ u8 tcp_udp_csum_ok, u8 ipv6, u8 ipv4_csum_ok, u16 vlan_tci,
+ struct sk_buff *skb)
+{
+ struct enic *enic = vnic_dev_priv(vrq->vdev);
+ struct net_device *netdev = enic->netdev;
+ struct enic_rq_stats *rqstats = &enic->rq[vrq->index].stats;
+ bool outer_csum_ok = true, encap = false;
+
+ if ((netdev->features & NETIF_F_RXHASH) && rss_hash && type == 3) {
+ switch (rss_type) {
+ case CQ_ENET_RQ_DESC_RSS_TYPE_TCP_IPv4:
+ case CQ_ENET_RQ_DESC_RSS_TYPE_TCP_IPv6:
+ case CQ_ENET_RQ_DESC_RSS_TYPE_TCP_IPv6_EX:
+ skb_set_hash(skb, rss_hash, PKT_HASH_TYPE_L4);
+ rqstats->l4_rss_hash++;
+ break;
+ case CQ_ENET_RQ_DESC_RSS_TYPE_IPv4:
+ case CQ_ENET_RQ_DESC_RSS_TYPE_IPv6:
+ case CQ_ENET_RQ_DESC_RSS_TYPE_IPv6_EX:
+ skb_set_hash(skb, rss_hash, PKT_HASH_TYPE_L3);
+ rqstats->l3_rss_hash++;
+ break;
+ }
+ }
+ if (enic->vxlan.vxlan_udp_port_number) {
+ switch (enic->vxlan.patch_level) {
+ case 0:
+ if (fcoe) {
+ encap = true;
+ outer_csum_ok = fcoe_fc_crc_ok;
+ }
+ break;
+ case 2:
+ if (type == 7 && (rss_hash & BIT(0))) {
+ encap = true;
+ outer_csum_ok = (rss_hash & BIT(1)) &&
+ (rss_hash & BIT(2));
+ }
+ break;
+ }
+ }
+
+ /* Hardware does not provide whole packet checksum. It only
+ * provides pseudo checksum. Since hw validates the packet
+ * checksum but not provide us the checksum value. use
+ * CHECSUM_UNNECESSARY.
+ *
+ * In case of encap pkt tcp_udp_csum_ok/tcp_udp_csum_ok is
+ * inner csum_ok. outer_csum_ok is set by hw when outer udp
+ * csum is correct or is zero.
+ */
+ if ((netdev->features & NETIF_F_RXCSUM) && !csum_not_calc &&
+ tcp_udp_csum_ok && outer_csum_ok && (ipv4_csum_ok || ipv6)) {
+ skb->ip_summed = CHECKSUM_UNNECESSARY;
+ skb->csum_level = encap;
+ if (encap)
+ rqstats->csum_unnecessary_encap++;
+ else
+ rqstats->csum_unnecessary++;
+ }
+
+ if (vlan_stripped) {
+ __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlan_tci);
+ rqstats->vlan_stripped++;
+ }
+}
+
+int enic_rq_pkt_error(struct vnic_rq *vrq, u8 packet_error, u8 fcs_ok, u16 bytes_written)
+{
+ struct enic *enic = vnic_dev_priv(vrq->vdev);
+ struct enic_rq_stats *rqstats = &enic->rq[vrq->index].stats;
+ int ret = 0;
+
+ if (packet_error) {
+ if (!fcs_ok) {
+ if (bytes_written > 0) {
+ rqstats->bad_fcs++;
+ ret = 1;
+ } else if (bytes_written == 0) {
+ rqstats->pkt_truncated++;
+ ret = 2;
+ }
+ }
+ }
+ return ret;
+}
+
+int enic_rq_service(struct vnic_dev *vdev, struct cq_desc *cq_desc,
+ u8 type, u16 q_number, u16 completed_index, void *opaque)
+{
+ struct enic *enic = vnic_dev_priv(vdev);
+
+ vnic_rq_service(&enic->rq[q_number].vrq, cq_desc, completed_index,
+ VNIC_RQ_RETURN_DESC, enic_rq_indicate_buf, opaque);
+
+ return 0;
+}
diff --git a/drivers/net/ethernet/cisco/enic/enic_rq.h b/drivers/net/ethernet/cisco/enic/enic_rq.h
new file mode 100644
index 000000000000..46ab75fd74a0
--- /dev/null
+++ b/drivers/net/ethernet/cisco/enic/enic_rq.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright 2008-2010 Cisco Systems, Inc. All rights reserved.
+ * Copyright 2007 Nuova Systems, Inc. All rights reserved.
+ */
+
+#ifndef _ENIC_RQ_H_
+#define _ENIC_RQ_H_
+
+void enic_intr_update_pkt_size(struct vnic_rx_bytes_counter *pkt_size,
+ u32 pkt_len);
+void enic_rq_set_skb_flags(struct vnic_rq *rq, u8 type, u32 rss_hash, u8 rss_type,
+ u8 fcoe, u8 fcoe_fc_crc_ok, u8 vlan_stripped,
+ u8 csum_not_calc, u8 tcp_udp_csum_ok, u8 ipv6,
+ u8 ipv4_csum_ok, u16 vlan_tci, struct sk_buff *skb);
+int enic_rq_pkt_error(struct vnic_rq *rq, u8 packet_error, u8 fcs_ok,
+ u16 bytes_written);
+int enic_rq_service(struct vnic_dev *vdev, struct cq_desc *cq_desc,
+ u8 type, u16 q_number, u16 completed_index, void *opaque);
+void enic_rq_indicate_buf(struct vnic_rq *rq, struct cq_desc *cq_desc,
+ struct vnic_rq_buf *buf, int skipped, void *opaque);
+#endif /* _ENIC_RQ_H_ */
--
2.35.2
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH net-next v4 2/6] enic: Remove an unnecessary parameter from function enic_queue_rq_desc
2025-01-02 22:24 [PATCH net-next v4 0/6] enic: Use Page Pool API for receiving packets John Daley
2025-01-02 22:24 ` [PATCH net-next v4 1/6] enic: Refactor RX path common code into helper functions John Daley
@ 2025-01-02 22:24 ` John Daley
2025-01-02 22:24 ` [PATCH net-next v4 3/6] enic: Use function pointers for buf alloc, free and RQ service John Daley
` (3 subsequent siblings)
5 siblings, 0 replies; 17+ messages in thread
From: John Daley @ 2025-01-02 22:24 UTC (permalink / raw)
To: benve, satishkh, andrew+netdev, davem, edumazet, kuba, pabeni,
netdev
Cc: John Daley, Nelson Escobar
The function enic_queue_rq_desc has a parameter os_buf_index which was
only called with a hard coded 0. Remove it.
Co-developed-by: Nelson Escobar <neescoba@cisco.com>
Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Co-developed-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
---
drivers/net/ethernet/cisco/enic/enic_main.c | 8 ++------
drivers/net/ethernet/cisco/enic/enic_res.h | 10 +++-------
2 files changed, 5 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c
index 33890e26b8e5..f8d0011486d7 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -1301,14 +1301,11 @@ static int enic_rq_alloc_buf(struct vnic_rq *rq)
struct net_device *netdev = enic->netdev;
struct sk_buff *skb;
unsigned int len = netdev->mtu + VLAN_ETH_HLEN;
- unsigned int os_buf_index = 0;
dma_addr_t dma_addr;
struct vnic_rq_buf *buf = rq->to_use;
if (buf->os_buf) {
- enic_queue_rq_desc(rq, buf->os_buf, os_buf_index, buf->dma_addr,
- buf->len);
-
+ enic_queue_rq_desc(rq, buf->os_buf, buf->dma_addr, buf->len);
return 0;
}
skb = netdev_alloc_skb_ip_align(netdev, len);
@@ -1324,8 +1321,7 @@ static int enic_rq_alloc_buf(struct vnic_rq *rq)
return -ENOMEM;
}
- enic_queue_rq_desc(rq, skb, os_buf_index,
- dma_addr, len);
+ enic_queue_rq_desc(rq, skb, dma_addr, len);
return 0;
}
diff --git a/drivers/net/ethernet/cisco/enic/enic_res.h b/drivers/net/ethernet/cisco/enic/enic_res.h
index b8ee42d297aa..dad5c45b684a 100644
--- a/drivers/net/ethernet/cisco/enic/enic_res.h
+++ b/drivers/net/ethernet/cisco/enic/enic_res.h
@@ -107,19 +107,15 @@ static inline void enic_queue_wq_desc_tso(struct vnic_wq *wq,
}
static inline void enic_queue_rq_desc(struct vnic_rq *rq,
- void *os_buf, unsigned int os_buf_index,
- dma_addr_t dma_addr, unsigned int len)
+ void *os_buf, dma_addr_t dma_addr, unsigned int len)
{
struct rq_enet_desc *desc = vnic_rq_next_desc(rq);
- u64 wrid = 0;
- u8 type = os_buf_index ?
- RQ_ENET_TYPE_NOT_SOP : RQ_ENET_TYPE_ONLY_SOP;
rq_enet_desc_enc(desc,
(u64)dma_addr | VNIC_PADDR_TARGET,
- type, (u16)len);
+ RQ_ENET_TYPE_ONLY_SOP, (u16)len);
- vnic_rq_post(rq, os_buf, os_buf_index, dma_addr, len, wrid);
+ vnic_rq_post(rq, os_buf, 0, dma_addr, len, 0);
}
struct enic;
--
2.35.2
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH net-next v4 3/6] enic: Use function pointers for buf alloc, free and RQ service
2025-01-02 22:24 [PATCH net-next v4 0/6] enic: Use Page Pool API for receiving packets John Daley
2025-01-02 22:24 ` [PATCH net-next v4 1/6] enic: Refactor RX path common code into helper functions John Daley
2025-01-02 22:24 ` [PATCH net-next v4 2/6] enic: Remove an unnecessary parameter from function enic_queue_rq_desc John Daley
@ 2025-01-02 22:24 ` John Daley
2025-01-02 22:24 ` [PATCH net-next v4 4/6] enic: Use the Page Pool API for RX when MTU is less than page size John Daley
` (2 subsequent siblings)
5 siblings, 0 replies; 17+ messages in thread
From: John Daley @ 2025-01-02 22:24 UTC (permalink / raw)
To: benve, satishkh, andrew+netdev, davem, edumazet, kuba, pabeni,
netdev
Cc: John Daley, Nelson Escobar
In order to support more than one packet receive processing scheme, use
pointers for allocate, free and RQ descrptor processing functions.
Co-developed-by: Nelson Escobar <neescoba@cisco.com>
Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Co-developed-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
---
drivers/net/ethernet/cisco/enic/enic.h | 5 +++++
drivers/net/ethernet/cisco/enic/enic_main.c | 14 +++++++++-----
drivers/net/ethernet/cisco/enic/enic_rq.c | 2 +-
3 files changed, 15 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/cisco/enic/enic.h b/drivers/net/ethernet/cisco/enic/enic.h
index 10b7e02ba4d0..51f80378d928 100644
--- a/drivers/net/ethernet/cisco/enic/enic.h
+++ b/drivers/net/ethernet/cisco/enic/enic.h
@@ -226,6 +226,11 @@ struct enic {
u32 rx_copybreak;
u8 rss_key[ENIC_RSS_LEN];
struct vnic_gen_stats gen_stats;
+ void (*rq_buf_service)(struct vnic_rq *rq, struct cq_desc *cq_desc,
+ struct vnic_rq_buf *buf, int skipped,
+ void *opaque);
+ int (*rq_alloc_buf)(struct vnic_rq *rq);
+ void (*rq_free_buf)(struct vnic_rq *rq, struct vnic_rq_buf *buf);
};
static inline struct net_device *vnic_get_netdev(struct vnic_dev *vdev)
diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c
index f8d0011486d7..45ab6b670563 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -1519,7 +1519,7 @@ static int enic_poll(struct napi_struct *napi, int budget)
0 /* don't unmask intr */,
0 /* don't reset intr timer */);
- err = vnic_rq_fill(&enic->rq[0].vrq, enic_rq_alloc_buf);
+ err = vnic_rq_fill(&enic->rq[0].vrq, enic->rq_alloc_buf);
/* Buffer allocation failed. Stay in polling
* mode so we can try to fill the ring again.
@@ -1647,7 +1647,7 @@ static int enic_poll_msix_rq(struct napi_struct *napi, int budget)
0 /* don't unmask intr */,
0 /* don't reset intr timer */);
- err = vnic_rq_fill(&enic->rq[rq].vrq, enic_rq_alloc_buf);
+ err = vnic_rq_fill(&enic->rq[rq].vrq, enic->rq_alloc_buf);
/* Buffer allocation failed. Stay in polling mode
* so we can try to fill the ring again.
@@ -1882,6 +1882,10 @@ static int enic_open(struct net_device *netdev)
unsigned int i;
int err, ret;
+ enic->rq_buf_service = enic_rq_indicate_buf;
+ enic->rq_alloc_buf = enic_rq_alloc_buf;
+ enic->rq_free_buf = enic_free_rq_buf;
+
err = enic_request_intr(enic);
if (err) {
netdev_err(netdev, "Unable to request irq.\n");
@@ -1900,7 +1904,7 @@ static int enic_open(struct net_device *netdev)
for (i = 0; i < enic->rq_count; i++) {
/* enable rq before updating rq desc */
vnic_rq_enable(&enic->rq[i].vrq);
- vnic_rq_fill(&enic->rq[i].vrq, enic_rq_alloc_buf);
+ vnic_rq_fill(&enic->rq[i].vrq, enic->rq_alloc_buf);
/* Need at least one buffer on ring to get going */
if (vnic_rq_desc_used(&enic->rq[i].vrq) == 0) {
netdev_err(netdev, "Unable to alloc receive buffers\n");
@@ -1939,7 +1943,7 @@ static int enic_open(struct net_device *netdev)
for (i = 0; i < enic->rq_count; i++) {
ret = vnic_rq_disable(&enic->rq[i].vrq);
if (!ret)
- vnic_rq_clean(&enic->rq[i].vrq, enic_free_rq_buf);
+ vnic_rq_clean(&enic->rq[i].vrq, enic->rq_free_buf);
}
enic_dev_notify_unset(enic);
err_out_free_intr:
@@ -1998,7 +2002,7 @@ static int enic_stop(struct net_device *netdev)
for (i = 0; i < enic->wq_count; i++)
vnic_wq_clean(&enic->wq[i].vwq, enic_free_wq_buf);
for (i = 0; i < enic->rq_count; i++)
- vnic_rq_clean(&enic->rq[i].vrq, enic_free_rq_buf);
+ vnic_rq_clean(&enic->rq[i].vrq, enic->rq_free_buf);
for (i = 0; i < enic->cq_count; i++)
vnic_cq_clean(&enic->cq[i]);
for (i = 0; i < enic->intr_count; i++)
diff --git a/drivers/net/ethernet/cisco/enic/enic_rq.c b/drivers/net/ethernet/cisco/enic/enic_rq.c
index 571af8f31470..ae2ab5af87e9 100644
--- a/drivers/net/ethernet/cisco/enic/enic_rq.c
+++ b/drivers/net/ethernet/cisco/enic/enic_rq.c
@@ -114,7 +114,7 @@ int enic_rq_service(struct vnic_dev *vdev, struct cq_desc *cq_desc,
struct enic *enic = vnic_dev_priv(vdev);
vnic_rq_service(&enic->rq[q_number].vrq, cq_desc, completed_index,
- VNIC_RQ_RETURN_DESC, enic_rq_indicate_buf, opaque);
+ VNIC_RQ_RETURN_DESC, enic->rq_buf_service, opaque);
return 0;
}
--
2.35.2
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH net-next v4 4/6] enic: Use the Page Pool API for RX when MTU is less than page size
2025-01-02 22:24 [PATCH net-next v4 0/6] enic: Use Page Pool API for receiving packets John Daley
` (2 preceding siblings ...)
2025-01-02 22:24 ` [PATCH net-next v4 3/6] enic: Use function pointers for buf alloc, free and RQ service John Daley
@ 2025-01-02 22:24 ` John Daley
2025-01-05 1:41 ` Jakub Kicinski
2025-01-02 22:24 ` [PATCH net-next v4 5/6] enic: Move RX coalescing set function John Daley
2025-01-02 22:24 ` [PATCH net-next v4 6/6] enic: Obtain the Link speed only after the link comes up John Daley
5 siblings, 1 reply; 17+ messages in thread
From: John Daley @ 2025-01-02 22:24 UTC (permalink / raw)
To: benve, satishkh, andrew+netdev, davem, edumazet, kuba, pabeni,
netdev
Cc: John Daley, Nelson Escobar
The Page Pool API improves bandwidth and CPU overhead by recycling
pages instead of allocating new buffers in the driver. Make use of
page pool fragment allocation for smaller MTUs so that multiple
packets can share a page.
Added 'pp_alloc_error' per RQ ethtool statistic to count
page_pool_dev_alloc() failures.
Co-developed-by: Nelson Escobar <neescoba@cisco.com>
Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Co-developed-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
---
drivers/net/ethernet/cisco/enic/enic.h | 10 ++
.../net/ethernet/cisco/enic/enic_ethtool.c | 1 +
drivers/net/ethernet/cisco/enic/enic_main.c | 51 ++++++-
drivers/net/ethernet/cisco/enic/enic_rq.c | 140 ++++++++++++++++++
drivers/net/ethernet/cisco/enic/enic_rq.h | 5 +
drivers/net/ethernet/cisco/enic/vnic_rq.h | 2 +
6 files changed, 203 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/cisco/enic/enic.h b/drivers/net/ethernet/cisco/enic/enic.h
index 51f80378d928..19e22aba71a8 100644
--- a/drivers/net/ethernet/cisco/enic/enic.h
+++ b/drivers/net/ethernet/cisco/enic/enic.h
@@ -17,6 +17,8 @@
#include "vnic_nic.h"
#include "vnic_rss.h"
#include <linux/irq.h>
+#include <linux/if_vlan.h>
+#include <net/page_pool/helpers.h>
#define DRV_NAME "enic"
#define DRV_DESCRIPTION "Cisco VIC Ethernet NIC Driver"
@@ -158,6 +160,7 @@ struct enic_rq_stats {
u64 pkt_truncated; /* truncated pkts */
u64 no_skb; /* out of skbs */
u64 desc_skip; /* Rx pkt went into later buffer */
+ u64 pp_alloc_error; /* page alloc error */
};
struct enic_wq {
@@ -169,6 +172,7 @@ struct enic_wq {
struct enic_rq {
struct vnic_rq vrq;
struct enic_rq_stats stats;
+ struct page_pool *pool;
} ____cacheline_aligned;
/* Per-instance private data structure */
@@ -231,8 +235,14 @@ struct enic {
void *opaque);
int (*rq_alloc_buf)(struct vnic_rq *rq);
void (*rq_free_buf)(struct vnic_rq *rq, struct vnic_rq_buf *buf);
+ void (*rq_cleanup)(struct enic_rq *rq);
};
+static inline unsigned int get_max_pkt_len(struct enic *enic)
+{
+ return enic->netdev->mtu + VLAN_ETH_HLEN;
+}
+
static inline struct net_device *vnic_get_netdev(struct vnic_dev *vdev)
{
struct enic *enic = vdev->priv;
diff --git a/drivers/net/ethernet/cisco/enic/enic_ethtool.c b/drivers/net/ethernet/cisco/enic/enic_ethtool.c
index d607b4f0542c..799f44b95bfc 100644
--- a/drivers/net/ethernet/cisco/enic/enic_ethtool.c
+++ b/drivers/net/ethernet/cisco/enic/enic_ethtool.c
@@ -51,6 +51,7 @@ static const struct enic_stat enic_per_rq_stats[] = {
ENIC_PER_RQ_STAT(napi_repoll),
ENIC_PER_RQ_STAT(no_skb),
ENIC_PER_RQ_STAT(desc_skip),
+ ENIC_PER_RQ_STAT(pp_alloc_error),
};
#define NUM_ENIC_PER_RQ_STATS ARRAY_SIZE(enic_per_rq_stats)
diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c
index 45ab6b670563..5bfd89749237 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -1282,6 +1282,11 @@ static int enic_get_vf_port(struct net_device *netdev, int vf,
return -EMSGSIZE;
}
+/* nothing to do for buffers based allocation */
+static void enic_rq_buf_cleanup(struct enic_rq *rq)
+{
+}
+
static void enic_free_rq_buf(struct vnic_rq *rq, struct vnic_rq_buf *buf)
{
struct enic *enic = vnic_dev_priv(rq->vdev);
@@ -1881,10 +1886,33 @@ static int enic_open(struct net_device *netdev)
struct enic *enic = netdev_priv(netdev);
unsigned int i;
int err, ret;
-
- enic->rq_buf_service = enic_rq_indicate_buf;
- enic->rq_alloc_buf = enic_rq_alloc_buf;
- enic->rq_free_buf = enic_free_rq_buf;
+ bool use_page_pool;
+ struct page_pool_params pp_params = { 0 };
+
+ /* Use the Page Pool API for MTUs <= PAGE_SIZE */
+ use_page_pool = (get_max_pkt_len(enic) <= PAGE_SIZE);
+
+ if (use_page_pool) {
+ /* use the page pool API */
+ pp_params.order = 0;
+ pp_params.pool_size = enic->config.rq_desc_count;
+ pp_params.nid = dev_to_node(&enic->pdev->dev);
+ pp_params.dev = &enic->pdev->dev;
+ pp_params.dma_dir = DMA_FROM_DEVICE;
+ pp_params.max_len = PAGE_SIZE;
+ pp_params.netdev = netdev;
+ pp_params.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV;
+
+ enic->rq_buf_service = enic_rq_indicate_page;
+ enic->rq_alloc_buf = enic_rq_alloc_page;
+ enic->rq_free_buf = enic_rq_free_page;
+ enic->rq_cleanup = enic_rq_page_cleanup;
+ } else {
+ enic->rq_buf_service = enic_rq_indicate_buf;
+ enic->rq_alloc_buf = enic_rq_alloc_buf;
+ enic->rq_free_buf = enic_free_rq_buf;
+ enic->rq_cleanup = enic_rq_buf_cleanup;
+ }
err = enic_request_intr(enic);
if (err) {
@@ -1902,6 +1930,13 @@ static int enic_open(struct net_device *netdev)
}
for (i = 0; i < enic->rq_count; i++) {
+ /* create a page pool for each RQ */
+ if (use_page_pool) {
+ pp_params.napi = &enic->napi[i];
+ pp_params.queue_idx = i;
+ enic->rq[i].pool = page_pool_create(&pp_params);
+ }
+
/* enable rq before updating rq desc */
vnic_rq_enable(&enic->rq[i].vrq);
vnic_rq_fill(&enic->rq[i].vrq, enic->rq_alloc_buf);
@@ -1942,8 +1977,10 @@ static int enic_open(struct net_device *netdev)
err_out_free_rq:
for (i = 0; i < enic->rq_count; i++) {
ret = vnic_rq_disable(&enic->rq[i].vrq);
- if (!ret)
+ if (!ret) {
vnic_rq_clean(&enic->rq[i].vrq, enic->rq_free_buf);
+ enic->rq_cleanup(&enic->rq[i]);
+ }
}
enic_dev_notify_unset(enic);
err_out_free_intr:
@@ -2001,8 +2038,10 @@ static int enic_stop(struct net_device *netdev)
for (i = 0; i < enic->wq_count; i++)
vnic_wq_clean(&enic->wq[i].vwq, enic_free_wq_buf);
- for (i = 0; i < enic->rq_count; i++)
+ for (i = 0; i < enic->rq_count; i++) {
vnic_rq_clean(&enic->rq[i].vrq, enic->rq_free_buf);
+ enic->rq_cleanup(&enic->rq[i]);
+ }
for (i = 0; i < enic->cq_count; i++)
vnic_cq_clean(&enic->cq[i]);
for (i = 0; i < enic->intr_count; i++)
diff --git a/drivers/net/ethernet/cisco/enic/enic_rq.c b/drivers/net/ethernet/cisco/enic/enic_rq.c
index ae2ab5af87e9..d450d2f4f694 100644
--- a/drivers/net/ethernet/cisco/enic/enic_rq.c
+++ b/drivers/net/ethernet/cisco/enic/enic_rq.c
@@ -7,6 +7,7 @@
#include "enic_rq.h"
#include "vnic_rq.h"
#include "cq_enet_desc.h"
+#include "enic_res.h"
#define ENIC_LARGE_PKT_THRESHOLD 1000
@@ -118,3 +119,142 @@ int enic_rq_service(struct vnic_dev *vdev, struct cq_desc *cq_desc,
return 0;
}
+
+void enic_rq_page_cleanup(struct enic_rq *rq)
+{
+ struct vnic_rq *vrq = &rq->vrq;
+ struct enic *enic = vnic_dev_priv(vrq->vdev);
+ struct napi_struct *napi = &enic->napi[vrq->index];
+
+ napi_free_frags(napi);
+ page_pool_destroy(rq->pool);
+}
+
+void enic_rq_free_page(struct vnic_rq *vrq, struct vnic_rq_buf *buf)
+{
+ struct enic *enic = vnic_dev_priv(vrq->vdev);
+ struct enic_rq *rq = &enic->rq[vrq->index];
+
+ if (!buf->os_buf)
+ return;
+
+ page_pool_put_full_page(rq->pool, (struct page *)buf->os_buf, true);
+ buf->os_buf = NULL;
+}
+
+int enic_rq_alloc_page(struct vnic_rq *vrq)
+{
+ struct enic *enic = vnic_dev_priv(vrq->vdev);
+ struct enic_rq *rq = &enic->rq[vrq->index];
+ struct enic_rq_stats *rqstats = &rq->stats;
+ struct vnic_rq_buf *buf = vrq->to_use;
+ dma_addr_t dma_addr;
+ struct page *page;
+ unsigned int offset = 0;
+ unsigned int len;
+ unsigned int truesize;
+
+ len = get_max_pkt_len(enic);
+ truesize = len;
+
+ if (buf->os_buf) {
+ dma_addr = buf->dma_addr;
+ } else {
+ page = page_pool_dev_alloc(rq->pool, &offset, &truesize);
+ if (unlikely(!page)) {
+ rqstats->pp_alloc_error++;
+ return -ENOMEM;
+ }
+ buf->os_buf = (void *)page;
+ buf->offset = offset;
+ buf->truesize = truesize;
+ dma_addr = page_pool_get_dma_addr(page) + offset;
+ }
+
+ enic_queue_rq_desc(vrq, buf->os_buf, dma_addr, len);
+
+ return 0;
+}
+
+/* Unmap and free pages fragments making up the error packet.
+ */
+static void enic_rq_error_reset(struct vnic_rq *vrq)
+{
+ struct enic *enic = vnic_dev_priv(vrq->vdev);
+ struct napi_struct *napi = &enic->napi[vrq->index];
+
+ napi_free_frags(napi);
+}
+
+void enic_rq_indicate_page(struct vnic_rq *vrq, struct cq_desc *cq_desc,
+ struct vnic_rq_buf *buf, int skipped, void *opaque)
+{
+ struct enic *enic = vnic_dev_priv(vrq->vdev);
+ struct sk_buff *skb;
+ struct enic_rq *rq = &enic->rq[vrq->index];
+ struct enic_rq_stats *rqstats = &rq->stats;
+ struct vnic_cq *cq = &enic->cq[enic_cq_rq(enic, vrq->index)];
+ struct napi_struct *napi;
+ u8 type, color, eop, sop, ingress_port, vlan_stripped;
+ u8 fcoe, fcoe_sof, fcoe_fc_crc_ok, fcoe_enc_error, fcoe_eof;
+ u8 tcp_udp_csum_ok, udp, tcp, ipv4_csum_ok;
+ u8 ipv6, ipv4, ipv4_fragment, fcs_ok, rss_type, csum_not_calc;
+ u8 packet_error;
+ u16 q_number, completed_index, bytes_written, vlan_tci, checksum;
+ u32 rss_hash;
+
+ if (skipped) {
+ rqstats->desc_skip++;
+ return;
+ }
+
+ if (!buf || !buf->dma_addr) {
+ net_warn_ratelimited("%s: !buf || !buf->dma_addr!!\n",
+ enic->netdev->name);
+ return;
+ }
+
+ cq_enet_rq_desc_dec((struct cq_enet_rq_desc *)cq_desc,
+ &type, &color, &q_number, &completed_index,
+ &ingress_port, &fcoe, &eop, &sop, &rss_type,
+ &csum_not_calc, &rss_hash, &bytes_written,
+ &packet_error, &vlan_stripped, &vlan_tci, &checksum,
+ &fcoe_sof, &fcoe_fc_crc_ok, &fcoe_enc_error,
+ &fcoe_eof, &tcp_udp_csum_ok, &udp, &tcp,
+ &ipv4_csum_ok, &ipv6, &ipv4, &ipv4_fragment,
+ &fcs_ok);
+
+ if (enic_rq_pkt_error(vrq, packet_error, fcs_ok, bytes_written)) {
+ enic_rq_error_reset(vrq);
+ return;
+ }
+
+ napi = &enic->napi[vrq->index];
+ skb = napi_get_frags(napi);
+ if (unlikely(!skb)) {
+ net_warn_ratelimited("%s: skb alloc error rq[%d], desc[%d]\n",
+ enic->netdev->name, vrq->index,
+ completed_index);
+ rqstats->no_skb++;
+ return;
+ }
+
+ dma_sync_single_for_cpu(&enic->pdev->dev, buf->dma_addr, bytes_written,
+ DMA_FROM_DEVICE);
+ skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, (struct page *)buf->os_buf,
+ buf->offset, bytes_written, buf->truesize);
+
+ buf->os_buf = NULL;
+ buf->dma_addr = 0;
+ buf = buf->next;
+
+ enic_rq_set_skb_flags(vrq, type, rss_hash, rss_type, fcoe, fcoe_fc_crc_ok,
+ vlan_stripped, csum_not_calc, tcp_udp_csum_ok, ipv6,
+ ipv4_csum_ok, vlan_tci, skb);
+ if (enic->rx_coalesce_setting.use_adaptive_rx_coalesce)
+ enic_intr_update_pkt_size(&cq->pkt_size_counter, skb->len);
+ skb_mark_for_recycle(skb);
+ skb_record_rx_queue(skb, vrq->index);
+ napi_gro_frags(napi);
+ rqstats->packets++;
+}
diff --git a/drivers/net/ethernet/cisco/enic/enic_rq.h b/drivers/net/ethernet/cisco/enic/enic_rq.h
index 46ab75fd74a0..f429f31b6172 100644
--- a/drivers/net/ethernet/cisco/enic/enic_rq.h
+++ b/drivers/net/ethernet/cisco/enic/enic_rq.h
@@ -19,4 +19,9 @@ int enic_rq_service(struct vnic_dev *vdev, struct cq_desc *cq_desc,
u8 type, u16 q_number, u16 completed_index, void *opaque);
void enic_rq_indicate_buf(struct vnic_rq *rq, struct cq_desc *cq_desc,
struct vnic_rq_buf *buf, int skipped, void *opaque);
+void enic_rq_indicate_page(struct vnic_rq *rq, struct cq_desc *cq_desc,
+ struct vnic_rq_buf *buf, int skipped, void *opaque);
+int enic_rq_alloc_page(struct vnic_rq *rq);
+void enic_rq_free_page(struct vnic_rq *rq, struct vnic_rq_buf *buf);
+void enic_rq_page_cleanup(struct enic_rq *rq);
#endif /* _ENIC_RQ_H_ */
diff --git a/drivers/net/ethernet/cisco/enic/vnic_rq.h b/drivers/net/ethernet/cisco/enic/vnic_rq.h
index 0bc595abc03b..2ee4be2b9a34 100644
--- a/drivers/net/ethernet/cisco/enic/vnic_rq.h
+++ b/drivers/net/ethernet/cisco/enic/vnic_rq.h
@@ -61,6 +61,8 @@ struct vnic_rq_buf {
unsigned int index;
void *desc;
uint64_t wr_id;
+ unsigned int offset;
+ unsigned int truesize;
};
enum enic_poll_state {
--
2.35.2
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH net-next v4 5/6] enic: Move RX coalescing set function
2025-01-02 22:24 [PATCH net-next v4 0/6] enic: Use Page Pool API for receiving packets John Daley
` (3 preceding siblings ...)
2025-01-02 22:24 ` [PATCH net-next v4 4/6] enic: Use the Page Pool API for RX when MTU is less than page size John Daley
@ 2025-01-02 22:24 ` John Daley
2025-01-02 22:24 ` [PATCH net-next v4 6/6] enic: Obtain the Link speed only after the link comes up John Daley
5 siblings, 0 replies; 17+ messages in thread
From: John Daley @ 2025-01-02 22:24 UTC (permalink / raw)
To: benve, satishkh, andrew+netdev, davem, edumazet, kuba, pabeni,
netdev
Cc: John Daley, Nelson Escobar
Move the function used for setting the RX coalescing range to before
the function that checks the link status. It needs to be called from
there instead of from the probe function.
There is no functional change.
Co-developed-by: Nelson Escobar <neescoba@cisco.com>
Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Co-developed-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
---
drivers/net/ethernet/cisco/enic/enic_main.c | 60 ++++++++++-----------
1 file changed, 30 insertions(+), 30 deletions(-)
diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c
index 5bfd89749237..21cbd7ed7bda 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -428,6 +428,36 @@ static void enic_mtu_check(struct enic *enic)
}
}
+static void enic_set_rx_coal_setting(struct enic *enic)
+{
+ unsigned int speed;
+ int index = -1;
+ struct enic_rx_coal *rx_coal = &enic->rx_coalesce_setting;
+
+ /* 1. Read the link speed from fw
+ * 2. Pick the default range for the speed
+ * 3. Update it in enic->rx_coalesce_setting
+ */
+ speed = vnic_dev_port_speed(enic->vdev);
+ if (speed > ENIC_LINK_SPEED_10G)
+ index = ENIC_LINK_40G_INDEX;
+ else if (speed > ENIC_LINK_SPEED_4G)
+ index = ENIC_LINK_10G_INDEX;
+ else
+ index = ENIC_LINK_4G_INDEX;
+
+ rx_coal->small_pkt_range_start = mod_range[index].small_pkt_range_start;
+ rx_coal->large_pkt_range_start = mod_range[index].large_pkt_range_start;
+ rx_coal->range_end = ENIC_RX_COALESCE_RANGE_END;
+
+ /* Start with the value provided by UCSM */
+ for (index = 0; index < enic->rq_count; index++)
+ enic->cq[index].cur_rx_coal_timeval =
+ enic->config.intr_timer_usec;
+
+ rx_coal->use_adaptive_rx_coalesce = 1;
+}
+
static void enic_link_check(struct enic *enic)
{
int link_status = vnic_dev_link_status(enic->vdev);
@@ -1816,36 +1846,6 @@ static void enic_synchronize_irqs(struct enic *enic)
}
}
-static void enic_set_rx_coal_setting(struct enic *enic)
-{
- unsigned int speed;
- int index = -1;
- struct enic_rx_coal *rx_coal = &enic->rx_coalesce_setting;
-
- /* 1. Read the link speed from fw
- * 2. Pick the default range for the speed
- * 3. Update it in enic->rx_coalesce_setting
- */
- speed = vnic_dev_port_speed(enic->vdev);
- if (ENIC_LINK_SPEED_10G < speed)
- index = ENIC_LINK_40G_INDEX;
- else if (ENIC_LINK_SPEED_4G < speed)
- index = ENIC_LINK_10G_INDEX;
- else
- index = ENIC_LINK_4G_INDEX;
-
- rx_coal->small_pkt_range_start = mod_range[index].small_pkt_range_start;
- rx_coal->large_pkt_range_start = mod_range[index].large_pkt_range_start;
- rx_coal->range_end = ENIC_RX_COALESCE_RANGE_END;
-
- /* Start with the value provided by UCSM */
- for (index = 0; index < enic->rq_count; index++)
- enic->cq[index].cur_rx_coal_timeval =
- enic->config.intr_timer_usec;
-
- rx_coal->use_adaptive_rx_coalesce = 1;
-}
-
static int enic_dev_notify_set(struct enic *enic)
{
int err;
--
2.35.2
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH net-next v4 6/6] enic: Obtain the Link speed only after the link comes up
2025-01-02 22:24 [PATCH net-next v4 0/6] enic: Use Page Pool API for receiving packets John Daley
` (4 preceding siblings ...)
2025-01-02 22:24 ` [PATCH net-next v4 5/6] enic: Move RX coalescing set function John Daley
@ 2025-01-02 22:24 ` John Daley
5 siblings, 0 replies; 17+ messages in thread
From: John Daley @ 2025-01-02 22:24 UTC (permalink / raw)
To: benve, satishkh, andrew+netdev, davem, edumazet, kuba, pabeni,
netdev
Cc: John Daley, Nelson Escobar
The link speed is obtained in the RX adaptive coalescing function. It
was being called at probe time when the link may not be up. Change the
call to run after the Link comes up.
The impact of not getting the correct link speed was that the low end of
the adaptive interrupt range was always being set to 0 which could have
caused a slight increase in the number of RX interrupts.
Co-developed-by: Nelson Escobar <neescoba@cisco.com>
Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Co-developed-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
---
drivers/net/ethernet/cisco/enic/enic_main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c
index 21cbd7ed7bda..12678bcf96a6 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -109,7 +109,7 @@ static struct enic_intr_mod_table mod_table[ENIC_MAX_COALESCE_TIMERS + 1] = {
static struct enic_intr_mod_range mod_range[ENIC_MAX_LINK_SPEEDS] = {
{0, 0}, /* 0 - 4 Gbps */
{0, 3}, /* 4 - 10 Gbps */
- {3, 6}, /* 10 - 40 Gbps */
+ {3, 6}, /* 10+ Gbps */
};
static void enic_init_affinity_hint(struct enic *enic)
@@ -466,6 +466,7 @@ static void enic_link_check(struct enic *enic)
if (link_status && !carrier_ok) {
netdev_info(enic->netdev, "Link UP\n");
netif_carrier_on(enic->netdev);
+ enic_set_rx_coal_setting(enic);
} else if (!link_status && carrier_ok) {
netdev_info(enic->netdev, "Link DOWN\n");
netif_carrier_off(enic->netdev);
@@ -3016,7 +3017,6 @@ static int enic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
timer_setup(&enic->notify_timer, enic_notify_timer, 0);
enic_rfs_flw_tbl_init(enic);
- enic_set_rx_coal_setting(enic);
INIT_WORK(&enic->reset, enic_reset);
INIT_WORK(&enic->tx_hang_reset, enic_tx_hang_reset);
INIT_WORK(&enic->change_mtu_work, enic_change_mtu_work);
--
2.35.2
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 4/6] enic: Use the Page Pool API for RX when MTU is less than page size
2025-01-02 22:24 ` [PATCH net-next v4 4/6] enic: Use the Page Pool API for RX when MTU is less than page size John Daley
@ 2025-01-05 1:41 ` Jakub Kicinski
2025-01-06 21:54 ` John Daley
2025-01-10 23:52 ` John Daley
0 siblings, 2 replies; 17+ messages in thread
From: Jakub Kicinski @ 2025-01-05 1:41 UTC (permalink / raw)
To: John Daley
Cc: benve, satishkh, andrew+netdev, davem, edumazet, pabeni, netdev,
Nelson Escobar
On Thu, 2 Jan 2025 14:24:25 -0800 John Daley wrote:
> The Page Pool API improves bandwidth and CPU overhead by recycling
> pages instead of allocating new buffers in the driver. Make use of
> page pool fragment allocation for smaller MTUs so that multiple
> packets can share a page.
Why the MTU limitation? You can set page_pool_params.order
to appropriate value always use the page pool.
> Added 'pp_alloc_error' per RQ ethtool statistic to count
> page_pool_dev_alloc() failures.
SG, but please don't report it via ethtool. Add it in
enic_get_queue_stats_rx() as alloc_fail (and enic_get_base_stats()).
As one of the benefits you'll be able to use
tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py
to test this stat and error handling in the driver.
> +void enic_rq_page_cleanup(struct enic_rq *rq)
> +{
> + struct vnic_rq *vrq = &rq->vrq;
> + struct enic *enic = vnic_dev_priv(vrq->vdev);
> + struct napi_struct *napi = &enic->napi[vrq->index];
> +
> + napi_free_frags(napi);
why?
> + page_pool_destroy(rq->pool);
> +}
--
pw-bot: cr
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 4/6] enic: Use the Page Pool API for RX when MTU is less than page size
2025-01-05 1:41 ` Jakub Kicinski
@ 2025-01-06 21:54 ` John Daley
2025-01-07 0:19 ` Jakub Kicinski
2025-01-10 23:52 ` John Daley
1 sibling, 1 reply; 17+ messages in thread
From: John Daley @ 2025-01-06 21:54 UTC (permalink / raw)
To: kuba
Cc: andrew+netdev, benve, davem, edumazet, johndale, neescoba, netdev,
pabeni, satishkh
>> The Page Pool API improves bandwidth and CPU overhead by recycling
>> pages instead of allocating new buffers in the driver. Make use of
>> page pool fragment allocation for smaller MTUs so that multiple
>> packets can share a page.
>
>Why the MTU limitation? You can set page_pool_params.order
>to appropriate value always use the page pool.
I thought it might waste memory, e.g. allocating 16K for 9000 mtu.
But now that you mention it, I see that the added code complexity is
probably not worth it. I am unclear on what to set pp_params.max_len
to when MTU > PAGE_SIZE. Order * PAGE_SIZE or MTU size? In this case
the pages won't be fragmented so isn't only necessary for the MTU sized
area to be DMA SYNC'ed?
>
>> Added 'pp_alloc_error' per RQ ethtool statistic to count
>> page_pool_dev_alloc() failures.
>
>SG, but please don't report it via ethtool. Add it in
>enic_get_queue_stats_rx() as alloc_fail (and enic_get_base_stats()).
>As one of the benefits you'll be able to use
>tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py
>to test this stat and error handling in the driver.
ok, will do.
>
>> +void enic_rq_page_cleanup(struct enic_rq *rq)
>> +{
>> + struct vnic_rq *vrq = &rq->vrq;
>> + struct enic *enic = vnic_dev_priv(vrq->vdev);
>> + struct napi_struct *napi = &enic->napi[vrq->index];
>> +
>> + napi_free_frags(napi);
>
>why?
Mistake, left over from previous patch. Also, I will remove enic_rq_error_reset()
which calls napi_free_frags at a time when napi->skb is not owned by driver.
>
>> + page_pool_destroy(rq->pool);
>> +}
I will make a v5 shortly. Would you recommend I split the patchset into 2 parts
as I think @andrew+netdev was suggesting? The last 2 patches are kind of unrelated
to the first 4.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 4/6] enic: Use the Page Pool API for RX when MTU is less than page size
2025-01-06 21:54 ` John Daley
@ 2025-01-07 0:19 ` Jakub Kicinski
2025-01-07 3:00 ` John Daley
0 siblings, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2025-01-07 0:19 UTC (permalink / raw)
To: John Daley
Cc: andrew+netdev, benve, davem, edumazet, neescoba, netdev, pabeni,
satishkh
On Mon, 6 Jan 2025 13:54:25 -0800 John Daley wrote:
> >> The Page Pool API improves bandwidth and CPU overhead by recycling
> >> pages instead of allocating new buffers in the driver. Make use of
> >> page pool fragment allocation for smaller MTUs so that multiple
> >> packets can share a page.
> >
> >Why the MTU limitation? You can set page_pool_params.order
> >to appropriate value always use the page pool.
>
> I thought it might waste memory, e.g. allocating 16K for 9000 mtu.
> But now that you mention it, I see that the added code complexity is
> probably not worth it. I am unclear on what to set pp_params.max_len
> to when MTU > PAGE_SIZE. Order * PAGE_SIZE or MTU size? In this case
> the pages won't be fragmented so isn't only necessary for the MTU sized
> area to be DMA SYNC'ed?
Good point, once fragmentation is no longer possible you can
set .max_len to the size of the fragment HW may clobber,
and .offset to the reserved headroom.
> >
> >> + page_pool_destroy(rq->pool);
> >> +}
>
> I will make a v5 shortly. Would you recommend I split the patchset into 2 parts
> as I think @andrew+netdev was suggesting? The last 2 patches are kind of unrelated
> to the first 4.
Yes, seems like a good idea, patches 5 and 6 would probably have been
merged a while back if they were separate.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 4/6] enic: Use the Page Pool API for RX when MTU is less than page size
2025-01-07 0:19 ` Jakub Kicinski
@ 2025-01-07 3:00 ` John Daley
2025-01-10 4:03 ` John Daley
0 siblings, 1 reply; 17+ messages in thread
From: John Daley @ 2025-01-07 3:00 UTC (permalink / raw)
To: kuba
Cc: andrew+netdev, benve, davem, edumazet, johndale, neescoba, netdev,
pabeni, satishkh
>> >> The Page Pool API improves bandwidth and CPU overhead by recycling
>> >> pages instead of allocating new buffers in the driver. Make use of
>> >> page pool fragment allocation for smaller MTUs so that multiple
>> >> packets can share a page.
>> >
>> >Why the MTU limitation? You can set page_pool_params.order
>> >to appropriate value always use the page pool.
>>
>> I thought it might waste memory, e.g. allocating 16K for 9000 mtu.
>> But now that you mention it, I see that the added code complexity is
>> probably not worth it. I am unclear on what to set pp_params.max_len
>> to when MTU > PAGE_SIZE. Order * PAGE_SIZE or MTU size? In this case
>> the pages won't be fragmented so isn't only necessary for the MTU sized
>> area to be DMA SYNC'ed?
>
>Good point, once fragmentation is no longer possible you can
>set .max_len to the size of the fragment HW may clobber,
>and .offset to the reserved headroom.
Ok, testing going good so far, but need another day.
>
>> >
>> >> + page_pool_destroy(rq->pool);
>> >> +}
>>
>> I will make a v5 shortly. Would you recommend I split the patchset into 2 parts
>> as I think @andrew+netdev was suggesting? The last 2 patches are kind of unrelated
>> to the first 4.
>
>Yes, seems like a good idea, patches 5 and 6 would probably have been
>merged a while back if they were separate.
Ok, I submitted patches 5 and 6 as separate trivial patchset. Hopefully it will be
merged by the time my testing for the page_pool changes are complete so I can submit
on top of them. thanks!
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 4/6] enic: Use the Page Pool API for RX when MTU is less than page size
2025-01-07 3:00 ` John Daley
@ 2025-01-10 4:03 ` John Daley
2025-01-11 0:38 ` Jakub Kicinski
0 siblings, 1 reply; 17+ messages in thread
From: John Daley @ 2025-01-10 4:03 UTC (permalink / raw)
To: johndale
Cc: andrew+netdev, benve, davem, edumazet, kuba, neescoba, netdev,
pabeni, satishkh
On 1/6/25, 7:00 PM, "John Daley" <johndale@cisco.com> wrote:
>
>>> >> The Page Pool API improves bandwidth and CPU overhead by recycling
>>> >> pages instead of allocating new buffers in the driver. Make use of
>>> >> page pool fragment allocation for smaller MTUs so that multiple
>>> >> packets can share a page.
>>> >
>>> >Why the MTU limitation? You can set page_pool_params.order
>>> >to appropriate value always use the page pool.
>>>
>>> I thought it might waste memory, e.g. allocating 16K for 9000 mtu.
>>> But now that you mention it, I see that the added code complexity is
>>> probably not worth it. I am unclear on what to set pp_params.max_len
>>> to when MTU > PAGE_SIZE. Order * PAGE_SIZE or MTU size? In this case
>>> the pages won't be fragmented so isn't only necessary for the MTU sized
>>> area to be DMA SYNC'ed?
>>
>>Good point, once fragmentation is no longer possible you can
>>set .max_len to the size of the fragment HW may clobber,
>>and .offset to the reserved headroom.
>
>Ok, testing going good so far, but need another day.
Testing is OK, but we are concerned about extra memory usage when order
is greater than 0. Especially for 9000 MTU where order 2 would mean
allocating an extra unused page per buffer. This could impact scaled up
installations with memory constraints. For this reason we would like to
limit the use of page pool to MTU <= PAGE_SIZE for now so that order is
0.
Our newer hardware supports using multiple 0 order pages for large MTUs
and we will submit a patch for that in the future.
I will make a v5 patchset with the napi_free_frags and pp_alloc_error
changes already discussed. Thanks, John
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 4/6] enic: Use the Page Pool API for RX when MTU is less than page size
2025-01-05 1:41 ` Jakub Kicinski
2025-01-06 21:54 ` John Daley
@ 2025-01-10 23:52 ` John Daley
2025-01-11 0:41 ` Jakub Kicinski
1 sibling, 1 reply; 17+ messages in thread
From: John Daley @ 2025-01-10 23:52 UTC (permalink / raw)
To: kuba
Cc: andrew+netdev, benve, davem, edumazet, johndale, neescoba, netdev,
pabeni, satishkh
On 1/4/25, 5:42 PM, "Jakub Kicinski" kuba@kernel.org wrote:
>On Thu, 2 Jan 2025 14:24:25 -0800 John Daley wrote:
>> The Page Pool API improves bandwidth and CPU overhead by recycling
>> pages instead of allocating new buffers in the driver. Make use of
>> page pool fragment allocation for smaller MTUs so that multiple
>> packets can share a page.
>
>Why the MTU limitation? You can set page_pool_params.order
>to appropriate value always use the page pool.
>
>> Added 'pp_alloc_error' per RQ ethtool statistic to count
>> page_pool_dev_alloc() failures.
>
>SG, but please don't report it via ethtool. Add it in
>enic_get_queue_stats_rx() as alloc_fail (and enic_get_base_stats()).
>As one of the benefits you'll be able to use
>tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py
>to test this stat and error handling in the driver.
Fyi, after making suggested change I used pp_alloc_fail.py but no
errors were injected. I think the path from page_pool_dev_alloc()
does not call page_pool_alloc_pages()?
Here is what I beleive the call path is:
page_pool_dev_alloc(rq->pool, &offset, &truesize)
page_pool_alloc(pool, offset, size, gfp)
netmem_to_page(page_pool_alloc_netmem(pool, offset, size, gfp));
page_pool_alloc_frag_netmem(pool, offset, *size, gfp);
page_pool_alloc_netmems(pool, gfp);
__page_pool_alloc_pages_slow(pool, gfp);
If I change the call from page_pool_dev_alloc() to
page_pool_dev_alloc_pages() in the driver I do see the errors injected.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 4/6] enic: Use the Page Pool API for RX when MTU is less than page size
2025-01-10 4:03 ` John Daley
@ 2025-01-11 0:38 ` Jakub Kicinski
2025-01-14 21:23 ` John Daley
0 siblings, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2025-01-11 0:38 UTC (permalink / raw)
To: John Daley
Cc: andrew+netdev, benve, davem, edumazet, neescoba, netdev, pabeni,
satishkh
On Thu, 9 Jan 2025 20:03:02 -0800 John Daley wrote:
> >>Good point, once fragmentation is no longer possible you can
> >>set .max_len to the size of the fragment HW may clobber,
> >>and .offset to the reserved headroom.
> >
> >Ok, testing going good so far, but need another day.
>
> Testing is OK, but we are concerned about extra memory usage when order
> is greater than 0. Especially for 9000 MTU where order 2 would mean
> allocating an extra unused page per buffer. This could impact scaled up
> installations with memory constraints. For this reason we would like to
> limit the use of page pool to MTU <= PAGE_SIZE for now so that order is
> 0.
And if you don't use the page pool what would be the allocation size
for 9k MTU if you don't have scatter? I think you're allocating linear
skbs, which IIRC will round up to the next power of 2...
> Our newer hardware supports using multiple 0 order pages for large MTUs
> and we will submit a patch for that in the future.
>
> I will make a v5 patchset with the napi_free_frags and pp_alloc_error
> changes already discussed. Thanks, John
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 4/6] enic: Use the Page Pool API for RX when MTU is less than page size
2025-01-10 23:52 ` John Daley
@ 2025-01-11 0:41 ` Jakub Kicinski
2025-01-14 21:13 ` John Daley
0 siblings, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2025-01-11 0:41 UTC (permalink / raw)
To: John Daley
Cc: andrew+netdev, benve, davem, edumazet, neescoba, netdev, pabeni,
satishkh
On Fri, 10 Jan 2025 15:52:04 -0800 John Daley wrote:
> >SG, but please don't report it via ethtool. Add it in
> >enic_get_queue_stats_rx() as alloc_fail (and enic_get_base_stats()).
> >As one of the benefits you'll be able to use
> >tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py
> >to test this stat and error handling in the driver.
>
> Fyi, after making suggested change I used pp_alloc_fail.py but no
> errors were injected. I think the path from page_pool_dev_alloc()
> does not call page_pool_alloc_pages()?
>
> Here is what I beleive the call path is:
> page_pool_dev_alloc(rq->pool, &offset, &truesize)
> page_pool_alloc(pool, offset, size, gfp)
> netmem_to_page(page_pool_alloc_netmem(pool, offset, size, gfp));
> page_pool_alloc_frag_netmem(pool, offset, *size, gfp);
> page_pool_alloc_netmems(pool, gfp);
> __page_pool_alloc_pages_slow(pool, gfp);
>
> If I change the call from page_pool_dev_alloc() to
> page_pool_dev_alloc_pages() in the driver I do see the errors injected.
Ah, good point. I think the netmems conversion broke it :(
If we moved the error injection to happen on page_pool_alloc_netmems
it would work, right? Would I be able to convince you to test that
and send a patch? :)
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 4/6] enic: Use the Page Pool API for RX when MTU is less than page size
2025-01-11 0:41 ` Jakub Kicinski
@ 2025-01-14 21:13 ` John Daley
0 siblings, 0 replies; 17+ messages in thread
From: John Daley @ 2025-01-14 21:13 UTC (permalink / raw)
To: kuba
Cc: andrew+netdev, benve, davem, edumazet, johndale, neescoba, netdev,
pabeni, satishkh
>On 1/10/25, 4:42 PM, "Jakub Kicinski" kuba@kernel.org wrote:
>
>On Fri, 10 Jan 2025 15:52:04 -0800 John Daley wrote:
>> >SG, but please don't report it via ethtool. Add it in
>> >enic_get_queue_stats_rx() as alloc_fail (and enic_get_base_stats()).
>> >As one of the benefits you'll be able to use
>> >tools/testing/selftests/drivers/net/hw/pp_alloc_fail.py
>> >to test this stat and error handling in the driver.
>>
>> Fyi, after making suggested change I used pp_alloc_fail.py but no
>> errors were injected. I think the path from page_pool_dev_alloc()
>> does not call page_pool_alloc_pages()?
>>
>> Here is what I beleive the call path is:
>> page_pool_dev_alloc(rq->pool, &offset, &truesize)
>> page_pool_alloc(pool, offset, size, gfp)
>> netmem_to_page(page_pool_alloc_netmem(pool, offset, size, gfp));
>> page_pool_alloc_frag_netmem(pool, offset, *size, gfp);
>> page_pool_alloc_netmems(pool, gfp);
>> __page_pool_alloc_pages_slow(pool, gfp);
>>
>> If I change the call from page_pool_dev_alloc() to
>> page_pool_dev_alloc_pages() in the driver I do see the errors injected.
>
>Ah, good point. I think the netmems conversion broke it :(
>If we moved the error injection to happen on page_pool_alloc_netmems
>it would work, right? Would I be able to convince you to test that
>and send a patch? :)
Sure, I'll give it a shot.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 4/6] enic: Use the Page Pool API for RX when MTU is less than page size
2025-01-11 0:38 ` Jakub Kicinski
@ 2025-01-14 21:23 ` John Daley
0 siblings, 0 replies; 17+ messages in thread
From: John Daley @ 2025-01-14 21:23 UTC (permalink / raw)
To: kuba
Cc: andrew+netdev, benve, davem, edumazet, johndale, neescoba, netdev,
pabeni, satishkh
>On 1/10/25, 4:38 PM, "Jakub Kicinski" kuba@kernel.org wrote:
>
>On Thu, 9 Jan 2025 20:03:02 -0800 John Daley wrote:
>> >>Good point, once fragmentation is no longer possible you can
>> >>set .max_len to the size of the fragment HW may clobber,
>> >>and .offset to the reserved headroom.
>> >
>> >Ok, testing going good so far, but need another day.
>>
>> Testing is OK, but we are concerned about extra memory usage when order
>> is greater than 0. Especially for 9000 MTU where order 2 would mean
>> allocating an extra unused page per buffer. This could impact scaled up
>> installations with memory constraints. For this reason we would like to
>> limit the use of page pool to MTU <= PAGE_SIZE for now so that order is
>> 0.
>
>And if you don't use the page pool what would be the allocation size
>for 9k MTU if you don't have scatter? I think you're allocating linear
>skbs, which IIRC will round up to the next power of 2...
Right, I now realize the linear skb allocation does round up so it is
using the same amount of memory as page pool for MTU 9000. I am spinning
a new patch set where only page pool is used since the code will be less
complicated. Thanks!
>
>> Our newer hardware supports using multiple 0 order pages for large MTUs
>> and we will submit a patch for that in the future.
>>
>> I will make a v5 patchset with the napi_free_frags and pp_alloc_error
>> changes already discussed. Thanks, John
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2025-01-14 21:23 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-02 22:24 [PATCH net-next v4 0/6] enic: Use Page Pool API for receiving packets John Daley
2025-01-02 22:24 ` [PATCH net-next v4 1/6] enic: Refactor RX path common code into helper functions John Daley
2025-01-02 22:24 ` [PATCH net-next v4 2/6] enic: Remove an unnecessary parameter from function enic_queue_rq_desc John Daley
2025-01-02 22:24 ` [PATCH net-next v4 3/6] enic: Use function pointers for buf alloc, free and RQ service John Daley
2025-01-02 22:24 ` [PATCH net-next v4 4/6] enic: Use the Page Pool API for RX when MTU is less than page size John Daley
2025-01-05 1:41 ` Jakub Kicinski
2025-01-06 21:54 ` John Daley
2025-01-07 0:19 ` Jakub Kicinski
2025-01-07 3:00 ` John Daley
2025-01-10 4:03 ` John Daley
2025-01-11 0:38 ` Jakub Kicinski
2025-01-14 21:23 ` John Daley
2025-01-10 23:52 ` John Daley
2025-01-11 0:41 ` Jakub Kicinski
2025-01-14 21:13 ` John Daley
2025-01-02 22:24 ` [PATCH net-next v4 5/6] enic: Move RX coalescing set function John Daley
2025-01-02 22:24 ` [PATCH net-next v4 6/6] enic: Obtain the Link speed only after the link comes up John Daley
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).