* [PATCH v6 RESEND net-next 0/5] Add more feautues for ENETC v4 - round 1
@ 2024-12-04 5:29 Wei Fang
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum offload for i.MX95 ENETC Wei Fang
` (4 more replies)
0 siblings, 5 replies; 23+ messages in thread
From: Wei Fang @ 2024-12-04 5:29 UTC (permalink / raw)
To: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, frank.li
Cc: netdev, linux-kernel, imx
Compared to ENETC v1 (LS1028A), ENETC v4 (i.MX95) adds more features, and
some features are configured completely differently from v1. In order to
more fully support ENETC v4, these features will be added through several
rounds of patch sets. This round adds these features, such as Tx and Rx
checksum offload, increase maximum chained Tx BD number and Large send
offload (LSO).
---
v1 Link: https://lore.kernel.org/imx/20241107033817.1654163-1-wei.fang@nxp.com/
v2 Link: https://lore.kernel.org/imx/20241111015216.1804534-1-wei.fang@nxp.com/
v3 Link: https://lore.kernel.org/imx/20241112091447.1850899-1-wei.fang@nxp.com/
v4 Link: https://lore.kernel.org/imx/20241115024744.1903377-1-wei.fang@nxp.com/
v5 Link: https://lore.kernel.org/imx/20241118060630.1956134-1-wei.fang@nxp.com/
v6 Link: https://lore.kernel.org/imx/20241119082344.2022830-1-wei.fang@nxp.com/
---
Wei Fang (5):
net: enetc: add Rx checksum offload for i.MX95 ENETC
net: enetc: add Tx checksum offload for i.MX95 ENETC
net: enetc: update max chained Tx BD number for i.MX95 ENETC
net: enetc: add LSO support for i.MX95 ENETC PF
net: enetc: add UDP segmentation offload support
drivers/net/ethernet/freescale/enetc/enetc.c | 333 ++++++++++++++++--
drivers/net/ethernet/freescale/enetc/enetc.h | 32 +-
.../net/ethernet/freescale/enetc/enetc4_hw.h | 22 ++
.../net/ethernet/freescale/enetc/enetc_hw.h | 31 +-
.../freescale/enetc/enetc_pf_common.c | 16 +-
.../net/ethernet/freescale/enetc/enetc_vf.c | 7 +-
6 files changed, 407 insertions(+), 34 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum offload for i.MX95 ENETC
2024-12-04 5:29 [PATCH v6 RESEND net-next 0/5] Add more feautues for ENETC v4 - round 1 Wei Fang
@ 2024-12-04 5:29 ` Wei Fang
2024-12-06 9:23 ` Simon Horman
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 2/5] net: enetc: add Tx " Wei Fang
` (3 subsequent siblings)
4 siblings, 1 reply; 23+ messages in thread
From: Wei Fang @ 2024-12-04 5:29 UTC (permalink / raw)
To: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, frank.li
Cc: netdev, linux-kernel, imx
ENETC rev 4.1 supports TCP and UDP checksum offload for receive, the bit
108 of the Rx BD will be set if the TCP/UDP checksum is correct. Since
this capability is not defined in register, the rx_csum bit is added to
struct enetc_drvdata to indicate whether the device supports Rx checksum
offload.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
---
v2: no changes
v3: no changes
v4: no changes
v5: no changes
v6: no changes
---
drivers/net/ethernet/freescale/enetc/enetc.c | 14 ++++++++++----
drivers/net/ethernet/freescale/enetc/enetc.h | 2 ++
drivers/net/ethernet/freescale/enetc/enetc_hw.h | 2 ++
.../net/ethernet/freescale/enetc/enetc_pf_common.c | 3 +++
4 files changed, 17 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index 35634c516e26..3137b6ee62d3 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -1011,10 +1011,15 @@ static void enetc_get_offloads(struct enetc_bdr *rx_ring,
/* TODO: hashing */
if (rx_ring->ndev->features & NETIF_F_RXCSUM) {
- u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
-
- skb->csum = csum_unfold((__force __sum16)~htons(inet_csum));
- skb->ip_summed = CHECKSUM_COMPLETE;
+ if (priv->active_offloads & ENETC_F_RXCSUM &&
+ le16_to_cpu(rxbd->r.flags) & ENETC_RXBD_FLAG_L4_CSUM_OK) {
+ skb->ip_summed = CHECKSUM_UNNECESSARY;
+ } else {
+ u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
+
+ skb->csum = csum_unfold((__force __sum16)~htons(inet_csum));
+ skb->ip_summed = CHECKSUM_COMPLETE;
+ }
}
if (le16_to_cpu(rxbd->r.flags) & ENETC_RXBD_FLAG_VLAN) {
@@ -3281,6 +3286,7 @@ static const struct enetc_drvdata enetc_pf_data = {
static const struct enetc_drvdata enetc4_pf_data = {
.sysclk_freq = ENETC_CLK_333M,
.pmac_offset = ENETC4_PMAC_OFFSET,
+ .rx_csum = 1,
.eth_ops = &enetc4_pf_ethtool_ops,
};
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h
index 72fa03dbc2dd..5b65f79e05be 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc.h
@@ -234,6 +234,7 @@ enum enetc_errata {
struct enetc_drvdata {
u32 pmac_offset; /* Only valid for PSI which supports 802.1Qbu */
+ u8 rx_csum:1;
u64 sysclk_freq;
const struct ethtool_ops *eth_ops;
};
@@ -341,6 +342,7 @@ enum enetc_active_offloads {
ENETC_F_QBV = BIT(9),
ENETC_F_QCI = BIT(10),
ENETC_F_QBU = BIT(11),
+ ENETC_F_RXCSUM = BIT(12),
};
enum enetc_flags_bit {
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_hw.h b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
index 7c3285584f8a..4b8fd1879005 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_hw.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
@@ -645,6 +645,8 @@ union enetc_rx_bd {
#define ENETC_RXBD_LSTATUS(flags) ((flags) << 16)
#define ENETC_RXBD_FLAG_VLAN BIT(9)
#define ENETC_RXBD_FLAG_TSTMP BIT(10)
+/* UDP and TCP checksum offload, for ENETC 4.1 and later */
+#define ENETC_RXBD_FLAG_L4_CSUM_OK BIT(12)
#define ENETC_RXBD_FLAG_TPID GENMASK(1, 0)
#define ENETC_MAC_ADDR_FILT_CNT 8 /* # of supported entries per port */
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c b/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
index 0eecfc833164..91e79582a541 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
@@ -119,6 +119,9 @@ void enetc_pf_netdev_setup(struct enetc_si *si, struct net_device *ndev,
ndev->priv_flags |= IFF_UNICAST_FLT;
+ if (si->drvdata->rx_csum)
+ priv->active_offloads |= ENETC_F_RXCSUM;
+
/* TODO: currently, i.MX95 ENETC driver does not support advanced features */
if (!is_enetc_rev1(si)) {
ndev->hw_features &= ~(NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_LOOPBACK);
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v6 RESEND net-next 2/5] net: enetc: add Tx checksum offload for i.MX95 ENETC
2024-12-04 5:29 [PATCH v6 RESEND net-next 0/5] Add more feautues for ENETC v4 - round 1 Wei Fang
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum offload for i.MX95 ENETC Wei Fang
@ 2024-12-04 5:29 ` Wei Fang
2024-12-06 9:37 ` Simon Horman
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 3/5] net: enetc: update max chained Tx BD number " Wei Fang
` (2 subsequent siblings)
4 siblings, 1 reply; 23+ messages in thread
From: Wei Fang @ 2024-12-04 5:29 UTC (permalink / raw)
To: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, frank.li
Cc: netdev, linux-kernel, imx
In addition to supporting Rx checksum offload, i.MX95 ENETC also supports
Tx checksum offload. The transmit checksum offload is implemented through
the Tx BD. To support Tx checksum offload, software needs to fill some
auxiliary information in Tx BD, such as IP version, IP header offset and
size, whether L4 is UDP or TCP, etc.
Same as Rx checksum offload, Tx checksum offload capability isn't defined
in register, so tx_csum bit is added to struct enetc_drvdata to indicate
whether the device supports Tx checksum offload.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
---
v2: refine enetc_tx_csum_offload_check().
v3:
1. refine enetc_tx_csum_offload_check() and enetc_skb_is_tcp() through
skb->csum_offset instead of touching skb->data.
2. add enetc_skb_is_ipv6() helper function
v4: no changes
v5:
1. remove 'inline' from enetc_skb_is_ipv6() and enetc_skb_is_tcp().
2. temp_bd.ipcs is no need to be set due to Linux always aclculates
the IPv4 checksum, so remove it.
3. simplify the setting of temp_bd.l3t.
4. remove the error log from the datapath
v6: no changes
---
drivers/net/ethernet/freescale/enetc/enetc.c | 47 ++++++++++++++++---
drivers/net/ethernet/freescale/enetc/enetc.h | 2 +
.../net/ethernet/freescale/enetc/enetc_hw.h | 14 ++++--
.../freescale/enetc/enetc_pf_common.c | 3 ++
4 files changed, 56 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index 3137b6ee62d3..94a78dca86e1 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -143,6 +143,27 @@ static int enetc_ptp_parse(struct sk_buff *skb, u8 *udp,
return 0;
}
+static bool enetc_tx_csum_offload_check(struct sk_buff *skb)
+{
+ switch (skb->csum_offset) {
+ case offsetof(struct tcphdr, check):
+ case offsetof(struct udphdr, check):
+ return true;
+ default:
+ return false;
+ }
+}
+
+static bool enetc_skb_is_ipv6(struct sk_buff *skb)
+{
+ return vlan_get_protocol(skb) == htons(ETH_P_IPV6);
+}
+
+static bool enetc_skb_is_tcp(struct sk_buff *skb)
+{
+ return skb->csum_offset == offsetof(struct tcphdr, check);
+}
+
static int enetc_map_tx_buffs(struct enetc_bdr *tx_ring, struct sk_buff *skb)
{
bool do_vlan, do_onestep_tstamp = false, do_twostep_tstamp = false;
@@ -160,6 +181,23 @@ static int enetc_map_tx_buffs(struct enetc_bdr *tx_ring, struct sk_buff *skb)
dma_addr_t dma;
u8 flags = 0;
+ enetc_clear_tx_bd(&temp_bd);
+ if (skb->ip_summed == CHECKSUM_PARTIAL) {
+ /* Can not support TSD and checksum offload at the same time */
+ if (priv->active_offloads & ENETC_F_TXCSUM &&
+ enetc_tx_csum_offload_check(skb) && !tx_ring->tsd_enable) {
+ temp_bd.l3_start = skb_network_offset(skb);
+ temp_bd.l3_hdr_size = skb_network_header_len(skb) / 4;
+ temp_bd.l3t = enetc_skb_is_ipv6(skb);
+ temp_bd.l4t = enetc_skb_is_tcp(skb) ? ENETC_TXBD_L4T_TCP :
+ ENETC_TXBD_L4T_UDP;
+ flags |= ENETC_TXBD_FLAGS_CSUM_LSO | ENETC_TXBD_FLAGS_L4CS;
+ } else {
+ if (skb_checksum_help(skb))
+ return 0;
+ }
+ }
+
i = tx_ring->next_to_use;
txbd = ENETC_TXBD(*tx_ring, i);
prefetchw(txbd);
@@ -170,7 +208,6 @@ static int enetc_map_tx_buffs(struct enetc_bdr *tx_ring, struct sk_buff *skb)
temp_bd.addr = cpu_to_le64(dma);
temp_bd.buf_len = cpu_to_le16(len);
- temp_bd.lstatus = 0;
tx_swbd = &tx_ring->tx_swbd[i];
tx_swbd->dma = dma;
@@ -591,7 +628,7 @@ static netdev_tx_t enetc_start_xmit(struct sk_buff *skb,
{
struct enetc_ndev_priv *priv = netdev_priv(ndev);
struct enetc_bdr *tx_ring;
- int count, err;
+ int count;
/* Queue one-step Sync packet if already locked */
if (skb->cb[0] & ENETC_F_TX_ONESTEP_SYNC_TSTAMP) {
@@ -624,11 +661,6 @@ static netdev_tx_t enetc_start_xmit(struct sk_buff *skb,
return NETDEV_TX_BUSY;
}
- if (skb->ip_summed == CHECKSUM_PARTIAL) {
- err = skb_checksum_help(skb);
- if (err)
- goto drop_packet_err;
- }
enetc_lock_mdio();
count = enetc_map_tx_buffs(tx_ring, skb);
enetc_unlock_mdio();
@@ -3287,6 +3319,7 @@ static const struct enetc_drvdata enetc4_pf_data = {
.sysclk_freq = ENETC_CLK_333M,
.pmac_offset = ENETC4_PMAC_OFFSET,
.rx_csum = 1,
+ .tx_csum = 1,
.eth_ops = &enetc4_pf_ethtool_ops,
};
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h
index 5b65f79e05be..ee11ff97e9ed 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc.h
@@ -235,6 +235,7 @@ enum enetc_errata {
struct enetc_drvdata {
u32 pmac_offset; /* Only valid for PSI which supports 802.1Qbu */
u8 rx_csum:1;
+ u8 tx_csum:1;
u64 sysclk_freq;
const struct ethtool_ops *eth_ops;
};
@@ -343,6 +344,7 @@ enum enetc_active_offloads {
ENETC_F_QCI = BIT(10),
ENETC_F_QBU = BIT(11),
ENETC_F_RXCSUM = BIT(12),
+ ENETC_F_TXCSUM = BIT(13),
};
enum enetc_flags_bit {
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_hw.h b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
index 4b8fd1879005..590b1412fadf 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_hw.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
@@ -558,7 +558,12 @@ union enetc_tx_bd {
__le16 frm_len;
union {
struct {
- u8 reserved[3];
+ u8 l3_start:7;
+ u8 ipcs:1;
+ u8 l3_hdr_size:7;
+ u8 l3t:1;
+ u8 resv:5;
+ u8 l4t:3;
u8 flags;
}; /* default layout */
__le32 txstart;
@@ -582,10 +587,10 @@ union enetc_tx_bd {
};
enum enetc_txbd_flags {
- ENETC_TXBD_FLAGS_RES0 = BIT(0), /* reserved */
+ ENETC_TXBD_FLAGS_L4CS = BIT(0), /* For ENETC 4.1 and later */
ENETC_TXBD_FLAGS_TSE = BIT(1),
ENETC_TXBD_FLAGS_W = BIT(2),
- ENETC_TXBD_FLAGS_RES3 = BIT(3), /* reserved */
+ ENETC_TXBD_FLAGS_CSUM_LSO = BIT(3), /* For ENETC 4.1 and later */
ENETC_TXBD_FLAGS_TXSTART = BIT(4),
ENETC_TXBD_FLAGS_EX = BIT(6),
ENETC_TXBD_FLAGS_F = BIT(7)
@@ -594,6 +599,9 @@ enum enetc_txbd_flags {
#define ENETC_TXBD_TXSTART_MASK GENMASK(24, 0)
#define ENETC_TXBD_FLAGS_OFFSET 24
+#define ENETC_TXBD_L4T_UDP BIT(0)
+#define ENETC_TXBD_L4T_TCP BIT(1)
+
static inline __le32 enetc_txbd_set_tx_start(u64 tx_start, u8 flags)
{
u32 temp;
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c b/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
index 91e79582a541..3a8a5b6d8c26 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
@@ -122,6 +122,9 @@ void enetc_pf_netdev_setup(struct enetc_si *si, struct net_device *ndev,
if (si->drvdata->rx_csum)
priv->active_offloads |= ENETC_F_RXCSUM;
+ if (si->drvdata->tx_csum)
+ priv->active_offloads |= ENETC_F_TXCSUM;
+
/* TODO: currently, i.MX95 ENETC driver does not support advanced features */
if (!is_enetc_rev1(si)) {
ndev->hw_features &= ~(NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_LOOPBACK);
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v6 RESEND net-next 3/5] net: enetc: update max chained Tx BD number for i.MX95 ENETC
2024-12-04 5:29 [PATCH v6 RESEND net-next 0/5] Add more feautues for ENETC v4 - round 1 Wei Fang
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum offload for i.MX95 ENETC Wei Fang
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 2/5] net: enetc: add Tx " Wei Fang
@ 2024-12-04 5:29 ` Wei Fang
2024-12-06 10:11 ` Simon Horman
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 4/5] net: enetc: add LSO support for i.MX95 ENETC PF Wei Fang
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 5/5] net: enetc: add UDP segmentation offload support Wei Fang
4 siblings, 1 reply; 23+ messages in thread
From: Wei Fang @ 2024-12-04 5:29 UTC (permalink / raw)
To: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, frank.li
Cc: netdev, linux-kernel, imx
The max chained Tx BDs of latest ENETC (i.MX95 ENETC, rev 4.1) has been
increased to 63, but since the range of MAX_SKB_FRAGS is 17~45, so for
i.MX95 ENETC and later revision, it is better to set ENETC4_MAX_SKB_FRAGS
to MAX_SKB_FRAGS.
In addition, add max_frags in struct enetc_drvdata to indicate the max
chained BDs supported by device. Because the max number of chained BDs
supported by LS1028A and i.MX95 ENETC is different.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
---
v2:
1. Refine the commit message
2. Add Reviewed-by tag
v3: no changes
v4: no changes
v5: no changes
v6: no changes
---
drivers/net/ethernet/freescale/enetc/enetc.c | 13 +++++++++----
drivers/net/ethernet/freescale/enetc/enetc.h | 13 +++++++++++--
.../net/ethernet/freescale/enetc/enetc_pf_common.c | 1 +
drivers/net/ethernet/freescale/enetc/enetc_vf.c | 1 +
4 files changed, 22 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index 94a78dca86e1..dafe7aeac26b 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -525,6 +525,7 @@ static void enetc_tso_complete_csum(struct enetc_bdr *tx_ring, struct tso_t *tso
static int enetc_map_tx_tso_buffs(struct enetc_bdr *tx_ring, struct sk_buff *skb)
{
+ struct enetc_ndev_priv *priv = netdev_priv(tx_ring->ndev);
int hdr_len, total_len, data_len;
struct enetc_tx_swbd *tx_swbd;
union enetc_tx_bd *txbd;
@@ -590,7 +591,7 @@ static int enetc_map_tx_tso_buffs(struct enetc_bdr *tx_ring, struct sk_buff *skb
bd_data_num++;
tso_build_data(skb, &tso, size);
- if (unlikely(bd_data_num >= ENETC_MAX_SKB_FRAGS && data_len))
+ if (unlikely(bd_data_num >= priv->max_frags && data_len))
goto err_chained_bd;
}
@@ -651,7 +652,7 @@ static netdev_tx_t enetc_start_xmit(struct sk_buff *skb,
count = enetc_map_tx_tso_buffs(tx_ring, skb);
enetc_unlock_mdio();
} else {
- if (unlikely(skb_shinfo(skb)->nr_frags > ENETC_MAX_SKB_FRAGS))
+ if (unlikely(skb_shinfo(skb)->nr_frags > priv->max_frags))
if (unlikely(skb_linearize(skb)))
goto drop_packet_err;
@@ -669,7 +670,7 @@ static netdev_tx_t enetc_start_xmit(struct sk_buff *skb,
if (unlikely(!count))
goto drop_packet_err;
- if (enetc_bd_unused(tx_ring) < ENETC_TXBDS_MAX_NEEDED)
+ if (enetc_bd_unused(tx_ring) < ENETC_TXBDS_MAX_NEEDED(priv->max_frags))
netif_stop_subqueue(ndev, tx_ring->index);
return NETDEV_TX_OK;
@@ -937,7 +938,8 @@ static bool enetc_clean_tx_ring(struct enetc_bdr *tx_ring, int napi_budget)
if (unlikely(tx_frm_cnt && netif_carrier_ok(ndev) &&
__netif_subqueue_stopped(ndev, tx_ring->index) &&
!test_bit(ENETC_TX_DOWN, &priv->flags) &&
- (enetc_bd_unused(tx_ring) >= ENETC_TXBDS_MAX_NEEDED))) {
+ (enetc_bd_unused(tx_ring) >=
+ ENETC_TXBDS_MAX_NEEDED(priv->max_frags)))) {
netif_wake_subqueue(ndev, tx_ring->index);
}
@@ -3312,6 +3314,7 @@ EXPORT_SYMBOL_GPL(enetc_pci_remove);
static const struct enetc_drvdata enetc_pf_data = {
.sysclk_freq = ENETC_CLK_400M,
.pmac_offset = ENETC_PMAC_OFFSET,
+ .max_frags = ENETC_MAX_SKB_FRAGS,
.eth_ops = &enetc_pf_ethtool_ops,
};
@@ -3320,11 +3323,13 @@ static const struct enetc_drvdata enetc4_pf_data = {
.pmac_offset = ENETC4_PMAC_OFFSET,
.rx_csum = 1,
.tx_csum = 1,
+ .max_frags = ENETC4_MAX_SKB_FRAGS,
.eth_ops = &enetc4_pf_ethtool_ops,
};
static const struct enetc_drvdata enetc_vf_data = {
.sysclk_freq = ENETC_CLK_400M,
+ .max_frags = ENETC_MAX_SKB_FRAGS,
.eth_ops = &enetc_vf_ethtool_ops,
};
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h
index ee11ff97e9ed..a78af4f624e0 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc.h
@@ -59,9 +59,16 @@ struct enetc_rx_swbd {
/* ENETC overhead: optional extension BD + 1 BD gap */
#define ENETC_TXBDS_NEEDED(val) ((val) + 2)
-/* max # of chained Tx BDs is 15, including head and extension BD */
+/* For LS1028A, max # of chained Tx BDs is 15, including head and
+ * extension BD.
+ */
#define ENETC_MAX_SKB_FRAGS 13
-#define ENETC_TXBDS_MAX_NEEDED ENETC_TXBDS_NEEDED(ENETC_MAX_SKB_FRAGS + 1)
+/* For ENETC v4 and later versions, max # of chained Tx BDs is 63,
+ * including head and extension BD, but the range of MAX_SKB_FRAGS
+ * is 17 ~ 45, so set ENETC4_MAX_SKB_FRAGS to MAX_SKB_FRAGS.
+ */
+#define ENETC4_MAX_SKB_FRAGS MAX_SKB_FRAGS
+#define ENETC_TXBDS_MAX_NEEDED(x) ENETC_TXBDS_NEEDED((x) + 1)
struct enetc_ring_stats {
unsigned int packets;
@@ -236,6 +243,7 @@ struct enetc_drvdata {
u32 pmac_offset; /* Only valid for PSI which supports 802.1Qbu */
u8 rx_csum:1;
u8 tx_csum:1;
+ u8 max_frags;
u64 sysclk_freq;
const struct ethtool_ops *eth_ops;
};
@@ -379,6 +387,7 @@ struct enetc_ndev_priv {
u16 msg_enable;
u8 preemptible_tcs;
+ u8 max_frags; /* The maximum number of BDs for fragments */
enum enetc_active_offloads active_offloads;
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c b/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
index 3a8a5b6d8c26..2c4c6af672e7 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
@@ -101,6 +101,7 @@ void enetc_pf_netdev_setup(struct enetc_si *si, struct net_device *ndev,
priv->msg_enable = (NETIF_MSG_WOL << 1) - 1;
priv->sysclk_freq = si->drvdata->sysclk_freq;
+ priv->max_frags = si->drvdata->max_frags;
ndev->netdev_ops = ndev_ops;
enetc_set_ethtool_ops(ndev);
ndev->watchdog_timeo = 5 * HZ;
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_vf.c b/drivers/net/ethernet/freescale/enetc/enetc_vf.c
index a5f8ce576b6e..63d78b2b8670 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_vf.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_vf.c
@@ -136,6 +136,7 @@ static void enetc_vf_netdev_setup(struct enetc_si *si, struct net_device *ndev,
priv->msg_enable = (NETIF_MSG_IFUP << 1) - 1;
priv->sysclk_freq = si->drvdata->sysclk_freq;
+ priv->max_frags = si->drvdata->max_frags;
ndev->netdev_ops = ndev_ops;
enetc_set_ethtool_ops(ndev);
ndev->watchdog_timeo = 5 * HZ;
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v6 RESEND net-next 4/5] net: enetc: add LSO support for i.MX95 ENETC PF
2024-12-04 5:29 [PATCH v6 RESEND net-next 0/5] Add more feautues for ENETC v4 - round 1 Wei Fang
` (2 preceding siblings ...)
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 3/5] net: enetc: update max chained Tx BD number " Wei Fang
@ 2024-12-04 5:29 ` Wei Fang
2024-12-06 9:59 ` Simon Horman
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 5/5] net: enetc: add UDP segmentation offload support Wei Fang
4 siblings, 1 reply; 23+ messages in thread
From: Wei Fang @ 2024-12-04 5:29 UTC (permalink / raw)
To: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, frank.li
Cc: netdev, linux-kernel, imx
ENETC rev 4.1 supports large send offload (LSO), segmenting large TCP
and UDP transmit units into multiple Ethernet frames. To support LSO,
software needs to fill some auxiliary information in Tx BD, such as LSO
header length, frame length, LSO maximum segment size, etc.
At 1Gbps link rate, TCP segmentation was tested using iperf3, and the
CPU performance before and after applying the patch was compared through
the top command. It can be seen that LSO saves a significant amount of
CPU cycles compared to software TSO.
Before applying the patch:
%Cpu(s): 0.1 us, 4.1 sy, 0.0 ni, 85.7 id, 0.0 wa, 0.5 hi, 9.7 si
After applying the patch:
%Cpu(s): 0.1 us, 2.3 sy, 0.0 ni, 94.5 id, 0.0 wa, 0.4 hi, 2.6 si
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
---
v2: no changes
v3: use enetc_skb_is_ipv6() helper fucntion which is added in patch 2
v4: fix a typo
v5: no changes
v6: remove error logs from the datapath
---
drivers/net/ethernet/freescale/enetc/enetc.c | 259 +++++++++++++++++-
drivers/net/ethernet/freescale/enetc/enetc.h | 15 +
.../net/ethernet/freescale/enetc/enetc4_hw.h | 22 ++
.../net/ethernet/freescale/enetc/enetc_hw.h | 15 +-
.../freescale/enetc/enetc_pf_common.c | 3 +
5 files changed, 304 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index dafe7aeac26b..82a7932725f9 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -523,6 +523,226 @@ static void enetc_tso_complete_csum(struct enetc_bdr *tx_ring, struct tso_t *tso
}
}
+static inline int enetc_lso_count_descs(const struct sk_buff *skb)
+{
+ /* 4 BDs: 1 BD for LSO header + 1 BD for extended BD + 1 BD
+ * for linear area data but not include LSO header, namely
+ * skb_headlen(skb) - lso_hdr_len. And 1 BD for gap.
+ */
+ return skb_shinfo(skb)->nr_frags + 4;
+}
+
+static int enetc_lso_get_hdr_len(const struct sk_buff *skb)
+{
+ int hdr_len, tlen;
+
+ tlen = skb_is_gso_tcp(skb) ? tcp_hdrlen(skb) : sizeof(struct udphdr);
+ hdr_len = skb_transport_offset(skb) + tlen;
+
+ return hdr_len;
+}
+
+static void enetc_lso_start(struct sk_buff *skb, struct enetc_lso_t *lso)
+{
+ lso->lso_seg_size = skb_shinfo(skb)->gso_size;
+ lso->ipv6 = enetc_skb_is_ipv6(skb);
+ lso->tcp = skb_is_gso_tcp(skb);
+ lso->l3_hdr_len = skb_network_header_len(skb);
+ lso->l3_start = skb_network_offset(skb);
+ lso->hdr_len = enetc_lso_get_hdr_len(skb);
+ lso->total_len = skb->len - lso->hdr_len;
+}
+
+static void enetc_lso_map_hdr(struct enetc_bdr *tx_ring, struct sk_buff *skb,
+ int *i, struct enetc_lso_t *lso)
+{
+ union enetc_tx_bd txbd_tmp, *txbd;
+ struct enetc_tx_swbd *tx_swbd;
+ u16 frm_len, frm_len_ext;
+ u8 flags, e_flags = 0;
+ dma_addr_t addr;
+ char *hdr;
+
+ /* Get the first BD of the LSO BDs chain */
+ txbd = ENETC_TXBD(*tx_ring, *i);
+ tx_swbd = &tx_ring->tx_swbd[*i];
+ prefetchw(txbd);
+
+ /* Prepare LSO header: MAC + IP + TCP/UDP */
+ hdr = tx_ring->tso_headers + *i * TSO_HEADER_SIZE;
+ memcpy(hdr, skb->data, lso->hdr_len);
+ addr = tx_ring->tso_headers_dma + *i * TSO_HEADER_SIZE;
+
+ frm_len = lso->total_len & 0xffff;
+ frm_len_ext = (lso->total_len >> 16) & 0xf;
+
+ /* Set the flags of the first BD */
+ flags = ENETC_TXBD_FLAGS_EX | ENETC_TXBD_FLAGS_CSUM_LSO |
+ ENETC_TXBD_FLAGS_LSO | ENETC_TXBD_FLAGS_L4CS;
+
+ enetc_clear_tx_bd(&txbd_tmp);
+ txbd_tmp.addr = cpu_to_le64(addr);
+ txbd_tmp.hdr_len = cpu_to_le16(lso->hdr_len);
+
+ /* first BD needs frm_len and offload flags set */
+ txbd_tmp.frm_len = cpu_to_le16(frm_len);
+ txbd_tmp.flags = flags;
+
+ if (lso->tcp)
+ txbd_tmp.l4t = ENETC_TXBD_L4T_TCP;
+ else
+ txbd_tmp.l4t = ENETC_TXBD_L4T_UDP;
+
+ if (lso->ipv6)
+ txbd_tmp.l3t = 1;
+ else
+ txbd_tmp.ipcs = 1;
+
+ /* l3_hdr_size in 32-bits (4 bytes) */
+ txbd_tmp.l3_hdr_size = lso->l3_hdr_len / 4;
+ txbd_tmp.l3_start = lso->l3_start;
+
+ /* For the LSO header we do not set the dma address since
+ * we do not want it unmapped when we do cleanup. We still
+ * set len so that we count the bytes sent.
+ */
+ tx_swbd->len = lso->hdr_len;
+ tx_swbd->do_twostep_tstamp = false;
+ tx_swbd->check_wb = false;
+
+ /* Actually write the header in the BD */
+ *txbd = txbd_tmp;
+
+ /* Get the next BD, and the next BD is extended BD */
+ enetc_bdr_idx_inc(tx_ring, i);
+ txbd = ENETC_TXBD(*tx_ring, *i);
+ tx_swbd = &tx_ring->tx_swbd[*i];
+ prefetchw(txbd);
+
+ enetc_clear_tx_bd(&txbd_tmp);
+ if (skb_vlan_tag_present(skb)) {
+ /* Setup the VLAN fields */
+ txbd_tmp.ext.vid = cpu_to_le16(skb_vlan_tag_get(skb));
+ txbd_tmp.ext.tpid = 0; /* < C-TAG */
+ e_flags = ENETC_TXBD_E_FLAGS_VLAN_INS;
+ }
+
+ /* Write the BD */
+ txbd_tmp.ext.e_flags = e_flags;
+ txbd_tmp.ext.lso_sg_size = cpu_to_le16(lso->lso_seg_size);
+ txbd_tmp.ext.frm_len_ext = cpu_to_le16(frm_len_ext);
+ *txbd = txbd_tmp;
+}
+
+static int enetc_lso_map_data(struct enetc_bdr *tx_ring, struct sk_buff *skb,
+ int *i, struct enetc_lso_t *lso, int *count)
+{
+ union enetc_tx_bd txbd_tmp, *txbd = NULL;
+ struct enetc_tx_swbd *tx_swbd;
+ skb_frag_t *frag;
+ dma_addr_t dma;
+ u8 flags = 0;
+ int len, f;
+
+ len = skb_headlen(skb) - lso->hdr_len;
+ if (len > 0) {
+ dma = dma_map_single(tx_ring->dev, skb->data + lso->hdr_len,
+ len, DMA_TO_DEVICE);
+ if (unlikely(dma_mapping_error(tx_ring->dev, dma)))
+ return -ENOMEM;
+
+ enetc_bdr_idx_inc(tx_ring, i);
+ txbd = ENETC_TXBD(*tx_ring, *i);
+ tx_swbd = &tx_ring->tx_swbd[*i];
+ prefetchw(txbd);
+ *count += 1;
+
+ enetc_clear_tx_bd(&txbd_tmp);
+ txbd_tmp.addr = cpu_to_le64(dma);
+ txbd_tmp.buf_len = cpu_to_le16(len);
+
+ tx_swbd->dma = dma;
+ tx_swbd->len = len;
+ tx_swbd->is_dma_page = 0;
+ tx_swbd->dir = DMA_TO_DEVICE;
+ }
+
+ frag = &skb_shinfo(skb)->frags[0];
+ for (f = 0; f < skb_shinfo(skb)->nr_frags; f++, frag++) {
+ if (txbd)
+ *txbd = txbd_tmp;
+
+ len = skb_frag_size(frag);
+ dma = skb_frag_dma_map(tx_ring->dev, frag, 0, len,
+ DMA_TO_DEVICE);
+ if (unlikely(dma_mapping_error(tx_ring->dev, dma)))
+ return -ENOMEM;
+
+ /* Get the next BD */
+ enetc_bdr_idx_inc(tx_ring, i);
+ txbd = ENETC_TXBD(*tx_ring, *i);
+ tx_swbd = &tx_ring->tx_swbd[*i];
+ prefetchw(txbd);
+ *count += 1;
+
+ enetc_clear_tx_bd(&txbd_tmp);
+ txbd_tmp.addr = cpu_to_le64(dma);
+ txbd_tmp.buf_len = cpu_to_le16(len);
+
+ tx_swbd->dma = dma;
+ tx_swbd->len = len;
+ tx_swbd->is_dma_page = 1;
+ tx_swbd->dir = DMA_TO_DEVICE;
+ }
+
+ /* Last BD needs 'F' bit set */
+ flags |= ENETC_TXBD_FLAGS_F;
+ txbd_tmp.flags = flags;
+ *txbd = txbd_tmp;
+
+ tx_swbd->is_eof = 1;
+ tx_swbd->skb = skb;
+
+ return 0;
+}
+
+static int enetc_lso_hw_offload(struct enetc_bdr *tx_ring, struct sk_buff *skb)
+{
+ struct enetc_tx_swbd *tx_swbd;
+ struct enetc_lso_t lso = {0};
+ int err, i, count = 0;
+
+ /* Initialize the LSO handler */
+ enetc_lso_start(skb, &lso);
+ i = tx_ring->next_to_use;
+
+ enetc_lso_map_hdr(tx_ring, skb, &i, &lso);
+ /* First BD and an extend BD */
+ count += 2;
+
+ err = enetc_lso_map_data(tx_ring, skb, &i, &lso, &count);
+ if (err)
+ goto dma_err;
+
+ /* Go to the next BD */
+ enetc_bdr_idx_inc(tx_ring, &i);
+ tx_ring->next_to_use = i;
+ enetc_update_tx_ring_tail(tx_ring);
+
+ return count;
+
+dma_err:
+ do {
+ tx_swbd = &tx_ring->tx_swbd[i];
+ enetc_free_tx_frame(tx_ring, tx_swbd);
+ if (i == 0)
+ i = tx_ring->bd_count;
+ i--;
+ } while (count--);
+
+ return 0;
+}
+
static int enetc_map_tx_tso_buffs(struct enetc_bdr *tx_ring, struct sk_buff *skb)
{
struct enetc_ndev_priv *priv = netdev_priv(tx_ring->ndev);
@@ -643,14 +863,26 @@ static netdev_tx_t enetc_start_xmit(struct sk_buff *skb,
tx_ring = priv->tx_ring[skb->queue_mapping];
if (skb_is_gso(skb)) {
- if (enetc_bd_unused(tx_ring) < tso_count_descs(skb)) {
- netif_stop_subqueue(ndev, tx_ring->index);
- return NETDEV_TX_BUSY;
- }
+ /* LSO data unit lengths of up to 256KB are supported */
+ if (priv->active_offloads & ENETC_F_LSO &&
+ (skb->len - enetc_lso_get_hdr_len(skb)) <=
+ ENETC_LSO_MAX_DATA_LEN) {
+ if (enetc_bd_unused(tx_ring) < enetc_lso_count_descs(skb)) {
+ netif_stop_subqueue(ndev, tx_ring->index);
+ return NETDEV_TX_BUSY;
+ }
- enetc_lock_mdio();
- count = enetc_map_tx_tso_buffs(tx_ring, skb);
- enetc_unlock_mdio();
+ count = enetc_lso_hw_offload(tx_ring, skb);
+ } else {
+ if (enetc_bd_unused(tx_ring) < tso_count_descs(skb)) {
+ netif_stop_subqueue(ndev, tx_ring->index);
+ return NETDEV_TX_BUSY;
+ }
+
+ enetc_lock_mdio();
+ count = enetc_map_tx_tso_buffs(tx_ring, skb);
+ enetc_unlock_mdio();
+ }
} else {
if (unlikely(skb_shinfo(skb)->nr_frags > priv->max_frags))
if (unlikely(skb_linearize(skb)))
@@ -1796,6 +2028,9 @@ void enetc_get_si_caps(struct enetc_si *si)
si->num_rss = ENETC_SIRSSCAPR_GET_NUM_RSS(rss);
}
+ if (val & ENETC_SIPCAPR0_LSO)
+ si->hw_features |= ENETC_SI_F_LSO;
+
if (val & ENETC_SIPCAPR0_QBV)
si->hw_features |= ENETC_SI_F_QBV;
@@ -2100,6 +2335,13 @@ static int enetc_setup_default_rss_table(struct enetc_si *si, int num_groups)
return 0;
}
+static void enetc_set_lso_flags_mask(struct enetc_hw *hw)
+{
+ enetc_wr(hw, ENETC4_SILSOSFMR0,
+ SILSOSFMR0_VAL_SET(TCP_NL_SEG_FLAGS_DMASK, TCP_NL_SEG_FLAGS_DMASK));
+ enetc_wr(hw, ENETC4_SILSOSFMR1, 0);
+}
+
int enetc_configure_si(struct enetc_ndev_priv *priv)
{
struct enetc_si *si = priv->si;
@@ -2113,6 +2355,9 @@ int enetc_configure_si(struct enetc_ndev_priv *priv)
/* enable SI */
enetc_wr(hw, ENETC_SIMR, ENETC_SIMR_EN);
+ if (si->hw_features & ENETC_SI_F_LSO)
+ enetc_set_lso_flags_mask(hw);
+
/* TODO: RSS support for i.MX95 will be supported later, and the
* is_enetc_rev1() condition will be removed
*/
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h
index a78af4f624e0..0a69f72fe8ec 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc.h
@@ -41,6 +41,19 @@ struct enetc_tx_swbd {
u8 qbv_en:1;
};
+struct enetc_lso_t {
+ bool ipv6;
+ bool tcp;
+ u8 l3_hdr_len;
+ u8 hdr_len; /* LSO header length */
+ u8 l3_start;
+ u16 lso_seg_size;
+ int total_len; /* total data length, not include LSO header */
+};
+
+#define ENETC_1KB_SIZE 1024
+#define ENETC_LSO_MAX_DATA_LEN (256 * ENETC_1KB_SIZE)
+
#define ENETC_RX_MAXFRM_SIZE ENETC_MAC_MAXFRM_SIZE
#define ENETC_RXB_TRUESIZE 2048 /* PAGE_SIZE >> 1 */
#define ENETC_RXB_PAD NET_SKB_PAD /* add extra space if needed */
@@ -238,6 +251,7 @@ enum enetc_errata {
#define ENETC_SI_F_PSFP BIT(0)
#define ENETC_SI_F_QBV BIT(1)
#define ENETC_SI_F_QBU BIT(2)
+#define ENETC_SI_F_LSO BIT(3)
struct enetc_drvdata {
u32 pmac_offset; /* Only valid for PSI which supports 802.1Qbu */
@@ -353,6 +367,7 @@ enum enetc_active_offloads {
ENETC_F_QBU = BIT(11),
ENETC_F_RXCSUM = BIT(12),
ENETC_F_TXCSUM = BIT(13),
+ ENETC_F_LSO = BIT(14),
};
enum enetc_flags_bit {
diff --git a/drivers/net/ethernet/freescale/enetc/enetc4_hw.h b/drivers/net/ethernet/freescale/enetc/enetc4_hw.h
index 26b220677448..cdde8e93a73c 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc4_hw.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc4_hw.h
@@ -12,6 +12,28 @@
#define NXP_ENETC_VENDOR_ID 0x1131
#define NXP_ENETC_PF_DEV_ID 0xe101
+/**********************Station interface registers************************/
+/* Station interface LSO segmentation flag mask register 0/1 */
+#define ENETC4_SILSOSFMR0 0x1300
+#define SILSOSFMR0_TCP_MID_SEG GENMASK(27, 16)
+#define SILSOSFMR0_TCP_1ST_SEG GENMASK(11, 0)
+#define SILSOSFMR0_VAL_SET(first, mid) ((((mid) << 16) & SILSOSFMR0_TCP_MID_SEG) | \
+ ((first) & SILSOSFMR0_TCP_1ST_SEG))
+
+#define ENETC4_SILSOSFMR1 0x1304
+#define SILSOSFMR1_TCP_LAST_SEG GENMASK(11, 0)
+#define TCP_FLAGS_FIN BIT(0)
+#define TCP_FLAGS_SYN BIT(1)
+#define TCP_FLAGS_RST BIT(2)
+#define TCP_FLAGS_PSH BIT(3)
+#define TCP_FLAGS_ACK BIT(4)
+#define TCP_FLAGS_URG BIT(5)
+#define TCP_FLAGS_ECE BIT(6)
+#define TCP_FLAGS_CWR BIT(7)
+#define TCP_FLAGS_NS BIT(8)
+/* According to tso_build_hdr(), clear all special flags for not last packet. */
+#define TCP_NL_SEG_FLAGS_DMASK (TCP_FLAGS_FIN | TCP_FLAGS_RST | TCP_FLAGS_PSH)
+
/***************************ENETC port registers**************************/
#define ENETC4_ECAPR0 0x0
#define ECAPR0_RFS BIT(2)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_hw.h b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
index 590b1412fadf..34a3e8f1496e 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_hw.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
@@ -28,6 +28,8 @@
#define ENETC_SIPCAPR0_QBV BIT(4)
#define ENETC_SIPCAPR0_QBU BIT(3)
#define ENETC_SIPCAPR0_RFS BIT(2)
+#define ENETC_SIPCAPR0_LSO BIT(1)
+#define ENETC_SIPCAPR0_RSC BIT(0)
#define ENETC_SIPCAPR1 0x24
#define ENETC_SITGTGR 0x30
#define ENETC_SIRBGCR 0x38
@@ -554,7 +556,10 @@ static inline u64 _enetc_rd_reg64_wa(void __iomem *reg)
union enetc_tx_bd {
struct {
__le64 addr;
- __le16 buf_len;
+ union {
+ __le16 buf_len;
+ __le16 hdr_len; /* For LSO, ENETC 4.1 and later */
+ };
__le16 frm_len;
union {
struct {
@@ -574,13 +579,16 @@ union enetc_tx_bd {
__le32 tstamp;
__le16 tpid;
__le16 vid;
- u8 reserved[6];
+ __le16 lso_sg_size; /* For ENETC 4.1 and later */
+ __le16 frm_len_ext; /* For ENETC 4.1 and later */
+ u8 reserved[2];
u8 e_flags;
u8 flags;
} ext; /* Tx BD extension */
struct {
__le32 tstamp;
- u8 reserved[10];
+ u8 reserved[8];
+ __le16 lso_err_count; /* For ENETC 4.1 and later */
u8 status;
u8 flags;
} wb; /* writeback descriptor */
@@ -589,6 +597,7 @@ union enetc_tx_bd {
enum enetc_txbd_flags {
ENETC_TXBD_FLAGS_L4CS = BIT(0), /* For ENETC 4.1 and later */
ENETC_TXBD_FLAGS_TSE = BIT(1),
+ ENETC_TXBD_FLAGS_LSO = BIT(1), /* For ENETC 4.1 and later */
ENETC_TXBD_FLAGS_W = BIT(2),
ENETC_TXBD_FLAGS_CSUM_LSO = BIT(3), /* For ENETC 4.1 and later */
ENETC_TXBD_FLAGS_TXSTART = BIT(4),
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c b/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
index 2c4c6af672e7..82a67356abe4 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
@@ -126,6 +126,9 @@ void enetc_pf_netdev_setup(struct enetc_si *si, struct net_device *ndev,
if (si->drvdata->tx_csum)
priv->active_offloads |= ENETC_F_TXCSUM;
+ if (si->hw_features & ENETC_SI_F_LSO)
+ priv->active_offloads |= ENETC_F_LSO;
+
/* TODO: currently, i.MX95 ENETC driver does not support advanced features */
if (!is_enetc_rev1(si)) {
ndev->hw_features &= ~(NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_LOOPBACK);
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v6 RESEND net-next 5/5] net: enetc: add UDP segmentation offload support
2024-12-04 5:29 [PATCH v6 RESEND net-next 0/5] Add more feautues for ENETC v4 - round 1 Wei Fang
` (3 preceding siblings ...)
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 4/5] net: enetc: add LSO support for i.MX95 ENETC PF Wei Fang
@ 2024-12-04 5:29 ` Wei Fang
2024-12-08 15:09 ` Ido Schimmel
4 siblings, 1 reply; 23+ messages in thread
From: Wei Fang @ 2024-12-04 5:29 UTC (permalink / raw)
To: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, frank.li
Cc: netdev, linux-kernel, imx
Set NETIF_F_GSO_UDP_L4 bit of hw_features and features because i.MX95
enetc and LS1028A driver implements UDP segmentation.
- i.MX95 ENETC supports UDP segmentation via LSO.
- LS1028A ENETC supports UDP segmentation since the commit 3d5b459ba0e3
("net: tso: add UDP segmentation support").
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
---
v2: rephrase the commit message
v3: no changes
v4: fix typo in commit message
v5: no changes
v6: no changes
---
drivers/net/ethernet/freescale/enetc/enetc_pf_common.c | 6 ++++--
drivers/net/ethernet/freescale/enetc/enetc_vf.c | 6 ++++--
2 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c b/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
index 82a67356abe4..76fc3c6fdec1 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
@@ -110,11 +110,13 @@ void enetc_pf_netdev_setup(struct enetc_si *si, struct net_device *ndev,
ndev->hw_features = NETIF_F_SG | NETIF_F_RXCSUM |
NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX |
NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_LOOPBACK |
- NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6;
+ NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6 |
+ NETIF_F_GSO_UDP_L4;
ndev->features = NETIF_F_HIGHDMA | NETIF_F_SG | NETIF_F_RXCSUM |
NETIF_F_HW_VLAN_CTAG_TX |
NETIF_F_HW_VLAN_CTAG_RX |
- NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6;
+ NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6 |
+ NETIF_F_GSO_UDP_L4;
ndev->vlan_features = NETIF_F_SG | NETIF_F_HW_CSUM |
NETIF_F_TSO | NETIF_F_TSO6;
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_vf.c b/drivers/net/ethernet/freescale/enetc/enetc_vf.c
index 63d78b2b8670..3768752b6008 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_vf.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_vf.c
@@ -145,11 +145,13 @@ static void enetc_vf_netdev_setup(struct enetc_si *si, struct net_device *ndev,
ndev->hw_features = NETIF_F_SG | NETIF_F_RXCSUM |
NETIF_F_HW_VLAN_CTAG_TX |
NETIF_F_HW_VLAN_CTAG_RX |
- NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6;
+ NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6 |
+ NETIF_F_GSO_UDP_L4;
ndev->features = NETIF_F_HIGHDMA | NETIF_F_SG | NETIF_F_RXCSUM |
NETIF_F_HW_VLAN_CTAG_TX |
NETIF_F_HW_VLAN_CTAG_RX |
- NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6;
+ NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6 |
+ NETIF_F_GSO_UDP_L4;
ndev->vlan_features = NETIF_F_SG | NETIF_F_HW_CSUM |
NETIF_F_TSO | NETIF_F_TSO6;
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum offload for i.MX95 ENETC
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum offload for i.MX95 ENETC Wei Fang
@ 2024-12-06 9:23 ` Simon Horman
2024-12-06 10:33 ` Wei Fang
0 siblings, 1 reply; 23+ messages in thread
From: Simon Horman @ 2024-12-06 9:23 UTC (permalink / raw)
To: Wei Fang
Cc: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, frank.li, netdev, linux-kernel,
imx
On Wed, Dec 04, 2024 at 01:29:28PM +0800, Wei Fang wrote:
> ENETC rev 4.1 supports TCP and UDP checksum offload for receive, the bit
> 108 of the Rx BD will be set if the TCP/UDP checksum is correct. Since
> this capability is not defined in register, the rx_csum bit is added to
> struct enetc_drvdata to indicate whether the device supports Rx checksum
> offload.
>
> Signed-off-by: Wei Fang <wei.fang@nxp.com>
> Reviewed-by: Frank Li <Frank.Li@nxp.com>
> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
> ---
> v2: no changes
> v3: no changes
> v4: no changes
> v5: no changes
> v6: no changes
> ---
> drivers/net/ethernet/freescale/enetc/enetc.c | 14 ++++++++++----
> drivers/net/ethernet/freescale/enetc/enetc.h | 2 ++
> drivers/net/ethernet/freescale/enetc/enetc_hw.h | 2 ++
> .../net/ethernet/freescale/enetc/enetc_pf_common.c | 3 +++
> 4 files changed, 17 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
> index 35634c516e26..3137b6ee62d3 100644
> --- a/drivers/net/ethernet/freescale/enetc/enetc.c
> +++ b/drivers/net/ethernet/freescale/enetc/enetc.c
> @@ -1011,10 +1011,15 @@ static void enetc_get_offloads(struct enetc_bdr *rx_ring,
>
> /* TODO: hashing */
> if (rx_ring->ndev->features & NETIF_F_RXCSUM) {
> - u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
> -
> - skb->csum = csum_unfold((__force __sum16)~htons(inet_csum));
> - skb->ip_summed = CHECKSUM_COMPLETE;
> + if (priv->active_offloads & ENETC_F_RXCSUM &&
> + le16_to_cpu(rxbd->r.flags) & ENETC_RXBD_FLAG_L4_CSUM_OK) {
> + skb->ip_summed = CHECKSUM_UNNECESSARY;
> + } else {
> + u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
> +
> + skb->csum = csum_unfold((__force __sum16)~htons(inet_csum));
> + skb->ip_summed = CHECKSUM_COMPLETE;
> + }
> }
Hi Wei,
I am wondering about the relationship between the above and
hardware support for CHECKSUM_COMPLETE.
Prior to this patch CHECKSUM_COMPLETE was always used, which seems
desirable. But with this patch, CHECKSUM_UNNECESSARY is conditionally used.
If those cases don't work with CHECKSUM_COMPLETE then is this a bug-fix?
Or, alternatively, if those cases do work with CHECKSUM_COMPLETE, then
I'm unsure why this change is necessary or desirable. It's my understanding
that from the Kernel's perspective CHECKSUM_COMPLETE is preferable to
CHECKSUM_UNNECESSARY.
...
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v6 RESEND net-next 2/5] net: enetc: add Tx checksum offload for i.MX95 ENETC
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 2/5] net: enetc: add Tx " Wei Fang
@ 2024-12-06 9:37 ` Simon Horman
2024-12-06 10:46 ` Wei Fang
0 siblings, 1 reply; 23+ messages in thread
From: Simon Horman @ 2024-12-06 9:37 UTC (permalink / raw)
To: Wei Fang
Cc: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, frank.li, netdev, linux-kernel,
imx
On Wed, Dec 04, 2024 at 01:29:29PM +0800, Wei Fang wrote:
> In addition to supporting Rx checksum offload, i.MX95 ENETC also supports
> Tx checksum offload. The transmit checksum offload is implemented through
> the Tx BD. To support Tx checksum offload, software needs to fill some
> auxiliary information in Tx BD, such as IP version, IP header offset and
> size, whether L4 is UDP or TCP, etc.
>
> Same as Rx checksum offload, Tx checksum offload capability isn't defined
> in register, so tx_csum bit is added to struct enetc_drvdata to indicate
> whether the device supports Tx checksum offload.
>
> Signed-off-by: Wei Fang <wei.fang@nxp.com>
> Reviewed-by: Frank Li <Frank.Li@nxp.com>
> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
...
> diff --git a/drivers/net/ethernet/freescale/enetc/enetc_hw.h b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
> index 4b8fd1879005..590b1412fadf 100644
> --- a/drivers/net/ethernet/freescale/enetc/enetc_hw.h
> +++ b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
> @@ -558,7 +558,12 @@ union enetc_tx_bd {
> __le16 frm_len;
> union {
> struct {
> - u8 reserved[3];
> + u8 l3_start:7;
> + u8 ipcs:1;
> + u8 l3_hdr_size:7;
> + u8 l3t:1;
> + u8 resv:5;
> + u8 l4t:3;
> u8 flags;
> }; /* default layout */
Hi Wei,
Given that little-endian types are used elsewhere in this structure
I am guessing that the layout above works for little-endian hosts
but will not work on big-endian hosts.
If so, I would suggest an alternate approach of using a single 32-bit
word and accessing it using a combination of FIELD_PREP() and FIELD_GET()
using masks created using GENMASK() and BIT().
Or, less desirably IMHO, by providing an alternate layout for
the embedded struct for big endian systems.
...
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v6 RESEND net-next 4/5] net: enetc: add LSO support for i.MX95 ENETC PF
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 4/5] net: enetc: add LSO support for i.MX95 ENETC PF Wei Fang
@ 2024-12-06 9:59 ` Simon Horman
2024-12-06 10:33 ` Simon Horman
0 siblings, 1 reply; 23+ messages in thread
From: Simon Horman @ 2024-12-06 9:59 UTC (permalink / raw)
To: Wei Fang
Cc: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, frank.li, netdev, linux-kernel,
imx
On Wed, Dec 04, 2024 at 01:29:31PM +0800, Wei Fang wrote:
> ENETC rev 4.1 supports large send offload (LSO), segmenting large TCP
> and UDP transmit units into multiple Ethernet frames. To support LSO,
> software needs to fill some auxiliary information in Tx BD, such as LSO
> header length, frame length, LSO maximum segment size, etc.
>
> At 1Gbps link rate, TCP segmentation was tested using iperf3, and the
> CPU performance before and after applying the patch was compared through
> the top command. It can be seen that LSO saves a significant amount of
> CPU cycles compared to software TSO.
>
> Before applying the patch:
> %Cpu(s): 0.1 us, 4.1 sy, 0.0 ni, 85.7 id, 0.0 wa, 0.5 hi, 9.7 si
>
> After applying the patch:
> %Cpu(s): 0.1 us, 2.3 sy, 0.0 ni, 94.5 id, 0.0 wa, 0.4 hi, 2.6 si
>
> Signed-off-by: Wei Fang <wei.fang@nxp.com>
> Reviewed-by: Frank Li <Frank.Li@nxp.com>
> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
> ---
> v2: no changes
> v3: use enetc_skb_is_ipv6() helper fucntion which is added in patch 2
> v4: fix a typo
> v5: no changes
> v6: remove error logs from the datapath
> ---
> drivers/net/ethernet/freescale/enetc/enetc.c | 259 +++++++++++++++++-
> drivers/net/ethernet/freescale/enetc/enetc.h | 15 +
> .../net/ethernet/freescale/enetc/enetc4_hw.h | 22 ++
> .../net/ethernet/freescale/enetc/enetc_hw.h | 15 +-
> .../freescale/enetc/enetc_pf_common.c | 3 +
> 5 files changed, 304 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
> index dafe7aeac26b..82a7932725f9 100644
> --- a/drivers/net/ethernet/freescale/enetc/enetc.c
> +++ b/drivers/net/ethernet/freescale/enetc/enetc.c
> @@ -523,6 +523,226 @@ static void enetc_tso_complete_csum(struct enetc_bdr *tx_ring, struct tso_t *tso
> }
> }
>
> +static inline int enetc_lso_count_descs(const struct sk_buff *skb)
> +{
> + /* 4 BDs: 1 BD for LSO header + 1 BD for extended BD + 1 BD
> + * for linear area data but not include LSO header, namely
> + * skb_headlen(skb) - lso_hdr_len. And 1 BD for gap.
> + */
> + return skb_shinfo(skb)->nr_frags + 4;
> +}
> +
> +static int enetc_lso_get_hdr_len(const struct sk_buff *skb)
> +{
> + int hdr_len, tlen;
> +
> + tlen = skb_is_gso_tcp(skb) ? tcp_hdrlen(skb) : sizeof(struct udphdr);
> + hdr_len = skb_transport_offset(skb) + tlen;
Hi Wei,
I am wondering if packets that are neither TCP nor UDP can be process
by the LSO code added by this patch, and if so, what the implications are.
> +
> + return hdr_len;
> +}
...
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v6 RESEND net-next 3/5] net: enetc: update max chained Tx BD number for i.MX95 ENETC
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 3/5] net: enetc: update max chained Tx BD number " Wei Fang
@ 2024-12-06 10:11 ` Simon Horman
0 siblings, 0 replies; 23+ messages in thread
From: Simon Horman @ 2024-12-06 10:11 UTC (permalink / raw)
To: Wei Fang
Cc: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, frank.li, netdev, linux-kernel,
imx
On Wed, Dec 04, 2024 at 01:29:30PM +0800, Wei Fang wrote:
> The max chained Tx BDs of latest ENETC (i.MX95 ENETC, rev 4.1) has been
> increased to 63, but since the range of MAX_SKB_FRAGS is 17~45, so for
> i.MX95 ENETC and later revision, it is better to set ENETC4_MAX_SKB_FRAGS
> to MAX_SKB_FRAGS.
>
> In addition, add max_frags in struct enetc_drvdata to indicate the max
> chained BDs supported by device. Because the max number of chained BDs
> supported by LS1028A and i.MX95 ENETC is different.
>
> Signed-off-by: Wei Fang <wei.fang@nxp.com>
> Reviewed-by: Frank Li <Frank.Li@nxp.com>
> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v6 RESEND net-next 4/5] net: enetc: add LSO support for i.MX95 ENETC PF
2024-12-06 9:59 ` Simon Horman
@ 2024-12-06 10:33 ` Simon Horman
0 siblings, 0 replies; 23+ messages in thread
From: Simon Horman @ 2024-12-06 10:33 UTC (permalink / raw)
To: Wei Fang
Cc: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, frank.li, netdev, linux-kernel,
imx
On Fri, Dec 06, 2024 at 09:59:38AM +0000, Simon Horman wrote:
> On Wed, Dec 04, 2024 at 01:29:31PM +0800, Wei Fang wrote:
> > ENETC rev 4.1 supports large send offload (LSO), segmenting large TCP
> > and UDP transmit units into multiple Ethernet frames. To support LSO,
> > software needs to fill some auxiliary information in Tx BD, such as LSO
> > header length, frame length, LSO maximum segment size, etc.
> >
> > At 1Gbps link rate, TCP segmentation was tested using iperf3, and the
> > CPU performance before and after applying the patch was compared through
> > the top command. It can be seen that LSO saves a significant amount of
> > CPU cycles compared to software TSO.
> >
> > Before applying the patch:
> > %Cpu(s): 0.1 us, 4.1 sy, 0.0 ni, 85.7 id, 0.0 wa, 0.5 hi, 9.7 si
> >
> > After applying the patch:
> > %Cpu(s): 0.1 us, 2.3 sy, 0.0 ni, 94.5 id, 0.0 wa, 0.4 hi, 2.6 si
> >
> > Signed-off-by: Wei Fang <wei.fang@nxp.com>
> > Reviewed-by: Frank Li <Frank.Li@nxp.com>
> > Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
> > ---
> > v2: no changes
> > v3: use enetc_skb_is_ipv6() helper fucntion which is added in patch 2
> > v4: fix a typo
> > v5: no changes
> > v6: remove error logs from the datapath
> > ---
> > drivers/net/ethernet/freescale/enetc/enetc.c | 259 +++++++++++++++++-
> > drivers/net/ethernet/freescale/enetc/enetc.h | 15 +
> > .../net/ethernet/freescale/enetc/enetc4_hw.h | 22 ++
> > .../net/ethernet/freescale/enetc/enetc_hw.h | 15 +-
> > .../freescale/enetc/enetc_pf_common.c | 3 +
> > 5 files changed, 304 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
> > index dafe7aeac26b..82a7932725f9 100644
> > --- a/drivers/net/ethernet/freescale/enetc/enetc.c
> > +++ b/drivers/net/ethernet/freescale/enetc/enetc.c
> > @@ -523,6 +523,226 @@ static void enetc_tso_complete_csum(struct enetc_bdr *tx_ring, struct tso_t *tso
> > }
> > }
> >
> > +static inline int enetc_lso_count_descs(const struct sk_buff *skb)
> > +{
> > + /* 4 BDs: 1 BD for LSO header + 1 BD for extended BD + 1 BD
> > + * for linear area data but not include LSO header, namely
> > + * skb_headlen(skb) - lso_hdr_len. And 1 BD for gap.
> > + */
> > + return skb_shinfo(skb)->nr_frags + 4;
> > +}
> > +
> > +static int enetc_lso_get_hdr_len(const struct sk_buff *skb)
> > +{
> > + int hdr_len, tlen;
> > +
> > + tlen = skb_is_gso_tcp(skb) ? tcp_hdrlen(skb) : sizeof(struct udphdr);
> > + hdr_len = skb_transport_offset(skb) + tlen;
>
> Hi Wei,
>
> I am wondering if packets that are neither TCP nor UDP can be process
> by the LSO code added by this patch, and if so, what the implications are.
Sorry, I now realise that the answer to that is rather obvious: no
due to feature flags. I should have paid more attention to patch 5/5
before sending the above.
>
> > +
> > + return hdr_len;
> > +}
>
> ...
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum offload for i.MX95 ENETC
2024-12-06 9:23 ` Simon Horman
@ 2024-12-06 10:33 ` Wei Fang
2024-12-06 12:30 ` Simon Horman
0 siblings, 1 reply; 23+ messages in thread
From: Wei Fang @ 2024-12-06 10:33 UTC (permalink / raw)
To: Simon Horman
Cc: Claudiu Manoil, Vladimir Oltean, Clark Wang,
andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, Frank Li,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
imx@lists.linux.dev
> -----Original Message-----
> From: Simon Horman <horms@kernel.org>
> Sent: 2024年12月6日 17:23
> To: Wei Fang <wei.fang@nxp.com>
> Cc: Claudiu Manoil <claudiu.manoil@nxp.com>; Vladimir Oltean
> <vladimir.oltean@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> andrew+netdev@lunn.ch; davem@davemloft.net; edumazet@google.com;
> kuba@kernel.org; pabeni@redhat.com; Frank Li <frank.li@nxp.com>;
> netdev@vger.kernel.org; linux-kernel@vger.kernel.org; imx@lists.linux.dev
> Subject: Re: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum
> offload for i.MX95 ENETC
>
> On Wed, Dec 04, 2024 at 01:29:28PM +0800, Wei Fang wrote:
> > ENETC rev 4.1 supports TCP and UDP checksum offload for receive, the bit
> > 108 of the Rx BD will be set if the TCP/UDP checksum is correct. Since
> > this capability is not defined in register, the rx_csum bit is added to
> > struct enetc_drvdata to indicate whether the device supports Rx checksum
> > offload.
> >
> > Signed-off-by: Wei Fang <wei.fang@nxp.com>
> > Reviewed-by: Frank Li <Frank.Li@nxp.com>
> > Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
> > ---
> > v2: no changes
> > v3: no changes
> > v4: no changes
> > v5: no changes
> > v6: no changes
> > ---
> > drivers/net/ethernet/freescale/enetc/enetc.c | 14 ++++++++++----
> > drivers/net/ethernet/freescale/enetc/enetc.h | 2 ++
> > drivers/net/ethernet/freescale/enetc/enetc_hw.h | 2 ++
> > .../net/ethernet/freescale/enetc/enetc_pf_common.c | 3 +++
> > 4 files changed, 17 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c
> b/drivers/net/ethernet/freescale/enetc/enetc.c
> > index 35634c516e26..3137b6ee62d3 100644
> > --- a/drivers/net/ethernet/freescale/enetc/enetc.c
> > +++ b/drivers/net/ethernet/freescale/enetc/enetc.c
> > @@ -1011,10 +1011,15 @@ static void enetc_get_offloads(struct enetc_bdr
> *rx_ring,
> >
> > /* TODO: hashing */
> > if (rx_ring->ndev->features & NETIF_F_RXCSUM) {
> > - u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
> > -
> > - skb->csum = csum_unfold((__force __sum16)~htons(inet_csum));
> > - skb->ip_summed = CHECKSUM_COMPLETE;
> > + if (priv->active_offloads & ENETC_F_RXCSUM &&
> > + le16_to_cpu(rxbd->r.flags) & ENETC_RXBD_FLAG_L4_CSUM_OK)
> {
> > + skb->ip_summed = CHECKSUM_UNNECESSARY;
> > + } else {
> > + u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
> > +
> > + skb->csum = csum_unfold((__force __sum16)~htons(inet_csum));
> > + skb->ip_summed = CHECKSUM_COMPLETE;
> > + }
> > }
>
> Hi Wei,
>
> I am wondering about the relationship between the above and
> hardware support for CHECKSUM_COMPLETE.
>
> Prior to this patch CHECKSUM_COMPLETE was always used, which seems
> desirable. But with this patch, CHECKSUM_UNNECESSARY is conditionally used.
>
> If those cases don't work with CHECKSUM_COMPLETE then is this a bug-fix?
>
> Or, alternatively, if those cases do work with CHECKSUM_COMPLETE, then
> I'm unsure why this change is necessary or desirable. It's my understanding
> that from the Kernel's perspective CHECKSUM_COMPLETE is preferable to
> CHECKSUM_UNNECESSARY.
>
> ...
Rx checksum offload is a new feature of ENETC v4. We would like to exploit this
capability of the hardware to save CPU cycles in calculating and verifying checksum.
^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [PATCH v6 RESEND net-next 2/5] net: enetc: add Tx checksum offload for i.MX95 ENETC
2024-12-06 9:37 ` Simon Horman
@ 2024-12-06 10:46 ` Wei Fang
2024-12-06 12:32 ` Simon Horman
0 siblings, 1 reply; 23+ messages in thread
From: Wei Fang @ 2024-12-06 10:46 UTC (permalink / raw)
To: Simon Horman
Cc: Claudiu Manoil, Vladimir Oltean, Clark Wang,
andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, Frank Li,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
imx@lists.linux.dev
> -----Original Message-----
> From: Simon Horman <horms@kernel.org>
> Sent: 2024年12月6日 17:37
> To: Wei Fang <wei.fang@nxp.com>
> Cc: Claudiu Manoil <claudiu.manoil@nxp.com>; Vladimir Oltean
> <vladimir.oltean@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> andrew+netdev@lunn.ch; davem@davemloft.net; edumazet@google.com;
> kuba@kernel.org; pabeni@redhat.com; Frank Li <frank.li@nxp.com>;
> netdev@vger.kernel.org; linux-kernel@vger.kernel.org; imx@lists.linux.dev
> Subject: Re: [PATCH v6 RESEND net-next 2/5] net: enetc: add Tx checksum
> offload for i.MX95 ENETC
>
> On Wed, Dec 04, 2024 at 01:29:29PM +0800, Wei Fang wrote:
> > In addition to supporting Rx checksum offload, i.MX95 ENETC also supports
> > Tx checksum offload. The transmit checksum offload is implemented through
> > the Tx BD. To support Tx checksum offload, software needs to fill some
> > auxiliary information in Tx BD, such as IP version, IP header offset and
> > size, whether L4 is UDP or TCP, etc.
> >
> > Same as Rx checksum offload, Tx checksum offload capability isn't defined
> > in register, so tx_csum bit is added to struct enetc_drvdata to indicate
> > whether the device supports Tx checksum offload.
> >
> > Signed-off-by: Wei Fang <wei.fang@nxp.com>
> > Reviewed-by: Frank Li <Frank.Li@nxp.com>
> > Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
>
> ...
>
> > diff --git a/drivers/net/ethernet/freescale/enetc/enetc_hw.h
> b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
> > index 4b8fd1879005..590b1412fadf 100644
> > --- a/drivers/net/ethernet/freescale/enetc/enetc_hw.h
> > +++ b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
> > @@ -558,7 +558,12 @@ union enetc_tx_bd {
> > __le16 frm_len;
> > union {
> > struct {
> > - u8 reserved[3];
> > + u8 l3_start:7;
> > + u8 ipcs:1;
> > + u8 l3_hdr_size:7;
> > + u8 l3t:1;
> > + u8 resv:5;
> > + u8 l4t:3;
> > u8 flags;
> > }; /* default layout */
>
> Hi Wei,
>
> Given that little-endian types are used elsewhere in this structure
> I am guessing that the layout above works for little-endian hosts
> but will not work on big-endian hosts.
>
> If so, I would suggest an alternate approach of using a single 32-bit
> word and accessing it using a combination of FIELD_PREP() and FIELD_GET()
> using masks created using GENMASK() and BIT().
Good suggestion, I will refine it, thanks.
>
> Or, less desirably IMHO, by providing an alternate layout for
> the embedded struct for big endian systems.
>
> ...
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum offload for i.MX95 ENETC
2024-12-06 10:33 ` Wei Fang
@ 2024-12-06 12:30 ` Simon Horman
2024-12-06 12:45 ` Wei Fang
0 siblings, 1 reply; 23+ messages in thread
From: Simon Horman @ 2024-12-06 12:30 UTC (permalink / raw)
To: Wei Fang
Cc: Claudiu Manoil, Vladimir Oltean, Clark Wang,
andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, Frank Li,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
imx@lists.linux.dev
On Fri, Dec 06, 2024 at 10:33:15AM +0000, Wei Fang wrote:
> > -----Original Message-----
> > From: Simon Horman <horms@kernel.org>
> > Sent: 2024年12月6日 17:23
> > To: Wei Fang <wei.fang@nxp.com>
> > Cc: Claudiu Manoil <claudiu.manoil@nxp.com>; Vladimir Oltean
> > <vladimir.oltean@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> > andrew+netdev@lunn.ch; davem@davemloft.net; edumazet@google.com;
> > kuba@kernel.org; pabeni@redhat.com; Frank Li <frank.li@nxp.com>;
> > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; imx@lists.linux.dev
> > Subject: Re: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum
> > offload for i.MX95 ENETC
> >
> > On Wed, Dec 04, 2024 at 01:29:28PM +0800, Wei Fang wrote:
> > > ENETC rev 4.1 supports TCP and UDP checksum offload for receive, the bit
> > > 108 of the Rx BD will be set if the TCP/UDP checksum is correct. Since
> > > this capability is not defined in register, the rx_csum bit is added to
> > > struct enetc_drvdata to indicate whether the device supports Rx checksum
> > > offload.
> > >
> > > Signed-off-by: Wei Fang <wei.fang@nxp.com>
> > > Reviewed-by: Frank Li <Frank.Li@nxp.com>
> > > Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
> > > ---
> > > v2: no changes
> > > v3: no changes
> > > v4: no changes
> > > v5: no changes
> > > v6: no changes
> > > ---
> > > drivers/net/ethernet/freescale/enetc/enetc.c | 14 ++++++++++----
> > > drivers/net/ethernet/freescale/enetc/enetc.h | 2 ++
> > > drivers/net/ethernet/freescale/enetc/enetc_hw.h | 2 ++
> > > .../net/ethernet/freescale/enetc/enetc_pf_common.c | 3 +++
> > > 4 files changed, 17 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c
> > b/drivers/net/ethernet/freescale/enetc/enetc.c
> > > index 35634c516e26..3137b6ee62d3 100644
> > > --- a/drivers/net/ethernet/freescale/enetc/enetc.c
> > > +++ b/drivers/net/ethernet/freescale/enetc/enetc.c
> > > @@ -1011,10 +1011,15 @@ static void enetc_get_offloads(struct enetc_bdr
> > *rx_ring,
> > >
> > > /* TODO: hashing */
> > > if (rx_ring->ndev->features & NETIF_F_RXCSUM) {
> > > - u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
> > > -
> > > - skb->csum = csum_unfold((__force __sum16)~htons(inet_csum));
> > > - skb->ip_summed = CHECKSUM_COMPLETE;
> > > + if (priv->active_offloads & ENETC_F_RXCSUM &&
> > > + le16_to_cpu(rxbd->r.flags) & ENETC_RXBD_FLAG_L4_CSUM_OK)
> > {
> > > + skb->ip_summed = CHECKSUM_UNNECESSARY;
> > > + } else {
> > > + u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
> > > +
> > > + skb->csum = csum_unfold((__force __sum16)~htons(inet_csum));
> > > + skb->ip_summed = CHECKSUM_COMPLETE;
> > > + }
> > > }
> >
> > Hi Wei,
> >
> > I am wondering about the relationship between the above and
> > hardware support for CHECKSUM_COMPLETE.
> >
> > Prior to this patch CHECKSUM_COMPLETE was always used, which seems
> > desirable. But with this patch, CHECKSUM_UNNECESSARY is conditionally used.
> >
> > If those cases don't work with CHECKSUM_COMPLETE then is this a bug-fix?
> >
> > Or, alternatively, if those cases do work with CHECKSUM_COMPLETE, then
> > I'm unsure why this change is necessary or desirable. It's my understanding
> > that from the Kernel's perspective CHECKSUM_COMPLETE is preferable to
> > CHECKSUM_UNNECESSARY.
> >
> > ...
>
> Rx checksum offload is a new feature of ENETC v4. We would like to exploit this
> capability of the hardware to save CPU cycles in calculating and verifying checksum.
>
Understood, but CHECKSUM_UNNECESSARY is usually the preferred option as it
is more flexible, e.g. allowing low-cost calculation of inner checksums
in the presence of encapsulation.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v6 RESEND net-next 2/5] net: enetc: add Tx checksum offload for i.MX95 ENETC
2024-12-06 10:46 ` Wei Fang
@ 2024-12-06 12:32 ` Simon Horman
2024-12-06 12:38 ` Wei Fang
0 siblings, 1 reply; 23+ messages in thread
From: Simon Horman @ 2024-12-06 12:32 UTC (permalink / raw)
To: Wei Fang
Cc: Claudiu Manoil, Vladimir Oltean, Clark Wang,
andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, Frank Li,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
imx@lists.linux.dev
On Fri, Dec 06, 2024 at 10:46:49AM +0000, Wei Fang wrote:
> > -----Original Message-----
> > From: Simon Horman <horms@kernel.org>
> > Sent: 2024年12月6日 17:37
> > To: Wei Fang <wei.fang@nxp.com>
> > Cc: Claudiu Manoil <claudiu.manoil@nxp.com>; Vladimir Oltean
> > <vladimir.oltean@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> > andrew+netdev@lunn.ch; davem@davemloft.net; edumazet@google.com;
> > kuba@kernel.org; pabeni@redhat.com; Frank Li <frank.li@nxp.com>;
> > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; imx@lists.linux.dev
> > Subject: Re: [PATCH v6 RESEND net-next 2/5] net: enetc: add Tx checksum
> > offload for i.MX95 ENETC
> >
> > On Wed, Dec 04, 2024 at 01:29:29PM +0800, Wei Fang wrote:
> > > In addition to supporting Rx checksum offload, i.MX95 ENETC also supports
> > > Tx checksum offload. The transmit checksum offload is implemented through
> > > the Tx BD. To support Tx checksum offload, software needs to fill some
> > > auxiliary information in Tx BD, such as IP version, IP header offset and
> > > size, whether L4 is UDP or TCP, etc.
> > >
> > > Same as Rx checksum offload, Tx checksum offload capability isn't defined
> > > in register, so tx_csum bit is added to struct enetc_drvdata to indicate
> > > whether the device supports Tx checksum offload.
> > >
> > > Signed-off-by: Wei Fang <wei.fang@nxp.com>
> > > Reviewed-by: Frank Li <Frank.Li@nxp.com>
> > > Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
> >
> > ...
> >
> > > diff --git a/drivers/net/ethernet/freescale/enetc/enetc_hw.h
> > b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
> > > index 4b8fd1879005..590b1412fadf 100644
> > > --- a/drivers/net/ethernet/freescale/enetc/enetc_hw.h
> > > +++ b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
> > > @@ -558,7 +558,12 @@ union enetc_tx_bd {
> > > __le16 frm_len;
> > > union {
> > > struct {
> > > - u8 reserved[3];
> > > + u8 l3_start:7;
> > > + u8 ipcs:1;
> > > + u8 l3_hdr_size:7;
> > > + u8 l3t:1;
> > > + u8 resv:5;
> > > + u8 l4t:3;
> > > u8 flags;
> > > }; /* default layout */
> >
> > Hi Wei,
> >
> > Given that little-endian types are used elsewhere in this structure
> > I am guessing that the layout above works for little-endian hosts
> > but will not work on big-endian hosts.
> >
> > If so, I would suggest an alternate approach of using a single 32-bit
> > word and accessing it using a combination of FIELD_PREP() and FIELD_GET()
> > using masks created using GENMASK() and BIT().
>
> Good suggestion, I will refine it, thanks.
Thanks. I forgot to mention that you will likely also need to add
cpu_to_le32 and le32_to_cpu to the mix.
> > Or, less desirably IMHO, by providing an alternate layout for
> > the embedded struct for big endian systems.
> >
> > ...
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [PATCH v6 RESEND net-next 2/5] net: enetc: add Tx checksum offload for i.MX95 ENETC
2024-12-06 12:32 ` Simon Horman
@ 2024-12-06 12:38 ` Wei Fang
2024-12-06 13:31 ` Simon Horman
0 siblings, 1 reply; 23+ messages in thread
From: Wei Fang @ 2024-12-06 12:38 UTC (permalink / raw)
To: Simon Horman
Cc: Claudiu Manoil, Vladimir Oltean, Clark Wang,
andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, Frank Li,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
imx@lists.linux.dev
> -----Original Message-----
> From: Simon Horman <horms@kernel.org>
> Sent: 2024年12月6日 20:32
> To: Wei Fang <wei.fang@nxp.com>
> Cc: Claudiu Manoil <claudiu.manoil@nxp.com>; Vladimir Oltean
> <vladimir.oltean@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> andrew+netdev@lunn.ch; davem@davemloft.net; edumazet@google.com;
> kuba@kernel.org; pabeni@redhat.com; Frank Li <frank.li@nxp.com>;
> netdev@vger.kernel.org; linux-kernel@vger.kernel.org; imx@lists.linux.dev
> Subject: Re: [PATCH v6 RESEND net-next 2/5] net: enetc: add Tx checksum
> offload for i.MX95 ENETC
>
> On Fri, Dec 06, 2024 at 10:46:49AM +0000, Wei Fang wrote:
> > > -----Original Message-----
> > > From: Simon Horman <horms@kernel.org>
> > > Sent: 2024年12月6日 17:37
> > > To: Wei Fang <wei.fang@nxp.com>
> > > Cc: Claudiu Manoil <claudiu.manoil@nxp.com>; Vladimir Oltean
> > > <vladimir.oltean@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> > > andrew+netdev@lunn.ch; davem@davemloft.net; edumazet@google.com;
> > > kuba@kernel.org; pabeni@redhat.com; Frank Li <frank.li@nxp.com>;
> > > netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
> > > imx@lists.linux.dev
> > > Subject: Re: [PATCH v6 RESEND net-next 2/5] net: enetc: add Tx
> > > checksum offload for i.MX95 ENETC
> > >
> > > On Wed, Dec 04, 2024 at 01:29:29PM +0800, Wei Fang wrote:
> > > > In addition to supporting Rx checksum offload, i.MX95 ENETC also
> > > > supports Tx checksum offload. The transmit checksum offload is
> > > > implemented through the Tx BD. To support Tx checksum offload,
> > > > software needs to fill some auxiliary information in Tx BD, such
> > > > as IP version, IP header offset and size, whether L4 is UDP or TCP, etc.
> > > >
> > > > Same as Rx checksum offload, Tx checksum offload capability isn't
> > > > defined in register, so tx_csum bit is added to struct
> > > > enetc_drvdata to indicate whether the device supports Tx checksum
> offload.
> > > >
> > > > Signed-off-by: Wei Fang <wei.fang@nxp.com>
> > > > Reviewed-by: Frank Li <Frank.Li@nxp.com>
> > > > Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
> > >
> > > ...
> > >
> > > > diff --git a/drivers/net/ethernet/freescale/enetc/enetc_hw.h
> > > b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
> > > > index 4b8fd1879005..590b1412fadf 100644
> > > > --- a/drivers/net/ethernet/freescale/enetc/enetc_hw.h
> > > > +++ b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
> > > > @@ -558,7 +558,12 @@ union enetc_tx_bd {
> > > > __le16 frm_len;
> > > > union {
> > > > struct {
> > > > - u8 reserved[3];
> > > > + u8 l3_start:7;
> > > > + u8 ipcs:1;
> > > > + u8 l3_hdr_size:7;
> > > > + u8 l3t:1;
> > > > + u8 resv:5;
> > > > + u8 l4t:3;
> > > > u8 flags;
> > > > }; /* default layout */
> > >
> > > Hi Wei,
> > >
> > > Given that little-endian types are used elsewhere in this structure
> > > I am guessing that the layout above works for little-endian hosts
> > > but will not work on big-endian hosts.
> > >
> > > If so, I would suggest an alternate approach of using a single
> > > 32-bit word and accessing it using a combination of FIELD_PREP() and
> > > FIELD_GET() using masks created using GENMASK() and BIT().
> >
> > Good suggestion, I will refine it, thanks.
>
> Thanks. I forgot to mention that you will likely also need to add
> cpu_to_le32 and le32_to_cpu to the mix.
>
I think I will use u8 instead of 32-bit, because I don't want to affect
the existing 'u8 flag'. And u8 is good enough.
> > > Or, less desirably IMHO, by providing an alternate layout for the
> > > embedded struct for big endian systems.
> > >
> > > ...
> >
^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum offload for i.MX95 ENETC
2024-12-06 12:30 ` Simon Horman
@ 2024-12-06 12:45 ` Wei Fang
2024-12-08 15:47 ` Ido Schimmel
0 siblings, 1 reply; 23+ messages in thread
From: Wei Fang @ 2024-12-06 12:45 UTC (permalink / raw)
To: Simon Horman
Cc: Claudiu Manoil, Vladimir Oltean, Clark Wang,
andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, Frank Li,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
imx@lists.linux.dev
> -----Original Message-----
> From: Simon Horman <horms@kernel.org>
> Sent: 2024年12月6日 20:31
> To: Wei Fang <wei.fang@nxp.com>
> Cc: Claudiu Manoil <claudiu.manoil@nxp.com>; Vladimir Oltean
> <vladimir.oltean@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> andrew+netdev@lunn.ch; davem@davemloft.net; edumazet@google.com;
> kuba@kernel.org; pabeni@redhat.com; Frank Li <frank.li@nxp.com>;
> netdev@vger.kernel.org; linux-kernel@vger.kernel.org; imx@lists.linux.dev
> Subject: Re: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum
> offload for i.MX95 ENETC
>
> On Fri, Dec 06, 2024 at 10:33:15AM +0000, Wei Fang wrote:
> > > -----Original Message-----
> > > From: Simon Horman <horms@kernel.org>
> > > Sent: 2024年12月6日 17:23
> > > To: Wei Fang <wei.fang@nxp.com>
> > > Cc: Claudiu Manoil <claudiu.manoil@nxp.com>; Vladimir Oltean
> > > <vladimir.oltean@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> > > andrew+netdev@lunn.ch; davem@davemloft.net; edumazet@google.com;
> > > kuba@kernel.org; pabeni@redhat.com; Frank Li <frank.li@nxp.com>;
> > > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; imx@lists.linux.dev
> > > Subject: Re: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum
> > > offload for i.MX95 ENETC
> > >
> > > On Wed, Dec 04, 2024 at 01:29:28PM +0800, Wei Fang wrote:
> > > > ENETC rev 4.1 supports TCP and UDP checksum offload for receive, the bit
> > > > 108 of the Rx BD will be set if the TCP/UDP checksum is correct. Since
> > > > this capability is not defined in register, the rx_csum bit is added to
> > > > struct enetc_drvdata to indicate whether the device supports Rx
> checksum
> > > > offload.
> > > >
> > > > Signed-off-by: Wei Fang <wei.fang@nxp.com>
> > > > Reviewed-by: Frank Li <Frank.Li@nxp.com>
> > > > Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
> > > > ---
> > > > v2: no changes
> > > > v3: no changes
> > > > v4: no changes
> > > > v5: no changes
> > > > v6: no changes
> > > > ---
> > > > drivers/net/ethernet/freescale/enetc/enetc.c | 14
> ++++++++++----
> > > > drivers/net/ethernet/freescale/enetc/enetc.h | 2 ++
> > > > drivers/net/ethernet/freescale/enetc/enetc_hw.h | 2 ++
> > > > .../net/ethernet/freescale/enetc/enetc_pf_common.c | 3 +++
> > > > 4 files changed, 17 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c
> > > b/drivers/net/ethernet/freescale/enetc/enetc.c
> > > > index 35634c516e26..3137b6ee62d3 100644
> > > > --- a/drivers/net/ethernet/freescale/enetc/enetc.c
> > > > +++ b/drivers/net/ethernet/freescale/enetc/enetc.c
> > > > @@ -1011,10 +1011,15 @@ static void enetc_get_offloads(struct
> enetc_bdr
> > > *rx_ring,
> > > >
> > > > /* TODO: hashing */
> > > > if (rx_ring->ndev->features & NETIF_F_RXCSUM) {
> > > > - u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
> > > > -
> > > > - skb->csum = csum_unfold((__force
> __sum16)~htons(inet_csum));
> > > > - skb->ip_summed = CHECKSUM_COMPLETE;
> > > > + if (priv->active_offloads & ENETC_F_RXCSUM &&
> > > > + le16_to_cpu(rxbd->r.flags) &
> ENETC_RXBD_FLAG_L4_CSUM_OK)
> > > {
> > > > + skb->ip_summed = CHECKSUM_UNNECESSARY;
> > > > + } else {
> > > > + u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
> > > > +
> > > > + skb->csum = csum_unfold((__force
> __sum16)~htons(inet_csum));
> > > > + skb->ip_summed = CHECKSUM_COMPLETE;
> > > > + }
> > > > }
> > >
> > > Hi Wei,
> > >
> > > I am wondering about the relationship between the above and
> > > hardware support for CHECKSUM_COMPLETE.
> > >
> > > Prior to this patch CHECKSUM_COMPLETE was always used, which seems
> > > desirable. But with this patch, CHECKSUM_UNNECESSARY is conditionally
> used.
> > >
> > > If those cases don't work with CHECKSUM_COMPLETE then is this a
> bug-fix?
> > >
> > > Or, alternatively, if those cases do work with CHECKSUM_COMPLETE, then
> > > I'm unsure why this change is necessary or desirable. It's my understanding
> > > that from the Kernel's perspective CHECKSUM_COMPLETE is preferable to
> > > CHECKSUM_UNNECESSARY.
> > >
> > > ...
> >
> > Rx checksum offload is a new feature of ENETC v4. We would like to exploit
> this
> > capability of the hardware to save CPU cycles in calculating and verifying
> checksum.
> >
>
> Understood, but CHECKSUM_UNNECESSARY is usually the preferred option as
> it
> is more flexible, e.g. allowing low-cost calculation of inner checksums
> in the presence of encapsulation.
I think you mean 'CHECKSUM_COMPLETE' is the preferred option. But there is no
strong reason against using CHECKSUM_UNNECESSARY. So I hope to keep this patch.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v6 RESEND net-next 2/5] net: enetc: add Tx checksum offload for i.MX95 ENETC
2024-12-06 12:38 ` Wei Fang
@ 2024-12-06 13:31 ` Simon Horman
0 siblings, 0 replies; 23+ messages in thread
From: Simon Horman @ 2024-12-06 13:31 UTC (permalink / raw)
To: Wei Fang
Cc: Claudiu Manoil, Vladimir Oltean, Clark Wang,
andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, Frank Li,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
imx@lists.linux.dev
On Fri, Dec 06, 2024 at 12:38:49PM +0000, Wei Fang wrote:
> > -----Original Message-----
> > From: Simon Horman <horms@kernel.org>
> > Sent: 2024年12月6日 20:32
> > To: Wei Fang <wei.fang@nxp.com>
> > Cc: Claudiu Manoil <claudiu.manoil@nxp.com>; Vladimir Oltean
> > <vladimir.oltean@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> > andrew+netdev@lunn.ch; davem@davemloft.net; edumazet@google.com;
> > kuba@kernel.org; pabeni@redhat.com; Frank Li <frank.li@nxp.com>;
> > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; imx@lists.linux.dev
> > Subject: Re: [PATCH v6 RESEND net-next 2/5] net: enetc: add Tx checksum
> > offload for i.MX95 ENETC
> >
> > On Fri, Dec 06, 2024 at 10:46:49AM +0000, Wei Fang wrote:
> > > > -----Original Message-----
> > > > From: Simon Horman <horms@kernel.org>
> > > > Sent: 2024年12月6日 17:37
> > > > To: Wei Fang <wei.fang@nxp.com>
> > > > Cc: Claudiu Manoil <claudiu.manoil@nxp.com>; Vladimir Oltean
> > > > <vladimir.oltean@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> > > > andrew+netdev@lunn.ch; davem@davemloft.net; edumazet@google.com;
> > > > kuba@kernel.org; pabeni@redhat.com; Frank Li <frank.li@nxp.com>;
> > > > netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
> > > > imx@lists.linux.dev
> > > > Subject: Re: [PATCH v6 RESEND net-next 2/5] net: enetc: add Tx
> > > > checksum offload for i.MX95 ENETC
> > > >
> > > > On Wed, Dec 04, 2024 at 01:29:29PM +0800, Wei Fang wrote:
> > > > > In addition to supporting Rx checksum offload, i.MX95 ENETC also
> > > > > supports Tx checksum offload. The transmit checksum offload is
> > > > > implemented through the Tx BD. To support Tx checksum offload,
> > > > > software needs to fill some auxiliary information in Tx BD, such
> > > > > as IP version, IP header offset and size, whether L4 is UDP or TCP, etc.
> > > > >
> > > > > Same as Rx checksum offload, Tx checksum offload capability isn't
> > > > > defined in register, so tx_csum bit is added to struct
> > > > > enetc_drvdata to indicate whether the device supports Tx checksum
> > offload.
> > > > >
> > > > > Signed-off-by: Wei Fang <wei.fang@nxp.com>
> > > > > Reviewed-by: Frank Li <Frank.Li@nxp.com>
> > > > > Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
> > > >
> > > > ...
> > > >
> > > > > diff --git a/drivers/net/ethernet/freescale/enetc/enetc_hw.h
> > > > b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
> > > > > index 4b8fd1879005..590b1412fadf 100644
> > > > > --- a/drivers/net/ethernet/freescale/enetc/enetc_hw.h
> > > > > +++ b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
> > > > > @@ -558,7 +558,12 @@ union enetc_tx_bd {
> > > > > __le16 frm_len;
> > > > > union {
> > > > > struct {
> > > > > - u8 reserved[3];
> > > > > + u8 l3_start:7;
> > > > > + u8 ipcs:1;
> > > > > + u8 l3_hdr_size:7;
> > > > > + u8 l3t:1;
> > > > > + u8 resv:5;
> > > > > + u8 l4t:3;
> > > > > u8 flags;
> > > > > }; /* default layout */
> > > >
> > > > Hi Wei,
> > > >
> > > > Given that little-endian types are used elsewhere in this structure
> > > > I am guessing that the layout above works for little-endian hosts
> > > > but will not work on big-endian hosts.
> > > >
> > > > If so, I would suggest an alternate approach of using a single
> > > > 32-bit word and accessing it using a combination of FIELD_PREP() and
> > > > FIELD_GET() using masks created using GENMASK() and BIT().
> > >
> > > Good suggestion, I will refine it, thanks.
> >
> > Thanks. I forgot to mention that you will likely also need to add
> > cpu_to_le32 and le32_to_cpu to the mix.
> >
>
> I think I will use u8 instead of 32-bit, because I don't want to affect
> the existing 'u8 flag'. And u8 is good enough.
Sure, I agree that looks like it should work.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v6 RESEND net-next 5/5] net: enetc: add UDP segmentation offload support
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 5/5] net: enetc: add UDP segmentation offload support Wei Fang
@ 2024-12-08 15:09 ` Ido Schimmel
2024-12-08 15:54 ` Eric Dumazet
0 siblings, 1 reply; 23+ messages in thread
From: Ido Schimmel @ 2024-12-08 15:09 UTC (permalink / raw)
To: Wei Fang
Cc: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, frank.li, netdev, linux-kernel,
imx
On Wed, Dec 04, 2024 at 01:29:32PM +0800, Wei Fang wrote:
> Set NETIF_F_GSO_UDP_L4 bit of hw_features and features because i.MX95
> enetc and LS1028A driver implements UDP segmentation.
>
> - i.MX95 ENETC supports UDP segmentation via LSO.
> - LS1028A ENETC supports UDP segmentation since the commit 3d5b459ba0e3
> ("net: tso: add UDP segmentation support").
>
> Signed-off-by: Wei Fang <wei.fang@nxp.com>
> Reviewed-by: Frank Li <Frank.Li@nxp.com>
> ---
> v2: rephrase the commit message
> v3: no changes
> v4: fix typo in commit message
> v5: no changes
> v6: no changes
> ---
> drivers/net/ethernet/freescale/enetc/enetc_pf_common.c | 6 ++++--
> drivers/net/ethernet/freescale/enetc/enetc_vf.c | 6 ++++--
> 2 files changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c b/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
> index 82a67356abe4..76fc3c6fdec1 100644
> --- a/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
> +++ b/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
> @@ -110,11 +110,13 @@ void enetc_pf_netdev_setup(struct enetc_si *si, struct net_device *ndev,
> ndev->hw_features = NETIF_F_SG | NETIF_F_RXCSUM |
> NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX |
> NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_LOOPBACK |
> - NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6;
> + NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6 |
> + NETIF_F_GSO_UDP_L4;
> ndev->features = NETIF_F_HIGHDMA | NETIF_F_SG | NETIF_F_RXCSUM |
> NETIF_F_HW_VLAN_CTAG_TX |
> NETIF_F_HW_VLAN_CTAG_RX |
> - NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6;
> + NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6 |
> + NETIF_F_GSO_UDP_L4;
> ndev->vlan_features = NETIF_F_SG | NETIF_F_HW_CSUM |
> NETIF_F_TSO | NETIF_F_TSO6;
I didn't see any wording about it in the commit message / cover letter
so I will ask: Any reason not to enable UDP segmentation offload on
upper VLAN devices by setting the feature in 'ndev->vlan_features'?
>
> diff --git a/drivers/net/ethernet/freescale/enetc/enetc_vf.c b/drivers/net/ethernet/freescale/enetc/enetc_vf.c
> index 63d78b2b8670..3768752b6008 100644
> --- a/drivers/net/ethernet/freescale/enetc/enetc_vf.c
> +++ b/drivers/net/ethernet/freescale/enetc/enetc_vf.c
> @@ -145,11 +145,13 @@ static void enetc_vf_netdev_setup(struct enetc_si *si, struct net_device *ndev,
> ndev->hw_features = NETIF_F_SG | NETIF_F_RXCSUM |
> NETIF_F_HW_VLAN_CTAG_TX |
> NETIF_F_HW_VLAN_CTAG_RX |
> - NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6;
> + NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6 |
> + NETIF_F_GSO_UDP_L4;
> ndev->features = NETIF_F_HIGHDMA | NETIF_F_SG | NETIF_F_RXCSUM |
> NETIF_F_HW_VLAN_CTAG_TX |
> NETIF_F_HW_VLAN_CTAG_RX |
> - NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6;
> + NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6 |
> + NETIF_F_GSO_UDP_L4;
> ndev->vlan_features = NETIF_F_SG | NETIF_F_HW_CSUM |
> NETIF_F_TSO | NETIF_F_TSO6;
>
> --
> 2.34.1
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum offload for i.MX95 ENETC
2024-12-06 12:45 ` Wei Fang
@ 2024-12-08 15:47 ` Ido Schimmel
2024-12-10 7:49 ` Wei Fang
0 siblings, 1 reply; 23+ messages in thread
From: Ido Schimmel @ 2024-12-08 15:47 UTC (permalink / raw)
To: Wei Fang, tom
Cc: Simon Horman, Claudiu Manoil, Vladimir Oltean, Clark Wang,
andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, Frank Li,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
imx@lists.linux.dev
On Fri, Dec 06, 2024 at 12:45:02PM +0000, Wei Fang wrote:
> > -----Original Message-----
> > From: Simon Horman <horms@kernel.org>
> > Sent: 2024年12月6日 20:31
> > To: Wei Fang <wei.fang@nxp.com>
> > Cc: Claudiu Manoil <claudiu.manoil@nxp.com>; Vladimir Oltean
> > <vladimir.oltean@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> > andrew+netdev@lunn.ch; davem@davemloft.net; edumazet@google.com;
> > kuba@kernel.org; pabeni@redhat.com; Frank Li <frank.li@nxp.com>;
> > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; imx@lists.linux.dev
> > Subject: Re: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum
> > offload for i.MX95 ENETC
> >
> > On Fri, Dec 06, 2024 at 10:33:15AM +0000, Wei Fang wrote:
> > > > -----Original Message-----
> > > > From: Simon Horman <horms@kernel.org>
> > > > Sent: 2024年12月6日 17:23
> > > > To: Wei Fang <wei.fang@nxp.com>
> > > > Cc: Claudiu Manoil <claudiu.manoil@nxp.com>; Vladimir Oltean
> > > > <vladimir.oltean@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> > > > andrew+netdev@lunn.ch; davem@davemloft.net; edumazet@google.com;
> > > > kuba@kernel.org; pabeni@redhat.com; Frank Li <frank.li@nxp.com>;
> > > > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; imx@lists.linux.dev
> > > > Subject: Re: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum
> > > > offload for i.MX95 ENETC
> > > >
> > > > On Wed, Dec 04, 2024 at 01:29:28PM +0800, Wei Fang wrote:
> > > > > ENETC rev 4.1 supports TCP and UDP checksum offload for receive, the bit
> > > > > 108 of the Rx BD will be set if the TCP/UDP checksum is correct. Since
> > > > > this capability is not defined in register, the rx_csum bit is added to
> > > > > struct enetc_drvdata to indicate whether the device supports Rx
> > checksum
> > > > > offload.
> > > > >
> > > > > Signed-off-by: Wei Fang <wei.fang@nxp.com>
> > > > > Reviewed-by: Frank Li <Frank.Li@nxp.com>
> > > > > Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
> > > > > ---
> > > > > v2: no changes
> > > > > v3: no changes
> > > > > v4: no changes
> > > > > v5: no changes
> > > > > v6: no changes
> > > > > ---
> > > > > drivers/net/ethernet/freescale/enetc/enetc.c | 14
> > ++++++++++----
> > > > > drivers/net/ethernet/freescale/enetc/enetc.h | 2 ++
> > > > > drivers/net/ethernet/freescale/enetc/enetc_hw.h | 2 ++
> > > > > .../net/ethernet/freescale/enetc/enetc_pf_common.c | 3 +++
> > > > > 4 files changed, 17 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c
> > > > b/drivers/net/ethernet/freescale/enetc/enetc.c
> > > > > index 35634c516e26..3137b6ee62d3 100644
> > > > > --- a/drivers/net/ethernet/freescale/enetc/enetc.c
> > > > > +++ b/drivers/net/ethernet/freescale/enetc/enetc.c
> > > > > @@ -1011,10 +1011,15 @@ static void enetc_get_offloads(struct
> > enetc_bdr
> > > > *rx_ring,
> > > > >
> > > > > /* TODO: hashing */
> > > > > if (rx_ring->ndev->features & NETIF_F_RXCSUM) {
> > > > > - u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
> > > > > -
> > > > > - skb->csum = csum_unfold((__force
> > __sum16)~htons(inet_csum));
> > > > > - skb->ip_summed = CHECKSUM_COMPLETE;
> > > > > + if (priv->active_offloads & ENETC_F_RXCSUM &&
> > > > > + le16_to_cpu(rxbd->r.flags) &
> > ENETC_RXBD_FLAG_L4_CSUM_OK)
> > > > {
> > > > > + skb->ip_summed = CHECKSUM_UNNECESSARY;
> > > > > + } else {
> > > > > + u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
> > > > > +
> > > > > + skb->csum = csum_unfold((__force
> > __sum16)~htons(inet_csum));
> > > > > + skb->ip_summed = CHECKSUM_COMPLETE;
> > > > > + }
> > > > > }
> > > >
> > > > Hi Wei,
> > > >
> > > > I am wondering about the relationship between the above and
> > > > hardware support for CHECKSUM_COMPLETE.
> > > >
> > > > Prior to this patch CHECKSUM_COMPLETE was always used, which seems
> > > > desirable. But with this patch, CHECKSUM_UNNECESSARY is conditionally
> > used.
> > > >
> > > > If those cases don't work with CHECKSUM_COMPLETE then is this a
> > bug-fix?
> > > >
> > > > Or, alternatively, if those cases do work with CHECKSUM_COMPLETE, then
> > > > I'm unsure why this change is necessary or desirable. It's my understanding
> > > > that from the Kernel's perspective CHECKSUM_COMPLETE is preferable to
> > > > CHECKSUM_UNNECESSARY.
> > > >
> > > > ...
> > >
> > > Rx checksum offload is a new feature of ENETC v4. We would like to exploit
> > this
> > > capability of the hardware to save CPU cycles in calculating and verifying
> > checksum.
> > >
> >
> > Understood, but CHECKSUM_UNNECESSARY is usually the preferred option as
> > it
> > is more flexible, e.g. allowing low-cost calculation of inner checksums
> > in the presence of encapsulation.
>
> I think you mean 'CHECKSUM_COMPLETE' is the preferred option. But there is no
> strong reason against using CHECKSUM_UNNECESSARY. So I hope to keep this patch.
I was also under the impression that CHECKSUM_COMPLETE is more desirable
than CHECKSUM_UNNECESSARY. Maybe Tom can help.
Tom:
If a device can report both CHECKSUM_UNNECESSARY and CHECKSUM_COMPLETE,
is there any advantage in reporting CHECKSUM_UNNECESSARY? The only
advantage I can think of is that when the kernel pulls headers (IPv6 for
example) it wouldn't need to compute their checksum in order to adjust
skb->csum, but I am not sure how critical that is.
I am asking because I am interested in knowing what is the
recommendation for future devices: Implement both or only
CHECKSUM_COMPLETE?
Original patch is here [1] and I did read your paper [2] and David's
presentation [3].
Thanks
[1] https://lore.kernel.org/netdev/20241204052932.112446-1-wei.fang@nxp.com/T/#mf89bb4c6c72e8dd4a697551cbc9485217366d013
[2] https://people.netfilter.org/pablo/netdev0.1/papers/UDP-Encapsulation-in-Linux.pdf
[3] https://www.netdevconf.info/1.1/proceedings/slides/miller-hardware-checksumming.pdf
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v6 RESEND net-next 5/5] net: enetc: add UDP segmentation offload support
2024-12-08 15:09 ` Ido Schimmel
@ 2024-12-08 15:54 ` Eric Dumazet
0 siblings, 0 replies; 23+ messages in thread
From: Eric Dumazet @ 2024-12-08 15:54 UTC (permalink / raw)
To: Ido Schimmel
Cc: Wei Fang, claudiu.manoil, vladimir.oltean, xiaoning.wang,
andrew+netdev, davem, kuba, pabeni, frank.li, netdev,
linux-kernel, imx
On Sun, Dec 8, 2024 at 4:09 PM Ido Schimmel <idosch@idosch.org> wrote:
>
> On Wed, Dec 04, 2024 at 01:29:32PM +0800, Wei Fang wrote:
> > Set NETIF_F_GSO_UDP_L4 bit of hw_features and features because i.MX95
> > enetc and LS1028A driver implements UDP segmentation.
> >
> > - i.MX95 ENETC supports UDP segmentation via LSO.
> > - LS1028A ENETC supports UDP segmentation since the commit 3d5b459ba0e3
> > ("net: tso: add UDP segmentation support").
> >
> > Signed-off-by: Wei Fang <wei.fang@nxp.com>
> > Reviewed-by: Frank Li <Frank.Li@nxp.com>
> > ---
> > v2: rephrase the commit message
> > v3: no changes
> > v4: fix typo in commit message
> > v5: no changes
> > v6: no changes
> > ---
> > drivers/net/ethernet/freescale/enetc/enetc_pf_common.c | 6 ++++--
> > drivers/net/ethernet/freescale/enetc/enetc_vf.c | 6 ++++--
> > 2 files changed, 8 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c b/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
> > index 82a67356abe4..76fc3c6fdec1 100644
> > --- a/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
> > +++ b/drivers/net/ethernet/freescale/enetc/enetc_pf_common.c
> > @@ -110,11 +110,13 @@ void enetc_pf_netdev_setup(struct enetc_si *si, struct net_device *ndev,
> > ndev->hw_features = NETIF_F_SG | NETIF_F_RXCSUM |
> > NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX |
> > NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_LOOPBACK |
> > - NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6;
> > + NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6 |
> > + NETIF_F_GSO_UDP_L4;
> > ndev->features = NETIF_F_HIGHDMA | NETIF_F_SG | NETIF_F_RXCSUM |
> > NETIF_F_HW_VLAN_CTAG_TX |
> > NETIF_F_HW_VLAN_CTAG_RX |
> > - NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6;
> > + NETIF_F_HW_CSUM | NETIF_F_TSO | NETIF_F_TSO6 |
> > + NETIF_F_GSO_UDP_L4;
> > ndev->vlan_features = NETIF_F_SG | NETIF_F_HW_CSUM |
> > NETIF_F_TSO | NETIF_F_TSO6;
>
> I didn't see any wording about it in the commit message / cover letter
> so I will ask: Any reason not to enable UDP segmentation offload on
> upper VLAN devices by setting the feature in 'ndev->vlan_features'?
Going to back to my commit, it stated that net/core/tso.c was only
dealing with basic stuff.
Adding vlan support would need some changes there, I guess this should
be done if there is enough interest and testing.
commit 3d5b459ba0e3788ab471e8cb98eee89964a9c5e8 net: tso: add
UDP segmentation support
Note that like TCP, we do not support additional encapsulations,
and that checksums must be offloaded to the NIC.
^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum offload for i.MX95 ENETC
2024-12-08 15:47 ` Ido Schimmel
@ 2024-12-10 7:49 ` Wei Fang
2024-12-10 20:07 ` Simon Horman
0 siblings, 1 reply; 23+ messages in thread
From: Wei Fang @ 2024-12-10 7:49 UTC (permalink / raw)
To: Ido Schimmel, tom@herbertland.com
Cc: Simon Horman, Claudiu Manoil, Vladimir Oltean, Clark Wang,
andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, Frank Li,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
imx@lists.linux.dev
> -----Original Message-----
> From: Ido Schimmel <idosch@idosch.org>
> Sent: 2024年12月8日 23:47
> To: Wei Fang <wei.fang@nxp.com>; tom@herbertland.com
> Cc: Simon Horman <horms@kernel.org>; Claudiu Manoil
> <claudiu.manoil@nxp.com>; Vladimir Oltean <vladimir.oltean@nxp.com>; Clark
> Wang <xiaoning.wang@nxp.com>; andrew+netdev@lunn.ch;
> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> pabeni@redhat.com; Frank Li <frank.li@nxp.com>; netdev@vger.kernel.org;
> linux-kernel@vger.kernel.org; imx@lists.linux.dev
> Subject: Re: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum
> offload for i.MX95 ENETC
>
> On Fri, Dec 06, 2024 at 12:45:02PM +0000, Wei Fang wrote:
> > > -----Original Message-----
> > > From: Simon Horman <horms@kernel.org>
> > > Sent: 2024年12月6日 20:31
> > > To: Wei Fang <wei.fang@nxp.com>
> > > Cc: Claudiu Manoil <claudiu.manoil@nxp.com>; Vladimir Oltean
> > > <vladimir.oltean@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> > > andrew+netdev@lunn.ch; davem@davemloft.net; edumazet@google.com;
> > > kuba@kernel.org; pabeni@redhat.com; Frank Li <frank.li@nxp.com>;
> > > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; imx@lists.linux.dev
> > > Subject: Re: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum
> > > offload for i.MX95 ENETC
> > >
> > > On Fri, Dec 06, 2024 at 10:33:15AM +0000, Wei Fang wrote:
> > > > > -----Original Message-----
> > > > > From: Simon Horman <horms@kernel.org>
> > > > > Sent: 2024年12月6日 17:23
> > > > > To: Wei Fang <wei.fang@nxp.com>
> > > > > Cc: Claudiu Manoil <claudiu.manoil@nxp.com>; Vladimir Oltean
> > > > > <vladimir.oltean@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> > > > > andrew+netdev@lunn.ch; davem@davemloft.net;
> edumazet@google.com;
> > > > > kuba@kernel.org; pabeni@redhat.com; Frank Li <frank.li@nxp.com>;
> > > > > netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
> imx@lists.linux.dev
> > > > > Subject: Re: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx
> checksum
> > > > > offload for i.MX95 ENETC
> > > > >
> > > > > On Wed, Dec 04, 2024 at 01:29:28PM +0800, Wei Fang wrote:
> > > > > > ENETC rev 4.1 supports TCP and UDP checksum offload for receive, the
> bit
> > > > > > 108 of the Rx BD will be set if the TCP/UDP checksum is correct. Since
> > > > > > this capability is not defined in register, the rx_csum bit is added to
> > > > > > struct enetc_drvdata to indicate whether the device supports Rx
> > > checksum
> > > > > > offload.
> > > > > >
> > > > > > Signed-off-by: Wei Fang <wei.fang@nxp.com>
> > > > > > Reviewed-by: Frank Li <Frank.Li@nxp.com>
> > > > > > Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
> > > > > > ---
> > > > > > v2: no changes
> > > > > > v3: no changes
> > > > > > v4: no changes
> > > > > > v5: no changes
> > > > > > v6: no changes
> > > > > > ---
> > > > > > drivers/net/ethernet/freescale/enetc/enetc.c | 14
> > > ++++++++++----
> > > > > > drivers/net/ethernet/freescale/enetc/enetc.h | 2 ++
> > > > > > drivers/net/ethernet/freescale/enetc/enetc_hw.h | 2 ++
> > > > > > .../net/ethernet/freescale/enetc/enetc_pf_common.c | 3 +++
> > > > > > 4 files changed, 17 insertions(+), 4 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c
> > > > > b/drivers/net/ethernet/freescale/enetc/enetc.c
> > > > > > index 35634c516e26..3137b6ee62d3 100644
> > > > > > --- a/drivers/net/ethernet/freescale/enetc/enetc.c
> > > > > > +++ b/drivers/net/ethernet/freescale/enetc/enetc.c
> > > > > > @@ -1011,10 +1011,15 @@ static void enetc_get_offloads(struct
> > > enetc_bdr
> > > > > *rx_ring,
> > > > > >
> > > > > > /* TODO: hashing */
> > > > > > if (rx_ring->ndev->features & NETIF_F_RXCSUM) {
> > > > > > - u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
> > > > > > -
> > > > > > - skb->csum = csum_unfold((__force
> > > __sum16)~htons(inet_csum));
> > > > > > - skb->ip_summed = CHECKSUM_COMPLETE;
> > > > > > + if (priv->active_offloads & ENETC_F_RXCSUM &&
> > > > > > + le16_to_cpu(rxbd->r.flags) &
> > > ENETC_RXBD_FLAG_L4_CSUM_OK)
> > > > > {
> > > > > > + skb->ip_summed = CHECKSUM_UNNECESSARY;
> > > > > > + } else {
> > > > > > + u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
> > > > > > +
> > > > > > + skb->csum = csum_unfold((__force
> > > __sum16)~htons(inet_csum));
> > > > > > + skb->ip_summed = CHECKSUM_COMPLETE;
> > > > > > + }
> > > > > > }
> > > > >
> > > > > Hi Wei,
> > > > >
> > > > > I am wondering about the relationship between the above and
> > > > > hardware support for CHECKSUM_COMPLETE.
> > > > >
> > > > > Prior to this patch CHECKSUM_COMPLETE was always used, which seems
> > > > > desirable. But with this patch, CHECKSUM_UNNECESSARY is conditionally
> > > used.
> > > > >
> > > > > If those cases don't work with CHECKSUM_COMPLETE then is this a
> > > bug-fix?
> > > > >
> > > > > Or, alternatively, if those cases do work with CHECKSUM_COMPLETE,
> then
> > > > > I'm unsure why this change is necessary or desirable. It's my
> understanding
> > > > > that from the Kernel's perspective CHECKSUM_COMPLETE is preferable
> to
> > > > > CHECKSUM_UNNECESSARY.
> > > > >
> > > > > ...
> > > >
> > > > Rx checksum offload is a new feature of ENETC v4. We would like to exploit
> > > this
> > > > capability of the hardware to save CPU cycles in calculating and verifying
> > > checksum.
> > > >
> > >
> > > Understood, but CHECKSUM_UNNECESSARY is usually the preferred option
> as
> > > it
> > > is more flexible, e.g. allowing low-cost calculation of inner checksums
> > > in the presence of encapsulation.
> >
> > I think you mean 'CHECKSUM_COMPLETE' is the preferred option. But there is
> no
> > strong reason against using CHECKSUM_UNNECESSARY. So I hope to keep this
> patch.
>
> I was also under the impression that CHECKSUM_COMPLETE is more desirable
> than CHECKSUM_UNNECESSARY. Maybe Tom can help.
From the kernel doc [1] it should be necessary to use CHECKSUM_COMPLETE in
enetc driver, because ENETCv4 only supports UDP/TCP checksum offload. So I will
drop this patch from the patch set. thanks.
[1] https://docs.kernel.org/networking/skbuff.html#:~:text=Even%20if%20device%20supports%20only%20some%20protocols%2C%20but%20is%20able%20to%20produce%20skb%2D%3Ecsum%2C%20it%20MUST%20use%20CHECKSUM_COMPLETE%2C%20not%20CHECKSUM_UNNECESSARY.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum offload for i.MX95 ENETC
2024-12-10 7:49 ` Wei Fang
@ 2024-12-10 20:07 ` Simon Horman
0 siblings, 0 replies; 23+ messages in thread
From: Simon Horman @ 2024-12-10 20:07 UTC (permalink / raw)
To: Wei Fang
Cc: Ido Schimmel, tom@herbertland.com, Claudiu Manoil,
Vladimir Oltean, Clark Wang, andrew+netdev@lunn.ch,
davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, Frank Li, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, imx@lists.linux.dev
On Tue, Dec 10, 2024 at 07:49:18AM +0000, Wei Fang wrote:
> > -----Original Message-----
> > From: Ido Schimmel <idosch@idosch.org>
> > Sent: 2024年12月8日 23:47
> > To: Wei Fang <wei.fang@nxp.com>; tom@herbertland.com
> > Cc: Simon Horman <horms@kernel.org>; Claudiu Manoil
> > <claudiu.manoil@nxp.com>; Vladimir Oltean <vladimir.oltean@nxp.com>; Clark
> > Wang <xiaoning.wang@nxp.com>; andrew+netdev@lunn.ch;
> > davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> > pabeni@redhat.com; Frank Li <frank.li@nxp.com>; netdev@vger.kernel.org;
> > linux-kernel@vger.kernel.org; imx@lists.linux.dev
> > Subject: Re: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum
> > offload for i.MX95 ENETC
> >
> > On Fri, Dec 06, 2024 at 12:45:02PM +0000, Wei Fang wrote:
> > > > -----Original Message-----
> > > > From: Simon Horman <horms@kernel.org>
> > > > Sent: 2024年12月6日 20:31
> > > > To: Wei Fang <wei.fang@nxp.com>
> > > > Cc: Claudiu Manoil <claudiu.manoil@nxp.com>; Vladimir Oltean
> > > > <vladimir.oltean@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> > > > andrew+netdev@lunn.ch; davem@davemloft.net; edumazet@google.com;
> > > > kuba@kernel.org; pabeni@redhat.com; Frank Li <frank.li@nxp.com>;
> > > > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; imx@lists.linux.dev
> > > > Subject: Re: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum
> > > > offload for i.MX95 ENETC
> > > >
> > > > On Fri, Dec 06, 2024 at 10:33:15AM +0000, Wei Fang wrote:
> > > > > > -----Original Message-----
> > > > > > From: Simon Horman <horms@kernel.org>
> > > > > > Sent: 2024年12月6日 17:23
> > > > > > To: Wei Fang <wei.fang@nxp.com>
> > > > > > Cc: Claudiu Manoil <claudiu.manoil@nxp.com>; Vladimir Oltean
> > > > > > <vladimir.oltean@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> > > > > > andrew+netdev@lunn.ch; davem@davemloft.net;
> > edumazet@google.com;
> > > > > > kuba@kernel.org; pabeni@redhat.com; Frank Li <frank.li@nxp.com>;
> > > > > > netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
> > imx@lists.linux.dev
> > > > > > Subject: Re: [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx
> > checksum
> > > > > > offload for i.MX95 ENETC
> > > > > >
> > > > > > On Wed, Dec 04, 2024 at 01:29:28PM +0800, Wei Fang wrote:
> > > > > > > ENETC rev 4.1 supports TCP and UDP checksum offload for receive, the
> > bit
> > > > > > > 108 of the Rx BD will be set if the TCP/UDP checksum is correct. Since
> > > > > > > this capability is not defined in register, the rx_csum bit is added to
> > > > > > > struct enetc_drvdata to indicate whether the device supports Rx
> > > > checksum
> > > > > > > offload.
> > > > > > >
> > > > > > > Signed-off-by: Wei Fang <wei.fang@nxp.com>
> > > > > > > Reviewed-by: Frank Li <Frank.Li@nxp.com>
> > > > > > > Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
> > > > > > > ---
> > > > > > > v2: no changes
> > > > > > > v3: no changes
> > > > > > > v4: no changes
> > > > > > > v5: no changes
> > > > > > > v6: no changes
> > > > > > > ---
> > > > > > > drivers/net/ethernet/freescale/enetc/enetc.c | 14
> > > > ++++++++++----
> > > > > > > drivers/net/ethernet/freescale/enetc/enetc.h | 2 ++
> > > > > > > drivers/net/ethernet/freescale/enetc/enetc_hw.h | 2 ++
> > > > > > > .../net/ethernet/freescale/enetc/enetc_pf_common.c | 3 +++
> > > > > > > 4 files changed, 17 insertions(+), 4 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c
> > > > > > b/drivers/net/ethernet/freescale/enetc/enetc.c
> > > > > > > index 35634c516e26..3137b6ee62d3 100644
> > > > > > > --- a/drivers/net/ethernet/freescale/enetc/enetc.c
> > > > > > > +++ b/drivers/net/ethernet/freescale/enetc/enetc.c
> > > > > > > @@ -1011,10 +1011,15 @@ static void enetc_get_offloads(struct
> > > > enetc_bdr
> > > > > > *rx_ring,
> > > > > > >
> > > > > > > /* TODO: hashing */
> > > > > > > if (rx_ring->ndev->features & NETIF_F_RXCSUM) {
> > > > > > > - u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
> > > > > > > -
> > > > > > > - skb->csum = csum_unfold((__force
> > > > __sum16)~htons(inet_csum));
> > > > > > > - skb->ip_summed = CHECKSUM_COMPLETE;
> > > > > > > + if (priv->active_offloads & ENETC_F_RXCSUM &&
> > > > > > > + le16_to_cpu(rxbd->r.flags) &
> > > > ENETC_RXBD_FLAG_L4_CSUM_OK)
> > > > > > {
> > > > > > > + skb->ip_summed = CHECKSUM_UNNECESSARY;
> > > > > > > + } else {
> > > > > > > + u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
> > > > > > > +
> > > > > > > + skb->csum = csum_unfold((__force
> > > > __sum16)~htons(inet_csum));
> > > > > > > + skb->ip_summed = CHECKSUM_COMPLETE;
> > > > > > > + }
> > > > > > > }
> > > > > >
> > > > > > Hi Wei,
> > > > > >
> > > > > > I am wondering about the relationship between the above and
> > > > > > hardware support for CHECKSUM_COMPLETE.
> > > > > >
> > > > > > Prior to this patch CHECKSUM_COMPLETE was always used, which seems
> > > > > > desirable. But with this patch, CHECKSUM_UNNECESSARY is conditionally
> > > > used.
> > > > > >
> > > > > > If those cases don't work with CHECKSUM_COMPLETE then is this a
> > > > bug-fix?
> > > > > >
> > > > > > Or, alternatively, if those cases do work with CHECKSUM_COMPLETE,
> > then
> > > > > > I'm unsure why this change is necessary or desirable. It's my
> > understanding
> > > > > > that from the Kernel's perspective CHECKSUM_COMPLETE is preferable
> > to
> > > > > > CHECKSUM_UNNECESSARY.
> > > > > >
> > > > > > ...
> > > > >
> > > > > Rx checksum offload is a new feature of ENETC v4. We would like to exploit
> > > > this
> > > > > capability of the hardware to save CPU cycles in calculating and verifying
> > > > checksum.
> > > > >
> > > >
> > > > Understood, but CHECKSUM_UNNECESSARY is usually the preferred option
> > as
> > > > it
> > > > is more flexible, e.g. allowing low-cost calculation of inner checksums
> > > > in the presence of encapsulation.
> > >
> > > I think you mean 'CHECKSUM_COMPLETE' is the preferred option. But there is
> > no
> > > strong reason against using CHECKSUM_UNNECESSARY. So I hope to keep this
> > patch.
> >
> > I was also under the impression that CHECKSUM_COMPLETE is more desirable
> > than CHECKSUM_UNNECESSARY. Maybe Tom can help.
>
> From the kernel doc [1] it should be necessary to use CHECKSUM_COMPLETE in
> enetc driver, because ENETCv4 only supports UDP/TCP checksum offload. So I will
> drop this patch from the patch set. thanks.
Thanks.
>
> [1] https://docs.kernel.org/networking/skbuff.html#:~:text=Even%20if%20device%20supports%20only%20some%20protocols%2C%20but%20is%20able%20to%20produce%20skb%2D%3Ecsum%2C%20it%20MUST%20use%20CHECKSUM_COMPLETE%2C%20not%20CHECKSUM_UNNECESSARY.
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2024-12-10 20:08 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-04 5:29 [PATCH v6 RESEND net-next 0/5] Add more feautues for ENETC v4 - round 1 Wei Fang
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 1/5] net: enetc: add Rx checksum offload for i.MX95 ENETC Wei Fang
2024-12-06 9:23 ` Simon Horman
2024-12-06 10:33 ` Wei Fang
2024-12-06 12:30 ` Simon Horman
2024-12-06 12:45 ` Wei Fang
2024-12-08 15:47 ` Ido Schimmel
2024-12-10 7:49 ` Wei Fang
2024-12-10 20:07 ` Simon Horman
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 2/5] net: enetc: add Tx " Wei Fang
2024-12-06 9:37 ` Simon Horman
2024-12-06 10:46 ` Wei Fang
2024-12-06 12:32 ` Simon Horman
2024-12-06 12:38 ` Wei Fang
2024-12-06 13:31 ` Simon Horman
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 3/5] net: enetc: update max chained Tx BD number " Wei Fang
2024-12-06 10:11 ` Simon Horman
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 4/5] net: enetc: add LSO support for i.MX95 ENETC PF Wei Fang
2024-12-06 9:59 ` Simon Horman
2024-12-06 10:33 ` Simon Horman
2024-12-04 5:29 ` [PATCH v6 RESEND net-next 5/5] net: enetc: add UDP segmentation offload support Wei Fang
2024-12-08 15:09 ` Ido Schimmel
2024-12-08 15:54 ` Eric Dumazet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).