From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by smtp.lore.kernel.org (Postfix) with ESMTP id DCD8CE73140 for ; Mon, 2 Feb 2026 08:15:30 +0000 (UTC) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id BA5EC40A6C; Mon, 2 Feb 2026 09:15:07 +0100 (CET) Received: from mail-dl1-f46.google.com (mail-dl1-f46.google.com [74.125.82.46]) by mails.dpdk.org (Postfix) with ESMTP id 2939F40A8A for ; Mon, 2 Feb 2026 09:15:06 +0100 (CET) Received: by mail-dl1-f46.google.com with SMTP id a92af1059eb24-124a95e592fso4244363c88.0 for ; Mon, 02 Feb 2026 00:15:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1770020105; x=1770624905; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=alI04e/Gf+QKrZLFJHKmKKfYUaFE2mY3OXgZ0wCQh/g=; b=SGDFA5HsHphioh2BQ9YrouDi30cFW64bhe2Z0aNbESwoYmzswpsJcm/hhW6pBcIDjT IqFFFVMCv0H9i7it0ab3J3rwOig3ZNn73U9lMEyeo2s05A69MgJC8GG7NgcPO+7CL6+b EGT5jVY5051FujVZVu5nnkWa0O873J6mpvvYiNNS6sMPWWtcdlQhuQ9+oq+caydgmBEU KzJXiPdnG3/rr7v1/gWtVJMsHmrhRgLuKQYq5LnGyn8Je7Arx1j1/CMVWUVIZA+OeYIT 7jIfgpLIRtkJm/xirUnpxsCfaT9W2G0L/R9W1wzyYRam6Wc/PfXk6602qfp2roHi9eWp 1GQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770020105; x=1770624905; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=alI04e/Gf+QKrZLFJHKmKKfYUaFE2mY3OXgZ0wCQh/g=; b=BhYl2M6h7mYyYJBGH7amEBp+MeIANCx/2ZyozyTWGrfWgesQREzbU3JTb8aLgLk+fr 2cmmctD0GCp6x4/ogx66ZP6AWDJcIRa4LGVk1T3aD07Qv6xKE9n2xV7ov5tOr7q38cTv 23iMYA9rDoG6z8vbEtjoVLXHojHvGZTtcXYfhg+imKNyU0PBaESu26Ok/HNw8bL+1VOz 4jo2uS80DTFLeigIaQJ38f//U3tRnEa1/WAbkJii0hKk605tmeDwEp6CV6YF7rMSsnzm 8YdMc8cDF1AvMBLJvajSpxPhG548Ty1Kia4gbPhi71K25INL95K2mQlWIw2I2JwaI1iA oa3g== X-Gm-Message-State: AOJu0YxsuUbazQYnQzz4kybw6ZGfPKaI0ghhhtC4iB6CQN7cvaD/J5m7 GuqbjX6up8AJoLtxSwhIn28vkUzGY0Th71iWcsslqjq1RIZMi4/xV29EE1lkyA== X-Gm-Gg: AZuq6aKLE8WnBetfPvYLkQi1F6geptpbSxcoQ6VyDLwGaZXMsAGqZBqoi1lmf5xN8jq 5UmrryzHzewWSyF23GnzQDCbgDovk5cplpu7VfDlmQ0foQTRTQb/6UDVA6J/JWvTS2FKeVRBoV8 VGy0zSQ8xCx0b46gQyUSkgM/mx1Nv5n7pLeHcO2ixfN1vxNTLuZ4bdbKWSJ/IA3Fp7Be8O45eS1 Da3HeY58odJC/x6dsJ4kfRABRctznc5SYMYgt102WXubbZ9py6JYVRE5NpuKOAHaRvfRB7KIGB6 j75xkYewdquCbwYNFxYX/a37H2BPQjEty4kntIpiDlI943thd7+O+CCMQCq5DiuJnU9BFw9LkSk 9yZvu73HjSalSkBT0lXbjjS355k4275/CivVczefer7gc12p2zGeUItebQS1XcSupI03RMeAo2j 0/Lc5CwHDEy3NvgGlOlxLMsv3JkiW8rZJU X-Received: by 2002:a05:7022:1288:b0:123:345f:5d9c with SMTP id a92af1059eb24-125c0f99a07mr4707317c88.2.1770020104846; Mon, 02 Feb 2026 00:15:04 -0800 (PST) Received: from mac.com ([136.24.82.250]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-124a9d7f597sm17142131c88.5.2026.02.02.00.15.03 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 02 Feb 2026 00:15:04 -0800 (PST) From: scott.k.mitch1@gmail.com To: dev@dpdk.org Cc: stephen@networkplumber.org, Scott Mitchell Subject: [PATCH v4 4/4] net/af_packet: add software checksum offload support Date: Mon, 2 Feb 2026 00:14:56 -0800 Message-Id: <20260202081456.4322-5-scott.k.mitch1@gmail.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20260202081456.4322-1-scott.k.mitch1@gmail.com> References: <20260128191032.78916-1-scott.k.mitch1@gmail.com> <20260202081456.4322-1-scott.k.mitch1@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Scott Mitchell Add software checksum offload support and configurable TX poll behavior to improve flexibility and performance. Add rte_net_ip_udptcp_cksum_mbuf in rte_net.h which is shared between rte_eth_tap and rte_eth_af_packet that supports IPv4/UDP/TCP checksums in software due to hardware offload and context propagation not being supported. Signed-off-by: Scott Mitchell --- doc/guides/nics/features/afpacket.ini | 2 + doc/guides/rel_notes/release_26_03.rst | 2 + drivers/net/af_packet/rte_eth_af_packet.c | 15 +++- drivers/net/tap/rte_eth_tap.c | 61 +-------------- lib/net/rte_net.h | 92 +++++++++++++++++++++++ 5 files changed, 112 insertions(+), 60 deletions(-) diff --git a/doc/guides/nics/features/afpacket.ini b/doc/guides/nics/features/afpacket.ini index 391f79b173..4bb81c84ff 100644 --- a/doc/guides/nics/features/afpacket.ini +++ b/doc/guides/nics/features/afpacket.ini @@ -7,5 +7,7 @@ Link status = Y Promiscuous mode = Y MTU update = Y +L3 checksum offload = Y +L4 checksum offload = Y Basic stats = Y Stats per queue = Y diff --git a/doc/guides/rel_notes/release_26_03.rst b/doc/guides/rel_notes/release_26_03.rst index 3b6be19645..2946acce99 100644 --- a/doc/guides/rel_notes/release_26_03.rst +++ b/doc/guides/rel_notes/release_26_03.rst @@ -60,6 +60,8 @@ New Features * Fixed kernel memory barrier protocol for memory availability * Fixed shared memory frame overhead offset calculation * Added ``txpollnotrdy`` devarg to avoid ``poll()`` blocking calls + * Added checksum offload support for ``IPV4_CKSUM``, ``UDP_CKSUM``, + and ``TCP_CKSUM`` Removed Items ------------- diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c index 9df1b1fd4c..128f93bec6 100644 --- a/drivers/net/af_packet/rte_eth_af_packet.c +++ b/drivers/net/af_packet/rte_eth_af_packet.c @@ -10,6 +10,8 @@ #include #include #include +#include +#include #include #include #include @@ -101,6 +103,7 @@ struct pmd_internals { struct pkt_tx_queue *tx_queue; uint8_t vlan_strip; uint8_t timestamp_offloading; + bool tx_sw_cksum; }; static const char *valid_arguments[] = { @@ -311,6 +314,9 @@ eth_af_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) ppd->tp_len = mbuf->pkt_len; ppd->tp_snaplen = mbuf->pkt_len; + if (pkt_q->sw_cksum && !rte_net_ip_udptcp_cksum_mbuf(mbuf, false)) + continue; + struct rte_mbuf *tmp_mbuf = mbuf; do { uint16_t data_len = rte_pktmbuf_data_len(tmp_mbuf); @@ -396,10 +402,13 @@ eth_dev_configure(struct rte_eth_dev *dev __rte_unused) { struct rte_eth_conf *dev_conf = &dev->data->dev_conf; const struct rte_eth_rxmode *rxmode = &dev_conf->rxmode; + const struct rte_eth_txmode *txmode = &dev_conf->txmode; struct pmd_internals *internals = dev->data->dev_private; internals->vlan_strip = !!(rxmode->offloads & RTE_ETH_RX_OFFLOAD_VLAN_STRIP); internals->timestamp_offloading = !!(rxmode->offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP); + internals->tx_sw_cksum = !!(txmode->offloads & (RTE_ETH_TX_OFFLOAD_IPV4_CKSUM | + RTE_ETH_TX_OFFLOAD_UDP_CKSUM | RTE_ETH_TX_OFFLOAD_TCP_CKSUM)); return 0; } @@ -417,7 +426,10 @@ eth_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info) dev_info->max_tx_queues = (uint16_t)internals->nb_queues; dev_info->min_rx_bufsize = ETH_AF_PACKET_ETH_OVERHEAD; dev_info->tx_offload_capa = RTE_ETH_TX_OFFLOAD_MULTI_SEGS | - RTE_ETH_TX_OFFLOAD_VLAN_INSERT; + RTE_ETH_TX_OFFLOAD_VLAN_INSERT | + RTE_ETH_TX_OFFLOAD_IPV4_CKSUM | + RTE_ETH_TX_OFFLOAD_UDP_CKSUM | + RTE_ETH_TX_OFFLOAD_TCP_CKSUM; dev_info->rx_offload_capa = RTE_ETH_RX_OFFLOAD_VLAN_STRIP | RTE_ETH_RX_OFFLOAD_TIMESTAMP; @@ -618,6 +630,7 @@ eth_tx_queue_setup(struct rte_eth_dev *dev, { struct pmd_internals *internals = dev->data->dev_private; + internals->tx_queue[tx_queue_id].sw_cksum = internals->tx_sw_cksum; dev->data->tx_queues[tx_queue_id] = &internals->tx_queue[tx_queue_id]; return 0; diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c index 730f1859bd..55f496babe 100644 --- a/drivers/net/tap/rte_eth_tap.c +++ b/drivers/net/tap/rte_eth_tap.c @@ -560,70 +560,13 @@ tap_write_mbufs(struct tx_queue *txq, uint16_t num_mbufs, if (txq->csum && (mbuf->ol_flags & RTE_MBUF_F_TX_IP_CKSUM || l4_ol_flags == RTE_MBUF_F_TX_UDP_CKSUM || l4_ol_flags == RTE_MBUF_F_TX_TCP_CKSUM)) { - unsigned int hdrlens = mbuf->l2_len + mbuf->l3_len; - uint16_t *l4_cksum; - void *l3_hdr; - - if (l4_ol_flags == RTE_MBUF_F_TX_UDP_CKSUM) - hdrlens += sizeof(struct rte_udp_hdr); - else if (l4_ol_flags == RTE_MBUF_F_TX_TCP_CKSUM) - hdrlens += sizeof(struct rte_tcp_hdr); - else if (l4_ol_flags != RTE_MBUF_F_TX_L4_NO_CKSUM) - return -1; - - /* Support only packets with at least layer 4 - * header included in the first segment - */ - if (rte_pktmbuf_data_len(mbuf) < hdrlens) - return -1; - - /* To change checksums (considering that a mbuf can be - * indirect, for example), copy l2, l3 and l4 headers - * in a new segment and chain it to existing data - */ - seg = rte_pktmbuf_copy(mbuf, mbuf->pool, 0, hdrlens); + /* Compute checksums in software, copying headers if needed */ + seg = rte_net_ip_udptcp_cksum_mbuf(mbuf, true); if (seg == NULL) return -1; - rte_pktmbuf_adj(mbuf, hdrlens); - rte_pktmbuf_chain(seg, mbuf); pmbufs[i] = mbuf = seg; - - l3_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, mbuf->l2_len); - if (mbuf->ol_flags & RTE_MBUF_F_TX_IP_CKSUM) { - struct rte_ipv4_hdr *iph = l3_hdr; - - iph->hdr_checksum = 0; - iph->hdr_checksum = rte_ipv4_cksum(iph); - } - - if (l4_ol_flags == RTE_MBUF_F_TX_L4_NO_CKSUM) - goto skip_l4_cksum; - - if (l4_ol_flags == RTE_MBUF_F_TX_UDP_CKSUM) { - struct rte_udp_hdr *udp_hdr; - - udp_hdr = rte_pktmbuf_mtod_offset(mbuf, struct rte_udp_hdr *, - mbuf->l2_len + mbuf->l3_len); - l4_cksum = &udp_hdr->dgram_cksum; - } else { - struct rte_tcp_hdr *tcp_hdr; - - tcp_hdr = rte_pktmbuf_mtod_offset(mbuf, struct rte_tcp_hdr *, - mbuf->l2_len + mbuf->l3_len); - l4_cksum = &tcp_hdr->cksum; - } - - *l4_cksum = 0; - if (mbuf->ol_flags & RTE_MBUF_F_TX_IPV4) { - *l4_cksum = rte_ipv4_udptcp_cksum_mbuf(mbuf, l3_hdr, - mbuf->l2_len + mbuf->l3_len); - } else { - *l4_cksum = rte_ipv6_udptcp_cksum_mbuf(mbuf, l3_hdr, - mbuf->l2_len + mbuf->l3_len); - } } -skip_l4_cksum: for (j = 0; j < mbuf->nb_segs; j++) { iovecs[k].iov_len = rte_pktmbuf_data_len(seg); iovecs[k].iov_base = rte_pktmbuf_mtod(seg, void *); diff --git a/lib/net/rte_net.h b/lib/net/rte_net.h index 65d724b84b..44f42010c8 100644 --- a/lib/net/rte_net.h +++ b/lib/net/rte_net.h @@ -246,6 +246,98 @@ rte_net_intel_cksum_prepare(struct rte_mbuf *m) return rte_net_intel_cksum_flags_prepare(m, m->ol_flags); } +/** + * Compute IPv4 header and UDP/TCP checksums in software. + * + * Computes checksums based on mbuf offload flags: + * - RTE_MBUF_F_TX_IP_CKSUM: Compute IPv4 header checksum + * - RTE_MBUF_F_TX_UDP_CKSUM: Compute UDP checksum (IPv4 or IPv6) + * - RTE_MBUF_F_TX_TCP_CKSUM: Compute TCP checksum (IPv4 or IPv6) + * + * @param mbuf + * The packet mbuf. Must have l2_len and l3_len set correctly. + * @param copy + * If true, copy L2/L3/L4 headers to a new segment before computing + * checksums. This is safe for indirect mbufs but has overhead. + * If false, compute checksums in place. This is only safe if the + * mbuf will be copied afterward (e.g., to a device ring buffer). + * @return + * - On success: Returns mbuf (new segment if copy=true, original if copy=false) + * - On error: Returns NULL (allocation failed or malformed packet) + */ +__rte_experimental +static inline struct rte_mbuf * +rte_net_ip_udptcp_cksum_mbuf(struct rte_mbuf *mbuf, bool copy) +{ + const uint64_t l4_ol_flags = mbuf->ol_flags & RTE_MBUF_F_TX_L4_MASK; + const uint64_t l4_offset = mbuf->l2_len + mbuf->l3_len; + uint32_t hdrlens = l4_offset; + void *l3_hdr = NULL; + + /* Determine total header length needed */ + if (l4_ol_flags == RTE_MBUF_F_TX_UDP_CKSUM) + hdrlens += sizeof(struct rte_udp_hdr); + else if (l4_ol_flags == RTE_MBUF_F_TX_TCP_CKSUM) + hdrlens += sizeof(struct rte_tcp_hdr); + else if (l4_ol_flags != RTE_MBUF_F_TX_L4_NO_CKSUM) + return NULL; /* Unsupported L4 checksum type */ + else if (!(mbuf->ol_flags & RTE_MBUF_F_TX_IP_CKSUM)) + return mbuf; /* Nothing to do */ + + /* Validate we at least have L2+L3 headers before doing any work */ + if (unlikely(rte_pktmbuf_data_len(mbuf) < l4_offset)) + return NULL; + + if (copy) { + /* + * Copy headers to new segment to handle indirect mbufs. + * This ensures we can safely modify checksums without + * corrupting shared/read-only data. + */ + struct rte_mbuf *seg = rte_pktmbuf_copy(mbuf, mbuf->pool, 0, hdrlens); + if (!seg) + return NULL; + + rte_pktmbuf_adj(mbuf, hdrlens); + rte_pktmbuf_chain(seg, mbuf); + mbuf = seg; + } else if (unlikely(!RTE_MBUF_DIRECT(mbuf) || rte_mbuf_refcnt_read(mbuf) > 1)) + return NULL; + + l3_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, mbuf->l2_len); + + /* IPv4 header checksum */ + if (mbuf->ol_flags & RTE_MBUF_F_TX_IP_CKSUM) { + struct rte_ipv4_hdr *iph = (struct rte_ipv4_hdr *)l3_hdr; + iph->hdr_checksum = 0; + iph->hdr_checksum = rte_ipv4_cksum(iph); + } + + /* L4 checksum (UDP or TCP) - skip if headers not in first segment */ + if (l4_ol_flags == RTE_MBUF_F_TX_UDP_CKSUM && rte_pktmbuf_data_len(mbuf) >= hdrlens) { + struct rte_udp_hdr *udp_hdr = rte_pktmbuf_mtod_offset(mbuf, struct rte_udp_hdr *, + l4_offset); + udp_hdr->dgram_cksum = 0; + udp_hdr->dgram_cksum = (mbuf->ol_flags & RTE_MBUF_F_TX_IPV4) ? + rte_ipv4_udptcp_cksum_mbuf(mbuf, (const struct rte_ipv4_hdr *)l3_hdr, + l4_offset) : + rte_ipv6_udptcp_cksum_mbuf(mbuf, (const struct rte_ipv6_hdr *)l3_hdr, + l4_offset); + } else if (l4_ol_flags == RTE_MBUF_F_TX_TCP_CKSUM && + rte_pktmbuf_data_len(mbuf) >= hdrlens) { + struct rte_tcp_hdr *tcp_hdr = rte_pktmbuf_mtod_offset(mbuf, struct rte_tcp_hdr *, + l4_offset); + tcp_hdr->cksum = 0; + tcp_hdr->cksum = (mbuf->ol_flags & RTE_MBUF_F_TX_IPV4) ? + rte_ipv4_udptcp_cksum_mbuf(mbuf, (const struct rte_ipv4_hdr *)l3_hdr, + l4_offset) : + rte_ipv6_udptcp_cksum_mbuf(mbuf, (const struct rte_ipv6_hdr *)l3_hdr, + l4_offset); + } + + return mbuf; +} + #ifdef __cplusplus } #endif -- 2.39.5 (Apple Git-154)