* Re: [PATCH dpdk] net/tap: use offsets provided by rte_net_get_ptype
2026-04-22 13:36 [PATCH dpdk] net/tap: use offsets provided by rte_net_get_ptype Robin Jarry
@ 2026-04-22 16:32 ` Stephen Hemminger
2026-04-27 10:41 ` [PATCH dpdk v2] " Robin Jarry
` (4 subsequent siblings)
5 siblings, 0 replies; 12+ messages in thread
From: Stephen Hemminger @ 2026-04-22 16:32 UTC (permalink / raw)
To: Robin Jarry; +Cc: dev
On Wed, 22 Apr 2026 15:36:16 +0200
Robin Jarry <rjarry@redhat.com> wrote:
> Instead of guessing what are the proper header lengths, pass
> a rte_net_hdr_lens struct to rte_net_get_ptype and use it to get the
> proper header lengths/offsets in tap_verify_csum.
>
> Signed-off-by: Robin Jarry <rjarry@redhat.com>
> ---
AI review feedback spotted some things I didn't
On Wed, 22 Apr 2026 15:36:16 +0200
Robin Jarry <rjarry@redhat.com> wrote:
> Instead of guessing what are the proper header lengths, pass
> a rte_net_hdr_lens struct to rte_net_get_ptype and use it to get the
> proper header lengths/offsets in tap_verify_csum.
The two bounds checks this patch drops are not redundant with the new
l4_off check and need to stay in some form:
- if (l2_len + rte_be_to_cpu_16(iph->total_length) >
- rte_pktmbuf_data_len(mbuf))
- return;
...
- if (l2_len + l3_len + rte_be_to_cpu_16(iph->payload_len) >
- rte_pktmbuf_data_len(mbuf))
- return;
rte_ipv4_udptcp_cksum_verify() -> __rte_ipv4_udptcp_cksum() does:
l3_len = rte_be_to_cpu_16(ipv4_hdr->total_length);
...
l4_len = l3_len - ip_hdr_len;
cksum = rte_raw_cksum(l4_hdr, l4_len);
and the IPv6 variant reads payload_len the same way. Both are untrusted
values from the wire. Without the checks above, a packet whose
total_length / payload_len exceeds what was actually received causes
rte_raw_cksum() to read past the valid data - uninitialized tail room at
best, outside buf_len at worst. The IP/L4 CKSUM flags set on the mbuf
will then be computed over stale memory.
These checks were added deliberately in:
1168a4fd193 ("net/tap: add buffer overflow checks before checksum")
Please restore equivalent validation after rte_net_get_ptype(), e.g.
check iph->total_length (IPv4) and hdr_lens->l3_len + iph->payload_len
(IPv6) against rte_pktmbuf_data_len(mbuf) - hdr_lens->l2_len before
calling the checksum verifier.
One other thing: rte_net_get_ptype() returns RTE_PTYPE_L3_IPV6_EXT when
IPv6 extensions are present. The new "else if (l3 != RTE_PTYPE_L3_IPV6)"
clause falls through to return without setting any CKSUM flag for that
case. That matches pre-patch behavior, but the comment the patch removed
was the only thing documenting it. Either accept IPv6_EXT here (l3_len
now spans the extensions, so the existing helpers work as long as the
next header is TCP/UDP) or keep the comment.
Nit: the memset(&hdr_lens, 0, ...) runs on every Rx packet even when
RX_OFFLOAD_CHECKSUM is disabled. Moving it plus the &hdr_lens argument
inside the offload branch (passing NULL otherwise) avoids that.
I prefer initialization instead of memset, less error prone and easier.
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH dpdk v2] net/tap: use offsets provided by rte_net_get_ptype
2026-04-22 13:36 [PATCH dpdk] net/tap: use offsets provided by rte_net_get_ptype Robin Jarry
2026-04-22 16:32 ` Stephen Hemminger
@ 2026-04-27 10:41 ` Robin Jarry
2026-04-28 13:36 ` Stephen Hemminger
2026-04-30 23:28 ` [PATCH dpdk v3] " Robin Jarry
` (3 subsequent siblings)
5 siblings, 1 reply; 12+ messages in thread
From: Robin Jarry @ 2026-04-27 10:41 UTC (permalink / raw)
To: dev, Stephen Hemminger
Instead of guessing what are the proper header lengths, pass
a rte_net_hdr_lens struct to rte_net_get_ptype and use it to get the
proper header lengths/offsets in tap_verify_csum.
This allows supporting stacked VLAN/QinQ tags and IPv6 extensions.
Signed-off-by: Robin Jarry <rjarry@redhat.com>
---
Notes:
v2:
* Restored packet length checks in IPv4/IPv6 before checking L4
checksums.
* Moved struct rte_net_hdr_lens variable next to rte_net_get_ptype call
and initialize it to 0 instead of calling memset().
drivers/net/tap/rte_eth_tap.c | 44 +++++++++++++----------------------
1 file changed, 16 insertions(+), 28 deletions(-)
diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index a5d460a0b3cb..546bedea34df 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -327,79 +327,66 @@ tun_alloc(struct pmd_internals *pmd, int is_keepalive, int persistent)
}
static void
-tap_verify_csum(struct rte_mbuf *mbuf)
+tap_verify_csum(struct rte_mbuf *mbuf, const struct rte_net_hdr_lens *hlen)
{
- uint32_t l2 = mbuf->packet_type & RTE_PTYPE_L2_MASK;
uint32_t l3 = mbuf->packet_type & RTE_PTYPE_L3_MASK;
uint32_t l4 = mbuf->packet_type & RTE_PTYPE_L4_MASK;
- unsigned int l2_len = sizeof(struct rte_ether_hdr);
- unsigned int l3_len;
+ uint32_t l4_off = hlen->l2_len + hlen->l3_len;
uint16_t cksum = 0;
void *l3_hdr;
void *l4_hdr;
- struct rte_udp_hdr *udp_hdr;
- if (l2 == RTE_PTYPE_L2_ETHER_VLAN)
- l2_len += 4;
- else if (l2 == RTE_PTYPE_L2_ETHER_QINQ)
- l2_len += 8;
/* Don't verify checksum for packets with discontinuous L2 header */
- if (unlikely(l2_len + sizeof(struct rte_ipv4_hdr) >
- rte_pktmbuf_data_len(mbuf)))
+ if (unlikely(l4_off > rte_pktmbuf_data_len(mbuf)))
return;
- l3_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, l2_len);
+
+ l3_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, hlen->l2_len);
if (l3 == RTE_PTYPE_L3_IPV4 || l3 == RTE_PTYPE_L3_IPV4_EXT) {
struct rte_ipv4_hdr *iph = l3_hdr;
- l3_len = rte_ipv4_hdr_len(iph);
- if (unlikely(l2_len + l3_len > rte_pktmbuf_data_len(mbuf)))
- return;
/* check that the total length reported by header is not
* greater than the total received size
*/
- if (l2_len + rte_be_to_cpu_16(iph->total_length) >
+ if (hlen->l2_len + rte_be_to_cpu_16(iph->total_length) >
rte_pktmbuf_data_len(mbuf))
return;
- cksum = ~rte_raw_cksum(iph, l3_len);
+ cksum = ~rte_raw_cksum(iph, hlen->l3_len);
mbuf->ol_flags |= cksum ?
RTE_MBUF_F_RX_IP_CKSUM_BAD :
RTE_MBUF_F_RX_IP_CKSUM_GOOD;
- } else if (l3 == RTE_PTYPE_L3_IPV6) {
+ } else if (l3 == RTE_PTYPE_L3_IPV6 || l3 == RTE_PTYPE_L3_IPV6_EXT) {
struct rte_ipv6_hdr *iph = l3_hdr;
- l3_len = sizeof(struct rte_ipv6_hdr);
/* check that the total length reported by header is not
* greater than the total received size
*/
- if (l2_len + l3_len + rte_be_to_cpu_16(iph->payload_len) >
+ if (hlen->l2_len + hlen->l3_len + rte_be_to_cpu_16(iph->payload_len) >
rte_pktmbuf_data_len(mbuf))
return;
} else {
/* - RTE_PTYPE_L3_IPV4_EXT_UNKNOWN cannot happen because
* mbuf->packet_type is filled by rte_net_get_ptype() which
* never returns this value.
- * - IPv6 extensions are not supported.
*/
return;
}
if (l4 == RTE_PTYPE_L4_UDP || l4 == RTE_PTYPE_L4_TCP) {
int cksum_ok;
- const unsigned int l4_min_len = (l4 == RTE_PTYPE_L4_UDP)
- ? sizeof(struct rte_udp_hdr) : sizeof(struct rte_tcp_hdr);
/* Don't verify checksum if L4 header is truncated */
- if (l2_len + l3_len + l4_min_len > rte_pktmbuf_data_len(mbuf))
+ if (l4_off + hlen->l4_len > rte_pktmbuf_data_len(mbuf))
return;
- l4_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, l2_len + l3_len);
/* Don't verify checksum for multi-segment packets. */
if (mbuf->nb_segs > 1)
return;
+
+ l4_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, l4_off);
if (l3 == RTE_PTYPE_L3_IPV4 || l3 == RTE_PTYPE_L3_IPV4_EXT) {
if (l4 == RTE_PTYPE_L4_UDP) {
- udp_hdr = (struct rte_udp_hdr *)l4_hdr;
+ struct rte_udp_hdr *udp_hdr = l4_hdr;
if (udp_hdr->dgram_cksum == 0) {
/*
* For IPv4, a zero UDP checksum
@@ -561,10 +548,11 @@ pmd_rx_burst(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
continue;
}
- mbuf->packet_type = rte_net_get_ptype(mbuf, NULL,
+ struct rte_net_hdr_lens hlen = {0};
+ mbuf->packet_type = rte_net_get_ptype(mbuf, &hlen,
RTE_PTYPE_ALL_MASK);
if (rxq->rxmode->offloads & RTE_ETH_RX_OFFLOAD_CHECKSUM)
- tap_verify_csum(mbuf);
+ tap_verify_csum(mbuf, &hlen);
/* account for the receive frame */
bufs[num_rx++] = mbuf;
--
2.53.0
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH dpdk v2] net/tap: use offsets provided by rte_net_get_ptype
2026-04-27 10:41 ` [PATCH dpdk v2] " Robin Jarry
@ 2026-04-28 13:36 ` Stephen Hemminger
0 siblings, 0 replies; 12+ messages in thread
From: Stephen Hemminger @ 2026-04-28 13:36 UTC (permalink / raw)
To: Robin Jarry; +Cc: dev
On Mon, 27 Apr 2026 12:41:53 +0200
Robin Jarry <rjarry@redhat.com> wrote:
> Instead of guessing what are the proper header lengths, pass
> a rte_net_hdr_lens struct to rte_net_get_ptype and use it to get the
> proper header lengths/offsets in tap_verify_csum.
>
> This allows supporting stacked VLAN/QinQ tags and IPv6 extensions.
>
> Signed-off-by: Robin Jarry <rjarry@redhat.com>
> ---
AI review spotted some things.
But it looks like more of a bug in rte_net_get_ptype
Now let me check the upstream source for the IPv6 header context:Found the bug. Let me verify the IPv6 payload_len semantics:That confirms the bug. `payload_len` is documented as "IP payload size, including ext. headers" — exactly what I needed.
---
**Review of [PATCH dpdk v2] net/tap: use offsets provided by rte_net_get_ptype**
Error:
The IPv6 packet length check double-counts extension header bytes.
`rte_ipv6_hdr.payload_len` is defined as "IP payload size, including
ext. headers" (rte_ip6.h:487), meaning it already includes extension
header bytes counted from just after the 40-byte fixed IPv6 header.
But `hlen->l3_len` for `RTE_PTYPE_L3_IPV6_EXT` is set by
`rte_net_get_ptype` to `40 + extension_headers_size` (rte_net.c:439).
Adding them together counts the extension bytes twice:
if (hlen->l2_len + hlen->l3_len + rte_be_to_cpu_16(iph->payload_len) >
rte_pktmbuf_data_len(mbuf))
return;
For any IPv6 packet with extension headers this check will fire
spuriously, causing `tap_verify_csum` to return early without setting
checksum offload flags — the very case this patch is trying to fix.
The pre-existing code was correct: it used
`sizeof(struct rte_ipv6_hdr)` (40) as the L3 term, leaving
`payload_len` to account for extensions. The fix is to keep that
same constant here:
if (hlen->l2_len + sizeof(struct rte_ipv6_hdr) +
rte_be_to_cpu_16(iph->payload_len) >
rte_pktmbuf_data_len(mbuf))
return;
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH dpdk v3] net/tap: use offsets provided by rte_net_get_ptype
2026-04-22 13:36 [PATCH dpdk] net/tap: use offsets provided by rte_net_get_ptype Robin Jarry
2026-04-22 16:32 ` Stephen Hemminger
2026-04-27 10:41 ` [PATCH dpdk v2] " Robin Jarry
@ 2026-04-30 23:28 ` Robin Jarry
2026-05-03 3:29 ` Stephen Hemminger
2026-05-12 15:16 ` [PATCH dpdk v4] " Robin Jarry
` (2 subsequent siblings)
5 siblings, 1 reply; 12+ messages in thread
From: Robin Jarry @ 2026-04-30 23:28 UTC (permalink / raw)
To: dev, Stephen Hemminger
Instead of guessing what are the proper header lengths, pass
a rte_net_hdr_lens struct to rte_net_get_ptype and use it to get the
proper header lengths/offsets in tap_verify_csum.
This allows supporting stacked VLAN/QinQ tags and IPv6 extensions.
Signed-off-by: Robin Jarry <rjarry@redhat.com>
---
Notes:
v3:
* Fixed double count for ipv6 extension headers.
v2:
* Restored packet length checks in IPv4/IPv6 before checking L4
checksums.
* Moved struct rte_net_hdr_lens variable next to rte_net_get_ptype call
and initialize it to 0 instead of calling memset().
drivers/net/tap/rte_eth_tap.c | 44 +++++++++++++----------------------
1 file changed, 16 insertions(+), 28 deletions(-)
diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index a5d460a0b3cb..f807697aa6a8 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -327,79 +327,66 @@ tun_alloc(struct pmd_internals *pmd, int is_keepalive, int persistent)
}
static void
-tap_verify_csum(struct rte_mbuf *mbuf)
+tap_verify_csum(struct rte_mbuf *mbuf, const struct rte_net_hdr_lens *hlen)
{
- uint32_t l2 = mbuf->packet_type & RTE_PTYPE_L2_MASK;
uint32_t l3 = mbuf->packet_type & RTE_PTYPE_L3_MASK;
uint32_t l4 = mbuf->packet_type & RTE_PTYPE_L4_MASK;
- unsigned int l2_len = sizeof(struct rte_ether_hdr);
- unsigned int l3_len;
+ uint32_t l4_off = hlen->l2_len + hlen->l3_len;
uint16_t cksum = 0;
void *l3_hdr;
void *l4_hdr;
- struct rte_udp_hdr *udp_hdr;
- if (l2 == RTE_PTYPE_L2_ETHER_VLAN)
- l2_len += 4;
- else if (l2 == RTE_PTYPE_L2_ETHER_QINQ)
- l2_len += 8;
/* Don't verify checksum for packets with discontinuous L2 header */
- if (unlikely(l2_len + sizeof(struct rte_ipv4_hdr) >
- rte_pktmbuf_data_len(mbuf)))
+ if (unlikely(l4_off > rte_pktmbuf_data_len(mbuf)))
return;
- l3_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, l2_len);
+
+ l3_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, hlen->l2_len);
if (l3 == RTE_PTYPE_L3_IPV4 || l3 == RTE_PTYPE_L3_IPV4_EXT) {
struct rte_ipv4_hdr *iph = l3_hdr;
- l3_len = rte_ipv4_hdr_len(iph);
- if (unlikely(l2_len + l3_len > rte_pktmbuf_data_len(mbuf)))
- return;
/* check that the total length reported by header is not
* greater than the total received size
*/
- if (l2_len + rte_be_to_cpu_16(iph->total_length) >
+ if (hlen->l2_len + rte_be_to_cpu_16(iph->total_length) >
rte_pktmbuf_data_len(mbuf))
return;
- cksum = ~rte_raw_cksum(iph, l3_len);
+ cksum = ~rte_raw_cksum(iph, hlen->l3_len);
mbuf->ol_flags |= cksum ?
RTE_MBUF_F_RX_IP_CKSUM_BAD :
RTE_MBUF_F_RX_IP_CKSUM_GOOD;
- } else if (l3 == RTE_PTYPE_L3_IPV6) {
+ } else if (l3 == RTE_PTYPE_L3_IPV6 || l3 == RTE_PTYPE_L3_IPV6_EXT) {
struct rte_ipv6_hdr *iph = l3_hdr;
- l3_len = sizeof(struct rte_ipv6_hdr);
/* check that the total length reported by header is not
* greater than the total received size
*/
- if (l2_len + l3_len + rte_be_to_cpu_16(iph->payload_len) >
+ if (hlen->l2_len + sizeof(*iph) + rte_be_to_cpu_16(iph->payload_len) >
rte_pktmbuf_data_len(mbuf))
return;
} else {
/* - RTE_PTYPE_L3_IPV4_EXT_UNKNOWN cannot happen because
* mbuf->packet_type is filled by rte_net_get_ptype() which
* never returns this value.
- * - IPv6 extensions are not supported.
*/
return;
}
if (l4 == RTE_PTYPE_L4_UDP || l4 == RTE_PTYPE_L4_TCP) {
int cksum_ok;
- const unsigned int l4_min_len = (l4 == RTE_PTYPE_L4_UDP)
- ? sizeof(struct rte_udp_hdr) : sizeof(struct rte_tcp_hdr);
/* Don't verify checksum if L4 header is truncated */
- if (l2_len + l3_len + l4_min_len > rte_pktmbuf_data_len(mbuf))
+ if (l4_off + hlen->l4_len > rte_pktmbuf_data_len(mbuf))
return;
- l4_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, l2_len + l3_len);
/* Don't verify checksum for multi-segment packets. */
if (mbuf->nb_segs > 1)
return;
+
+ l4_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, l4_off);
if (l3 == RTE_PTYPE_L3_IPV4 || l3 == RTE_PTYPE_L3_IPV4_EXT) {
if (l4 == RTE_PTYPE_L4_UDP) {
- udp_hdr = (struct rte_udp_hdr *)l4_hdr;
+ struct rte_udp_hdr *udp_hdr = l4_hdr;
if (udp_hdr->dgram_cksum == 0) {
/*
* For IPv4, a zero UDP checksum
@@ -561,10 +548,11 @@ pmd_rx_burst(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
continue;
}
- mbuf->packet_type = rte_net_get_ptype(mbuf, NULL,
+ struct rte_net_hdr_lens hlen = {0};
+ mbuf->packet_type = rte_net_get_ptype(mbuf, &hlen,
RTE_PTYPE_ALL_MASK);
if (rxq->rxmode->offloads & RTE_ETH_RX_OFFLOAD_CHECKSUM)
- tap_verify_csum(mbuf);
+ tap_verify_csum(mbuf, &hlen);
/* account for the receive frame */
bufs[num_rx++] = mbuf;
--
2.54.0
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH dpdk v3] net/tap: use offsets provided by rte_net_get_ptype
2026-04-30 23:28 ` [PATCH dpdk v3] " Robin Jarry
@ 2026-05-03 3:29 ` Stephen Hemminger
0 siblings, 0 replies; 12+ messages in thread
From: Stephen Hemminger @ 2026-05-03 3:29 UTC (permalink / raw)
To: Robin Jarry; +Cc: dev
On Fri, 1 May 2026 01:28:07 +0200
Robin Jarry <rjarry@redhat.com> wrote:
> Instead of guessing what are the proper header lengths, pass
> a rte_net_hdr_lens struct to rte_net_get_ptype and use it to get the
> proper header lengths/offsets in tap_verify_csum.
>
> This allows supporting stacked VLAN/QinQ tags and IPv6 extensions.
>
> Signed-off-by: Robin Jarry <rjarry@redhat.com>
> ---
LGTM, AI found a couple of small things.
Info: stale comment
In the same else branch, the comment is now inaccurate since l3 can
also be RTE_PTYPE_L3_IPV6_EXT:
} else { /* l3 == RTE_PTYPE_L3_IPV6, checked above */
Info: cover letter
The "supporting [...] IPv6 extensions" claim should be qualified or
removed depending on which fix is taken. The header-length and offset
math is now extension-aware, but L4 checksum validation is not.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH dpdk v4] net/tap: use offsets provided by rte_net_get_ptype
2026-04-22 13:36 [PATCH dpdk] net/tap: use offsets provided by rte_net_get_ptype Robin Jarry
` (2 preceding siblings ...)
2026-04-30 23:28 ` [PATCH dpdk v3] " Robin Jarry
@ 2026-05-12 15:16 ` Robin Jarry
2026-05-18 3:14 ` Stephen Hemminger
2026-05-18 8:27 ` [PATCH dpdk v5] " Robin Jarry
2026-05-18 20:54 ` [PATCH dpdk v6] " Robin Jarry
5 siblings, 1 reply; 12+ messages in thread
From: Robin Jarry @ 2026-05-12 15:16 UTC (permalink / raw)
To: dev, Stephen Hemminger
Instead of guessing what are the proper header lengths, pass
a rte_net_hdr_lens struct to rte_net_get_ptype and use it to get the
proper header lengths/offsets in tap_verify_csum.
This allows supporting stacked VLAN/QinQ tags and IPv6 extensions.
Signed-off-by: Robin Jarry <rjarry@redhat.com>
---
Notes:
v4:
* Fixed stale comments.
v3:
* Fixed double count for ipv6 extension headers.
v2:
* Restored packet length checks in IPv4/IPv6 before checking L4
checksums.
* Moved struct rte_net_hdr_lens variable next to rte_net_get_ptype call
and initialize it to 0 instead of calling memset().
drivers/net/tap/rte_eth_tap.c | 46 +++++++++++++----------------------
1 file changed, 17 insertions(+), 29 deletions(-)
diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index a5d460a0b3cb..09cd252918e4 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -327,79 +327,66 @@ tun_alloc(struct pmd_internals *pmd, int is_keepalive, int persistent)
}
static void
-tap_verify_csum(struct rte_mbuf *mbuf)
+tap_verify_csum(struct rte_mbuf *mbuf, const struct rte_net_hdr_lens *hlen)
{
- uint32_t l2 = mbuf->packet_type & RTE_PTYPE_L2_MASK;
uint32_t l3 = mbuf->packet_type & RTE_PTYPE_L3_MASK;
uint32_t l4 = mbuf->packet_type & RTE_PTYPE_L4_MASK;
- unsigned int l2_len = sizeof(struct rte_ether_hdr);
- unsigned int l3_len;
+ uint32_t l4_off = hlen->l2_len + hlen->l3_len;
uint16_t cksum = 0;
void *l3_hdr;
void *l4_hdr;
- struct rte_udp_hdr *udp_hdr;
- if (l2 == RTE_PTYPE_L2_ETHER_VLAN)
- l2_len += 4;
- else if (l2 == RTE_PTYPE_L2_ETHER_QINQ)
- l2_len += 8;
/* Don't verify checksum for packets with discontinuous L2 header */
- if (unlikely(l2_len + sizeof(struct rte_ipv4_hdr) >
- rte_pktmbuf_data_len(mbuf)))
+ if (unlikely(l4_off > rte_pktmbuf_data_len(mbuf)))
return;
- l3_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, l2_len);
+
+ l3_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, hlen->l2_len);
if (l3 == RTE_PTYPE_L3_IPV4 || l3 == RTE_PTYPE_L3_IPV4_EXT) {
struct rte_ipv4_hdr *iph = l3_hdr;
- l3_len = rte_ipv4_hdr_len(iph);
- if (unlikely(l2_len + l3_len > rte_pktmbuf_data_len(mbuf)))
- return;
/* check that the total length reported by header is not
* greater than the total received size
*/
- if (l2_len + rte_be_to_cpu_16(iph->total_length) >
+ if (hlen->l2_len + rte_be_to_cpu_16(iph->total_length) >
rte_pktmbuf_data_len(mbuf))
return;
- cksum = ~rte_raw_cksum(iph, l3_len);
+ cksum = ~rte_raw_cksum(iph, hlen->l3_len);
mbuf->ol_flags |= cksum ?
RTE_MBUF_F_RX_IP_CKSUM_BAD :
RTE_MBUF_F_RX_IP_CKSUM_GOOD;
- } else if (l3 == RTE_PTYPE_L3_IPV6) {
+ } else if (l3 == RTE_PTYPE_L3_IPV6 || l3 == RTE_PTYPE_L3_IPV6_EXT) {
struct rte_ipv6_hdr *iph = l3_hdr;
- l3_len = sizeof(struct rte_ipv6_hdr);
/* check that the total length reported by header is not
* greater than the total received size
*/
- if (l2_len + l3_len + rte_be_to_cpu_16(iph->payload_len) >
+ if (hlen->l2_len + sizeof(*iph) + rte_be_to_cpu_16(iph->payload_len) >
rte_pktmbuf_data_len(mbuf))
return;
} else {
/* - RTE_PTYPE_L3_IPV4_EXT_UNKNOWN cannot happen because
* mbuf->packet_type is filled by rte_net_get_ptype() which
* never returns this value.
- * - IPv6 extensions are not supported.
*/
return;
}
if (l4 == RTE_PTYPE_L4_UDP || l4 == RTE_PTYPE_L4_TCP) {
int cksum_ok;
- const unsigned int l4_min_len = (l4 == RTE_PTYPE_L4_UDP)
- ? sizeof(struct rte_udp_hdr) : sizeof(struct rte_tcp_hdr);
/* Don't verify checksum if L4 header is truncated */
- if (l2_len + l3_len + l4_min_len > rte_pktmbuf_data_len(mbuf))
+ if (l4_off + hlen->l4_len > rte_pktmbuf_data_len(mbuf))
return;
- l4_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, l2_len + l3_len);
/* Don't verify checksum for multi-segment packets. */
if (mbuf->nb_segs > 1)
return;
+
+ l4_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, l4_off);
if (l3 == RTE_PTYPE_L3_IPV4 || l3 == RTE_PTYPE_L3_IPV4_EXT) {
if (l4 == RTE_PTYPE_L4_UDP) {
- udp_hdr = (struct rte_udp_hdr *)l4_hdr;
+ struct rte_udp_hdr *udp_hdr = l4_hdr;
if (udp_hdr->dgram_cksum == 0) {
/*
* For IPv4, a zero UDP checksum
@@ -412,7 +399,7 @@ tap_verify_csum(struct rte_mbuf *mbuf)
}
cksum_ok = !rte_ipv4_udptcp_cksum_verify(l3_hdr,
l4_hdr);
- } else { /* l3 == RTE_PTYPE_L3_IPV6, checked above */
+ } else { /* l3 == RTE_PTYPE_L3_IPV6{,_EXT}, checked above */
cksum_ok = !rte_ipv6_udptcp_cksum_verify(l3_hdr,
l4_hdr);
}
@@ -561,10 +548,11 @@ pmd_rx_burst(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
continue;
}
- mbuf->packet_type = rte_net_get_ptype(mbuf, NULL,
+ struct rte_net_hdr_lens hlen = {0};
+ mbuf->packet_type = rte_net_get_ptype(mbuf, &hlen,
RTE_PTYPE_ALL_MASK);
if (rxq->rxmode->offloads & RTE_ETH_RX_OFFLOAD_CHECKSUM)
- tap_verify_csum(mbuf);
+ tap_verify_csum(mbuf, &hlen);
/* account for the receive frame */
bufs[num_rx++] = mbuf;
--
2.54.0
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH dpdk v4] net/tap: use offsets provided by rte_net_get_ptype
2026-05-12 15:16 ` [PATCH dpdk v4] " Robin Jarry
@ 2026-05-18 3:14 ` Stephen Hemminger
0 siblings, 0 replies; 12+ messages in thread
From: Stephen Hemminger @ 2026-05-18 3:14 UTC (permalink / raw)
To: Robin Jarry; +Cc: dev
On Tue, 12 May 2026 17:16:12 +0200
Robin Jarry <rjarry@redhat.com> wrote:
> Instead of guessing what are the proper header lengths, pass
> a rte_net_hdr_lens struct to rte_net_get_ptype and use it to get the
> proper header lengths/offsets in tap_verify_csum.
>
> This allows supporting stacked VLAN/QinQ tags and IPv6 extensions.
>
> Signed-off-by: Robin Jarry <rjarry@redhat.com>
> ---
AI patch review
Short summary suitable for replying to the patch:
The v4 patch claims to add IPv6 extension header support, but the L4
checksum path is broken for that case. The patch matches both IPV6
and IPV6_EXT in the L3 block and then falls through to
rte_ipv6_udptcp_cksum_verify(), which is documented as not supporting
extension headers. Concretely, that helper uses ipv6_hdr->proto for
the pseudo-header (which is the first extension-header type, not the
L4 protocol) and uses ipv6_hdr->payload_len as the raw-cksum length
(which over-reads past the L4 data by the size of the extensions).
Result: valid TCP/UDP over IPv6+ext packets get tagged
RX_L4_CKSUM_BAD, contradicting the commit message.
Suggested fix: keep IPV6_EXT in the L3 sanity-check block, but skip
L4 verification when l3 == RTE_PTYPE_L3_IPV6_EXT until a helper that
handles extension headers exists.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH dpdk v5] net/tap: use offsets provided by rte_net_get_ptype
2026-04-22 13:36 [PATCH dpdk] net/tap: use offsets provided by rte_net_get_ptype Robin Jarry
` (3 preceding siblings ...)
2026-05-12 15:16 ` [PATCH dpdk v4] " Robin Jarry
@ 2026-05-18 8:27 ` Robin Jarry
2026-05-18 17:52 ` Stephen Hemminger
2026-05-18 20:54 ` [PATCH dpdk v6] " Robin Jarry
5 siblings, 1 reply; 12+ messages in thread
From: Robin Jarry @ 2026-05-18 8:27 UTC (permalink / raw)
To: dev, Stephen Hemminger
Instead of guessing what are the proper header lengths, pass
a rte_net_hdr_lens struct to rte_net_get_ptype and use it to get the
proper header lengths/offsets in tap_verify_csum.
This allows supporting stacked VLAN/QinQ tags and IPv6 extensions.
Signed-off-by: Robin Jarry <rjarry@redhat.com>
---
Notes:
v5:
* Fixed invalid L4 checksum verification when there are IPv6 extension headers.
v4:
* Fixed stale comments.
v3:
* Fixed double count for ipv6 extension headers.
v2:
* Restored packet length checks in IPv4/IPv6 before checking L4
checksums.
* Moved struct rte_net_hdr_lens variable next to rte_net_get_ptype call
and initialize it to 0 instead of calling memset().
drivers/net/tap/rte_eth_tap.c | 41 +++++++++++++----------------------
1 file changed, 15 insertions(+), 26 deletions(-)
diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index a5d460a0b3cb..3b8e19afb7af 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -327,52 +327,41 @@ tun_alloc(struct pmd_internals *pmd, int is_keepalive, int persistent)
}
static void
-tap_verify_csum(struct rte_mbuf *mbuf)
+tap_verify_csum(struct rte_mbuf *mbuf, const struct rte_net_hdr_lens *hlen)
{
- uint32_t l2 = mbuf->packet_type & RTE_PTYPE_L2_MASK;
uint32_t l3 = mbuf->packet_type & RTE_PTYPE_L3_MASK;
uint32_t l4 = mbuf->packet_type & RTE_PTYPE_L4_MASK;
- unsigned int l2_len = sizeof(struct rte_ether_hdr);
- unsigned int l3_len;
+ uint32_t l4_off = hlen->l2_len + hlen->l3_len;
uint16_t cksum = 0;
void *l3_hdr;
void *l4_hdr;
- struct rte_udp_hdr *udp_hdr;
- if (l2 == RTE_PTYPE_L2_ETHER_VLAN)
- l2_len += 4;
- else if (l2 == RTE_PTYPE_L2_ETHER_QINQ)
- l2_len += 8;
/* Don't verify checksum for packets with discontinuous L2 header */
- if (unlikely(l2_len + sizeof(struct rte_ipv4_hdr) >
- rte_pktmbuf_data_len(mbuf)))
+ if (unlikely(l4_off > rte_pktmbuf_data_len(mbuf)))
return;
- l3_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, l2_len);
+
+ l3_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, hlen->l2_len);
if (l3 == RTE_PTYPE_L3_IPV4 || l3 == RTE_PTYPE_L3_IPV4_EXT) {
struct rte_ipv4_hdr *iph = l3_hdr;
- l3_len = rte_ipv4_hdr_len(iph);
- if (unlikely(l2_len + l3_len > rte_pktmbuf_data_len(mbuf)))
- return;
/* check that the total length reported by header is not
* greater than the total received size
*/
- if (l2_len + rte_be_to_cpu_16(iph->total_length) >
+ if (hlen->l2_len + rte_be_to_cpu_16(iph->total_length) >
rte_pktmbuf_data_len(mbuf))
return;
- cksum = ~rte_raw_cksum(iph, l3_len);
+ cksum = ~rte_raw_cksum(iph, hlen->l3_len);
mbuf->ol_flags |= cksum ?
RTE_MBUF_F_RX_IP_CKSUM_BAD :
RTE_MBUF_F_RX_IP_CKSUM_GOOD;
} else if (l3 == RTE_PTYPE_L3_IPV6) {
struct rte_ipv6_hdr *iph = l3_hdr;
- l3_len = sizeof(struct rte_ipv6_hdr);
/* check that the total length reported by header is not
* greater than the total received size
*/
- if (l2_len + l3_len + rte_be_to_cpu_16(iph->payload_len) >
+ if (hlen->l2_len + sizeof(*iph) + rte_be_to_cpu_16(iph->payload_len) >
rte_pktmbuf_data_len(mbuf))
return;
} else {
@@ -386,20 +375,19 @@ tap_verify_csum(struct rte_mbuf *mbuf)
if (l4 == RTE_PTYPE_L4_UDP || l4 == RTE_PTYPE_L4_TCP) {
int cksum_ok;
- const unsigned int l4_min_len = (l4 == RTE_PTYPE_L4_UDP)
- ? sizeof(struct rte_udp_hdr) : sizeof(struct rte_tcp_hdr);
/* Don't verify checksum if L4 header is truncated */
- if (l2_len + l3_len + l4_min_len > rte_pktmbuf_data_len(mbuf))
+ if (l4_off + hlen->l4_len > rte_pktmbuf_data_len(mbuf))
return;
- l4_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, l2_len + l3_len);
/* Don't verify checksum for multi-segment packets. */
if (mbuf->nb_segs > 1)
return;
+
+ l4_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, l4_off);
if (l3 == RTE_PTYPE_L3_IPV4 || l3 == RTE_PTYPE_L3_IPV4_EXT) {
if (l4 == RTE_PTYPE_L4_UDP) {
- udp_hdr = (struct rte_udp_hdr *)l4_hdr;
+ struct rte_udp_hdr *udp_hdr = l4_hdr;
if (udp_hdr->dgram_cksum == 0) {
/*
* For IPv4, a zero UDP checksum
@@ -561,10 +549,11 @@ pmd_rx_burst(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
continue;
}
- mbuf->packet_type = rte_net_get_ptype(mbuf, NULL,
+ struct rte_net_hdr_lens hlen = {0};
+ mbuf->packet_type = rte_net_get_ptype(mbuf, &hlen,
RTE_PTYPE_ALL_MASK);
if (rxq->rxmode->offloads & RTE_ETH_RX_OFFLOAD_CHECKSUM)
- tap_verify_csum(mbuf);
+ tap_verify_csum(mbuf, &hlen);
/* account for the receive frame */
bufs[num_rx++] = mbuf;
--
2.54.0
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH dpdk v5] net/tap: use offsets provided by rte_net_get_ptype
2026-05-18 8:27 ` [PATCH dpdk v5] " Robin Jarry
@ 2026-05-18 17:52 ` Stephen Hemminger
0 siblings, 0 replies; 12+ messages in thread
From: Stephen Hemminger @ 2026-05-18 17:52 UTC (permalink / raw)
To: Robin Jarry; +Cc: dev
On Mon, 18 May 2026 10:27:00 +0200
Robin Jarry <rjarry@redhat.com> wrote:
> Instead of guessing what are the proper header lengths, pass
> a rte_net_hdr_lens struct to rte_net_get_ptype and use it to get the
> proper header lengths/offsets in tap_verify_csum.
>
> This allows supporting stacked VLAN/QinQ tags and IPv6 extensions.
>
> Signed-off-by: Robin Jarry <rjarry@redhat.com>
Still needs some polishing around Ipv6 ext headers
per AI review...
[PATCH dpdk v5] net/tap: use offsets provided by rte_net_get_ptype
The refactor is mechanically correct. rte_net_get_ptype() with a
non-NULL rte_net_hdr_lens fills l2_len walking the full VLAN/QinQ
stack (up to RTE_NET_VLAN_MAX_DEPTH), and for IPv4 sets
l3_len = rte_ipv4_hdr_len(ip4h). Replacing the local l2_len/l3_len
arithmetic with hlen->* preserves the existing bounds checks and
genuinely fixes the stacked-VLAN case (the old code only handled
single VLAN +4 or QinQ +8). The unnecessary cast of l4_hdr to
struct rte_udp_hdr * was correctly dropped. The {0} initializer on
the rte_net_hdr_lens stack variable is appropriate since
rte_net_get_ptype only writes the fields it parses.
Info:
- The commit message says "This allows supporting stacked
VLAN/QinQ tags and IPv6 extensions." The first half is true.
The IPv6-extensions half is not: the else branch that catches
RTE_PTYPE_L3_IPV6_EXT and returns is unchanged, and the
accompanying comment "IPv6 extensions are not supported." in
that branch is also unchanged. For an IPv6 packet with
extension headers, rte_net_get_ptype() will set
RTE_PTYPE_L3_IPV6_EXT (not RTE_PTYPE_L3_IPV6), so the
else-if branch is skipped and tap_verify_csum() returns
without verifying L3 or L4 checksums. The hlen now carries
the correct l3_len including extensions, so this is a
prerequisite for IPv6-ext support but not the feature itself.
Suggest rewording to "This is a prerequisite for IPv6
extension header support" or dropping that half of the
sentence.
- drivers/net/tap/rte_eth_tap.c:549: minor scoping observation
only (no action needed) -- the hlen struct is filled
unconditionally on every received packet whereas previously
rte_net_get_ptype() was called with NULL when offload check
was disabled. The cost is a handful of stack writes per
packet; consistent with how other PMDs handle this.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH dpdk v6] net/tap: use offsets provided by rte_net_get_ptype
2026-04-22 13:36 [PATCH dpdk] net/tap: use offsets provided by rte_net_get_ptype Robin Jarry
` (4 preceding siblings ...)
2026-05-18 8:27 ` [PATCH dpdk v5] " Robin Jarry
@ 2026-05-18 20:54 ` Robin Jarry
2026-05-18 22:26 ` Stephen Hemminger
5 siblings, 1 reply; 12+ messages in thread
From: Robin Jarry @ 2026-05-18 20:54 UTC (permalink / raw)
To: dev, Stephen Hemminger
Instead of guessing what are the proper header lengths, pass
a rte_net_hdr_lens struct to rte_net_get_ptype and use it to get the
proper header lengths/offsets in tap_verify_csum.
This allows supporting stacked VLAN/QinQ tags.
Signed-off-by: Robin Jarry <rjarry@redhat.com>
---
Notes:
v6: Removed invalid claim about IPv6 ext support in commit message.
v5: Fixed invalid L4 checksum verification when there are IPv6 extension headers.
v4: Fixed stale comments.
v3: Fixed double count for ipv6 extension headers.
v2:
* Restored packet length checks in IPv4/IPv6 before checking L4
checksums.
* Moved struct rte_net_hdr_lens variable next to rte_net_get_ptype call
and initialize it to 0 instead of calling memset().
drivers/net/tap/rte_eth_tap.c | 41 +++++++++++++----------------------
1 file changed, 15 insertions(+), 26 deletions(-)
diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index a5d460a0b3cb..3b8e19afb7af 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -327,52 +327,41 @@ tun_alloc(struct pmd_internals *pmd, int is_keepalive, int persistent)
}
static void
-tap_verify_csum(struct rte_mbuf *mbuf)
+tap_verify_csum(struct rte_mbuf *mbuf, const struct rte_net_hdr_lens *hlen)
{
- uint32_t l2 = mbuf->packet_type & RTE_PTYPE_L2_MASK;
uint32_t l3 = mbuf->packet_type & RTE_PTYPE_L3_MASK;
uint32_t l4 = mbuf->packet_type & RTE_PTYPE_L4_MASK;
- unsigned int l2_len = sizeof(struct rte_ether_hdr);
- unsigned int l3_len;
+ uint32_t l4_off = hlen->l2_len + hlen->l3_len;
uint16_t cksum = 0;
void *l3_hdr;
void *l4_hdr;
- struct rte_udp_hdr *udp_hdr;
- if (l2 == RTE_PTYPE_L2_ETHER_VLAN)
- l2_len += 4;
- else if (l2 == RTE_PTYPE_L2_ETHER_QINQ)
- l2_len += 8;
/* Don't verify checksum for packets with discontinuous L2 header */
- if (unlikely(l2_len + sizeof(struct rte_ipv4_hdr) >
- rte_pktmbuf_data_len(mbuf)))
+ if (unlikely(l4_off > rte_pktmbuf_data_len(mbuf)))
return;
- l3_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, l2_len);
+
+ l3_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, hlen->l2_len);
if (l3 == RTE_PTYPE_L3_IPV4 || l3 == RTE_PTYPE_L3_IPV4_EXT) {
struct rte_ipv4_hdr *iph = l3_hdr;
- l3_len = rte_ipv4_hdr_len(iph);
- if (unlikely(l2_len + l3_len > rte_pktmbuf_data_len(mbuf)))
- return;
/* check that the total length reported by header is not
* greater than the total received size
*/
- if (l2_len + rte_be_to_cpu_16(iph->total_length) >
+ if (hlen->l2_len + rte_be_to_cpu_16(iph->total_length) >
rte_pktmbuf_data_len(mbuf))
return;
- cksum = ~rte_raw_cksum(iph, l3_len);
+ cksum = ~rte_raw_cksum(iph, hlen->l3_len);
mbuf->ol_flags |= cksum ?
RTE_MBUF_F_RX_IP_CKSUM_BAD :
RTE_MBUF_F_RX_IP_CKSUM_GOOD;
} else if (l3 == RTE_PTYPE_L3_IPV6) {
struct rte_ipv6_hdr *iph = l3_hdr;
- l3_len = sizeof(struct rte_ipv6_hdr);
/* check that the total length reported by header is not
* greater than the total received size
*/
- if (l2_len + l3_len + rte_be_to_cpu_16(iph->payload_len) >
+ if (hlen->l2_len + sizeof(*iph) + rte_be_to_cpu_16(iph->payload_len) >
rte_pktmbuf_data_len(mbuf))
return;
} else {
@@ -386,20 +375,19 @@ tap_verify_csum(struct rte_mbuf *mbuf)
if (l4 == RTE_PTYPE_L4_UDP || l4 == RTE_PTYPE_L4_TCP) {
int cksum_ok;
- const unsigned int l4_min_len = (l4 == RTE_PTYPE_L4_UDP)
- ? sizeof(struct rte_udp_hdr) : sizeof(struct rte_tcp_hdr);
/* Don't verify checksum if L4 header is truncated */
- if (l2_len + l3_len + l4_min_len > rte_pktmbuf_data_len(mbuf))
+ if (l4_off + hlen->l4_len > rte_pktmbuf_data_len(mbuf))
return;
- l4_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, l2_len + l3_len);
/* Don't verify checksum for multi-segment packets. */
if (mbuf->nb_segs > 1)
return;
+
+ l4_hdr = rte_pktmbuf_mtod_offset(mbuf, void *, l4_off);
if (l3 == RTE_PTYPE_L3_IPV4 || l3 == RTE_PTYPE_L3_IPV4_EXT) {
if (l4 == RTE_PTYPE_L4_UDP) {
- udp_hdr = (struct rte_udp_hdr *)l4_hdr;
+ struct rte_udp_hdr *udp_hdr = l4_hdr;
if (udp_hdr->dgram_cksum == 0) {
/*
* For IPv4, a zero UDP checksum
@@ -561,10 +549,11 @@ pmd_rx_burst(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
continue;
}
- mbuf->packet_type = rte_net_get_ptype(mbuf, NULL,
+ struct rte_net_hdr_lens hlen = {0};
+ mbuf->packet_type = rte_net_get_ptype(mbuf, &hlen,
RTE_PTYPE_ALL_MASK);
if (rxq->rxmode->offloads & RTE_ETH_RX_OFFLOAD_CHECKSUM)
- tap_verify_csum(mbuf);
+ tap_verify_csum(mbuf, &hlen);
/* account for the receive frame */
bufs[num_rx++] = mbuf;
--
2.54.0
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH dpdk v6] net/tap: use offsets provided by rte_net_get_ptype
2026-05-18 20:54 ` [PATCH dpdk v6] " Robin Jarry
@ 2026-05-18 22:26 ` Stephen Hemminger
0 siblings, 0 replies; 12+ messages in thread
From: Stephen Hemminger @ 2026-05-18 22:26 UTC (permalink / raw)
To: Robin Jarry; +Cc: dev
On Mon, 18 May 2026 22:54:49 +0200
Robin Jarry <rjarry@redhat.com> wrote:
> Instead of guessing what are the proper header lengths, pass
> a rte_net_hdr_lens struct to rte_net_get_ptype and use it to get the
> proper header lengths/offsets in tap_verify_csum.
>
> This allows supporting stacked VLAN/QinQ tags.
>
> Signed-off-by: Robin Jarry <rjarry@redhat.com>
> ---
Applied to next-net
^ permalink raw reply [flat|nested] 12+ messages in thread