* [PATCH] net/cksum: compute raw cksum for several segments
@ 2025-07-31 9:55 苏赛
2025-07-31 10:51 ` Marat Khalili
` (4 more replies)
0 siblings, 5 replies; 24+ messages in thread
From: 苏赛 @ 2025-07-31 9:55 UTC (permalink / raw)
To: jasvinder.singh@intel.com; +Cc: dev@dpdk.org
[-- Attachment #1: Type: text/plain, Size: 2824 bytes --]
The rte_raw_cksum_mbuf function is used to compute
the raw checksum of a packet.
If the packet payload stored in multi mbuf, the function
will goto the hard case. In hard case,
the variable 'tmp' is a type of uint32_t,
so rte_bswap16 will drop high 16 bit.
Meanwhile, the variable 'sum' is a type of uint32_t,
so 'sum += tmp' will drop the carry when overflow.
Both drop will make cksum incorrect.
This commit fixes the above bug.
Signed-off-by: Su Sai <susai.ss@bytedance.com>
---
.mailmap | 1 +
lib/net/rte_cksum.h | 26 +++++++++++++++++++++++---
2 files changed, 24 insertions(+), 3 deletions(-)
diff --git a/.mailmap b/.mailmap
index 34a99f93a1..838b544a97 100644
--- a/.mailmap
+++ b/.mailmap
@@ -1891,3 +1891,4 @@ Zoltan Kiss <zoltan.kiss@schaman.hu> <
zoltan.kiss@linaro.org>
Zorik Machulsky <zorik@amazon.com>
Zyta Szpak <zyta@marvell.com> <zr@semihalf.com>
Zyta Szpak <zyta@marvell.com> <zyta.szpak@semihalf.com>
+Su Sai <susai.ss@bytedance.com>
diff --git a/lib/net/rte_cksum.h b/lib/net/rte_cksum.h
index a8e8927952..aa584d5f8d 100644
--- a/lib/net/rte_cksum.h
+++ b/lib/net/rte_cksum.h
@@ -80,6 +80,25 @@ __rte_raw_cksum_reduce(uint32_t sum)
return (uint16_t)sum;
}
+/**
+ * @internal Reduce a sum to the non-complemented checksum.
+ * Helper routine for the rte_raw_cksum_mbuf().
+ *
+ * @param sum
+ * Value of the sum.
+ * @return
+ * The non-complemented checksum.
+ */
+static inline uint16_t
+__rte_raw_cksum_reduce_u64(uint64_t sum)
+{
+ uint32_t tmp;
+
+ tmp = __rte_raw_cksum_reduce((uint32_t)sum);
+ tmp += __rte_raw_cksum_reduce((uint32_t)(sum >> 32));
+ return __rte_raw_cksum_reduce(tmp);
+}
+
/**
* Process the non-complemented checksum of a buffer.
*
@@ -119,8 +138,9 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m,
uint32_t off, uint32_t len,
{
const struct rte_mbuf *seg;
const char *buf;
- uint32_t sum, tmp;
+ uint32_t tmp;
uint32_t seglen, done;
+ uint64_t sum;
/* easy case: all data in the first segment */
if (off + len <= rte_pktmbuf_data_len(m)) {
@@ -157,7 +177,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m,
uint32_t off, uint32_t len,
for (;;) {
tmp = __rte_raw_cksum(buf, seglen, 0);
if (done & 1)
- tmp = rte_bswap16((uint16_t)tmp);
+ tmp = rte_bswap32(tmp);
sum += tmp;
done += seglen;
if (done == len)
@@ -169,7 +189,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m,
uint32_t off, uint32_t len,
seglen = len - done;
}
- *cksum = __rte_raw_cksum_reduce(sum);
+ *cksum = __rte_raw_cksum_reduce_u64(sum);
return 0;
}
--
2.39.2 (Apple Git-143)
[-- Attachment #2: Type: text/html, Size: 14519 bytes --]
^ permalink raw reply related [flat|nested] 24+ messages in thread* RE: [PATCH] net/cksum: compute raw cksum for several segments 2025-07-31 9:55 [PATCH] net/cksum: compute raw cksum for several segments 苏赛 @ 2025-07-31 10:51 ` Marat Khalili 2025-07-31 11:03 ` Marat Khalili 2025-07-31 11:46 ` Marat Khalili ` (3 subsequent siblings) 4 siblings, 1 reply; 24+ messages in thread From: Marat Khalili @ 2025-07-31 10:51 UTC (permalink / raw) To: 苏赛, jasvinder.singh@intel.com; +Cc: dev@dpdk.org [-- Attachment #1: Type: text/plain, Size: 3313 bytes --] Hi! Thanks for submitting this. Some inline comments follow. > -----Original Message----- > From: 苏赛 <susai.ss@bytedance.com> > Sent: Thursday 31 July 2025 10:55 > To: jasvinder.singh@intel.com > Cc: dev@dpdk.org > Subject: [PATCH] net/cksum: compute raw cksum for several segments > > The rte_raw_cksum_mbuf function is used to compute > the raw checksum of a packet. > If the packet payload stored in multi mbuf, the function > will goto the hard case. In hard case, > the variable 'tmp' is a type of uint32_t, > so rte_bswap16 will drop high 16 bit. > Meanwhile, the variable 'sum' is a type of uint32_t, > so 'sum += tmp' will drop the carry when overflow. > Both drop will make cksum incorrect. > This commit fixes the above bug. > > Signed-off-by: Su Sai <susai.ss@bytedance.com> > diff --git a/lib/net/rte_cksum.h b/lib/net/rte_cksum.h > index a8e8927952..aa584d5f8d 100644 > --- a/lib/net/rte_cksum.h > +++ b/lib/net/rte_cksum.h > @@ -80,6 +80,25 @@ __rte_raw_cksum_reduce(uint32_t sum) > return (uint16_t)sum; > } > > +/** > + * @internal Reduce a sum to the non-complemented checksum. > + * Helper routine for the rte_raw_cksum_mbuf(). > + * > + * @param sum > + * Value of the sum. > + * @return > + * The non-complemented checksum. > + */ > +static inline uint16_t > +__rte_raw_cksum_reduce_u64(uint64_t sum) > +{ > + uint32_t tmp; > + > + tmp = __rte_raw_cksum_reduce((uint32_t)sum); > + tmp += __rte_raw_cksum_reduce((uint32_t)(sum >> 32)); What if this addition overflows? To my taste I would not try to call `__rte_raw_cksum_reduce ` and instead reduce uint64_t directly to uint16_t, but it’s up to you. > + return __rte_raw_cksum_reduce(tmp); > +} > + > /** > * Process the non-complemented checksum of a buffer. > * > @@ -119,8 +138,9 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, > { > const struct rte_mbuf *seg; > const char *buf; > - uint32_t sum, tmp; > + uint32_t tmp; > uint32_t seglen, done; > + uint64_t sum; > > /* easy case: all data in the first segment */ > if (off + len <= rte_pktmbuf_data_len(m)) { > @@ -157,7 +177,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, > for (;;) { > tmp = __rte_raw_cksum(buf, seglen, 0); > if (done & 1) > - tmp = rte_bswap16((uint16_t)tmp); > + tmp = rte_bswap32(tmp); This part probably deserves a comment, since we only need to swap odd and even bytes, but we instead reverse all of them abusing the fact that order of 2-byte pairs does not matter for the algorithm. > sum += tmp; > done += seglen; > if (done == len) > @@ -169,7 +189,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, > seglen = len - done; > } > > - *cksum = __rte_raw_cksum_reduce(sum); > + *cksum = __rte_raw_cksum_reduce_u64(sum); > return 0; > } Changes to this function look correct to my eye, but given how many pitfalls we have already found I think we need tests. [-- Attachment #2: Type: text/html, Size: 10523 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: [PATCH] net/cksum: compute raw cksum for several segments 2025-07-31 10:51 ` Marat Khalili @ 2025-07-31 11:03 ` Marat Khalili 2025-07-31 11:31 ` [External] " Su Sai 0 siblings, 1 reply; 24+ messages in thread From: Marat Khalili @ 2025-07-31 11:03 UTC (permalink / raw) To: 苏赛, jasvinder.singh@intel.com; +Cc: dev@dpdk.org [-- Attachment #1: Type: text/plain, Size: 773 bytes --] Sorry, sent the previous email too quickly. > -----Original Message----- > From: Marat Khalili > Sent: Thursday 31 July 2025 11:52 > To: '苏赛' <susai.ss@bytedance.com>; jasvinder.singh@intel.com > Cc: dev@dpdk.org > Subject: RE: [PATCH] net/cksum: compute raw cksum for several segments > > +static inline uint16_t > > +__rte_raw_cksum_reduce_u64(uint64_t sum) > > +{ > > + uint32_t tmp; > > + > > + tmp = __rte_raw_cksum_reduce((uint32_t)sum); > > + tmp += __rte_raw_cksum_reduce((uint32_t)(sum >> 32)); > > What if this addition overflows? Realized it cannot actually overflow, my bad (maybe still needs a comment). Now this function looks good to me as well. > > + return __rte_raw_cksum_reduce(tmp); > > +} [-- Attachment #2: Type: text/html, Size: 3824 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [External] RE: [PATCH] net/cksum: compute raw cksum for several segments 2025-07-31 11:03 ` Marat Khalili @ 2025-07-31 11:31 ` Su Sai 2025-07-31 11:43 ` Marat Khalili 0 siblings, 1 reply; 24+ messages in thread From: Su Sai @ 2025-07-31 11:31 UTC (permalink / raw) To: Marat Khalili, jasvinder.singh@intel.com; +Cc: dev@dpdk.org [-- Attachment #1: Type: text/plain, Size: 3063 bytes --] Hi Marat Khalili, Thanks for your comments. We found small TCP checksum errors pkts online, then were traced to issues within the rte_raw_cksum_mbuf function. This error can be reproduced as follows: 1. In the client ECS with an MTU of 1500, initiate traffic using the command "iperf3 -c {dst ip} -b 1m -M 125 -t 8000". 2. In the destination ECS, the InCsumErrors statistic can be viewed using the command "netstat -st | grep -i csum". The erroneous packets can also be confirmed via the tcpdump command. The following is a detailed description of a captured erroneous packet. The hex stream of the packet is as follows: 00163e0b6bd2eeffffffffff0800450000a50d7a40004006b94bc0a8f91dc0a8f91ed5d2145146f9d990e10d6a2d8010020040a200000101080a95ac86ba091145d3ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff This is a packet in the format of eth + ipv4 + tcp + payload. The process leading to the error is as follows: 1. Due to the small MSS, TSO fragmentation was triggered, generating 3 mbufs. 2. The data_len of the first mbuf is 66 bytes, containing the Ethernet header, IPv4 header, and TCP header. 3. The data_len of the second mbuf is 61 bytes. 4. The data_len of the third mbuf is 52 bytes. 5. When calculating the checksum of the TCP header for such an mbuf chain using the rte_raw_cksum_mbuf function, the tmp value obtained during the calculation of the third mbuf is 0x19FFE6. 6. After applying rte_bswap16, tmp becomes 0xE6FF, with 0x19 discarded. This results in an incorrect final checksum. Meanwhile, we have also found that when enable jumbo frame, the payload becomes longer. In the process of rte_raw_cksum_mbuf handling an mbuf chain,due to the relatively large value of tmp, overflow will occur in the accumulated sum. From: "Marat Khalili"<marat.khalili@huawei.com> Date: Thu, Jul 31, 2025, 19:04 Subject: [External] RE: [PATCH] net/cksum: compute raw cksum for several segments To: "苏赛"<susai.ss@bytedance.com>, "jasvinder.singh@intel.com"< jasvinder.singh@intel.com> Cc: "dev@dpdk.org"<dev@dpdk.org> Sorry, sent the previous email too quickly. > -----Original Message----- > From: Marat Khalili > Sent: Thursday 31 July 2025 11:52 > To: '苏赛' <susai.ss@bytedance.com>; jasvinder.singh@intel.com > Cc: dev@dpdk.org > Subject: RE: [PATCH] net/cksum: compute raw cksum for several segments > > +static inline uint16_t > > +__rte_raw_cksum_reduce_u64(uint64_t sum) > > +{ > > + uint32_t tmp; > > + > > + tmp = __rte_raw_cksum_reduce((uint32_t)sum); > > + tmp += __rte_raw_cksum_reduce((uint32_t)(sum >> 32)); > > What if this addition overflows? Realized it cannot actually overflow, my bad (maybe still needs a comment). Now this function looks good to me as well. > > + return __rte_raw_cksum_reduce(tmp); > > +} [-- Attachment #2: Type: text/html, Size: 10044 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: [External] RE: [PATCH] net/cksum: compute raw cksum for several segments 2025-07-31 11:31 ` [External] " Su Sai @ 2025-07-31 11:43 ` Marat Khalili 0 siblings, 0 replies; 24+ messages in thread From: Marat Khalili @ 2025-07-31 11:43 UTC (permalink / raw) To: Su Sai, jasvinder.singh@intel.com; +Cc: dev@dpdk.org [-- Attachment #1: Type: text/plain, Size: 2622 bytes --] > -----Original Message----- > From: Su Sai <susai.ss@bytedance.com> > Sent: Thursday 31 July 2025 12:32 > To: Marat Khalili <marat.khalili@huawei.com>; jasvinder.singh@intel.com > Cc: dev@dpdk.org > Subject: Re: [External] RE: [PATCH] net/cksum: compute raw cksum for several segments > > Hi Marat Khalili, > > Thanks for your comments. > > We found small TCP checksum errors pkts online, then were traced to issues within the rte_raw_cksum_mbuf function. > > This error can be reproduced as follows: > 1. In the client ECS with an MTU of 1500, initiate traffic using the command "iperf3 -c {dst ip} -b 1m -M 125 -t 8000". > 2. In the destination ECS, the InCsumErrors statistic can be viewed using the command "netstat -st | grep -i csum". The erroneous packets can also be confirmed via the tcpdump command. > > The following is a detailed description of a captured erroneous packet. The hex stream of the packet is as follows: > 00163e0b6bd2eeffffffffff0800450000a50d7a40004006b94bc0a8f91dc0a8f91ed5d2145146f9d990e10d6a2d8010020040a200000101080a95ac86ba091145d3ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff > > This is a packet in the format of eth + ipv4 + tcp + payload. > > The process leading to the error is as follows: > 1. Due to the small MSS, TSO fragmentation was triggered, generating 3 mbufs. > 2. The data_len of the first mbuf is 66 bytes, containing the Ethernet header, IPv4 header, and TCP header. > 3. The data_len of the second mbuf is 61 bytes. > 4. The data_len of the third mbuf is 52 bytes. > 5. When calculating the checksum of the TCP header for such an mbuf chain using the rte_raw_cksum_mbuf function, the tmp value obtained during the calculation of the third mbuf is 0x19FFE6. > 6. After applying rte_bswap16, tmp becomes 0xE6FF, with 0x19 discarded. This results in an incorrect final checksum. > > Meanwhile, we have also found that when enable jumbo frame, the payload becomes longer. In the process of rte_raw_cksum_mbuf handling an mbuf chain,due to the relatively large value of tmp, overflow will occur in the accumulated sum. Thanks for the detailed description. I think both the bug and the fix are valid. Adding automated tests reproducing scenarios you described to app/test would help a lot, but I don’t know if you have cycles for this (I heard that AI is moderately good in generating test cases these days). I will ack your patch and let’s see what others say. [-- Attachment #2: Type: text/html, Size: 5594 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: [PATCH] net/cksum: compute raw cksum for several segments 2025-07-31 9:55 [PATCH] net/cksum: compute raw cksum for several segments 苏赛 2025-07-31 10:51 ` Marat Khalili @ 2025-07-31 11:46 ` Marat Khalili 2025-07-31 12:22 ` zhoumin ` (2 subsequent siblings) 4 siblings, 0 replies; 24+ messages in thread From: Marat Khalili @ 2025-07-31 11:46 UTC (permalink / raw) To: 苏赛, jasvinder.singh@intel.com; +Cc: dev@dpdk.org [-- Attachment #1: Type: text/plain, Size: 3586 bytes --] > -----Original Message----- > From: 苏赛 <susai.ss@bytedance.com> > Sent: Thursday 31 July 2025 10:55 > To: jasvinder.singh@intel.com > Cc: dev@dpdk.org > Subject: [PATCH] net/cksum: compute raw cksum for several segments > > The rte_raw_cksum_mbuf function is used to compute > the raw checksum of a packet. > If the packet payload stored in multi mbuf, the function > will goto the hard case. In hard case, > the variable 'tmp' is a type of uint32_t, > so rte_bswap16 will drop high 16 bit. > Meanwhile, the variable 'sum' is a type of uint32_t, > so 'sum += tmp' will drop the carry when overflow. > Both drop will make cksum incorrect. > This commit fixes the above bug. > > Signed-off-by: Su Sai <susai.ss@bytedance.com> > --- > .mailmap | 1 + > lib/net/rte_cksum.h | 26 +++++++++++++++++++++++--- > 2 files changed, 24 insertions(+), 3 deletions(-) > > diff --git a/.mailmap b/.mailmap > index 34a99f93a1..838b544a97 100644 > --- a/.mailmap > +++ b/.mailmap > @@ -1891,3 +1891,4 @@ Zoltan Kiss <zoltan.kiss@schaman.hu> <zoltan.kiss@linaro.org> > Zorik Machulsky <zorik@amazon.com> > Zyta Szpak <zyta@marvell.com> <zr@semihalf.com> > Zyta Szpak <zyta@marvell.com> <zyta.szpak@semihalf.com> > +Su Sai <susai.ss@bytedance.com> > diff --git a/lib/net/rte_cksum.h b/lib/net/rte_cksum.h > index a8e8927952..aa584d5f8d 100644 > --- a/lib/net/rte_cksum.h > +++ b/lib/net/rte_cksum.h > @@ -80,6 +80,25 @@ __rte_raw_cksum_reduce(uint32_t sum) > return (uint16_t)sum; > } > > +/** > + * @internal Reduce a sum to the non-complemented checksum. > + * Helper routine for the rte_raw_cksum_mbuf(). > + * > + * @param sum > + * Value of the sum. > + * @return > + * The non-complemented checksum. > + */ > +static inline uint16_t > +__rte_raw_cksum_reduce_u64(uint64_t sum) > +{ > + uint32_t tmp; > + > + tmp = __rte_raw_cksum_reduce((uint32_t)sum); > + tmp += __rte_raw_cksum_reduce((uint32_t)(sum >> 32)); > + return __rte_raw_cksum_reduce(tmp); > +} > + > /** > * Process the non-complemented checksum of a buffer. > * > @@ -119,8 +138,9 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, > { > const struct rte_mbuf *seg; > const char *buf; > - uint32_t sum, tmp; > + uint32_t tmp; > uint32_t seglen, done; > + uint64_t sum; > > /* easy case: all data in the first segment */ > if (off + len <= rte_pktmbuf_data_len(m)) { > @@ -157,7 +177,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, > for (;;) { > tmp = __rte_raw_cksum(buf, seglen, 0); > if (done & 1) > - tmp = rte_bswap16((uint16_t)tmp); > + tmp = rte_bswap32(tmp); > sum += tmp; > done += seglen; > if (done == len) > @@ -169,7 +189,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, > seglen = len - done; > } > > - *cksum = __rte_raw_cksum_reduce(sum); > + *cksum = __rte_raw_cksum_reduce_u64(sum); > return 0; > } > > -- > 2.39.2 (Apple Git-143) Acked-by: Marat Khalili <marat.khalili@huawei.com<mailto:marat.khalili@huawei.com>> [-- Attachment #2: Type: text/html, Size: 11376 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] net/cksum: compute raw cksum for several segments 2025-07-31 9:55 [PATCH] net/cksum: compute raw cksum for several segments 苏赛 2025-07-31 10:51 ` Marat Khalili 2025-07-31 11:46 ` Marat Khalili @ 2025-07-31 12:22 ` zhoumin 2025-08-01 7:26 ` Su Sai 2025-08-01 15:28 ` [v2] " Su Sai 4 siblings, 0 replies; 24+ messages in thread From: zhoumin @ 2025-07-31 12:22 UTC (permalink / raw) To: dev Recheck-request: loongarch-unit-testing ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] net/cksum: compute raw cksum for several segments 2025-07-31 9:55 [PATCH] net/cksum: compute raw cksum for several segments 苏赛 ` (2 preceding siblings ...) 2025-07-31 12:22 ` zhoumin @ 2025-08-01 7:26 ` Su Sai 2025-08-01 15:28 ` [v2] " Su Sai 4 siblings, 0 replies; 24+ messages in thread From: Su Sai @ 2025-08-01 7:26 UTC (permalink / raw) To: dev; +Cc: Su Sai The rte_raw_cksum_mbuf function is used to compute the raw checksum of a packet. If the packet payload stored in multi mbuf, the function will goto the hard case. In hard case, the variable 'tmp' is a type of uint32_t, so rte_bswap16 will drop high 16 bit. Meanwhile, the variable 'sum' is a type of uint32_t, so 'sum += tmp' will drop the carry when overflow. Both drop will make cksum incorrect. This commit fixes the above bug. Signed-off-by: Su Sai <susai.ss@bytedance.com> --- .mailmap | 1 + lib/net/rte_cksum.h | 26 +++++++++++++++++++++++--- 2 files changed, 24 insertions(+), 3 deletions(-) diff --git a/.mailmap b/.mailmap index 34a99f93a1..838b544a97 100644 --- a/.mailmap +++ b/.mailmap @@ -1891,3 +1891,4 @@ Zoltan Kiss <zoltan.kiss@schaman.hu> <zoltan.kiss@linaro.org> Zorik Machulsky <zorik@amazon.com> Zyta Szpak <zyta@marvell.com> <zr@semihalf.com> Zyta Szpak <zyta@marvell.com> <zyta.szpak@semihalf.com> +Su Sai <susai.ss@bytedance.com> diff --git a/lib/net/rte_cksum.h b/lib/net/rte_cksum.h index a8e8927952..aa584d5f8d 100644 --- a/lib/net/rte_cksum.h +++ b/lib/net/rte_cksum.h @@ -80,6 +80,25 @@ __rte_raw_cksum_reduce(uint32_t sum) return (uint16_t)sum; } +/** + * @internal Reduce a sum to the non-complemented checksum. + * Helper routine for the rte_raw_cksum_mbuf(). + * + * @param sum + * Value of the sum. + * @return + * The non-complemented checksum. + */ +static inline uint16_t +__rte_raw_cksum_reduce_u64(uint64_t sum) +{ + uint32_t tmp; + + tmp = __rte_raw_cksum_reduce((uint32_t)sum); + tmp += __rte_raw_cksum_reduce((uint32_t)(sum >> 32)); + return __rte_raw_cksum_reduce(tmp); +} + /** * Process the non-complemented checksum of a buffer. * @@ -119,8 +138,9 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, { const struct rte_mbuf *seg; const char *buf; - uint32_t sum, tmp; + uint32_t tmp; uint32_t seglen, done; + uint64_t sum; /* easy case: all data in the first segment */ if (off + len <= rte_pktmbuf_data_len(m)) { @@ -157,7 +177,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, for (;;) { tmp = __rte_raw_cksum(buf, seglen, 0); if (done & 1) - tmp = rte_bswap16((uint16_t)tmp); + tmp = rte_bswap32(tmp); sum += tmp; done += seglen; if (done == len) @@ -169,7 +189,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, seglen = len - done; } - *cksum = __rte_raw_cksum_reduce(sum); + *cksum = __rte_raw_cksum_reduce_u64(sum); return 0; } -- 2.39.2 (Apple Git-143) ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [v2] net/cksum: compute raw cksum for several segments 2025-07-31 9:55 [PATCH] net/cksum: compute raw cksum for several segments 苏赛 ` (3 preceding siblings ...) 2025-08-01 7:26 ` Su Sai @ 2025-08-01 15:28 ` Su Sai 2025-08-01 16:39 ` Marat Khalili 2025-08-02 11:08 ` [v3] " Su Sai 4 siblings, 2 replies; 24+ messages in thread From: Su Sai @ 2025-08-01 15:28 UTC (permalink / raw) To: dev; +Cc: jasvinder.singh, thomas, marat.khalili, Su Sai The rte_raw_cksum_mbuf function is used to compute the raw checksum of a packet. If the packet payload stored in multi mbuf, the function will goto the hard case. In hard case, the variable 'tmp' is a type of uint32_t, so rte_bswap16 will drop high 16 bit. Meanwhile, the variable 'sum' is a type of uint32_t, so 'sum += tmp' will drop the carry when overflow. Both drop will make cksum incorrect. This commit fixes the above bug. Signed-off-by: Su Sai <susai.ss@bytedance.com> --- .mailmap | 1 + app/test/test_cksum.c | 104 ++++++++++++++++++++++++++++++++++++++++++ lib/net/rte_cksum.h | 26 +++++++++-- 3 files changed, 128 insertions(+), 3 deletions(-) diff --git a/.mailmap b/.mailmap index 34a99f93a1..838b544a97 100644 --- a/.mailmap +++ b/.mailmap @@ -1891,3 +1891,4 @@ Zoltan Kiss <zoltan.kiss@schaman.hu> <zoltan.kiss@linaro.org> Zorik Machulsky <zorik@amazon.com> Zyta Szpak <zyta@marvell.com> <zr@semihalf.com> Zyta Szpak <zyta@marvell.com> <zyta.szpak@semihalf.com> +Su Sai <susai.ss@bytedance.com> diff --git a/app/test/test_cksum.c b/app/test/test_cksum.c index f2ab5af5a7..d2435a9962 100644 --- a/app/test/test_cksum.c +++ b/app/test/test_cksum.c @@ -85,6 +85,39 @@ static const char test_cksum_ipv4_opts_udp[] = { 0x00, 0x35, 0x00, 0x09, 0x89, 0x6f, 0x78, }; +/* generated in scapy with Ether()/IP()/TCP(options=[NOP,NOP,Timestamps])/os.urandom(113)) */ +static const char test_cksum_ipv4_tcp_multi_segs[] = { + 0x00, 0x16, 0x3e, 0x0b, 0x6b, 0xd2, 0xee, 0xff, + 0xff, 0xff, 0xff, 0xff, 0x08, 0x00, 0x45, 0x00, + 0x00, 0xa5, 0x46, 0x10, 0x40, 0x00, 0x40, 0x06, + 0x80, 0xb5, 0xc0, 0xa8, 0xf9, 0x1d, 0xc0, 0xa8, + 0xf9, 0x1e, 0xdc, 0xa2, 0x14, 0x51, 0xbb, 0x8f, + 0xa0, 0x00, 0xe4, 0x7c, 0xe4, 0xb8, 0x80, 0x10, + 0x02, 0x00, 0x4b, 0xc1, 0x00, 0x00, 0x01, 0x01, + 0x08, 0x0a, 0x90, 0x60, 0xf4, 0xff, 0x03, 0xc5, + 0xb4, 0x19, 0x77, 0x34, 0xd4, 0xdc, 0x84, 0x86, + 0xff, 0x44, 0x09, 0x63, 0x36, 0x2e, 0x26, 0x9b, + 0x90, 0x70, 0xf2, 0xed, 0xc8, 0x5b, 0x87, 0xaa, + 0xb4, 0x67, 0x6b, 0x32, 0x3d, 0xc4, 0xbf, 0x15, + 0xa9, 0x16, 0x6c, 0x2a, 0x9d, 0xb2, 0xb7, 0x6b, + 0x58, 0x44, 0x58, 0x12, 0x4b, 0x8f, 0xe5, 0x12, + 0x11, 0x90, 0x94, 0x68, 0x37, 0xad, 0x0a, 0x9b, + 0xd6, 0x79, 0xf2, 0xb7, 0x31, 0xcf, 0x44, 0x22, + 0xc8, 0x99, 0x3f, 0xe5, 0xe7, 0xac, 0xc7, 0x0b, + 0x86, 0xdf, 0xda, 0xed, 0x0a, 0x0f, 0x86, 0xd7, + 0x48, 0xe2, 0xf1, 0xc2, 0x43, 0xed, 0x47, 0x3a, + 0xea, 0x25, 0x2d, 0xd6, 0x60, 0x38, 0x30, 0x07, + 0x28, 0xdd, 0x1f, 0x0c, 0xdd, 0x7b, 0x7c, 0xd9, + 0x35, 0x9d, 0x14, 0xaa, 0xc6, 0x35, 0xd1, 0x03, + 0x38, 0xb1, 0xf5, +}; + +static const uint8_t test_cksum_ipv4_tcp_multi_segs_len[] = { + 66, /* the first seg contains all headers, including L2 to L4 */ + 61, /* the second seg lenght is odd, test byte order independent */ + 52, /* three segs are sufficient to test the most complex scenarios */ +}; + /* test l3/l4 checksum api */ static int test_l4_cksum(struct rte_mempool *pktmbuf_pool, const char *pktdata, size_t len) @@ -223,6 +256,71 @@ test_l4_cksum(struct rte_mempool *pktmbuf_pool, const char *pktdata, size_t len) return -1; } +/* test l4 checksum api for a packet with multiple mbufs */ +static int +test_l4_cksum_multi_mbufs(struct rte_mempool *pktmbuf_pool, const char *pktdata, size_t len, + const uint8_t *segs, size_t segs_len) +{ + struct rte_mbuf *m[NB_MBUF] = {0}; + struct rte_mbuf *m_hdr = NULL; + struct rte_net_hdr_lens hdr_lens; + size_t i, off = 0; + uint32_t packet_type; + void *l3_hdr; + uint32_t l3; + char *data; + + for (i = 0; i < segs_len; i++) { + m[i] = rte_pktmbuf_alloc(pktmbuf_pool); + if (m[i] == NULL) + GOTO_FAIL("Cannot allocate mbuf"); + + data = rte_pktmbuf_append(m[i], segs[i]); + if (data == NULL) + GOTO_FAIL("Cannot append data"); + + rte_memcpy(data, pktdata + off, segs[i]); + off += segs[i]; + + if (m_hdr) { + if (rte_pktmbuf_chain(m_hdr, m[i])) + GOTO_FAIL("Cannot chain mbuf"); + } else { + m_hdr = m[i]; + } + } + + if (off != len) + GOTO_FAIL("Invalid segs"); + + packet_type = rte_net_get_ptype(m_hdr, &hdr_lens, RTE_PTYPE_ALL_MASK); + l3 = packet_type & RTE_PTYPE_L3_MASK; + + l3_hdr = rte_pktmbuf_mtod_offset(m_hdr, void *, hdr_lens.l2_len); + off = hdr_lens.l2_len + hdr_lens.l3_len; + + if (l3 == RTE_PTYPE_L3_IPV4 || l3 == RTE_PTYPE_L3_IPV4_EXT) { + if (rte_ipv4_udptcp_cksum_mbuf_verify(m_hdr, l3_hdr, off) != 0) + GOTO_FAIL("Invalid L4 checksum verification for multiple mbufs"); + } else if (l3 == RTE_PTYPE_L3_IPV6 || l3 == RTE_PTYPE_L3_IPV6_EXT) { + if (rte_ipv6_udptcp_cksum_mbuf_verify(m_hdr, l3_hdr, off) != 0) + GOTO_FAIL("Invalid L4 checksum verification for multiple mbufs"); + } + + for (i = 0; i < segs_len; i++) + rte_pktmbuf_free(m[i]); + + return 0; + +fail: + for (i = 0; i < segs_len; i++) { + if (m[i]) + rte_pktmbuf_free(m[i]); + } + + return -1; +} + static int test_cksum(void) { @@ -256,6 +354,12 @@ test_cksum(void) sizeof(test_cksum_ipv4_opts_udp)) < 0) GOTO_FAIL("checksum error on ipv4_opts_udp"); + if (test_l4_cksum_multi_mbufs(pktmbuf_pool, test_cksum_ipv4_tcp_multi_segs, + sizeof(test_cksum_ipv4_tcp_multi_segs), + test_cksum_ipv4_tcp_multi_segs_len, + sizeof(test_cksum_ipv4_tcp_multi_segs_len)) < 0) + GOTO_FAIL("checksum error on multi mbufs check"); + rte_mempool_free(pktmbuf_pool); return 0; diff --git a/lib/net/rte_cksum.h b/lib/net/rte_cksum.h index a8e8927952..aa584d5f8d 100644 --- a/lib/net/rte_cksum.h +++ b/lib/net/rte_cksum.h @@ -80,6 +80,25 @@ __rte_raw_cksum_reduce(uint32_t sum) return (uint16_t)sum; } +/** + * @internal Reduce a sum to the non-complemented checksum. + * Helper routine for the rte_raw_cksum_mbuf(). + * + * @param sum + * Value of the sum. + * @return + * The non-complemented checksum. + */ +static inline uint16_t +__rte_raw_cksum_reduce_u64(uint64_t sum) +{ + uint32_t tmp; + + tmp = __rte_raw_cksum_reduce((uint32_t)sum); + tmp += __rte_raw_cksum_reduce((uint32_t)(sum >> 32)); + return __rte_raw_cksum_reduce(tmp); +} + /** * Process the non-complemented checksum of a buffer. * @@ -119,8 +138,9 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, { const struct rte_mbuf *seg; const char *buf; - uint32_t sum, tmp; + uint32_t tmp; uint32_t seglen, done; + uint64_t sum; /* easy case: all data in the first segment */ if (off + len <= rte_pktmbuf_data_len(m)) { @@ -157,7 +177,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, for (;;) { tmp = __rte_raw_cksum(buf, seglen, 0); if (done & 1) - tmp = rte_bswap16((uint16_t)tmp); + tmp = rte_bswap32(tmp); sum += tmp; done += seglen; if (done == len) @@ -169,7 +189,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, seglen = len - done; } - *cksum = __rte_raw_cksum_reduce(sum); + *cksum = __rte_raw_cksum_reduce_u64(sum); return 0; } -- 2.39.2 (Apple Git-143) ^ permalink raw reply related [flat|nested] 24+ messages in thread
* RE: [v2] net/cksum: compute raw cksum for several segments 2025-08-01 15:28 ` [v2] " Su Sai @ 2025-08-01 16:39 ` Marat Khalili 2025-08-02 11:08 ` [v3] " Su Sai 1 sibling, 0 replies; 24+ messages in thread From: Marat Khalili @ 2025-08-01 16:39 UTC (permalink / raw) To: Su Sai, dev@dpdk.org; +Cc: jasvinder.singh@intel.com, thomas@monjalon.net [-- Attachment #1: Type: text/plain, Size: 8159 bytes --] > -----Original Message----- > From: Su Sai <susai.ss@bytedance.com> > Sent: Friday 1 August 2025 16:29 > To: dev@dpdk.org > Cc: jasvinder.singh@intel.com; thomas@monjalon.net; Marat Khalili > <marat.khalili@huawei.com>; Su Sai <susai.ss@bytedance.com> > Subject: [v2] net/cksum: compute raw cksum for several segments > > The rte_raw_cksum_mbuf function is used to compute > the raw checksum of a packet. > If the packet payload stored in multi mbuf, the function > will goto the hard case. In hard case, > the variable 'tmp' is a type of uint32_t, > so rte_bswap16 will drop high 16 bit. > Meanwhile, the variable 'sum' is a type of uint32_t, > so 'sum += tmp' will drop the carry when overflow. > Both drop will make cksum incorrect. > This commit fixes the above bug. > > Signed-off-by: Su Sai <susai.ss@bytedance.com> > --- > .mailmap | 1 + > app/test/test_cksum.c | 104 > ++++++++++++++++++++++++++++++++++++++++++ > lib/net/rte_cksum.h | 26 +++++++++-- > 3 files changed, 128 insertions(+), 3 deletions(-) > > diff --git a/.mailmap b/.mailmap > index 34a99f93a1..838b544a97 100644 > --- a/.mailmap > +++ b/.mailmap > @@ -1891,3 +1891,4 @@ Zoltan Kiss <zoltan.kiss@schaman.hu> > <zoltan.kiss@linaro.org> > Zorik Machulsky <zorik@amazon.com> > Zyta Szpak <zyta@marvell.com> <zr@semihalf.com> > Zyta Szpak <zyta@marvell.com> <zyta.szpak@semihalf.com> > +Su Sai <susai.ss@bytedance.com> > diff --git a/app/test/test_cksum.c b/app/test/test_cksum.c > index f2ab5af5a7..d2435a9962 100644 > --- a/app/test/test_cksum.c > +++ b/app/test/test_cksum.c > @@ -85,6 +85,39 @@ static const char test_cksum_ipv4_opts_udp[] = { > 0x00, 0x35, 0x00, 0x09, 0x89, 0x6f, 0x78, > }; > > +/* generated in scapy with > Ether()/IP()/TCP(options=[NOP,NOP,Timestamps])/os.urandom(113)) */ > +static const char test_cksum_ipv4_tcp_multi_segs[] = { > + 0x00, 0x16, 0x3e, 0x0b, 0x6b, 0xd2, 0xee, 0xff, > + 0xff, 0xff, 0xff, 0xff, 0x08, 0x00, 0x45, 0x00, > + 0x00, 0xa5, 0x46, 0x10, 0x40, 0x00, 0x40, 0x06, > + 0x80, 0xb5, 0xc0, 0xa8, 0xf9, 0x1d, 0xc0, 0xa8, > + 0xf9, 0x1e, 0xdc, 0xa2, 0x14, 0x51, 0xbb, 0x8f, > + 0xa0, 0x00, 0xe4, 0x7c, 0xe4, 0xb8, 0x80, 0x10, > + 0x02, 0x00, 0x4b, 0xc1, 0x00, 0x00, 0x01, 0x01, > + 0x08, 0x0a, 0x90, 0x60, 0xf4, 0xff, 0x03, 0xc5, > + 0xb4, 0x19, 0x77, 0x34, 0xd4, 0xdc, 0x84, 0x86, > + 0xff, 0x44, 0x09, 0x63, 0x36, 0x2e, 0x26, 0x9b, > + 0x90, 0x70, 0xf2, 0xed, 0xc8, 0x5b, 0x87, 0xaa, > + 0xb4, 0x67, 0x6b, 0x32, 0x3d, 0xc4, 0xbf, 0x15, > + 0xa9, 0x16, 0x6c, 0x2a, 0x9d, 0xb2, 0xb7, 0x6b, > + 0x58, 0x44, 0x58, 0x12, 0x4b, 0x8f, 0xe5, 0x12, > + 0x11, 0x90, 0x94, 0x68, 0x37, 0xad, 0x0a, 0x9b, > + 0xd6, 0x79, 0xf2, 0xb7, 0x31, 0xcf, 0x44, 0x22, > + 0xc8, 0x99, 0x3f, 0xe5, 0xe7, 0xac, 0xc7, 0x0b, > + 0x86, 0xdf, 0xda, 0xed, 0x0a, 0x0f, 0x86, 0xd7, > + 0x48, 0xe2, 0xf1, 0xc2, 0x43, 0xed, 0x47, 0x3a, > + 0xea, 0x25, 0x2d, 0xd6, 0x60, 0x38, 0x30, 0x07, > + 0x28, 0xdd, 0x1f, 0x0c, 0xdd, 0x7b, 0x7c, 0xd9, > + 0x35, 0x9d, 0x14, 0xaa, 0xc6, 0x35, 0xd1, 0x03, > + 0x38, 0xb1, 0xf5, > +}; > + > +static const uint8_t test_cksum_ipv4_tcp_multi_segs_len[] = { > + 66, /* the first seg contains all headers, including L2 to L4 */ > + 61, /* the second seg lenght is odd, test byte order independent */ > + 52, /* three segs are sufficient to test the most complex scenarios */ > +}; > + > /* test l3/l4 checksum api */ > static int > test_l4_cksum(struct rte_mempool *pktmbuf_pool, const char *pktdata, > size_t len) > @@ -223,6 +256,71 @@ test_l4_cksum(struct rte_mempool *pktmbuf_pool, > const char *pktdata, size_t len) > return -1; > } > > +/* test l4 checksum api for a packet with multiple mbufs */ > +static int > +test_l4_cksum_multi_mbufs(struct rte_mempool *pktmbuf_pool, const > char *pktdata, size_t len, > + const uint8_t *segs, size_t segs_len) > +{ > + struct rte_mbuf *m[NB_MBUF] = {0}; > + struct rte_mbuf *m_hdr = NULL; > + struct rte_net_hdr_lens hdr_lens; > + size_t i, off = 0; > + uint32_t packet_type; > + void *l3_hdr; > + uint32_t l3; > + char *data; > + > + for (i = 0; i < segs_len; i++) { > + m[i] = rte_pktmbuf_alloc(pktmbuf_pool); > + if (m[i] == NULL) > + GOTO_FAIL("Cannot allocate mbuf"); > + > + data = rte_pktmbuf_append(m[i], segs[i]); > + if (data == NULL) > + GOTO_FAIL("Cannot append data"); > + > + rte_memcpy(data, pktdata + off, segs[i]); > + off += segs[i]; > + > + if (m_hdr) { > + if (rte_pktmbuf_chain(m_hdr, m[i])) > + GOTO_FAIL("Cannot chain mbuf"); > + } else { > + m_hdr = m[i]; > + } > + } > + > + if (off != len) > + GOTO_FAIL("Invalid segs"); > + > + packet_type = rte_net_get_ptype(m_hdr, &hdr_lens, > RTE_PTYPE_ALL_MASK); > + l3 = packet_type & RTE_PTYPE_L3_MASK; > + > + l3_hdr = rte_pktmbuf_mtod_offset(m_hdr, void *, hdr_lens.l2_len); > + off = hdr_lens.l2_len + hdr_lens.l3_len; > + > + if (l3 == RTE_PTYPE_L3_IPV4 || l3 == RTE_PTYPE_L3_IPV4_EXT) { > + if (rte_ipv4_udptcp_cksum_mbuf_verify(m_hdr, l3_hdr, > off) != 0) > + GOTO_FAIL("Invalid L4 checksum verification for > multiple mbufs"); > + } else if (l3 == RTE_PTYPE_L3_IPV6 || l3 == RTE_PTYPE_L3_IPV6_EXT) > { > + if (rte_ipv6_udptcp_cksum_mbuf_verify(m_hdr, l3_hdr, > off) != 0) > + GOTO_FAIL("Invalid L4 checksum verification for > multiple mbufs"); > + } > + > + for (i = 0; i < segs_len; i++) > + rte_pktmbuf_free(m[i]); > + > + return 0; > + > +fail: > + for (i = 0; i < segs_len; i++) { > + if (m[i]) > + rte_pktmbuf_free(m[i]); > + } > + > + return -1; > +} > + > static int > test_cksum(void) > { > @@ -256,6 +354,12 @@ test_cksum(void) > sizeof(test_cksum_ipv4_opts_udp)) < 0) > GOTO_FAIL("checksum error on ipv4_opts_udp"); > > + if (test_l4_cksum_multi_mbufs(pktmbuf_pool, > test_cksum_ipv4_tcp_multi_segs, > + sizeof(test_cksum_ipv4_tcp_multi_segs), > + test_cksum_ipv4_tcp_multi_segs_len, > + sizeof(test_cksum_ipv4_tcp_multi_segs_len)) < 0) > + GOTO_FAIL("checksum error on multi mbufs check"); > + > rte_mempool_free(pktmbuf_pool); > > return 0; > diff --git a/lib/net/rte_cksum.h b/lib/net/rte_cksum.h > index a8e8927952..aa584d5f8d 100644 > --- a/lib/net/rte_cksum.h > +++ b/lib/net/rte_cksum.h > @@ -80,6 +80,25 @@ __rte_raw_cksum_reduce(uint32_t sum) > return (uint16_t)sum; > } > > +/** > + * @internal Reduce a sum to the non-complemented checksum. > + * Helper routine for the rte_raw_cksum_mbuf(). > + * > + * @param sum > + * Value of the sum. > + * @return > + * The non-complemented checksum. > + */ > +static inline uint16_t > +__rte_raw_cksum_reduce_u64(uint64_t sum) > +{ > + uint32_t tmp; > + > + tmp = __rte_raw_cksum_reduce((uint32_t)sum); > + tmp += __rte_raw_cksum_reduce((uint32_t)(sum >> 32)); > + return __rte_raw_cksum_reduce(tmp); > +} > + > /** > * Process the non-complemented checksum of a buffer. > * > @@ -119,8 +138,9 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, > uint32_t off, uint32_t len, > { > const struct rte_mbuf *seg; > const char *buf; > - uint32_t sum, tmp; > + uint32_t tmp; > uint32_t seglen, done; > + uint64_t sum; > > /* easy case: all data in the first segment */ > if (off + len <= rte_pktmbuf_data_len(m)) { > @@ -157,7 +177,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, > uint32_t off, uint32_t len, > for (;;) { > tmp = __rte_raw_cksum(buf, seglen, 0); > if (done & 1) > - tmp = rte_bswap16((uint16_t)tmp); > + tmp = rte_bswap32(tmp); > sum += tmp; > done += seglen; > if (done == len) > @@ -169,7 +189,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, > uint32_t off, uint32_t len, > seglen = len - done; > } > > - *cksum = __rte_raw_cksum_reduce(sum); > + *cksum = __rte_raw_cksum_reduce_u64(sum); > return 0; > } > > -- > 2.39.2 (Apple Git-143) Hi, thanks a lot for providing a test. Note that something is garbling your patches: wrapping long lines etc. For the attached ungarbled version: Reviewed-by: Marat Khalili <marat.khalili@huawei.com> [-- Attachment #2: cksum.eml --] [-- Type: application/octet-stream, Size: 7380 bytes --] From: Su Sai <susai.ss@bytedance.com> Sent: Friday 1 August 2025 16:29 To: dev@dpdk.org Cc: jasvinder.singh@intel.com; thomas@monjalon.net; Marat Khalili; Su Sai Subject: [v2] net/cksum: compute raw cksum for several segments The rte_raw_cksum_mbuf function is used to compute the raw checksum of a packet. If the packet payload stored in multi mbuf, the function will goto the hard case. In hard case, the variable 'tmp' is a type of uint32_t, so rte_bswap16 will drop high 16 bit. Meanwhile, the variable 'sum' is a type of uint32_t, so 'sum += tmp' will drop the carry when overflow. Both drop will make cksum incorrect. This commit fixes the above bug. Signed-off-by: Su Sai <susai.ss@bytedance.com> --- .mailmap | 1 + app/test/test_cksum.c | 104 ++++++++++++++++++++++++++++++++++++++++++ lib/net/rte_cksum.h | 26 +++++++++-- 3 files changed, 128 insertions(+), 3 deletions(-) diff --git a/.mailmap b/.mailmap index 34a99f93a1..838b544a97 100644 --- a/.mailmap +++ b/.mailmap @@ -1891,3 +1891,4 @@ Zoltan Kiss <zoltan.kiss@schaman.hu> <zoltan.kiss@linaro.org> Zorik Machulsky <zorik@amazon.com> Zyta Szpak <zyta@marvell.com> <zr@semihalf.com> Zyta Szpak <zyta@marvell.com> <zyta.szpak@semihalf.com> +Su Sai <susai.ss@bytedance.com> diff --git a/app/test/test_cksum.c b/app/test/test_cksum.c index f2ab5af5a7..d2435a9962 100644 --- a/app/test/test_cksum.c +++ b/app/test/test_cksum.c @@ -85,6 +85,39 @@ static const char test_cksum_ipv4_opts_udp[] = { 0x00, 0x35, 0x00, 0x09, 0x89, 0x6f, 0x78, }; +/* generated in scapy with Ether()/IP()/TCP(options=[NOP,NOP,Timestamps])/os.urandom(113)) */ +static const char test_cksum_ipv4_tcp_multi_segs[] = { + 0x00, 0x16, 0x3e, 0x0b, 0x6b, 0xd2, 0xee, 0xff, + 0xff, 0xff, 0xff, 0xff, 0x08, 0x00, 0x45, 0x00, + 0x00, 0xa5, 0x46, 0x10, 0x40, 0x00, 0x40, 0x06, + 0x80, 0xb5, 0xc0, 0xa8, 0xf9, 0x1d, 0xc0, 0xa8, + 0xf9, 0x1e, 0xdc, 0xa2, 0x14, 0x51, 0xbb, 0x8f, + 0xa0, 0x00, 0xe4, 0x7c, 0xe4, 0xb8, 0x80, 0x10, + 0x02, 0x00, 0x4b, 0xc1, 0x00, 0x00, 0x01, 0x01, + 0x08, 0x0a, 0x90, 0x60, 0xf4, 0xff, 0x03, 0xc5, + 0xb4, 0x19, 0x77, 0x34, 0xd4, 0xdc, 0x84, 0x86, + 0xff, 0x44, 0x09, 0x63, 0x36, 0x2e, 0x26, 0x9b, + 0x90, 0x70, 0xf2, 0xed, 0xc8, 0x5b, 0x87, 0xaa, + 0xb4, 0x67, 0x6b, 0x32, 0x3d, 0xc4, 0xbf, 0x15, + 0xa9, 0x16, 0x6c, 0x2a, 0x9d, 0xb2, 0xb7, 0x6b, + 0x58, 0x44, 0x58, 0x12, 0x4b, 0x8f, 0xe5, 0x12, + 0x11, 0x90, 0x94, 0x68, 0x37, 0xad, 0x0a, 0x9b, + 0xd6, 0x79, 0xf2, 0xb7, 0x31, 0xcf, 0x44, 0x22, + 0xc8, 0x99, 0x3f, 0xe5, 0xe7, 0xac, 0xc7, 0x0b, + 0x86, 0xdf, 0xda, 0xed, 0x0a, 0x0f, 0x86, 0xd7, + 0x48, 0xe2, 0xf1, 0xc2, 0x43, 0xed, 0x47, 0x3a, + 0xea, 0x25, 0x2d, 0xd6, 0x60, 0x38, 0x30, 0x07, + 0x28, 0xdd, 0x1f, 0x0c, 0xdd, 0x7b, 0x7c, 0xd9, + 0x35, 0x9d, 0x14, 0xaa, 0xc6, 0x35, 0xd1, 0x03, + 0x38, 0xb1, 0xf5, +}; + +static const uint8_t test_cksum_ipv4_tcp_multi_segs_len[] = { + 66, /* the first seg contains all headers, including L2 to L4 */ + 61, /* the second seg lenght is odd, test byte order independent */ + 52, /* three segs are sufficient to test the most complex scenarios */ +}; + /* test l3/l4 checksum api */ static int test_l4_cksum(struct rte_mempool *pktmbuf_pool, const char *pktdata, size_t len) @@ -223,6 +256,71 @@ test_l4_cksum(struct rte_mempool *pktmbuf_pool, const char *pktdata, size_t len) return -1; } +/* test l4 checksum api for a packet with multiple mbufs */ +static int +test_l4_cksum_multi_mbufs(struct rte_mempool *pktmbuf_pool, const char *pktdata, size_t len, + const uint8_t *segs, size_t segs_len) +{ + struct rte_mbuf *m[NB_MBUF] = {0}; + struct rte_mbuf *m_hdr = NULL; + struct rte_net_hdr_lens hdr_lens; + size_t i, off = 0; + uint32_t packet_type; + void *l3_hdr; + uint32_t l3; + char *data; + + for (i = 0; i < segs_len; i++) { + m[i] = rte_pktmbuf_alloc(pktmbuf_pool); + if (m[i] == NULL) + GOTO_FAIL("Cannot allocate mbuf"); + + data = rte_pktmbuf_append(m[i], segs[i]); + if (data == NULL) + GOTO_FAIL("Cannot append data"); + + rte_memcpy(data, pktdata + off, segs[i]); + off += segs[i]; + + if (m_hdr) { + if (rte_pktmbuf_chain(m_hdr, m[i])) + GOTO_FAIL("Cannot chain mbuf"); + } else { + m_hdr = m[i]; + } + } + + if (off != len) + GOTO_FAIL("Invalid segs"); + + packet_type = rte_net_get_ptype(m_hdr, &hdr_lens, RTE_PTYPE_ALL_MASK); + l3 = packet_type & RTE_PTYPE_L3_MASK; + + l3_hdr = rte_pktmbuf_mtod_offset(m_hdr, void *, hdr_lens.l2_len); + off = hdr_lens.l2_len + hdr_lens.l3_len; + + if (l3 == RTE_PTYPE_L3_IPV4 || l3 == RTE_PTYPE_L3_IPV4_EXT) { + if (rte_ipv4_udptcp_cksum_mbuf_verify(m_hdr, l3_hdr, off) != 0) + GOTO_FAIL("Invalid L4 checksum verification for multiple mbufs"); + } else if (l3 == RTE_PTYPE_L3_IPV6 || l3 == RTE_PTYPE_L3_IPV6_EXT) { + if (rte_ipv6_udptcp_cksum_mbuf_verify(m_hdr, l3_hdr, off) != 0) + GOTO_FAIL("Invalid L4 checksum verification for multiple mbufs"); + } + + for (i = 0; i < segs_len; i++) + rte_pktmbuf_free(m[i]); + + return 0; + +fail: + for (i = 0; i < segs_len; i++) { + if (m[i]) + rte_pktmbuf_free(m[i]); + } + + return -1; +} + static int test_cksum(void) { @@ -256,6 +354,12 @@ test_cksum(void) sizeof(test_cksum_ipv4_opts_udp)) < 0) GOTO_FAIL("checksum error on ipv4_opts_udp"); + if (test_l4_cksum_multi_mbufs(pktmbuf_pool, test_cksum_ipv4_tcp_multi_segs, + sizeof(test_cksum_ipv4_tcp_multi_segs), + test_cksum_ipv4_tcp_multi_segs_len, + sizeof(test_cksum_ipv4_tcp_multi_segs_len)) < 0) + GOTO_FAIL("checksum error on multi mbufs check"); + rte_mempool_free(pktmbuf_pool); return 0; diff --git a/lib/net/rte_cksum.h b/lib/net/rte_cksum.h index a8e8927952..aa584d5f8d 100644 --- a/lib/net/rte_cksum.h +++ b/lib/net/rte_cksum.h @@ -80,6 +80,25 @@ __rte_raw_cksum_reduce(uint32_t sum) return (uint16_t)sum; } +/** + * @internal Reduce a sum to the non-complemented checksum. + * Helper routine for the rte_raw_cksum_mbuf(). + * + * @param sum + * Value of the sum. + * @return + * The non-complemented checksum. + */ +static inline uint16_t +__rte_raw_cksum_reduce_u64(uint64_t sum) +{ + uint32_t tmp; + + tmp = __rte_raw_cksum_reduce((uint32_t)sum); + tmp += __rte_raw_cksum_reduce((uint32_t)(sum >> 32)); + return __rte_raw_cksum_reduce(tmp); +} + /** * Process the non-complemented checksum of a buffer. * @@ -119,8 +138,9 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, { const struct rte_mbuf *seg; const char *buf; - uint32_t sum, tmp; + uint32_t tmp; uint32_t seglen, done; + uint64_t sum; /* easy case: all data in the first segment */ if (off + len <= rte_pktmbuf_data_len(m)) { @@ -157,7 +177,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, for (;;) { tmp = __rte_raw_cksum(buf, seglen, 0); if (done & 1) - tmp = rte_bswap16((uint16_t)tmp); + tmp = rte_bswap32(tmp); sum += tmp; done += seglen; if (done == len) @@ -169,7 +189,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, seglen = len - done; } - *cksum = __rte_raw_cksum_reduce(sum); + *cksum = __rte_raw_cksum_reduce_u64(sum); return 0; } -- 2.39.2 (Apple Git-143) ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [v3] net/cksum: compute raw cksum for several segments 2025-08-01 15:28 ` [v2] " Su Sai 2025-08-01 16:39 ` Marat Khalili @ 2025-08-02 11:08 ` Su Sai 2025-08-03 16:08 ` Stephen Hemminger 1 sibling, 1 reply; 24+ messages in thread From: Su Sai @ 2025-08-02 11:08 UTC (permalink / raw) To: dev; +Cc: Su Sai The rte_raw_cksum_mbuf function is used to compute the raw checksum of a packet. If the packet payload stored in multi mbuf, the function will goto the hard case. In hard case, the variable 'tmp' is a type of uint32_t, so rte_bswap16 will drop high 16 bit. Meanwhile, the variable 'sum' is a type of uint32_t, so 'sum += tmp' will drop the carry when overflow. Both drop will make cksum incorrect. This commit fixes the above bug. Signed-off-by: Su Sai <susai.ss@bytedance.com> --- .mailmap | 1 + app/test/test_cksum.c | 107 ++++++++++++++++++++++++++++++++++++++++++ lib/net/rte_cksum.h | 26 ++++++++-- 3 files changed, 130 insertions(+), 4 deletions(-) diff --git a/.mailmap b/.mailmap index 34a99f93a1..838b544a97 100644 --- a/.mailmap +++ b/.mailmap @@ -1891,3 +1891,4 @@ Zoltan Kiss <zoltan.kiss@schaman.hu> <zoltan.kiss@linaro.org> Zorik Machulsky <zorik@amazon.com> Zyta Szpak <zyta@marvell.com> <zr@semihalf.com> Zyta Szpak <zyta@marvell.com> <zyta.szpak@semihalf.com> +Su Sai <susai.ss@bytedance.com> diff --git a/app/test/test_cksum.c b/app/test/test_cksum.c index f2ab5af5a7..6765f48572 100644 --- a/app/test/test_cksum.c +++ b/app/test/test_cksum.c @@ -85,6 +85,42 @@ static const char test_cksum_ipv4_opts_udp[] = { 0x00, 0x35, 0x00, 0x09, 0x89, 0x6f, 0x78, }; +/* + * generated in scapy with + * Ether()/IP()/TCP(options=[NOP,NOP,Timestamps])/os.urandom(113)) + */ +static const char test_cksum_ipv4_tcp_multi_segs[] = { + 0x00, 0x16, 0x3e, 0x0b, 0x6b, 0xd2, 0xee, 0xff, + 0xff, 0xff, 0xff, 0xff, 0x08, 0x00, 0x45, 0x00, + 0x00, 0xa5, 0x46, 0x10, 0x40, 0x00, 0x40, 0x06, + 0x80, 0xb5, 0xc0, 0xa8, 0xf9, 0x1d, 0xc0, 0xa8, + 0xf9, 0x1e, 0xdc, 0xa2, 0x14, 0x51, 0xbb, 0x8f, + 0xa0, 0x00, 0xe4, 0x7c, 0xe4, 0xb8, 0x80, 0x10, + 0x02, 0x00, 0x4b, 0xc1, 0x00, 0x00, 0x01, 0x01, + 0x08, 0x0a, 0x90, 0x60, 0xf4, 0xff, 0x03, 0xc5, + 0xb4, 0x19, 0x77, 0x34, 0xd4, 0xdc, 0x84, 0x86, + 0xff, 0x44, 0x09, 0x63, 0x36, 0x2e, 0x26, 0x9b, + 0x90, 0x70, 0xf2, 0xed, 0xc8, 0x5b, 0x87, 0xaa, + 0xb4, 0x67, 0x6b, 0x32, 0x3d, 0xc4, 0xbf, 0x15, + 0xa9, 0x16, 0x6c, 0x2a, 0x9d, 0xb2, 0xb7, 0x6b, + 0x58, 0x44, 0x58, 0x12, 0x4b, 0x8f, 0xe5, 0x12, + 0x11, 0x90, 0x94, 0x68, 0x37, 0xad, 0x0a, 0x9b, + 0xd6, 0x79, 0xf2, 0xb7, 0x31, 0xcf, 0x44, 0x22, + 0xc8, 0x99, 0x3f, 0xe5, 0xe7, 0xac, 0xc7, 0x0b, + 0x86, 0xdf, 0xda, 0xed, 0x0a, 0x0f, 0x86, 0xd7, + 0x48, 0xe2, 0xf1, 0xc2, 0x43, 0xed, 0x47, 0x3a, + 0xea, 0x25, 0x2d, 0xd6, 0x60, 0x38, 0x30, 0x07, + 0x28, 0xdd, 0x1f, 0x0c, 0xdd, 0x7b, 0x7c, 0xd9, + 0x35, 0x9d, 0x14, 0xaa, 0xc6, 0x35, 0xd1, 0x03, + 0x38, 0xb1, 0xf5, +}; + +static const uint8_t test_cksum_ipv4_tcp_multi_segs_len[] = { + 66, /* the first seg contains all headers, including L2 to L4 */ + 61, /* the second seg length is odd, test byte order independent */ + 52, /* three segs are sufficient to test the most complex scenarios */ +}; + /* test l3/l4 checksum api */ static int test_l4_cksum(struct rte_mempool *pktmbuf_pool, const char *pktdata, size_t len) @@ -223,6 +259,71 @@ test_l4_cksum(struct rte_mempool *pktmbuf_pool, const char *pktdata, size_t len) return -1; } +/* test l4 checksum api for a packet with multiple mbufs */ +static int +test_l4_cksum_multi_mbufs(struct rte_mempool *pktmbuf_pool, const char *pktdata, size_t len, + const uint8_t *segs, size_t segs_len) +{ + struct rte_mbuf *m[NB_MBUF] = {0}; + struct rte_mbuf *m_hdr = NULL; + struct rte_net_hdr_lens hdr_lens; + size_t i, off = 0; + uint32_t packet_type; + void *l3_hdr; + uint32_t l3; + char *data; + + for (i = 0; i < segs_len; i++) { + m[i] = rte_pktmbuf_alloc(pktmbuf_pool); + if (m[i] == NULL) + GOTO_FAIL("Cannot allocate mbuf"); + + data = rte_pktmbuf_append(m[i], segs[i]); + if (data == NULL) + GOTO_FAIL("Cannot append data"); + + rte_memcpy(data, pktdata + off, segs[i]); + off += segs[i]; + + if (m_hdr) { + if (rte_pktmbuf_chain(m_hdr, m[i])) + GOTO_FAIL("Cannot chain mbuf"); + } else { + m_hdr = m[i]; + } + } + + if (off != len) + GOTO_FAIL("Invalid segs"); + + packet_type = rte_net_get_ptype(m_hdr, &hdr_lens, RTE_PTYPE_ALL_MASK); + l3 = packet_type & RTE_PTYPE_L3_MASK; + + l3_hdr = rte_pktmbuf_mtod_offset(m_hdr, void *, hdr_lens.l2_len); + off = hdr_lens.l2_len + hdr_lens.l3_len; + + if (l3 == RTE_PTYPE_L3_IPV4 || l3 == RTE_PTYPE_L3_IPV4_EXT) { + if (rte_ipv4_udptcp_cksum_mbuf_verify(m_hdr, l3_hdr, off) != 0) + GOTO_FAIL("Invalid L4 checksum verification for multiple mbufs"); + } else if (l3 == RTE_PTYPE_L3_IPV6 || l3 == RTE_PTYPE_L3_IPV6_EXT) { + if (rte_ipv6_udptcp_cksum_mbuf_verify(m_hdr, l3_hdr, off) != 0) + GOTO_FAIL("Invalid L4 checksum verification for multiple mbufs"); + } + + for (i = 0; i < segs_len; i++) + rte_pktmbuf_free(m[i]); + + return 0; + +fail: + for (i = 0; i < segs_len; i++) { + if (m[i]) + rte_pktmbuf_free(m[i]); + } + + return -1; +} + static int test_cksum(void) { @@ -256,6 +357,12 @@ test_cksum(void) sizeof(test_cksum_ipv4_opts_udp)) < 0) GOTO_FAIL("checksum error on ipv4_opts_udp"); + if (test_l4_cksum_multi_mbufs(pktmbuf_pool, test_cksum_ipv4_tcp_multi_segs, + sizeof(test_cksum_ipv4_tcp_multi_segs), + test_cksum_ipv4_tcp_multi_segs_len, + sizeof(test_cksum_ipv4_tcp_multi_segs_len)) < 0) + GOTO_FAIL("checksum error on multi mbufs check"); + rte_mempool_free(pktmbuf_pool); return 0; diff --git a/lib/net/rte_cksum.h b/lib/net/rte_cksum.h index a8e8927952..0f0b63a916 100644 --- a/lib/net/rte_cksum.h +++ b/lib/net/rte_cksum.h @@ -80,6 +80,24 @@ __rte_raw_cksum_reduce(uint32_t sum) return (uint16_t)sum; } +/** + * @internal Reduce a sum to the non-complemented checksum. + * Helper routine for the rte_raw_cksum_mbuf(). + * + * @param sum + * Value of the sum. + * @return + * The non-complemented checksum. + */ +static inline uint16_t +__rte_raw_cksum_reduce_u64(uint64_t sum) +{ + uint32_t tmp; + tmp = __rte_raw_cksum_reduce((uint32_t)sum); + tmp += __rte_raw_cksum_reduce((uint32_t)(sum >> 32)); + return __rte_raw_cksum_reduce(tmp); +} + /** * Process the non-complemented checksum of a buffer. * @@ -119,8 +137,8 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, { const struct rte_mbuf *seg; const char *buf; - uint32_t sum, tmp; - uint32_t seglen, done; + uint32_t tmp, seglen, done; + uint64_t sum; /* easy case: all data in the first segment */ if (off + len <= rte_pktmbuf_data_len(m)) { @@ -157,7 +175,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, for (;;) { tmp = __rte_raw_cksum(buf, seglen, 0); if (done & 1) - tmp = rte_bswap16((uint16_t)tmp); + tmp = rte_bswap32(tmp); sum += tmp; done += seglen; if (done == len) @@ -169,7 +187,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, seglen = len - done; } - *cksum = __rte_raw_cksum_reduce(sum); + *cksum = __rte_raw_cksum_reduce_u64(sum); return 0; } -- 2.20.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [v3] net/cksum: compute raw cksum for several segments 2025-08-02 11:08 ` [v3] " Su Sai @ 2025-08-03 16:08 ` Stephen Hemminger 2025-08-04 3:54 ` Su Sai 0 siblings, 1 reply; 24+ messages in thread From: Stephen Hemminger @ 2025-08-03 16:08 UTC (permalink / raw) To: Su Sai; +Cc: dev On Sat, 2 Aug 2025 04:08:16 -0700 Su Sai <susai.ss@bytedance.com> wrote: > The rte_raw_cksum_mbuf function is used to compute > the raw checksum of a packet. > If the packet payload stored in multi mbuf, the function > will goto the hard case. In hard case, > the variable 'tmp' is a type of uint32_t, > so rte_bswap16 will drop high 16 bit. > Meanwhile, the variable 'sum' is a type of uint32_t, > so 'sum += tmp' will drop the carry when overflow. > Both drop will make cksum incorrect. > This commit fixes the above bug. > > Signed-off-by: Su Sai <susai.ss@bytedance.com> > --- The patch is getting corrupted by your mail system and does not apply cleanly. Fix and resubmit. One option is to use an attachment ^ permalink raw reply [flat|nested] 24+ messages in thread
* [v3] net/cksum: compute raw cksum for several segments 2025-08-03 16:08 ` Stephen Hemminger @ 2025-08-04 3:54 ` Su Sai 2025-08-05 8:55 ` Marat Khalili ` (2 more replies) 0 siblings, 3 replies; 24+ messages in thread From: Su Sai @ 2025-08-04 3:54 UTC (permalink / raw) To: stephen; +Cc: dev, Su Sai The rte_raw_cksum_mbuf function is used to compute the raw checksum of a packet. If the packet payload stored in multi mbuf, the function will goto the hard case. In hard case, the variable 'tmp' is a type of uint32_t, so rte_bswap16 will drop high 16 bit. Meanwhile, the variable 'sum' is a type of uint32_t, so 'sum += tmp' will drop the carry when overflow. Both drop will make cksum incorrect. This commit fixes the above bug. Signed-off-by: Su Sai <spiderdetective.ss@gmail.com> --- .mailmap | 1 + app/test/test_cksum.c | 106 ++++++++++++++++++++++++++++++++++++++++++ lib/net/rte_cksum.h | 27 +++++++++-- 3 files changed, 130 insertions(+), 4 deletions(-) diff --git a/.mailmap b/.mailmap index 34a99f93a1..1da1d9f8e1 100644 --- a/.mailmap +++ b/.mailmap @@ -1552,6 +1552,7 @@ Sunil Kumar Kori <skori@marvell.com> <sunil.kori@nxp.com> Sunil Pai G <sunil.pai.g@intel.com> Sunil Uttarwar <sunilprakashrao.uttarwar@amd.com> Sun Jiajia <sunx.jiajia@intel.com> +Su Sai <spiderdetective.ss@gmail.com> <susai.ss@bytedance.com> Sunyang Wu <sunyang.wu@jaguarmicro.com> Surabhi Boob <surabhi.boob@intel.com> Suyang Ju <sju@paloaltonetworks.com> diff --git a/app/test/test_cksum.c b/app/test/test_cksum.c index f2ab5af5a7..fb2e3cf9e6 100644 --- a/app/test/test_cksum.c +++ b/app/test/test_cksum.c @@ -85,6 +85,42 @@ static const char test_cksum_ipv4_opts_udp[] = { 0x00, 0x35, 0x00, 0x09, 0x89, 0x6f, 0x78, }; +/* + * generated in scapy with + * Ether()/IP()/TCP(options=[NOP,NOP,Timestamps])/os.urandom(113)) + */ +static const char test_cksum_ipv4_tcp_multi_segs[] = { + 0x00, 0x16, 0x3e, 0x0b, 0x6b, 0xd2, 0xee, 0xff, + 0xff, 0xff, 0xff, 0xff, 0x08, 0x00, 0x45, 0x00, + 0x00, 0xa5, 0x46, 0x10, 0x40, 0x00, 0x40, 0x06, + 0x80, 0xb5, 0xc0, 0xa8, 0xf9, 0x1d, 0xc0, 0xa8, + 0xf9, 0x1e, 0xdc, 0xa2, 0x14, 0x51, 0xbb, 0x8f, + 0xa0, 0x00, 0xe4, 0x7c, 0xe4, 0xb8, 0x80, 0x10, + 0x02, 0x00, 0x4b, 0xc1, 0x00, 0x00, 0x01, 0x01, + 0x08, 0x0a, 0x90, 0x60, 0xf4, 0xff, 0x03, 0xc5, + 0xb4, 0x19, 0x77, 0x34, 0xd4, 0xdc, 0x84, 0x86, + 0xff, 0x44, 0x09, 0x63, 0x36, 0x2e, 0x26, 0x9b, + 0x90, 0x70, 0xf2, 0xed, 0xc8, 0x5b, 0x87, 0xaa, + 0xb4, 0x67, 0x6b, 0x32, 0x3d, 0xc4, 0xbf, 0x15, + 0xa9, 0x16, 0x6c, 0x2a, 0x9d, 0xb2, 0xb7, 0x6b, + 0x58, 0x44, 0x58, 0x12, 0x4b, 0x8f, 0xe5, 0x12, + 0x11, 0x90, 0x94, 0x68, 0x37, 0xad, 0x0a, 0x9b, + 0xd6, 0x79, 0xf2, 0xb7, 0x31, 0xcf, 0x44, 0x22, + 0xc8, 0x99, 0x3f, 0xe5, 0xe7, 0xac, 0xc7, 0x0b, + 0x86, 0xdf, 0xda, 0xed, 0x0a, 0x0f, 0x86, 0xd7, + 0x48, 0xe2, 0xf1, 0xc2, 0x43, 0xed, 0x47, 0x3a, + 0xea, 0x25, 0x2d, 0xd6, 0x60, 0x38, 0x30, 0x07, + 0x28, 0xdd, 0x1f, 0x0c, 0xdd, 0x7b, 0x7c, 0xd9, + 0x35, 0x9d, 0x14, 0xaa, 0xc6, 0x35, 0xd1, 0x03, + 0x38, 0xb1, 0xf5, +}; + +static const uint8_t test_cksum_ipv4_tcp_multi_segs_len[] = { + 66, /* the first seg contains all headers, including L2 to L4 */ + 61, /* the second seg length is odd, test byte order independent */ + 52, /* three segs are sufficient to test the most complex scenarios */ +}; + /* test l3/l4 checksum api */ static int test_l4_cksum(struct rte_mempool *pktmbuf_pool, const char *pktdata, size_t len) @@ -223,6 +259,70 @@ test_l4_cksum(struct rte_mempool *pktmbuf_pool, const char *pktdata, size_t len) return -1; } +/* test l4 checksum api for a packet with multiple mbufs */ +static int +test_l4_cksum_multi_mbufs(struct rte_mempool *pktmbuf_pool, const char *pktdata, size_t len, + const uint8_t *segs, size_t segs_len) +{ + struct rte_mbuf *m[NB_MBUF] = {0}; + struct rte_mbuf *m_hdr = NULL; + struct rte_net_hdr_lens hdr_lens; + size_t i, off = 0; + uint32_t packet_type, l3; + void *l3_hdr; + char *data; + + for (i = 0; i < segs_len; i++) { + m[i] = rte_pktmbuf_alloc(pktmbuf_pool); + if (m[i] == NULL) + GOTO_FAIL("Cannot allocate mbuf"); + + data = rte_pktmbuf_append(m[i], segs[i]); + if (data == NULL) + GOTO_FAIL("Cannot append data"); + + rte_memcpy(data, pktdata + off, segs[i]); + off += segs[i]; + + if (m_hdr) { + if (rte_pktmbuf_chain(m_hdr, m[i])) + GOTO_FAIL("Cannot chain mbuf"); + } else { + m_hdr = m[i]; + } + } + + if (off != len) + GOTO_FAIL("Invalid segs"); + + packet_type = rte_net_get_ptype(m_hdr, &hdr_lens, RTE_PTYPE_ALL_MASK); + l3 = packet_type & RTE_PTYPE_L3_MASK; + + l3_hdr = rte_pktmbuf_mtod_offset(m_hdr, void *, hdr_lens.l2_len); + off = hdr_lens.l2_len + hdr_lens.l3_len; + + if (l3 == RTE_PTYPE_L3_IPV4 || l3 == RTE_PTYPE_L3_IPV4_EXT) { + if (rte_ipv4_udptcp_cksum_mbuf_verify(m_hdr, l3_hdr, off) != 0) + GOTO_FAIL("Invalid L4 checksum verification for multiple mbufs"); + } else if (l3 == RTE_PTYPE_L3_IPV6 || l3 == RTE_PTYPE_L3_IPV6_EXT) { + if (rte_ipv6_udptcp_cksum_mbuf_verify(m_hdr, l3_hdr, off) != 0) + GOTO_FAIL("Invalid L4 checksum verification for multiple mbufs"); + } + + for (i = 0; i < segs_len; i++) + rte_pktmbuf_free(m[i]); + + return 0; + +fail: + for (i = 0; i < segs_len; i++) { + if (m[i]) + rte_pktmbuf_free(m[i]); + } + + return -1; +} + static int test_cksum(void) { @@ -256,6 +356,12 @@ test_cksum(void) sizeof(test_cksum_ipv4_opts_udp)) < 0) GOTO_FAIL("checksum error on ipv4_opts_udp"); + if (test_l4_cksum_multi_mbufs(pktmbuf_pool, test_cksum_ipv4_tcp_multi_segs, + sizeof(test_cksum_ipv4_tcp_multi_segs), + test_cksum_ipv4_tcp_multi_segs_len, + sizeof(test_cksum_ipv4_tcp_multi_segs_len)) < 0) + GOTO_FAIL("checksum error on multi mbufs check"); + rte_mempool_free(pktmbuf_pool); return 0; diff --git a/lib/net/rte_cksum.h b/lib/net/rte_cksum.h index a8e8927952..679ba82eb6 100644 --- a/lib/net/rte_cksum.h +++ b/lib/net/rte_cksum.h @@ -80,6 +80,25 @@ __rte_raw_cksum_reduce(uint32_t sum) return (uint16_t)sum; } +/** + * @internal Reduce a sum to the non-complemented checksum. + * Helper routine for the rte_raw_cksum_mbuf(). + * + * @param sum + * Value of the sum. + * @return + * The non-complemented checksum. + */ +static inline uint16_t +__rte_raw_cksum_reduce_u64(uint64_t sum) +{ + uint32_t tmp; + + tmp = __rte_raw_cksum_reduce((uint32_t)sum); + tmp += __rte_raw_cksum_reduce((uint32_t)(sum >> 32)); + return __rte_raw_cksum_reduce(tmp); +} + /** * Process the non-complemented checksum of a buffer. * @@ -119,8 +138,8 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, { const struct rte_mbuf *seg; const char *buf; - uint32_t sum, tmp; - uint32_t seglen, done; + uint32_t seglen, done, tmp; + uint64_t sum; /* easy case: all data in the first segment */ if (off + len <= rte_pktmbuf_data_len(m)) { @@ -157,7 +176,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, for (;;) { tmp = __rte_raw_cksum(buf, seglen, 0); if (done & 1) - tmp = rte_bswap16((uint16_t)tmp); + tmp = rte_bswap32(tmp); sum += tmp; done += seglen; if (done == len) @@ -169,7 +188,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, seglen = len - done; } - *cksum = __rte_raw_cksum_reduce(sum); + *cksum = __rte_raw_cksum_reduce_u64(sum); return 0; } -- 2.20.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* RE: [v3] net/cksum: compute raw cksum for several segments 2025-08-04 3:54 ` Su Sai @ 2025-08-05 8:55 ` Marat Khalili 2026-02-20 15:49 ` Marat Khalili 2025-08-11 14:42 ` Thomas Monjalon 2026-02-20 17:49 ` Stephen Hemminger 2 siblings, 1 reply; 24+ messages in thread From: Marat Khalili @ 2025-08-05 8:55 UTC (permalink / raw) To: Su Sai, stephen@networkplumber.org; +Cc: dev@dpdk.org > -----Original Message----- > From: Su Sai <spiderdetective.ss@gmail.com> > Sent: Monday 4 August 2025 04:55 > To: stephen@networkplumber.org > Cc: dev@dpdk.org; Su Sai <spiderdetective.ss@gmail.com> > Subject: [v3] net/cksum: compute raw cksum for several segments > > The rte_raw_cksum_mbuf function is used to compute > the raw checksum of a packet. > If the packet payload stored in multi mbuf, the function > will goto the hard case. In hard case, > the variable 'tmp' is a type of uint32_t, > so rte_bswap16 will drop high 16 bit. > Meanwhile, the variable 'sum' is a type of uint32_t, > so 'sum += tmp' will drop the carry when overflow. > Both drop will make cksum incorrect. > This commit fixes the above bug. > > Signed-off-by: Su Sai <spiderdetective.ss@gmail.com> > --- > .mailmap | 1 + > app/test/test_cksum.c | 106 ++++++++++++++++++++++++++++++++++++++++++ > lib/net/rte_cksum.h | 27 +++++++++-- > 3 files changed, 130 insertions(+), 4 deletions(-) > > diff --git a/.mailmap b/.mailmap > index 34a99f93a1..1da1d9f8e1 100644 > --- a/.mailmap > +++ b/.mailmap > @@ -1552,6 +1552,7 @@ Sunil Kumar Kori <skori@marvell.com> <sunil.kori@nxp.com> > Sunil Pai G <sunil.pai.g@intel.com> > Sunil Uttarwar <sunilprakashrao.uttarwar@amd.com> > Sun Jiajia <sunx.jiajia@intel.com> > +Su Sai <spiderdetective.ss@gmail.com> <susai.ss@bytedance.com> > Sunyang Wu <sunyang.wu@jaguarmicro.com> > Surabhi Boob <surabhi.boob@intel.com> > Suyang Ju <sju@paloaltonetworks.com> > diff --git a/app/test/test_cksum.c b/app/test/test_cksum.c > index f2ab5af5a7..fb2e3cf9e6 100644 > --- a/app/test/test_cksum.c > +++ b/app/test/test_cksum.c > @@ -85,6 +85,42 @@ static const char test_cksum_ipv4_opts_udp[] = { > 0x00, 0x35, 0x00, 0x09, 0x89, 0x6f, 0x78, > }; > > +/* > + * generated in scapy with > + * Ether()/IP()/TCP(options=[NOP,NOP,Timestamps])/os.urandom(113)) > + */ > +static const char test_cksum_ipv4_tcp_multi_segs[] = { > + 0x00, 0x16, 0x3e, 0x0b, 0x6b, 0xd2, 0xee, 0xff, > + 0xff, 0xff, 0xff, 0xff, 0x08, 0x00, 0x45, 0x00, > + 0x00, 0xa5, 0x46, 0x10, 0x40, 0x00, 0x40, 0x06, > + 0x80, 0xb5, 0xc0, 0xa8, 0xf9, 0x1d, 0xc0, 0xa8, > + 0xf9, 0x1e, 0xdc, 0xa2, 0x14, 0x51, 0xbb, 0x8f, > + 0xa0, 0x00, 0xe4, 0x7c, 0xe4, 0xb8, 0x80, 0x10, > + 0x02, 0x00, 0x4b, 0xc1, 0x00, 0x00, 0x01, 0x01, > + 0x08, 0x0a, 0x90, 0x60, 0xf4, 0xff, 0x03, 0xc5, > + 0xb4, 0x19, 0x77, 0x34, 0xd4, 0xdc, 0x84, 0x86, > + 0xff, 0x44, 0x09, 0x63, 0x36, 0x2e, 0x26, 0x9b, > + 0x90, 0x70, 0xf2, 0xed, 0xc8, 0x5b, 0x87, 0xaa, > + 0xb4, 0x67, 0x6b, 0x32, 0x3d, 0xc4, 0xbf, 0x15, > + 0xa9, 0x16, 0x6c, 0x2a, 0x9d, 0xb2, 0xb7, 0x6b, > + 0x58, 0x44, 0x58, 0x12, 0x4b, 0x8f, 0xe5, 0x12, > + 0x11, 0x90, 0x94, 0x68, 0x37, 0xad, 0x0a, 0x9b, > + 0xd6, 0x79, 0xf2, 0xb7, 0x31, 0xcf, 0x44, 0x22, > + 0xc8, 0x99, 0x3f, 0xe5, 0xe7, 0xac, 0xc7, 0x0b, > + 0x86, 0xdf, 0xda, 0xed, 0x0a, 0x0f, 0x86, 0xd7, > + 0x48, 0xe2, 0xf1, 0xc2, 0x43, 0xed, 0x47, 0x3a, > + 0xea, 0x25, 0x2d, 0xd6, 0x60, 0x38, 0x30, 0x07, > + 0x28, 0xdd, 0x1f, 0x0c, 0xdd, 0x7b, 0x7c, 0xd9, > + 0x35, 0x9d, 0x14, 0xaa, 0xc6, 0x35, 0xd1, 0x03, > + 0x38, 0xb1, 0xf5, > +}; > + > +static const uint8_t test_cksum_ipv4_tcp_multi_segs_len[] = { > + 66, /* the first seg contains all headers, including L2 to L4 */ > + 61, /* the second seg length is odd, test byte order independent */ > + 52, /* three segs are sufficient to test the most complex scenarios */ > +}; > + > /* test l3/l4 checksum api */ > static int > test_l4_cksum(struct rte_mempool *pktmbuf_pool, const char *pktdata, size_t len) > @@ -223,6 +259,70 @@ test_l4_cksum(struct rte_mempool *pktmbuf_pool, const char *pktdata, size_t len) > return -1; > } > > +/* test l4 checksum api for a packet with multiple mbufs */ > +static int > +test_l4_cksum_multi_mbufs(struct rte_mempool *pktmbuf_pool, const char *pktdata, size_t len, > + const uint8_t *segs, size_t segs_len) > +{ > + struct rte_mbuf *m[NB_MBUF] = {0}; > + struct rte_mbuf *m_hdr = NULL; > + struct rte_net_hdr_lens hdr_lens; > + size_t i, off = 0; > + uint32_t packet_type, l3; > + void *l3_hdr; > + char *data; > + > + for (i = 0; i < segs_len; i++) { > + m[i] = rte_pktmbuf_alloc(pktmbuf_pool); > + if (m[i] == NULL) > + GOTO_FAIL("Cannot allocate mbuf"); > + > + data = rte_pktmbuf_append(m[i], segs[i]); > + if (data == NULL) > + GOTO_FAIL("Cannot append data"); > + > + rte_memcpy(data, pktdata + off, segs[i]); > + off += segs[i]; > + > + if (m_hdr) { > + if (rte_pktmbuf_chain(m_hdr, m[i])) > + GOTO_FAIL("Cannot chain mbuf"); > + } else { > + m_hdr = m[i]; > + } > + } > + > + if (off != len) > + GOTO_FAIL("Invalid segs"); > + > + packet_type = rte_net_get_ptype(m_hdr, &hdr_lens, RTE_PTYPE_ALL_MASK); > + l3 = packet_type & RTE_PTYPE_L3_MASK; > + > + l3_hdr = rte_pktmbuf_mtod_offset(m_hdr, void *, hdr_lens.l2_len); > + off = hdr_lens.l2_len + hdr_lens.l3_len; > + > + if (l3 == RTE_PTYPE_L3_IPV4 || l3 == RTE_PTYPE_L3_IPV4_EXT) { > + if (rte_ipv4_udptcp_cksum_mbuf_verify(m_hdr, l3_hdr, off) != 0) > + GOTO_FAIL("Invalid L4 checksum verification for multiple mbufs"); > + } else if (l3 == RTE_PTYPE_L3_IPV6 || l3 == RTE_PTYPE_L3_IPV6_EXT) { > + if (rte_ipv6_udptcp_cksum_mbuf_verify(m_hdr, l3_hdr, off) != 0) > + GOTO_FAIL("Invalid L4 checksum verification for multiple mbufs"); > + } > + > + for (i = 0; i < segs_len; i++) > + rte_pktmbuf_free(m[i]); > + > + return 0; > + > +fail: > + for (i = 0; i < segs_len; i++) { > + if (m[i]) > + rte_pktmbuf_free(m[i]); > + } > + > + return -1; > +} > + > static int > test_cksum(void) > { > @@ -256,6 +356,12 @@ test_cksum(void) > sizeof(test_cksum_ipv4_opts_udp)) < 0) > GOTO_FAIL("checksum error on ipv4_opts_udp"); > > + if (test_l4_cksum_multi_mbufs(pktmbuf_pool, test_cksum_ipv4_tcp_multi_segs, > + sizeof(test_cksum_ipv4_tcp_multi_segs), > + test_cksum_ipv4_tcp_multi_segs_len, > + sizeof(test_cksum_ipv4_tcp_multi_segs_len)) < 0) > + GOTO_FAIL("checksum error on multi mbufs check"); > + > rte_mempool_free(pktmbuf_pool); > > return 0; > diff --git a/lib/net/rte_cksum.h b/lib/net/rte_cksum.h > index a8e8927952..679ba82eb6 100644 > --- a/lib/net/rte_cksum.h > +++ b/lib/net/rte_cksum.h > @@ -80,6 +80,25 @@ __rte_raw_cksum_reduce(uint32_t sum) > return (uint16_t)sum; > } > > +/** > + * @internal Reduce a sum to the non-complemented checksum. > + * Helper routine for the rte_raw_cksum_mbuf(). > + * > + * @param sum > + * Value of the sum. > + * @return > + * The non-complemented checksum. > + */ > +static inline uint16_t > +__rte_raw_cksum_reduce_u64(uint64_t sum) > +{ > + uint32_t tmp; > + > + tmp = __rte_raw_cksum_reduce((uint32_t)sum); > + tmp += __rte_raw_cksum_reduce((uint32_t)(sum >> 32)); > + return __rte_raw_cksum_reduce(tmp); > +} > + > /** > * Process the non-complemented checksum of a buffer. > * > @@ -119,8 +138,8 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, > { > const struct rte_mbuf *seg; > const char *buf; > - uint32_t sum, tmp; > - uint32_t seglen, done; > + uint32_t seglen, done, tmp; > + uint64_t sum; > > /* easy case: all data in the first segment */ > if (off + len <= rte_pktmbuf_data_len(m)) { > @@ -157,7 +176,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, > for (;;) { > tmp = __rte_raw_cksum(buf, seglen, 0); > if (done & 1) > - tmp = rte_bswap16((uint16_t)tmp); > + tmp = rte_bswap32(tmp); > sum += tmp; > done += seglen; > if (done == len) > @@ -169,7 +188,7 @@ rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len, > seglen = len - done; > } > > - *cksum = __rte_raw_cksum_reduce(sum); > + *cksum = __rte_raw_cksum_reduce_u64(sum); > return 0; > } > > -- > 2.20.1 Reviewed-by: Marat Khalili <marat.khalili@huawei.com> ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: [v3] net/cksum: compute raw cksum for several segments 2025-08-05 8:55 ` Marat Khalili @ 2026-02-20 15:49 ` Marat Khalili 2026-02-20 17:23 ` Thomas Monjalon 0 siblings, 1 reply; 24+ messages in thread From: Marat Khalili @ 2026-02-20 15:49 UTC (permalink / raw) To: stephen@networkplumber.org, Thomas Monjalon; +Cc: dev@dpdk.org, Su Sai Hi folks, What are we missing in order to merge this? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [v3] net/cksum: compute raw cksum for several segments 2026-02-20 15:49 ` Marat Khalili @ 2026-02-20 17:23 ` Thomas Monjalon 2026-02-20 18:17 ` Marat Khalili 0 siblings, 1 reply; 24+ messages in thread From: Thomas Monjalon @ 2026-02-20 17:23 UTC (permalink / raw) To: Marat Khalili; +Cc: stephen@networkplumber.org, dev@dpdk.org, Su Sai 20/02/2026 16:49, Marat Khalili: > Hi folks, > > What are we missing in order to merge this? I didn't take time to review the full explanation and code. Please could you review? https://inbox.dpdk.org/dev/CAFQeoKi8W8WAYUY1OuSn+qMgG1c2V7PjLjEH608m894TpLOVvw@mail.gmail.com/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: [v3] net/cksum: compute raw cksum for several segments 2026-02-20 17:23 ` Thomas Monjalon @ 2026-02-20 18:17 ` Marat Khalili 2026-02-20 18:35 ` Marat Khalili 0 siblings, 1 reply; 24+ messages in thread From: Marat Khalili @ 2026-02-20 18:17 UTC (permalink / raw) To: Thomas Monjalon; +Cc: stephen@networkplumber.org, dev@dpdk.org, Su Sai > I didn't take time to review the full explanation and code. > > Please could you review? > > https://inbox.dpdk.org/dev/CAFQeoKi8W8WAYUY1OuSn+qMgG1c2V7PjLjEH608m894TpLOVvw@mail.gmail.com/ > I can suggest my condition to get a wrong checksum, which does not go into the details of how it was observed in the wild: Function rte_raw_cksum_mbuf may produce incorrect result when multiple segments are present in the checksummed range, and some of them have odd checksummed length. More precisely, if any segment except the last one begins at an odd offset from the start, and arithmetic sum of its 16-bit words overflows, the part that has overflown will be lost. Cases when there is only one segment, or all of the segment sizes and checksum offsets are even, are not affected. ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: [v3] net/cksum: compute raw cksum for several segments 2026-02-20 18:17 ` Marat Khalili @ 2026-02-20 18:35 ` Marat Khalili 2026-02-27 6:36 ` su sai ` (2 more replies) 0 siblings, 3 replies; 24+ messages in thread From: Marat Khalili @ 2026-02-20 18:35 UTC (permalink / raw) To: Marat Khalili, Thomas Monjalon Cc: stephen@networkplumber.org, dev@dpdk.org, Su Sai > I can suggest my condition to get a wrong checksum, which does not go into the details of how it was > observed in the wild: > > Function rte_raw_cksum_mbuf may produce incorrect result when multiple segments are present in the > checksummed range, and some of them have odd checksummed length. More precisely, if any segment except > the last one begins at an odd offset from the start, and arithmetic sum of its 16-bit words overflows, > the part that has overflown will be lost. Cases when there is only one segment, or all of the segment > sizes and checksum offsets are even, are not affected. Apologies, rushed to send. The "except the last one" condition is unnecessary. Corrected version: Function rte_raw_cksum_mbuf may produce incorrect result when multiple segments are present in the checksummed range, and some of them have odd checksummed length. More precisely, if any segment begins at an odd offset from the start, and arithmetic sum of its 16-bit words overflows, the part that has overflown will be lost. Cases when there is only one segment, or all of the segment sizes and checksum offsets are even, are not affected. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [v3] net/cksum: compute raw cksum for several segments 2026-02-20 18:35 ` Marat Khalili @ 2026-02-27 6:36 ` su sai 2026-02-27 7:31 ` su sai 2026-03-06 15:17 ` Marat Khalili 2 siblings, 0 replies; 24+ messages in thread From: su sai @ 2026-02-27 6:36 UTC (permalink / raw) To: Marat Khalili, Thomas Monjalon; +Cc: stephen@networkplumber.org, dev@dpdk.org [-- Attachment #1: Type: text/plain, Size: 1517 bytes --] As noted in Marat Khalili's analysis, we identified this issue in the production environment. Meanwhile, I simplified the problematic scenario into corresponding test cases and added them to app/test/test_cksum.c On Sat, Feb 21, 2026 at 2:35 AM Marat Khalili <marat.khalili@huawei.com> wrote: > > I can suggest my condition to get a wrong checksum, which does not go > into the details of how it was > > observed in the wild: > > > > Function rte_raw_cksum_mbuf may produce incorrect result when multiple > segments are present in the > > checksummed range, and some of them have odd checksummed length. More > precisely, if any segment except > > the last one begins at an odd offset from the start, and arithmetic sum > of its 16-bit words overflows, > > the part that has overflown will be lost. Cases when there is only one > segment, or all of the segment > > sizes and checksum offsets are even, are not affected. > > Apologies, rushed to send. The "except the last one" condition is > unnecessary. Corrected version: > > Function rte_raw_cksum_mbuf may produce incorrect result when multiple > segments are present in the checksummed range, and some of them have odd > checksummed length. More precisely, if any segment begins at an odd offset > from the start, and arithmetic sum of its 16-bit words overflows, the part > that has overflown will be lost. Cases when there is only one segment, or > all of the segment sizes and checksum offsets are even, are not affected. > [-- Attachment #2: Type: text/html, Size: 3410 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [v3] net/cksum: compute raw cksum for several segments 2026-02-20 18:35 ` Marat Khalili 2026-02-27 6:36 ` su sai @ 2026-02-27 7:31 ` su sai 2026-03-06 15:17 ` Marat Khalili 2 siblings, 0 replies; 24+ messages in thread From: su sai @ 2026-02-27 7:31 UTC (permalink / raw) To: Marat Khalili, Thomas Monjalon; +Cc: stephen@networkplumber.org, dev@dpdk.org [-- Attachment #1: Type: text/plain, Size: 1517 bytes --] As noted in Marat Khalili's analysis, we identified this issue in the production environment. Meanwhile, I simplified the problematic scenario into corresponding test cases and added them to app/test/test_cksum.c On Sat, Feb 21, 2026 at 2:35 AM Marat Khalili <marat.khalili@huawei.com> wrote: > > I can suggest my condition to get a wrong checksum, which does not go > into the details of how it was > > observed in the wild: > > > > Function rte_raw_cksum_mbuf may produce incorrect result when multiple > segments are present in the > > checksummed range, and some of them have odd checksummed length. More > precisely, if any segment except > > the last one begins at an odd offset from the start, and arithmetic sum > of its 16-bit words overflows, > > the part that has overflown will be lost. Cases when there is only one > segment, or all of the segment > > sizes and checksum offsets are even, are not affected. > > Apologies, rushed to send. The "except the last one" condition is > unnecessary. Corrected version: > > Function rte_raw_cksum_mbuf may produce incorrect result when multiple > segments are present in the checksummed range, and some of them have odd > checksummed length. More precisely, if any segment begins at an odd offset > from the start, and arithmetic sum of its 16-bit words overflows, the part > that has overflown will be lost. Cases when there is only one segment, or > all of the segment sizes and checksum offsets are even, are not affected. > [-- Attachment #2: Type: text/html, Size: 1850 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: [v3] net/cksum: compute raw cksum for several segments 2026-02-20 18:35 ` Marat Khalili 2026-02-27 6:36 ` su sai 2026-02-27 7:31 ` su sai @ 2026-03-06 15:17 ` Marat Khalili 2 siblings, 0 replies; 24+ messages in thread From: Marat Khalili @ 2026-03-06 15:17 UTC (permalink / raw) To: Thomas Monjalon; +Cc: stephen@networkplumber.org, dev@dpdk.org, Su Sai > Function rte_raw_cksum_mbuf may produce incorrect result when multiple segments are present in the > checksummed range, and some of them have odd checksummed length. More precisely, if any segment begins > at an odd offset from the start, and arithmetic sum of its 16-bit words overflows, the part that has > overflown will be lost. Cases when there is only one segment, or all of the segment sizes and checksum > offsets are even, are not affected. Hi again folks, anything else needed here? There is a test, a fix, and conditions for the issue. (No problem if it only needs time.) BTW at least Gemini Pro is able to see this bug (maybe primed by our conversation) and reason about it. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [v3] net/cksum: compute raw cksum for several segments 2025-08-04 3:54 ` Su Sai 2025-08-05 8:55 ` Marat Khalili @ 2025-08-11 14:42 ` Thomas Monjalon 2025-08-12 3:03 ` su sai 2026-02-20 17:49 ` Stephen Hemminger 2 siblings, 1 reply; 24+ messages in thread From: Thomas Monjalon @ 2025-08-11 14:42 UTC (permalink / raw) To: Su Sai; +Cc: stephen, dev, Marat Khalili Hello, 04/08/2025 05:54, Su Sai: > The rte_raw_cksum_mbuf function is used to compute > the raw checksum of a packet. > If the packet payload stored in multi mbuf, the function > will goto the hard case. In hard case, > the variable 'tmp' is a type of uint32_t, > so rte_bswap16 will drop high 16 bit. > Meanwhile, the variable 'sum' is a type of uint32_t, > so 'sum += tmp' will drop the carry when overflow. > Both drop will make cksum incorrect. > This commit fixes the above bug. Thank you for the fix and the associated test. Please could describe the exact condition to get a wrong checksum? Does it happen with all multiseg packets? 3 segments is a minimum? Any other constraint to reproduce? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [v3] net/cksum: compute raw cksum for several segments 2025-08-11 14:42 ` Thomas Monjalon @ 2025-08-12 3:03 ` su sai 0 siblings, 0 replies; 24+ messages in thread From: su sai @ 2025-08-12 3:03 UTC (permalink / raw) To: Thomas Monjalon; +Cc: stephen, dev, Marat Khalili [-- Attachment #1: Type: text/plain, Size: 3691 bytes --] Hi Thomas Monjalon, First, let's describe the scenario where we discovered the problem at that time as follows: This error can be reproduced as follows: 1. In the client ECS with an MTU of 1500, initiate traffic using the command "iperf3 -c {dst ip} -b 1m -M 125 -t 8000". It will trigger TCP segmentation. 2. On the host machine, TCP segmentation is performed through the 'rte_gso_segment' function. 3. After the gso, a packet in one mbuf will be split into multiple segments. 4. When calculating the TCP checksum using the 'rte_raw_cksum_mbuf' function, it will enter the 'hard case: process checksum of several segments' of the function. At this point, a calculation error may occur. 5. In the destination ECS, the InCsumErrors statistic can be viewed using the command "netstat -st | grep -i csum". The erroneous packets can also be confirmed via the tcpdump command. The following is a detailed description of a captured erroneous packet. The hex stream of the packet is as follows: 00163e0b6bd2eeffffffffff0800450000a50d7a40004006b94bc0a8f91dc0a8f91ed5d2145146f9d990e10d6a2d8010020040a200000101080a95ac86ba091145d3ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff This is a packet in the format of Eth + IPv4 + TCP + Payload. Taking the above-mentioned packet as an example, the calculation process of 'rte_raw_cksum_mbuf' is as follows: 1. Due to the small MSS, TSO fragmentation was triggered, generating 3 mbufs. 2. The data_len of the first mbuf is 66 bytes, containing the Ethernet header, IPv4 header, and TCP header. 3. The data_len of the second mbuf is 61 bytes. 4. The data_len of the third mbuf is 52 bytes. 5. When calculating the checksum of the TCP header for such an mbuf chain using the rte_raw_cksum_mbuf function, the 'tmp' value obtained during the calculation of the third mbuf is 0x19FFE6. 6. After applying rte_bswap16, tmp becomes 0xE6FF, with 0x19 discarded. This results in an incorrect final checksum. Second, Not all multiseg packets will cause calculation errors in the 'rte_raw_cksum_mbuf' function. There are two cases that can lead to incorrect final results. 1. If the value of 'tmp' is greater than 0xFFFF, 'tmp = rte_bswap16((uint16_t)tmp)' will drop high 16 bit. 2. Both 'tmp' and 'sum' is uint32_t, if the value of 'sum' is greater than 0xFFFFFFFF, 'sum += tmp' will drop the carry when overflow. Third, in our online cloud network, we found that the problem only occurs when there are 3 or more segments. I believe that the aforementioned issue may be triggered when there are 3 or more segments, but a test case with 3 segments is sufficient to detect this problem. On Mon, Aug 11, 2025 at 10:42 PM Thomas Monjalon <thomas@monjalon.net> wrote: > Hello, > > 04/08/2025 05:54, Su Sai: > > The rte_raw_cksum_mbuf function is used to compute > > the raw checksum of a packet. > > If the packet payload stored in multi mbuf, the function > > will goto the hard case. In hard case, > > the variable 'tmp' is a type of uint32_t, > > so rte_bswap16 will drop high 16 bit. > > Meanwhile, the variable 'sum' is a type of uint32_t, > > so 'sum += tmp' will drop the carry when overflow. > > Both drop will make cksum incorrect. > > This commit fixes the above bug. > > Thank you for the fix and the associated test. > > Please could describe the exact condition to get a wrong checksum? > Does it happen with all multiseg packets? > 3 segments is a minimum? Any other constraint to reproduce? > > > [-- Attachment #2: Type: text/html, Size: 4247 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [v3] net/cksum: compute raw cksum for several segments 2025-08-04 3:54 ` Su Sai 2025-08-05 8:55 ` Marat Khalili 2025-08-11 14:42 ` Thomas Monjalon @ 2026-02-20 17:49 ` Stephen Hemminger 2 siblings, 0 replies; 24+ messages in thread From: Stephen Hemminger @ 2026-02-20 17:49 UTC (permalink / raw) To: Su Sai; +Cc: dev On Mon, 4 Aug 2025 11:54:30 +0800 Su Sai <spiderdetective.ss@gmail.com> wrote: > The rte_raw_cksum_mbuf function is used to compute > the raw checksum of a packet. > If the packet payload stored in multi mbuf, the function > will goto the hard case. In hard case, > the variable 'tmp' is a type of uint32_t, > so rte_bswap16 will drop high 16 bit. > Meanwhile, the variable 'sum' is a type of uint32_t, > so 'sum += tmp' will drop the carry when overflow. > Both drop will make cksum incorrect. > This commit fixes the above bug. > > Signed-off-by: Su Sai <spiderdetective.ss@gmail.com> > --- AI review found some issues. Summary of Findings Severity Issue Error Double-free of chained mbuf segments in test_l4_cksum_multi_mbufs() — success path frees m[0] (the whole chain) then also frees m[1], m[2] Error Missing Fixes: tag referencing the commit that introduced rte_raw_cksum_mbuf() Warning Missing Cc: stable@dpdk.org for backport Warning Double-free in the fail: error path if some segments were already chained --- ## Deep Dive Review: `[v3] net/cksum: compute raw cksum for several segments` ### Summary This patch fixes a real bug in `rte_raw_cksum_mbuf()` where multi-segment checksum computation loses data due to two issues: (1) `rte_bswap16()` on a `uint32_t` drops the upper 16 bits, and (2) a `uint32_t` accumulator `sum` can overflow and lose carries. The fix widens `sum` to `uint64_t`, uses `rte_bswap32()` instead of `rte_bswap16()`, and adds a new `__rte_raw_cksum_reduce_u64()` reduction function. A test case is also added. --- ### Correctness Analysis **The core fix is correct.** The original code had two bugs in the "hard case" (multi-segment path): 1. **`rte_bswap16((uint16_t)tmp)`** — `tmp` is the result of `__rte_raw_cksum()` which returns a value in the range [0, 0x1FFFE] (a folded-once 32-bit sum). Casting to `uint16_t` before byte-swapping silently drops the carry bit. The fix to use `rte_bswap32(tmp)` preserves all bits. 2. **`uint32_t sum` overflow** — After the `bswap32`, `tmp` can be up to 0xFFFF0000. Accumulating multiple such values in a `uint32_t` can overflow and lose carries. Widening `sum` to `uint64_t` prevents this. **`__rte_raw_cksum_reduce_u64()` correctness check:** The function folds a 64-bit sum into 16 bits by first reducing the low 32 bits, then reducing the high 32 bits, adding them, and reducing again. This is correct — it handles the full range of a 64-bit accumulator. The intermediate `tmp` is `uint32_t`, and the sum of two `uint16_t` values fits in `uint32_t`, so the final `__rte_raw_cksum_reduce(tmp)` correctly folds any carry. **No correctness bugs found in the fix itself.** --- ### Potential Issues **Error: Missing `Fixes:` tag and `Cc: stable@dpdk.org`** This patch fixes a real bug in existing code (`rte_raw_cksum_mbuf` producing incorrect checksums for multi-segment mbufs). Per DPDK process, bug fixes should include a `Fixes:` tag referencing the commit that introduced the bug, and `Cc: stable@dpdk.org` for backport consideration. Both are missing. **Warning: Commit message body starts with "The"** While the guidelines specifically forbid starting with "It", the commit message could be improved. More importantly, the body could more clearly describe the *impact* — that multi-segment packets get incorrect checksums, potentially causing packet drops or silent data corruption. The current description is somewhat mechanical ("drop high 16 bit") without stating the user-visible consequence. **Warning: Double-free risk in `test_l4_cksum_multi_mbufs()` error path** Lines 228-229 (the success path) and lines 234-237 (the `fail:` path) both iterate and free `m[i]`. However, after a successful `rte_pktmbuf_chain(m_hdr, m[i])` call, `m[i]` becomes a segment of the chained mbuf `m_hdr` (which is `m[0]`). Freeing `m[0]` via `rte_pktmbuf_free()` will walk the chain and free all chained segments. Then freeing `m[1]`, `m[2]`, etc. individually would be a **double-free**. On the **success path** (lines 228-229): `rte_pktmbuf_free(m[0])` frees the entire chain including `m[1]` and `m[2]`. Then `rte_pktmbuf_free(m[1])` and `rte_pktmbuf_free(m[2])` double-free those segments. This is a bug. On the **error path** (lines 234-237): The same double-free occurs if `rte_pktmbuf_chain` succeeded for some segments before a later failure. **Suggested fix for the success path:** ```c rte_pktmbuf_free(m_hdr); /* frees entire chain */ return 0; ``` **Suggested fix for the error path:** ```c fail: if (m_hdr) rte_pktmbuf_free(m_hdr); /* frees chained segments */ /* Free any remaining unchained segments */ for (i = (m_hdr ? 1 : 0); i < segs_len; i++) { /* Only free if not yet chained */ if (m[i] && m[i] != m_hdr && m[i]->next == NULL) rte_pktmbuf_free(m[i]); } ``` Or more simply, since chain failures are unlikely, set `m[i] = NULL` after a successful chain: ```c if (rte_pktmbuf_chain(m_hdr, m[i]) == 0) m[i] = NULL; /* now owned by chain */ else GOTO_FAIL("Cannot chain mbuf"); ``` Then both the success and error cleanup paths can safely iterate and free all non-NULL `m[i]`. **Warning: `off` variable type mismatch** `off` is declared as `size_t` on line 186 but is later reused on line 218 as `off = hdr_lens.l2_len + hdr_lens.l3_len` for a different purpose (L4 offset passed to the cksum verify functions). This reuse is not a bug per se, but a different variable name (e.g., `l4_off`) would improve clarity and avoid confusion. Minor style point. **Info: Test coverage is good** The test data uses three segments with an odd-length middle segment (61 bytes), which exercises the byte-swap alignment correction. This is exactly the scenario that triggers the bug. The test validates via `rte_ipv4_udptcp_cksum_mbuf_verify()`, confirming the checksum matches end-to-end. --- ### Summary of Findings | Severity | Issue | |----------|-------| | **Error** | Double-free of chained mbuf segments in `test_l4_cksum_multi_mbufs()` — success path frees `m[0]` (the whole chain) then also frees `m[1]`, `m[2]` | | **Error** | Missing `Fixes:` tag referencing the commit that introduced `rte_raw_cksum_mbuf()` | | **Warning** | Missing `Cc: stable@dpdk.org` for backport | | **Warning** | Double-free in the `fail:` error path if some segments were already chained | ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2026-03-06 15:17 UTC | newest] Thread overview: 24+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-07-31 9:55 [PATCH] net/cksum: compute raw cksum for several segments 苏赛 2025-07-31 10:51 ` Marat Khalili 2025-07-31 11:03 ` Marat Khalili 2025-07-31 11:31 ` [External] " Su Sai 2025-07-31 11:43 ` Marat Khalili 2025-07-31 11:46 ` Marat Khalili 2025-07-31 12:22 ` zhoumin 2025-08-01 7:26 ` Su Sai 2025-08-01 15:28 ` [v2] " Su Sai 2025-08-01 16:39 ` Marat Khalili 2025-08-02 11:08 ` [v3] " Su Sai 2025-08-03 16:08 ` Stephen Hemminger 2025-08-04 3:54 ` Su Sai 2025-08-05 8:55 ` Marat Khalili 2026-02-20 15:49 ` Marat Khalili 2026-02-20 17:23 ` Thomas Monjalon 2026-02-20 18:17 ` Marat Khalili 2026-02-20 18:35 ` Marat Khalili 2026-02-27 6:36 ` su sai 2026-02-27 7:31 ` su sai 2026-03-06 15:17 ` Marat Khalili 2025-08-11 14:42 ` Thomas Monjalon 2025-08-12 3:03 ` su sai 2026-02-20 17:49 ` Stephen Hemminger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox