* Confused about ip_summed member in sk_buff @ 2022-10-21 6:29 J.J. Mars 2022-10-22 19:51 ` Cong Wang 0 siblings, 1 reply; 5+ messages in thread From: J.J. Mars @ 2022-10-21 6:29 UTC (permalink / raw) To: netdev Hi everyone, I'm new here and I hope this mail won't disturb you :) Recently I was working with something about ip_summed, and I'm really confused about the question what does ip_summed exactly mean? This member is defined with comment Driver fed us an IP checksum'. So I guess it's about IP/L3 checksum status. But the possible value of ip_summed like CHECKSUM_UNNECESSARY is about L4. What confused me a lot is ip_summed seems to tell us the checksum of IP/L3 layer is available from its name. But it seems to tell us the checksum status of L4 layer from its value. Besides, in ip_rcv() it seems the ip_summed is not used before calculating the checksum of IP header. So does ip_summed indicate the status of L3 checksum status or L4 checksum status? If L4, why is it named like that? Best regards, Mars ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Confused about ip_summed member in sk_buff 2022-10-21 6:29 Confused about ip_summed member in sk_buff J.J. Mars @ 2022-10-22 19:51 ` Cong Wang 2022-11-08 12:32 ` J.J. Mars 0 siblings, 1 reply; 5+ messages in thread From: Cong Wang @ 2022-10-22 19:51 UTC (permalink / raw) To: J.J. Mars; +Cc: netdev On Fri, Oct 21, 2022 at 02:29:26PM +0800, J.J. Mars wrote: > Hi everyone, I'm new here and I hope this mail won't disturb you :) > > Recently I was working with something about ip_summed, and I'm really > confused about the question what does ip_summed exactly mean? > This member is defined with comment Driver fed us an IP checksum'. So > I guess it's about IP/L3 checksum status. > But the possible value of ip_summed like CHECKSUM_UNNECESSARY is about L4. > > What confused me a lot is ip_summed seems to tell us the checksum of > IP/L3 layer is available from its name. > But it seems to tell us the checksum status of L4 layer from its value. > > Besides, in ip_rcv() it seems the ip_summed is not used before > calculating the checksum of IP header. > > So does ip_summed indicate the status of L3 checksum status or L4 > checksum status? > If L4, why is it named like that? The name itself is indeed confusing, however, there are some good explanations in the code, at the beginning of include/linux/skbuff.h. I think that could help you to clear your confusions here. Thanks. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Confused about ip_summed member in sk_buff 2022-10-22 19:51 ` Cong Wang @ 2022-11-08 12:32 ` J.J. Mars 2022-11-08 17:12 ` Edward Cree 0 siblings, 1 reply; 5+ messages in thread From: J.J. Mars @ 2022-11-08 12:32 UTC (permalink / raw) To: Cong Wang; +Cc: netdev Thanks for your reply. I've been busy these days so that I can't reply on time. I've read the annotation about ip_summed in skbuff.h many times but it still puzzles me so I write my questions here directly. First of all, I focus on the receive direction only. Q1: In section 'CHECKSUM_COMPLETE' it said 'The device supplied checksum of the _whole_ packet as seen by netif_rx() and fills out in skb->csum. Meaning, the hardware doesn't need to parse L3/L4 headers to implement this.' So I assume the 'device' is a nic or something like that which supplied checksum, but the 'hardware' doesn't need to parse L3/L4 headers. So what's the difference between 'device' and 'hardware'? Which one is the nic? Q2: Which layer does the checksum refer in section 'CHECKSUM_COMPLETE' as it said 'The device supplied checksum of the _whole_ packet'. I assume it refers to both L3 and L4 checksum because of the word 'whole'. Q3: The full checksum is not calculated when 'CHECKSUM_UNNECESSARY' is set. What does the word 'full' mean? Does it refer to both L3 and L4? As it said 'CHECKSUM_UNNECESSARY' is set for some L4 packets, what's the status of L3 checksum now? Does L3 checksum MUST be right when 'CHECKSUM_UNNECESSARY' is set? Q4: In section 'CHECKSUM_PARTIAL' it described status of SOME part of the checksum is valid. As it said this value is set in GRO path, does it refer to L4 only? Q5: 'CHECKSUM_COMPLETE' and 'CHECKSUM_UNNECESSARY', which one supplies the most complete status of checksum? I assume it's CHECKSUM_UNNECESSARY. Q6: The name ip_summed doesn't describe the status of L3 only but also L4? Or just L4? Hope to receive replies from all you guys. Best wishes. Cong Wang <xiyou.wangcong@gmail.com> 于2022年10月23日周日 03:51写道: > > On Fri, Oct 21, 2022 at 02:29:26PM +0800, J.J. Mars wrote: > > Hi everyone, I'm new here and I hope this mail won't disturb you :) > > > > Recently I was working with something about ip_summed, and I'm really > > confused about the question what does ip_summed exactly mean? > > This member is defined with comment Driver fed us an IP checksum'. So > > I guess it's about IP/L3 checksum status. > > But the possible value of ip_summed like CHECKSUM_UNNECESSARY is about L4. > > > > What confused me a lot is ip_summed seems to tell us the checksum of > > IP/L3 layer is available from its name. > > But it seems to tell us the checksum status of L4 layer from its value. > > > > Besides, in ip_rcv() it seems the ip_summed is not used before > > calculating the checksum of IP header. > > > > So does ip_summed indicate the status of L3 checksum status or L4 > > checksum status? > > If L4, why is it named like that? > > The name itself is indeed confusing, however, there are some good > explanations in the code, at the beginning of include/linux/skbuff.h. I > think that could help you to clear your confusions here. > > Thanks. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Confused about ip_summed member in sk_buff 2022-11-08 12:32 ` J.J. Mars @ 2022-11-08 17:12 ` Edward Cree 2022-11-09 7:22 ` J.J. Mars 0 siblings, 1 reply; 5+ messages in thread From: Edward Cree @ 2022-11-08 17:12 UTC (permalink / raw) To: J.J. Mars, Cong Wang; +Cc: netdev On 08/11/2022 12:32, J.J. Mars wrote: > Thanks for your reply. I've been busy these days so that I can't reply on time. > I've read the annotation about ip_summed in skbuff.h many times but it > still puzzles me so I write my questions here directly. > > First of all, I focus on the receive direction only. > > Q1: In section 'CHECKSUM_COMPLETE' it said 'The device supplied > checksum of the _whole_ packet as seen by netif_rx() and fills out in > skb->csum. Meaning, the hardware doesn't need to parse L3/L4 headers > to implement this.' So I assume the 'device' is a nic or something > like that which supplied checksum, but the 'hardware' doesn't need to > parse L3/L4 headers. So what's the difference between 'device' and > 'hardware'? Which one is the nic? Both. To implement this feature, the NIC is supposed to treat the packet data as an unstructured array of 16-bit integers, and compute their (ones- complement) sum. When the kernel parses the packet headers, it will subtract out from this sum the headers it consumes, and then check that what's left over matches the sum of the L4 pseudo header (as it should for a correctly checksummed packet). Note that this design means protocol parsing happens only in software, with the NIC completely protocol-agnostic; thus upgrades to support new protocols only require a kernel upgrade and not a new NIC. > Q2: Which layer does the checksum refer in section 'CHECKSUM_COMPLETE' > as it said 'The device supplied checksum of the _whole_ packet'. I > assume it refers to both L3 and L4 checksum because of the word > 'whole'. See above - the device is not supposed to know or care where L3 or L4 headers start or where their checksum fields live, it just sums the whole thing, and the kernel mathematically derives the sum of the L4 payload from that. > Q3: The full checksum is not calculated when 'CHECKSUM_UNNECESSARY' is > set. What does the word 'full' mean? Does it refer to both L3 and L4? > As it said 'CHECKSUM_UNNECESSARY' is set for some L4 packets, what's > the status of L3 checksum now? Does L3 checksum MUST be right when > 'CHECKSUM_UNNECESSARY' is set? 'full' here refers to the CHECKSUM_COMPLETE sum described above. CHECKSUM_UNNECESSARY refers to the L4 checksum, and may be set by the driver when the hardware has determined that the L4 checksum is correct. This is an inferior hardware design because it can only support those specific protocols the hardware understands; but we handle it in the kernel because lots of hardware like that exists :( L3 checksums are never offloaded to hardware (neither by CHECKSUM_COMPLETE nor by CHECKSUM_UNNECESSARY); because they only sum over the L3 header (not its payload), they are cheap to compute in software (the costly bit is actually bringing the data into cache, and we have to do that anyway to parse the header, so summing it at the same time is almost free). AFAIK a driver may set CHECKSUM_UNNECESSARY even if the L3 checksum is incorrect, because it only covers the L4 sum; but I'm not 100% sure. > Q4: In section 'CHECKSUM_PARTIAL' it described status of SOME part of > the checksum is valid. As it said this value is set in GRO path, does > it refer to L4 only? Drivers should not use CHECKSUM_PARTIAL on the RX side; only on TX (for which see [1] for additional documentation). > Q5: 'CHECKSUM_COMPLETE' and 'CHECKSUM_UNNECESSARY', which one supplies > the most complete status of checksum? I assume it's > CHECKSUM_UNNECESSARY. CHECKSUM_COMPLETE is preferred, as per above remarks about protocols. > Q6: The name ip_summed doesn't describe the status of L3 only but also > L4? Or just L4? Just L4. It's called "ip_summed" because the "16-bit ones-complement sum" style of checksum is also known as the "Internet checksum" since it is used repeatedly in the Internet protocol suite, such as in TCP and UDP as well as IPv4. Yes, this is confusing, but it's too late to rename it now. HTH, -ed [1] https://www.kernel.org/doc/html/latest/networking/checksum-offloads.html#tx-checksum-offload ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Confused about ip_summed member in sk_buff 2022-11-08 17:12 ` Edward Cree @ 2022-11-09 7:22 ` J.J. Mars 0 siblings, 0 replies; 5+ messages in thread From: J.J. Mars @ 2022-11-09 7:22 UTC (permalink / raw) To: Edward Cree; +Cc: Cong Wang, netdev Thank you Edward, your reply does really help me a lot. But I have some new questions. Q1: From your reply, it seems the NIC is stupid or unnecessary to detect the PROTOCOL of a packet. But on the RX side, some NIC drivers can detect packet status from their rx desc. What's more, drivers get some status of checksum from rx desc. Does this mean the NIC can deal with PROTOCOL in some sense? BTW some advantage abilities like RSS may need to detect PROTOCOL of a packet as well? I have pool knowledge about NIC and driver so if my question is stupid or bad please forgive me :) Here's some rx desc status defined in e1000_hw.h: #define E1000_RXD_STAT_UDPCS 0x10 /* UDP xsum calculated */ #define E1000_RXD_STAT_TCPCS 0x20 /* TCP xsum calculated */ #define E1000_RXD_STAT_IPCS 0x40 /* IP xsum calculated */ And in e1000_rx_checksum the driver uses the PROTOCOL status bit to decide whether to set CHECKSUM_UNNECESSARY or not. Q2: Seems CHECKSUM_COMPLETE contains more data checked than CHECKSUM_UNNECESSARY. But when the stack handles the packet, like tcp_v4_rcv->skb_checksum_init, if CHECKSUM_UNNECESSARY is set, it's free for stack to calculate any checksum while there's still some work when CHECKSUM_COMPLETE is set. Does it mean CHECKSUM_UNNECESSARY is more useful to reduce the overhead for stack on certain protocols? Best wishes. Edward Cree <ecree.xilinx@gmail.com> 于2022年11月9日周三 01:13写道: > > On 08/11/2022 12:32, J.J. Mars wrote: > > Thanks for your reply. I've been busy these days so that I can't reply on time. > > I've read the annotation about ip_summed in skbuff.h many times but it > > still puzzles me so I write my questions here directly. > > > > First of all, I focus on the receive direction only. > > > > Q1: In section 'CHECKSUM_COMPLETE' it said 'The device supplied > > checksum of the _whole_ packet as seen by netif_rx() and fills out in > > skb->csum. Meaning, the hardware doesn't need to parse L3/L4 headers > > to implement this.' So I assume the 'device' is a nic or something > > like that which supplied checksum, but the 'hardware' doesn't need to > > parse L3/L4 headers. So what's the difference between 'device' and > > 'hardware'? Which one is the nic? > > Both. > To implement this feature, the NIC is supposed to treat the packet data > as an unstructured array of 16-bit integers, and compute their (ones- > complement) sum. > When the kernel parses the packet headers, it will subtract out from > this sum the headers it consumes, and then check that what's left over > matches the sum of the L4 pseudo header (as it should for a correctly > checksummed packet). > Note that this design means protocol parsing happens only in software, > with the NIC completely protocol-agnostic; thus upgrades to support > new protocols only require a kernel upgrade and not a new NIC. > > > Q2: Which layer does the checksum refer in section 'CHECKSUM_COMPLETE' > > as it said 'The device supplied checksum of the _whole_ packet'. I > > assume it refers to both L3 and L4 checksum because of the word > > 'whole'. > > See above - the device is not supposed to know or care where L3 or L4 > headers start or where their checksum fields live, it just sums the > whole thing, and the kernel mathematically derives the sum of the L4 > payload from that. > > > Q3: The full checksum is not calculated when 'CHECKSUM_UNNECESSARY' is > > set. What does the word 'full' mean? Does it refer to both L3 and L4? > > As it said 'CHECKSUM_UNNECESSARY' is set for some L4 packets, what's > > the status of L3 checksum now? Does L3 checksum MUST be right when > > 'CHECKSUM_UNNECESSARY' is set? > > 'full' here refers to the CHECKSUM_COMPLETE sum described above. > CHECKSUM_UNNECESSARY refers to the L4 checksum, and may be set by the > driver when the hardware has determined that the L4 checksum is > correct. This is an inferior hardware design because it can only > support those specific protocols the hardware understands; but we > handle it in the kernel because lots of hardware like that exists :( > L3 checksums are never offloaded to hardware (neither by > CHECKSUM_COMPLETE nor by CHECKSUM_UNNECESSARY); because they only > sum over the L3 header (not its payload), they are cheap to compute > in software (the costly bit is actually bringing the data into cache, > and we have to do that anyway to parse the header, so summing it at > the same time is almost free). > AFAIK a driver may set CHECKSUM_UNNECESSARY even if the L3 checksum is > incorrect, because it only covers the L4 sum; but I'm not 100% sure. > > > Q4: In section 'CHECKSUM_PARTIAL' it described status of SOME part of > > the checksum is valid. As it said this value is set in GRO path, does > > it refer to L4 only? > > Drivers should not use CHECKSUM_PARTIAL on the RX side; only on TX > (for which see [1] for additional documentation). > > > Q5: 'CHECKSUM_COMPLETE' and 'CHECKSUM_UNNECESSARY', which one supplies > > the most complete status of checksum? I assume it's > > CHECKSUM_UNNECESSARY. > > CHECKSUM_COMPLETE is preferred, as per above remarks about protocols. > > > Q6: The name ip_summed doesn't describe the status of L3 only but also > > L4? Or just L4? > > Just L4. It's called "ip_summed" because the "16-bit ones-complement > sum" style of checksum is also known as the "Internet checksum" > since it is used repeatedly in the Internet protocol suite, such as > in TCP and UDP as well as IPv4. Yes, this is confusing, but it's > too late to rename it now. > > HTH, > -ed > > [1] https://www.kernel.org/doc/html/latest/networking/checksum-offloads.html#tx-checksum-offload ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-11-09 7:23 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-10-21 6:29 Confused about ip_summed member in sk_buff J.J. Mars 2022-10-22 19:51 ` Cong Wang 2022-11-08 12:32 ` J.J. Mars 2022-11-08 17:12 ` Edward Cree 2022-11-09 7:22 ` J.J. Mars
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.