From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Federico Brasili <federico.brasili@gmail.com>, netdev@vger.kernel.org
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
"David S . Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>,
Paolo Abeni <pabeni@redhat.com>,
Pablo Neira Ayuso <pablo@netfilter.org>,
Florian Westphal <fw@strlen.de>,
netfilter-devel@vger.kernel.org
Subject: Re: [net] AF_PACKET PACKET_VNET_HDR CHECKSUM_PARTIAL packets bypass ct invalid classification
Date: Tue, 19 May 2026 21:29:23 -0400 [thread overview]
Message-ID: <willemdebruijn.kernel.fc94e94f7ffd@gmail.com> (raw)
In-Reply-To: <20260519132535.3806659-1-federico.brasili@gmail.com>
Federico Brasili wrote:
> Hello,
>
> I would like to ask for feedback on a possible checksum/conntrack inconsistency in the AF_PACKET PACKET_VNET_HDR transmit path.
>
> A locally injected IPv4/UDP packet with an invalid raw UDP checksum is classified as ct state invalid when sent as a normal AF_PACKET raw frame. However, an otherwise equivalent packet sent through AF_PACKET with PACKET_VNET_HDR and VIRTIO_NET_HDR_F_NEEDS_CSUM is not classified as invalid and is delivered to a UDP socket, even though packet sockets still observe the UDP checksum field unchanged and report CSUMNOTREADY.
>
> Minimal behavior observed:
>
> 1. RAW_BAD
>
> AF_PACKET raw frame
> UDP checksum field: 0x1111
>
> nft:
> ct state invalid counter packets 1 drop
> udp dport 12345 counter packets 0 accept
>
> UDP socket:
> no packet received
>
> 2. VNET_BAD
>
> AF_PACKET + PACKET_VNET_HDR
> VIRTIO_NET_HDR_F_NEEDS_CSUM
> csum_start = 34
> csum_offset = 6
> UDP checksum field: 0x1111
>
> packet socket:
> PACKET_AUXDATA reports CSUMNOTREADY
> UDP header still contains checksum 0x1111
>
> nft:
> ct state invalid counter packets 0 drop
> udp dport 12345 counter packets 1 accept
>
> UDP socket:
> packet received
>
> A trace of the VNET case shows the packet being converted to CHECKSUM_PARTIAL and reaching conntrack/UDP in that state:
>
> skb_partial_csum_set(... arg_start=34 arg_off=6) = 1
> XMIT ip_summed=3 csum_start=36 csum_offset=6
> NF_CT_UDP ip_summed=3 csum_start=36 csum_offset=6
> UDP_RCV ip_summed=3 csum_start=36 csum_offset=6
> UDP_QUEUE ip_summed=3 csum_start=36 csum_offset=6
>
> The relevant path appears to be:
>
> net/packet/af_packet.c
> packet_snd()
> tpacket_snd()
> __packet_snd_vnet_parse()
> virtio_net_hdr_to_skb()
>
> include/linux/virtio_net.h
> __virtio_net_hdr_to_skb()
> skb_partial_csum_set()
>
> The same behavior was also reproduced through PACKET_TX_RING + PACKET_VNET_HDR.
>
> An explicit nftables rule such as udp dport 12345 drop still works correctly, so this is not a general firewall bypass. The observed difference is specifically around checksum-invalid classification: raw invalid packets are treated as ct state invalid, while PACKET_VNET_HDR/NEEDS_CSUM packets with the same invalid raw checksum are not.
>
> My question is whether this is considered intended behavior for locally injected CHECKSUM_PARTIAL skbs, or whether AF_PACKET should reject or normalize this case before the packet reaches conntrack/UDP.
This is expected.
The VIRTIO_NET_HDR_F_NEEDS_CSUM flag on transmit indicates that a
checksum hardware offload (CHECKSUM_PARTIAL) is to be programmed.
The sender will include checksum start and offset instructions.
Handling of CHECKSUM_PARTIAL skbuffs inside the kernel is described at
the top of skbuff.h. Note the section on receive processing, the point
about "are considered verified":
* - %CHECKSUM_PARTIAL
*
* A checksum is set up to be offloaded to a device as described in the
* output description for CHECKSUM_PARTIAL. This may occur on a packet
* received directly from another Linux OS, e.g., a virtualized Linux kernel
* on the same host, or it may be set in the input path in GRO or remote
* checksum offload. For the purposes of checksum verification, the checksum
* referred to by skb->csum_start + skb->csum_offset and any preceding
* checksums in the packet are considered verified. Any checksums in the
* packet that are after the checksum being offloaded are not considered to
* be verified.
next prev parent reply other threads:[~2026-05-20 1:29 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-19 13:25 [net] AF_PACKET PACKET_VNET_HDR CHECKSUM_PARTIAL packets bypass ct invalid classification Federico Brasili
2026-05-20 1:29 ` Willem de Bruijn [this message]
-- strict thread matches above, loose matches on Subject: below --
2026-05-19 13:13 Federico Brasili
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=willemdebruijn.kernel.fc94e94f7ffd@gmail.com \
--to=willemdebruijn.kernel@gmail.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=federico.brasili@gmail.com \
--cc=fw@strlen.de \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pablo@netfilter.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox