Netdev List
 help / color / mirror / Atom feed
From: Ren Wei <n05ec@lzu.edu.cn>
To: netdev@vger.kernel.org, linux-kselftest@vger.kernel.org,
	linux-kernel@vger.kernel.org
Cc: jhs@mojatatu.com, jiri@resnulli.us, kuba@kernel.org,
	paulb@nvidia.com, victor@mojatatu.com, yuantan098@gmail.com,
	yifanwucs@gmail.com, tomapufckgml@gmail.com, bird@lzu.edu.cn,
	xizh2024@lzu.edu.cn, n05ec@lzu.edu.cn
Subject: [PATCH net v2 0/2] net/sched: act_ct: preserve tc_skb_cb across defragmentation
Date: Sun, 14 Jun 2026 01:42:38 +0800	[thread overview]
Message-ID: <cover.1781358691.git.xizh2024@lzu.edu.cn> (raw)

From: Zihan Xi <xizh2024@lzu.edu.cn>

Hi Linux kernel maintainers,

We found and validated an issue in net/sched/act_ct.c. The bug is
reachable when configuring TC with act_ct on a netdev (requires
CAP_NET_ADMIN). We have tested it, and the fix should not affect
other functionality.

We provide bug details, a PoC, and a crash log below.

v2 adds a tc-testing (TDC) selftest case in patch 2, per maintainer
feedback.

---- details below ----

Bug details:

tcf_ct_handle_fragments() calls nf_ct_handle_fragments() without
saving and restoring skb->cb. The defrag helper clears IPCB/IP6CB,
which aliases the tc_skb_cb/qdisc_skb_cb control buffer in
include/net/sch_generic.h. Fragmented traffic through act_ct
therefore loses qdisc metadata such as pkt_segs.

Later qdisc dequeue paths call qdisc_bstats_update() ->
qdisc_pkt_segs(). For a non-GSO skb, clobbered pkt_segs == 0 trips
DEBUG_NET_WARN_ON_ONCE() in qdisc_pkt_segs(). With panic_on_warn=1
the kernel panics.

Unlike ovs_ct_handle_fragments() in net/openvswitch/conntrack.c, the
act_ct caller only restored mru after defrag, not the full control
buffer. The attached patch saves and restores struct tc_skb_cb around
nf_ct_handle_fragments(), matching the OVS pattern.

Reproducer:

Run as root in the guest (QEMU bullseye image, eth0):

    chmod +x ./poc.sh
    ./poc.sh eth0 10.0.2.2 100

The script installs a root prio qdisc, clsact egress with "action ct",
then sends oversized UDP datagrams with PMTUD disabled to force IPv4
fragmentation through the act_ct defrag path.

We run the PoC in a 2 vCPU, 2 GB RAM x86 QEMU environment.

------BEGIN poc.sh------

#!/bin/sh
set -eu

IFACE="${1:-eth0}"
DST="${2:-10.0.2.2}"
COUNT="${3:-100}"

sysctl -w kernel.panic_on_warn=1 >/dev/null

tc qdisc del dev "$IFACE" clsact 2>/dev/null || true
tc qdisc del dev "$IFACE" root 2>/dev/null || true

tc qdisc add dev "$IFACE" root handle 1: prio
tc qdisc add dev "$IFACE" clsact
tc filter add dev "$IFACE" egress protocol ip pref 1 u32 \
	match u32 0 0 action ct zone 1 pipe

python3 - "$DST" "$COUNT" <<'PY'
import socket
import sys
import time

dst = sys.argv[1]
count = int(sys.argv[2])

IP_MTU_DISCOVER = 10
IP_PMTUDISC_DONT = 0

s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.setsockopt(socket.IPPROTO_IP, IP_MTU_DISCOVER, IP_PMTUDISC_DONT)
payload = b"A" * 4000

for _ in range(count):
    s.sendto(payload, (dst, 9))
    time.sleep(0.01)
PY

------END poc.sh------

----BEGIN crash log----

[  549.900801][T10210] Kernel panic - not syncing: kernel: panic_on_warn set ...
[  549.901406][T10210] CPU: 2 UID: 0 PID: 10210 Comm: python3 Not tainted 7.1.0-rc1 #2 PREEMPT(full)
[  549.902720][T10210] Call Trace:
[  549.903756][T10210]  ? qdisc_dequeue_head+0x287/0x370
[  549.904713][T10210]  check_panic_on_warn+0x61/0x80
[  549.905053][T10210]  __warn+0xe8/0x330
[  549.905345][T10210]  ? qdisc_dequeue_head+0x287/0x370
[  549.909442][T10210] RIP: 0010:qdisc_dequeue_head+0x287/0x370
[  549.914217][T10210]  prio_dequeue+0x40c/0x6a0
[  549.914539][T10210]  __qdisc_run+0x170/0x1b30
[  549.915561][T10210]  __dev_queue_xmit+0x25e6/0x3ac0
[  549.920352][T10210]  ip_do_fragment+0x1188/0x19a0
[  549.924214][T10210]  udp_send_skb+0x885/0x1270
[  549.924556][T10210]  udp_sendmsg+0x13f3/0x20a0

-----END crash log-----

Best regards,
Zihan Xi

Zihan Xi (2):
  net/sched: act_ct: preserve tc_skb_cb across defragmentation
  selftests/tc-testing: act_ct: add TDC test for skb cb preservation
    across defrag

 net/sched/act_ct.c                            |  7 ++--
 .../tc-testing/tc-tests/actions/ct.json       | 38 +++++++++++++++++++
 2 files changed, 42 insertions(+), 3 deletions(-)

-- 
2.43.0


             reply	other threads:[~2026-06-13 17:43 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-13 17:42 Ren Wei [this message]
2026-06-13 17:42 ` [PATCH net v2 1/2] net/sched: act_ct: preserve tc_skb_cb across defragmentation Ren Wei
2026-06-13 17:42 ` [PATCH net v2 2/2] selftests/tc-testing: act_ct: add TDC test for skb cb preservation across defrag Ren Wei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1781358691.git.xizh2024@lzu.edu.cn \
    --to=n05ec@lzu.edu.cn \
    --cc=bird@lzu.edu.cn \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=paulb@nvidia.com \
    --cc=tomapufckgml@gmail.com \
    --cc=victor@mojatatu.com \
    --cc=xizh2024@lzu.edu.cn \
    --cc=yifanwucs@gmail.com \
    --cc=yuantan098@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox