From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pg1-f177.google.com (mail-pg1-f177.google.com [209.85.215.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 91EC036DA10 for ; Fri, 13 Mar 2026 21:17:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.177 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773436628; cv=none; b=tA98MH0SRD3p2VkJJCguSbQBgRLH5Za4fmY4FPf1LNcLVhBYLDQYJjMxwhjkOZBvO93jl1oDcO2/xh7Hm+eKTghqezt8kQswBwCUIg2By3wf4fnw1M9G3NtqvXGnaQ3MLsIDefbD1ng5S0ZmOiu7fYhGmbqaDzzm4EExByUvGiA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773436628; c=relaxed/simple; bh=qJvGRypxooCEZzrxu+IeFh/RVpOAqul2XNGK+NN0PS0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Hhh4wZ7u+j3i4i9ge7qgL9WZOHF6xn6tspNK8gyiRCczUhSQ4yH7/TZG0dqfA37yme5xThnGuE2H53XFrVyRGPTP2sxssVRubKrAQZrFompBG99S/Bxr/6VtFuP3msol9W9pHrRn6XUgFbb523/wSXp4TB1OaFmC9epgqtouWmE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=networkplumber.org; spf=pass smtp.mailfrom=networkplumber.org; dkim=pass (2048-bit key) header.d=networkplumber-org.20230601.gappssmtp.com header.i=@networkplumber-org.20230601.gappssmtp.com header.b=NhM3/oC0; arc=none smtp.client-ip=209.85.215.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=networkplumber.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=networkplumber.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=networkplumber-org.20230601.gappssmtp.com header.i=@networkplumber-org.20230601.gappssmtp.com header.b="NhM3/oC0" Received: by mail-pg1-f177.google.com with SMTP id 41be03b00d2f7-c73a12af63cso1577472a12.0 for ; Fri, 13 Mar 2026 14:17:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20230601.gappssmtp.com; s=20230601; t=1773436627; x=1774041427; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=LSyAnTsYqcKTvS6O/2Ex4exo3xvS2i/hK5q4p87idwc=; b=NhM3/oC0EoiuNh/hGDppPwyTcTNhoXPcN9PZ42dkvA77VPTbBTvYZtYXkB/SQT45wE VxOlNc3KXbitDGqz+SuxGrlKOmqQssmzycAw90lLGNyVfSKjMh8EVArf3x/uacP6w06D Z+1OYGe9YRQkvZD0ncC402TiplOi5hiut6dynNqUyQP/NHFhDt9yxCzPFQDgw8n2hOes 9hlm28rclYXH6OJoGIuwlkHk5+mkLUEH2GiQK36a6rPOQKRhy1VH7FBNqmLdM/kZRG6L fB3Z6LfEZon1bA6cxbRNm0SrOaxG+utLEjRbjKDaRWCvOGUoGiWP5jYWphV948uUcdum c0Vw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773436627; x=1774041427; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=LSyAnTsYqcKTvS6O/2Ex4exo3xvS2i/hK5q4p87idwc=; b=hf7Dq/iTyPLw4g/7jdkV/G+wU6wnUeiAkdqCCLQmJtJVqaJVUIyzY/xJ8Q4/S/nLWc 8zL6v9RsTQ8di0pIETZDDRL/eHhc7QPFLOtlMSWsu42vrVxKvFWkIHKthrNqWXZERO7T qogYqGy27LLAAFer0SLe2jJ5yj7ZGr9gs9MVAUlxqiyjdQecLLNRFT/VTXgtxxvB13Zb zbBIIwl7C/OF7VvzTljLyzJmLoKvT/8qwBtNOOs7jvDgZTAtZkllXZdkRbmnbtLxZw53 +2faKtVVz0VzhEe3aFED2dfg4DEASbx1lNc+eqrsh83vKtNK4AaQeMqWLf2DdENlA9Gq eQtg== X-Gm-Message-State: AOJu0YxrE4NclDfQSi81V8eaDOs+ZhUmD3ql5Fc/BKul2nGiO0UMLFiw wGY7ns/vgneHoEN7JmZ4jzrkrt8pK5ja5+fQefWzvOtGSI+mqcK5KhBMJFnP4nNHxqoqxH8syWm 9cItRBFQ= X-Gm-Gg: ATEYQzz6Qfhjiqd6RfpFgS3IfNHiObuTeevsyLHf8ckokR48lGdrCA2c49bcxgmkvZB A86/2WaTgeLl7ZwRfGDDqsCu1VjaTx0cmTCYNXUvfQ65stbkrb1xXnLm+rCDgsQoQXQ0sjhi+bf JcxT4AST+f5XcdA75AcE7+QrmUbzY+7RgRN+1dRb9JymXo4IoxNZoDOeiY3QUSdtpJoseHMOktb Va27UxfEU7RzUorYwCuwKD68hMMhU9GZwojaReWZYLVhdWFMo2QK+TnjjmQ7xujGbX8kk1HsMT2 1BMN74rlc+3beQdhY4HzMc9128Ex8NQRvOqa5Nu5U8EoejDBVM7fYcrjfXD1e8Kk1JGVoESJ1XR 6sDaZhAFVque64s0ApuenpB4q6e4Nsqct+U2VEa8MTyP58gEH0QedHdrcUnTZp+vk7VcS4L2SMZ 55+DPiEOzs72LTSb1yN8hJPSVj89/XnZZZ X-Received: by 2002:a17:903:2ace:b0:2ae:ba41:55 with SMTP id d9443c01a7336-2aecab1f315mr47101445ad.26.1773436626809; Fri, 13 Mar 2026 14:17:06 -0700 (PDT) Received: from phoenix.lan ([104.202.29.139]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2aece81afccsm31204195ad.68.2026.03.13.14.17.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Mar 2026 14:17:06 -0700 (PDT) From: Stephen Hemminger To: netdev@vger.kernel.org Cc: Stephen Hemminger , stable@vger.kernel.org Subject: [PATCH 04/12] net/sched: netem: restructure dequeue to avoid re-entrancy with child qdisc Date: Fri, 13 Mar 2026 14:15:04 -0700 Message-ID: <20260313211646.12549-5-stephen@networkplumber.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260313211646.12549-1-stephen@networkplumber.org> References: <20260313211646.12549-1-stephen@networkplumber.org> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit netem_dequeue() currently enqueues time-ready packets into the child qdisc during the dequeue call path. This creates several problems: 1. Parent qdiscs like HFSC track class active/inactive state based on qlen transitions. The child enqueue during netem's dequeue can cause qlen to increase while the parent is mid-dequeue, leading to double-insertion in HFSC's eltree (CVE-2025-37890, CVE-2025-38001). 2. If the child qdisc is non-work-conserving (e.g., TBF), it may refuse to release packets during its dequeue even though they were just enqueued. The parent then sees netem returning NULL despite having backlog, violating the work-conserving contract and causing stalls with parents like DRR that deactivate classes in this case. Restructure netem_dequeue so that when a child qdisc is present, all time-ready packets are transferred from the tfifo to the child in a batch before asking the child for output. This ensures the child only receives packets whose delay has already elapsed. The no-child path (tfifo direct dequeue) is unchanged. Fixes: 50612537e9ab ("netem: fix classful handling") Cc: stable@vger.kernel.org Signed-off-by: Stephen Hemminger --- net/sched/sch_netem.c | 82 +++++++++++++++++++++++++++++-------------- 1 file changed, 56 insertions(+), 26 deletions(-) diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c index 085fa3ad6f83..08006a60849e 100644 --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -726,7 +726,6 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch) struct netem_sched_data *q = qdisc_priv(sch); struct sk_buff *skb; -tfifo_dequeue: skb = __qdisc_dequeue_head(&sch->q); if (skb) { deliver: @@ -734,24 +733,28 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch) qdisc_bstats_update(sch, skb); return skb; } - skb = netem_peek(q); - if (skb) { - u64 time_to_send; + + /* If we have a child qdisc, transfer all time-ready packets + * from the tfifo into the child, then dequeue from the child. + * This avoids enqueueing into the child during the parent's + * dequeue callback, which can confuse parents that track + * active/inactive state based on qlen transitions (HFSC). + */ + if (q->qdisc) { u64 now = ktime_get_ns(); - /* if more time remaining? */ - time_to_send = netem_skb_cb(skb)->time_to_send; - if (q->slot.slot_next && q->slot.slot_next < time_to_send) - get_slot_next(q, now); + while ((skb = netem_peek(q)) != NULL) { + u64 t = netem_skb_cb(skb)->time_to_send; + + if (t > now) + break; + if (q->slot.slot_next && q->slot.slot_next > now) + break; - if (time_to_send <= now && q->slot.slot_next <= now) { netem_erase_head(q, skb); q->t_len--; skb->next = NULL; skb->prev = NULL; - /* skb->dev shares skb->rbnode area, - * we need to restore its value. - */ skb->dev = qdisc_dev(sch); if (q->slot.slot_next) { @@ -762,7 +765,7 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch) get_slot_next(q, now); } - if (q->qdisc) { + { unsigned int pkt_len = qdisc_pkt_len(skb); struct sk_buff *to_free = NULL; int err; @@ -774,34 +777,61 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch) qdisc_qstats_drop(sch); sch->qstats.backlog -= pkt_len; sch->q.qlen--; - qdisc_tree_reduce_backlog(sch, 1, pkt_len); + qdisc_tree_reduce_backlog(sch, + 1, pkt_len); } - goto tfifo_dequeue; } + } + + skb = q->qdisc->ops->dequeue(q->qdisc); + if (skb) { sch->q.qlen--; goto deliver; } - - if (q->qdisc) { - skb = q->qdisc->ops->dequeue(q->qdisc); - if (skb) { + } else { + /* No child qdisc: dequeue directly from tfifo */ + skb = netem_peek(q); + if (skb) { + u64 time_to_send; + u64 now = ktime_get_ns(); + + time_to_send = netem_skb_cb(skb)->time_to_send; + if (q->slot.slot_next && + q->slot.slot_next < time_to_send) + get_slot_next(q, now); + + if (time_to_send <= now && + q->slot.slot_next <= now) { + netem_erase_head(q, skb); + q->t_len--; + skb->next = NULL; + skb->prev = NULL; + skb->dev = qdisc_dev(sch); + + if (q->slot.slot_next) { + q->slot.packets_left--; + q->slot.bytes_left -= + qdisc_pkt_len(skb); + if (q->slot.packets_left <= 0 || + q->slot.bytes_left <= 0) + get_slot_next(q, now); + } sch->q.qlen--; goto deliver; } } + } + + /* Schedule watchdog for next time-ready packet */ + skb = netem_peek(q); + if (skb) { + u64 time_to_send = netem_skb_cb(skb)->time_to_send; qdisc_watchdog_schedule_ns(&q->watchdog, max(time_to_send, q->slot.slot_next)); } - if (q->qdisc) { - skb = q->qdisc->ops->dequeue(q->qdisc); - if (skb) { - sch->q.qlen--; - goto deliver; - } - } return NULL; } -- 2.51.0