Netdev List
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: netdev@vger.kernel.org
Cc: davem@davemloft.net, kuba@kernel.org, edumazet@google.com,
	pabeni@redhat.com, horms@kernel.org,
	Willem de Bruijn <willemb@google.com>
Subject: [PATCH net-next 2/3] net_sched: sch_fq: convert skb->tstamp if not monotonic
Date: Wed,  3 Jun 2026 15:01:29 -0400	[thread overview]
Message-ID: <20260603190243.2789335-3-willemdebruijn.kernel@gmail.com> (raw)
In-Reply-To: <20260603190243.2789335-1-willemdebruijn.kernel@gmail.com>

From: Willem de Bruijn <willemb@google.com>

FQ currently assumes skb->tstamp holds monotonic time, as used by TCP.

Users with ns_capable CAP_NET_ADMIN can transmit skbs using SO_TXTIME
with CLOCK_MONOTONIC, CLOCK_REALTIME or CLOCK_TAI clockids as of the
below commit.

More recently, skbs also gained tstamp_type to explicitly communicate
the clockid of skb->tstamp.

Detect other clocks and convert to monotonic for use in FQ. That is,
convert fq_skb_cb(skb)->time_to_send. Do not convert skb->tstamp
itself. Network device clocks are more commonly synchronized to TAI.

Conversion may be imprecise due to clock adjustment (e.g., adjfreq)
between when SCM_TSTAMP is set and when it is converted in fq_enqueue.
The common codepath is short, so skew will be well below common pacing
operation. Even in edge cases, bursts (too soon) or beyond horizon
(too late) are indistinguishable from network conditions. To which
senders must be robust, as long as infrequent.

Avoid overflow due to negative offsets becoming huge when converting
from signed ktime_t to u64 time_to_send. Bound lower to mono 1 and
upper to now + q->horizon. This protects against bad input, e.g.,
from BPF programs.

Detect legacy BPF programs that program skb->tstamp without setting
skb->tstamp_type. Here tstamp_type is zero (SKB_CLOCK_REALTIME), but
the value will be unrealistic for realtime in the 21st century. Follow
existing TIME_UPTIME_SEC_MAX as bound between mono and realtime.

Fixes: 80b14dee2bea ("net: Add a new socket option for a future transmit time.")
Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 net/sched/sch_fq.c | 43 ++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 38 insertions(+), 5 deletions(-)

diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c
index 33783c9f8e16..7cae082a9847 100644
--- a/net/sched/sch_fq.c
+++ b/net/sched/sch_fq.c
@@ -537,10 +537,10 @@ static void flow_queue_add(struct fq_flow *flow, struct sk_buff *skb)
 	rb_insert_color(&skb->rbnode, &flow->t_root);
 }
 
-static bool fq_packet_beyond_horizon(const struct sk_buff *skb,
+static bool fq_packet_beyond_horizon(ktime_t time_to_send,
 				     const struct fq_sched_data *q, u64 now)
 {
-	return unlikely((s64)skb->tstamp > (s64)(now + q->horizon));
+	return unlikely((s64)time_to_send > (s64)(now + q->horizon));
 }
 
 static void fq_flow_adjust_timer(struct fq_sched_data *q, struct fq_flow *flow,
@@ -561,6 +561,36 @@ static void fq_flow_adjust_timer(struct fq_sched_data *q, struct fq_flow *flow,
 	}
 }
 
+static ktime_t fq_skb_tstamp_to_mono(struct sk_buff *skb)
+{
+	const ktime_t mono_max = NSEC_PER_SEC * TIME_UPTIME_SEC_MAX;
+
+	if (likely(skb->tstamp_type == SKB_CLOCK_MONOTONIC))
+		return max(skb->tstamp, 1);
+
+	if (skb->tstamp_type == SKB_CLOCK_TAI)
+		return max(ktime_sub(skb->tstamp, ktime_mono_to_any(0, TK_OFFS_TAI)), 1);
+
+	if (likely(skb->tstamp > mono_max))
+		return max(ktime_sub(skb->tstamp, ktime_mono_to_real(0)), 1);
+
+	/* Handle BPF programs setting skb->stamp but not tstamp_type */
+	net_warn_ratelimited("fq: likely mono tstamp with tstamp_type 0\n");
+
+	skb->tstamp_type = SKB_CLOCK_MONOTONIC;
+	return max(skb->tstamp, 1);
+}
+
+static void fq_mono_to_skb_tstamp(struct sk_buff *skb, ktime_t time_to_send)
+{
+	if (skb->tstamp_type == SKB_CLOCK_MONOTONIC)
+		skb->tstamp = time_to_send;
+	else if (skb->tstamp_type == SKB_CLOCK_REALTIME)
+		skb->tstamp = ktime_mono_to_real(time_to_send);
+	else
+		skb->tstamp = ktime_mono_to_any(time_to_send, TK_OFFS_TAI);
+}
+
 static int fq_enqueue(struct sk_buff *skb, struct Qdisc *sch,
 		      struct sk_buff **to_free)
 {
@@ -579,17 +609,20 @@ static int fq_enqueue(struct sk_buff *skb, struct Qdisc *sch,
 	if (!skb->tstamp) {
 		fq_skb_cb(skb)->time_to_send = now;
 	} else {
+		ktime_t time_to_send = fq_skb_tstamp_to_mono(skb);
+
 		/* Check if packet timestamp is too far in the future. */
-		if (fq_packet_beyond_horizon(skb, q, now)) {
+		if (fq_packet_beyond_horizon(time_to_send, q, now)) {
 			if (q->horizon_drop) {
 				q->stat_horizon_drops++;
 				return qdisc_drop_reason(skb, sch, to_free,
 							 QDISC_DROP_HORIZON_LIMIT);
 			}
 			q->stat_horizon_caps++;
-			skb->tstamp = now + q->horizon;
+			time_to_send = now + q->horizon;
+			fq_mono_to_skb_tstamp(skb, time_to_send);
 		}
-		fq_skb_cb(skb)->time_to_send = skb->tstamp;
+		fq_skb_cb(skb)->time_to_send = (u64)time_to_send;
 	}
 
 	f = fq_classify(sch, skb, now);
-- 
2.54.0.1032.g2f8565e1d1-goog


  parent reply	other threads:[~2026-06-03 19:02 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-03 19:01 [PATCH net-next 0/3] SO_TXTIME improvements Willem de Bruijn
2026-06-03 19:01 ` [PATCH net-next 1/3] net: ensure SCM_TXTIME delivery time is no older than system boot Willem de Bruijn
2026-06-03 22:11   ` Jakub Kicinski
2026-06-03 19:01 ` Willem de Bruijn [this message]
2026-06-03 22:22   ` [PATCH net-next 2/3] net_sched: sch_fq: convert skb->tstamp if not monotonic Jakub Kicinski
2026-06-03 22:59     ` Willem de Bruijn
2026-06-03 23:27       ` Jakub Kicinski
2026-06-03 19:01 ` [PATCH net-next 3/3] selftests: drv-net: extend so_txtime with FQ with other clocks Willem de Bruijn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260603190243.2789335-3-willemdebruijn.kernel@gmail.com \
    --to=willemdebruijn.kernel@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox