All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Eric Dumazet <edumazet@google.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	"David S . Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, netdev <netdev@vger.kernel.org>,
	Neal Cardwell <ncardwell@google.com>,
	Ingemar Johansson S <ingemar.s.johansson@ericsson.com>,
	Tom Henderson <tomh@tomh.org>, Bob Briscoe <in@bobbriscoe.net>
Subject: Re: [PATCH net-next 2/2] fq_codel: implement L4S style ce_threshold_ect1 marking
Date: Fri, 15 Oct 2021 01:24:49 +0200	[thread overview]
Message-ID: <87mtnb196m.fsf@toke.dk> (raw)
In-Reply-To: <CANn89iLbJL2Jzot5fy7m07xDhP_iCf8ro8SBzXx1hd0EYVvHcA@mail.gmail.com>

Eric Dumazet <edumazet@google.com> writes:

> On Thu, Oct 14, 2021 at 12:54 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Eric Dumazet <eric.dumazet@gmail.com> writes:
>>
>> > From: Eric Dumazet <edumazet@google.com>
>> >
>> > Add TCA_FQ_CODEL_CE_THRESHOLD_ECT1 boolean option to select Low Latency,
>> > Low Loss, Scalable Throughput (L4S) style marking, along with ce_threshold.
>> >
>> > If enabled, only packets with ECT(1) can be transformed to CE
>> > if their sojourn time is above the ce_threshold.
>> >
>> > Note that this new option does not change rules for codel law.
>> > In particular, if TCA_FQ_CODEL_ECN is left enabled (this is
>> > the default when fq_codel qdisc is created), ECT(0) packets can
>> > still get CE if codel law (as governed by limit/target) decides so.
>>
>> The ability to have certain packets receive a shallow marking threshold
>> and others regular ECN semantics is no doubt useful. However, given that
>> it is by no means certain how the L4S experiment will pan out (and I for
>> one remain sceptical that the real-world benefits will turn out to match
>> the tech demos), I think it's premature to bake the ECT(1) semantics
>> into UAPI.
>
> Chicken and egg problem.
> We had fq_codel in linux kernel years before RFC after all :)

Sure, but fq_codel is a self-contained algorithm, it doesn't add new
meanings to bits of the IP header... :)

>> So how about tying this behaviour to a configurable skb->mark instead?
>> That way users can get the shallow marking behaviour for any subset of
>> packets they want, simply by installing a suitable filter on the
>> qdisc...
>
> This seems an idea, but do you really expect users installing a sophisticated
> filter ? Please provide more details, and cost analysis.

Not sure it's that sophisticated; pretty simple to do with tc-u32
(although it's complicated a bit by having to restore the default
hashing behaviour of fq_codel with a second filter). Something like:

# tc qdisc replace dev $DEV handle 1: fq_codel
# tc filter add dev $DEV parent 1: pref 1 protocol ipv6 u32 match u32 00100000 00100000 action skbedit mark 2 continue
# tc filter add dev $DEV parent 1: pref 2 protocol ip u32 match ip dsfield 1 1 action skbedit mark 2 continue
# tc filter add dev $DEV parent 1: handle 1 pref 3 protocol all flow hash keys src,dst,proto,proto-src,proto-dst divisor 1024

or one could write a single BPF program that combines all three to save
some cycles walking the filter chain.

> (Having to install a filter is probably more expensive than testing a
> boolean, after the sojourn time has exceeded the threshold)

No doubt, all other things being equal. But odds are they're not: if
you're already running a BPF filter somewhere in the path, adding the
logic above to an existing filter reduces it back down to a couple of
boolean comparisons, for instance.

But even if it does add a bit of overhead, IMO the flexibility makes up
for this. We can always revisit it if L4S becomes a standards-track RFC
at some point :)

> Given that INET_ECN_set_ce(skb) only operates on ECT(1) and ECT(0),
> I guess we could  use a bitmask of two bits so that users can decide
> which code points can become CE.

That would be an improvement. But if we're doing bitmasks, and since the
code is reading the whole dsfield anyway, why not extend that bitmask to
the whole dsfield?

-Toke


  reply	other threads:[~2021-10-14 23:24 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-14 17:59 [PATCH net-next 0/2] net/sched: implement L4S style ce_threshold_ect1 marking Eric Dumazet
2021-10-14 17:59 ` [PATCH net-next 1/2] net: add skb_get_dsfield() helper Eric Dumazet
2021-10-14 17:59 ` [PATCH net-next 2/2] fq_codel: implement L4S style ce_threshold_ect1 marking Eric Dumazet
2021-10-14 19:54   ` Toke Høiland-Jørgensen
2021-10-14 21:35     ` Eric Dumazet
2021-10-14 23:24       ` Toke Høiland-Jørgensen [this message]
2021-10-16  7:39         ` Jonathan Morton
2021-10-17 11:22           ` Bob Briscoe
2021-10-17 12:18             ` Jonathan Morton
2021-10-18 19:43               ` Gorry Fairhurst
2021-10-15 12:59   ` Bob Briscoe
2021-10-15 14:08     ` Eric Dumazet
2021-10-15 15:49       ` Neal Cardwell
2021-10-17  0:42         ` Bob Briscoe
2021-10-18 11:42     ` Dave Taht
2021-10-15 10:40 ` [PATCH net-next 0/2] net/sched: " patchwork-bot+netdevbpf
2021-10-15 13:01   ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mtnb196m.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=in@bobbriscoe.net \
    --cc=ingemar.s.johansson@ericsson.com \
    --cc=kuba@kernel.org \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=tomh@tomh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.