netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Jörn-Thorben Hinz" <jthinz@mailbox.tu-berlin.de>
To: Martin KaFai Lau <kafai@fb.com>
Cc: <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>, <netdev@vger.kernel.org>
Subject: Re: [PATCH bpf-next 2/2] bpf: Require only one of cong_avoid() and cong_control() from a TCP CC
Date: Tue, 14 Jun 2022 12:51:23 +0200	[thread overview]
Message-ID: <d87b06a38f6a8fa61abc36b06f108568b82bfa21.camel@mailbox.tu-berlin.de> (raw)
In-Reply-To: <20220609185551.ptn2htxmk4fsr5p2@kafai-mbp>

On Thu, 2022-06-09 at 11:55 -0700, Martin KaFai Lau wrote:
> On Thu, Jun 09, 2022 at 10:55:25AM +0200, Jörn-Thorben Hinz wrote:
> > Thanks for the feedback, Martin.
> > 
> > On Wed, 2022-06-08 at 11:33 -0700, Martin KaFai Lau wrote:
> > > On Wed, Jun 08, 2022 at 07:48:43PM +0200, Jörn-Thorben Hinz
> > > wrote:
> > > > When a CC implements tcp_congestion_ops.cong_control(), the
> > > > alternate
> > > > cong_avoid() is not in use in the TCP stack. Do not force a BPF
> > > > CC
> > > > to
> > > > implement cong_avoid() as a no-op by always requiring it.
> > > > 
> > > > An incomplete BPF CC implementing neither cong_avoid() nor
> > > > cong_control() will still get rejected by
> > > > tcp_register_congestion_control().
> > > > 
> > > > Signed-off-by: Jörn-Thorben Hinz <jthinz@mailbox.tu-berlin.de>
> > > > ---
> > > >  net/ipv4/bpf_tcp_ca.c | 1 +
> > > >  1 file changed, 1 insertion(+)
> > > > 
> > > > diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c
> > > > index 1f5c53ede4e5..37290d0bf134 100644
> > > > --- a/net/ipv4/bpf_tcp_ca.c
> > > > +++ b/net/ipv4/bpf_tcp_ca.c
> > > > @@ -17,6 +17,7 @@ extern struct bpf_struct_ops
> > > > bpf_tcp_congestion_ops;
> > > >  static u32 optional_ops[] = {
> > > >         offsetof(struct tcp_congestion_ops, init),
> > > >         offsetof(struct tcp_congestion_ops, release),
> > > > +       offsetof(struct tcp_congestion_ops, cong_avoid),
> > > At least one of the cong_avoid() or cong_control() is needed.
> > > It is better to remove is_optional(moff) check and its
> > > optional_ops[]
> > > here.  Only depends on the tcp_register_congestion_control()
> > > which
> > > does a similar check at the beginning.
> > You mean completely remove this part of the validation from
> > bpf_tcp_ca.c and just rely on tcp_register_congestion_control()?
> > True,
> Yes.
> 
> > that would be even easier to maintain at this point, make
> > tcp_register_congestion_control() the one-and-only place that has
> > to
> > know about required and optional functions.
> > 
> > Will rework the second patch.
> > 
> > > 
> > > Patch 1 looks good.  tcp_bbr.c also needs the sk_pacing fields.
> > > 
> > > A selftest is needed.  Can you share your bpf tcp-cc and
> > > use it as a selftest to exercise the change in this patch
> > > set ?
> > I cannot do that just now, unfortunately. It’s still earlier work
> > in
> > progress. Also, it will have an additional, external dependency
> > which
> > might make it unfit to be included here/as a selftest. I will keep
> > it
> > in mind for later this year, though.
> What is the external dependency ?  Could you share some high level
> of the CC you are developing ?
> The reason for this question is to see if there is something
> missing from the kernel side to write the tcp-cc in bpf that you
> are developing.
The algorithm is PowerTCP[1], I’m currently implementing it with eBPF.
It requires telemetry from the network.

The mentioned dependency (not developed by us, not yet released) will
provide such telemetry. Its end-host part is implemented with eBPF
itself and does not require any additional kernel patches.

At the moment I’m not aware of any other bits missing from the kernel
side. Will propose another patch if that changes or at least report it.

[1] https://www.usenix.org/system/files/nsdi22-paper-addanki_3.pdf

> 
> > In the meantime, I could look into adding a more naive/trivial
> > test,
> > that implements cong_control() without cong_avoid() and relies on
> > sk_pacing_* being writable, if you would prefer that? Would that be
> > fine as a follow-up patch (might take me a moment) or better be
> > included in this series?
> Yeah, it will do and the test should be submitted together in
> this series.
Please see v3 of the series.


  reply	other threads:[~2022-06-14 11:01 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-08 17:48 [PATCH bpf-next 0/2] Align BPF TCP CCs implementing cong_control() with non-BPF CCs Jörn-Thorben Hinz
2022-06-08 17:48 ` [PATCH bpf-next 1/2] bpf: Allow a TCP CC to write sk_pacing_rate and sk_pacing_status Jörn-Thorben Hinz
2022-06-08 17:48 ` [PATCH bpf-next 2/2] bpf: Require only one of cong_avoid() and cong_control() from a TCP CC Jörn-Thorben Hinz
2022-06-08 18:33   ` Martin KaFai Lau
2022-06-09  8:55     ` Jörn-Thorben Hinz
2022-06-09 18:55       ` Martin KaFai Lau
2022-06-14 10:51         ` Jörn-Thorben Hinz [this message]
2022-06-09 20:47 ` [PATCH bpf-next v2 0/2] Align BPF TCP CCs implementing cong_control() with non-BPF CCs Jörn-Thorben Hinz
2022-06-09 20:47   ` [PATCH bpf-next v2 1/2] bpf: Allow a TCP CC to write sk_pacing_rate and sk_pacing_status Jörn-Thorben Hinz
2022-06-09 20:47   ` [PATCH bpf-next v2 2/2] bpf: Require only one of cong_avoid() and cong_control() from a TCP CC Jörn-Thorben Hinz
2022-06-10  0:52     ` kernel test robot
2022-06-10 13:26     ` kernel test robot
2022-06-13 17:17     ` Martin KaFai Lau
2022-06-14 10:52       ` Jörn-Thorben Hinz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d87b06a38f6a8fa61abc36b06f108568b82bfa21.camel@mailbox.tu-berlin.de \
    --to=jthinz@mailbox.tu-berlin.de \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).