From: Kui-Feng Lee <sinquersw@gmail.com>
To: Kui-Feng Lee <kuifeng@meta.com>,
bpf@vger.kernel.org, ast@kernel.org, martin.lau@linux.dev,
song@kernel.org, kernel-team@meta.com, andrii@kernel.org,
sdf@google.com
Subject: Re: [PATCH bpf-next v6 0/8] Transit between BPF TCP congestion controls.
Date: Fri, 10 Mar 2023 08:28:59 -0800 [thread overview]
Message-ID: <7e0b5974-0518-fe8d-0485-a8b2b73059cb@gmail.com> (raw)
In-Reply-To: <20230310043812.3087672-1-kuifeng@meta.com>
On 3/9/23 20:38, Kui-Feng Lee wrote:
> Major changes:
>
> - Create bpf_links in the kernel for BPF struct_ops to register and
> unregister it.
>
> - Enables switching between implementations of bpf-tcp-cc under a
> name instantly by replacing the backing struct_ops map of a
> bpf_link.
>
> Previously, BPF struct_ops didn't go off, as even when the user
> program creating it was terminated, none of these ever were pinned.
> For instance, the TCP congestion control subsystem indirectly
> maintains a reference count on the struct_ops of any registered BPF
> implemented algorithm. Thus, the algorithm won't be deactivated until
> someone deliberately unregisters it. For compatibility with other BPF
> programs, bpf_links have been created to work in coordination with
> struct_ops maps. This ensures that the registration and unregistration
> of these respective maps is carried out at the start and end of the
> bpf_link.
>
> We also faced complications when attempting to replace an existing TCP
> congestion control algorithm with a new implementation on the fly. A
> struct_ops map was used to register a TCP congestion control algorithm
> with a unique name. We had to either register the alternative
> implementation with a new name and move over or unregister the current
> one before being able to reregistration with the same name. To fix
> this problem, we can an option to migrate the registration of the
> algorithm from struct_ops maps to bpf_links. By modifying the backing
> map of a bpf_link, it suddenly becomes possible to replace an existing
> TCP congestion control algorithm with ease.
The major differences from v5:
- Add a new step to bpf_object__load() to prepare vdata.
- Accept BPF_F_REPLACE.
- Check section IDs in find_struct_ops_map_by_offset()
- Add a test case to check mixing w/ & w/o link struct_ops.
- Add a test case of using struct_ops w/o link to update a link.
- Improve bpf_link__detach_struct_ops() to handle the w/ link case.
>
> The major differences from v4:
>
> - Rebase.
>
> - Reorder patches and merge part 4 to part 2 of the v4.
>
> The major differences from v3:
>
> - Remove bpf_struct_ops_map_free_rcu(), and use synchronize_rcu().
>
> - Improve the commit log of the part 1.
>
> - Before transitioning to the READY state, we conduct a value check
> to ensure that struct_ops can be successfully utilized and links
> created later.
>
> The major differences from v2:
>
> - Simplify states
>
> - Remove TOBEUNREG.
>
> - Rename UNREG to READY.
>
> - Stop using the refcnt of the kvalue of a struct_ops. Explicitly
> increase and decrease the refcount of struct_ops.
>
> - Prepare kernel vdata during the load phase of libbpf.
>
> The major differences from v1:
>
> - Added bpf_struct_ops_link to replace the previous union-based
> approach.
>
> - Added UNREG and TOBEUNREG to the state of bpf_struct_ops_map.
>
> - bpf_struct_ops_transit_state() maintains state transitions.
>
> - Fixed synchronization issue.
>
> - Prepare kernel vdata of struct_ops during the loading phase of
> bpf_object.
>
> - Merged previous patch 3 to patch 1.
>
v5: https://lore.kernel.org/all/20230308005050.255859-1-kuifeng@meta.com/
> v4: https://lore.kernel.org/all/20230307232913.576893-1-andrii@kernel.org/
> v3: https://lore.kernel.org/all/20230303012122.852654-1-kuifeng@meta.com/
> v2: https://lore.kernel.org/bpf/20230223011238.12313-1-kuifeng@meta.com/
> v1: https://lore.kernel.org/bpf/20230214221718.503964-1-kuifeng@meta.com/
>
> Kui-Feng Lee (8):
> bpf: Retire the struct_ops map kvalue->refcnt.
> net: Update an existing TCP congestion control algorithm.
> bpf: Create links for BPF struct_ops maps.
> libbpf: Create a bpf_link in bpf_map__attach_struct_ops().
> bpf: Update the struct_ops of a bpf_link.
> libbpf: Update a bpf_link with another struct_ops.
> libbpf: Use .struct_ops.link section to indicate a struct_ops with a
> link.
> selftests/bpf: Test switching TCP Congestion Control algorithms.
>
> include/linux/bpf.h | 10 +
> include/net/tcp.h | 3 +
> include/uapi/linux/bpf.h | 20 +-
> kernel/bpf/bpf_struct_ops.c | 229 +++++++++++++++---
> kernel/bpf/syscall.c | 49 +++-
> net/bpf/bpf_dummy_struct_ops.c | 6 +
> net/ipv4/bpf_tcp_ca.c | 14 +-
> net/ipv4/tcp_cong.c | 60 ++++-
> tools/include/uapi/linux/bpf.h | 20 +-
> tools/lib/bpf/libbpf.c | 180 +++++++++++---
> tools/lib/bpf/libbpf.h | 1 +
> tools/lib/bpf/libbpf.map | 1 +
> .../selftests/bpf/prog_tests/bpf_tcp_ca.c | 91 +++++++
> .../selftests/bpf/progs/tcp_ca_update.c | 80 ++++++
> 14 files changed, 671 insertions(+), 93 deletions(-)
> create mode 100644 tools/testing/selftests/bpf/progs/tcp_ca_update.c
>
prev parent reply other threads:[~2023-03-10 16:32 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-10 4:38 [PATCH bpf-next v6 0/8] Transit between BPF TCP congestion controls Kui-Feng Lee
2023-03-10 4:38 ` [PATCH bpf-next v6 1/8] bpf: Retire the struct_ops map kvalue->refcnt Kui-Feng Lee
2023-03-14 6:05 ` Martin KaFai Lau
2023-03-10 4:38 ` [PATCH bpf-next v6 2/8] net: Update an existing TCP congestion control algorithm Kui-Feng Lee
2023-03-10 16:47 ` Stephen Hemminger
2023-03-13 15:46 ` Kui-Feng Lee
2023-03-13 16:43 ` Kui-Feng Lee
2023-03-14 0:28 ` Martin KaFai Lau
2023-03-14 4:31 ` Kui-Feng Lee
2023-03-10 4:38 ` [PATCH bpf-next v6 3/8] bpf: Create links for BPF struct_ops maps Kui-Feng Lee
2023-03-14 1:42 ` Martin KaFai Lau
2023-03-16 0:21 ` Kui-Feng Lee
2023-03-10 4:38 ` [PATCH bpf-next v6 4/8] libbpf: Create a bpf_link in bpf_map__attach_struct_ops() Kui-Feng Lee
2023-03-10 4:38 ` [PATCH bpf-next v6 5/8] bpf: Update the struct_ops of a bpf_link Kui-Feng Lee
2023-03-10 4:38 ` [PATCH bpf-next v6 6/8] libbpf: Update a bpf_link with another struct_ops Kui-Feng Lee
2023-03-10 4:38 ` [PATCH bpf-next v6 7/8] libbpf: Use .struct_ops.link section to indicate a struct_ops with a link Kui-Feng Lee
2023-03-10 4:38 ` [PATCH bpf-next v6 8/8] selftests/bpf: Test switching TCP Congestion Control algorithms Kui-Feng Lee
2023-03-14 5:04 ` Martin KaFai Lau
2023-03-10 16:28 ` Kui-Feng Lee [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7e0b5974-0518-fe8d-0485-a8b2b73059cb@gmail.com \
--to=sinquersw@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=kernel-team@meta.com \
--cc=kuifeng@meta.com \
--cc=martin.lau@linux.dev \
--cc=sdf@google.com \
--cc=song@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox