From: Matthieu Baerts <matttbe@kernel.org>
To: Geliang Tang <geliang@kernel.org>
Cc: mptcp@lists.linux.dev
Subject: Re: [PATCH bpf-next v4 2/2] selftests/bpf: Add mptcp subflow subtest
Date: Thu, 15 Aug 2024 23:00:05 +0200 [thread overview]
Message-ID: <b0f25eeb-e644-4cd7-a719-b9dcd1b57480@kernel.org> (raw)
In-Reply-To: <716cbd56-4a44-4451-a6f3-5bacef3e0729@linux.dev>
Hi Geliang,
(only the MPTCP list)
On 15/08/2024 00:37, Martin KaFai Lau wrote:
> On 8/14/24 3:04 AM, Matthieu Baerts wrote:
>> Hi Martin,
>>
>> Thank you for your reply!
>>
>> On 14/08/2024 03:12, Martin KaFai Lau wrote:
>>> On 8/5/24 2:52 AM, Matthieu Baerts (NGI0) wrote:
(...)
>>>> + ASSERT_OK(ss_search(ADDR_1, "fwmark:0x1"), "ss_search
>>>> fwmark:0x1");
>>>> + ASSERT_OK(ss_search(ADDR_2, "fwmark:0x2"), "ss_search
>>>> fwmark:0x2");
>>>> + ASSERT_OK(ss_search(ADDR_1, new), "ss_search new cc");
>>>> + ASSERT_OK(ss_search(ADDR_2, cc), "ss_search default cc");
>>>
>>> Is there a getsockopt way instead of ss + grep?
>>
>> No there isn't: from the userspace, the app communicates with the MPTCP
>> socket, which can have multiple paths (subflows, a TCP socket). To keep
>> the compatibility with TCP, [gs]etsockopt() will look at/modify the
>> whole MPTCP connection. For example, in some cases, a setsockopt() will
>> propagate the option to all the subflows. Depending on the option, the
>> modification might only apply to the first subflow, or to the
>> user-facing socket.
>>
>> For advanced users who want to have different options set to the
>> different subflows of an MPTCP connection, they can use BPF: that's what
>> is being validated here. In other words, doing a 'getsockopt()' from the
>> userspace program here will not show all the different marks and TCP CC
>> that can be set per subflow with BPF. We can see that in the test: a
>> getsockopt() is done on the MPTCP socket to retrieve the default TCP CC
>> ('cc' which is certainly 'cubic'), but we expect to find another one
>> ('new' which is 'reno'), set by the BPF program from patch 1/2. I guess
>> we could use bpf to do a getsockopt() per subflow, but that's seems a
>> bit cheated to have the BPF test program setting something and checking
>> if it is set. Here, it is an external way. Because it is done from a
>
> I think the result is valid by having a bpf prog to inspect the value of
> a sock. Inspecting socket is an existing use case. There are many
> existing bpf tests covering this inspection use case to ensure the
> result is legit. A separate cgroup/getsockopt program should help here
> (more on this below).
>
>> dedicated netns, it sounds OK to do that, no?
>
> Thanks for the explanation. I was hoping there is a way to get to the
> underlying subflow fd. It seems impossible.
>
> In the netns does help here. It is not only about the ss iterating a lot
> of connections or not. My preference is not depending on external tool/
> shell-ing if possible, e.g. to avoid the package update discussion like
> the iproute2 here. The uapi from the testing kernel is always up-to-
> date. ss is another binary but arguably in the same iproute2 package.
> There is now another extra "grep" and pipe here. We had been bitten by
> different shell behaviors and some arch has different shells ...etc.
>
> I think it is ok to take this set as is if you (and Gelang?) are ok to
> followup a "cgroup/getsockopt" way to inspect the subflow as the very
> next patch to the mptcp selftest. It seems inspecting subflow will be a
> common test going forward for mptcp, so it will be beneficial to have a
> "cgroup/getsockopt" way to inspect the subflow directly.
>
> Take a look at a recent example [0]. The mptcp test is under a cgroup
> already and has the cgroup setup. An extra "cgroup/getsockopt" prog
> should be enough. That prog can walk the msk->conn_list and use
> bpf_rdonly_cast (or the bpf_core_cast macro in libbpf) to cast a pointer
> to tcp_sock for readonly. It will allow to inspect all the fields in a
> tcp_sock.
Do you think it is something that you could do to replace the
validations with 'ss', and maybe help with future validations for the
BPF packet schedulers?
> Something needs to a fix in patch 2(replied separately), so a re-spin is
> needed.
>
> pw-bot: cr
>
> [0]: https://lore.kernel.org/all/20240808150558.1035626-3-
> alan.maguire@oracle.com/
>
>
>
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
next prev parent reply other threads:[~2024-08-15 21:00 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-05 9:52 [PATCH bpf-next v4 0/2] selftests/bpf: new MPTCP subflow subtest Matthieu Baerts (NGI0)
2024-08-05 9:52 ` [PATCH bpf-next v4 1/2] selftests/bpf: Add mptcp subflow example Matthieu Baerts (NGI0)
2024-08-05 9:52 ` [PATCH bpf-next v4 2/2] selftests/bpf: Add mptcp subflow subtest Matthieu Baerts (NGI0)
2024-08-14 1:12 ` Martin KaFai Lau
2024-08-14 10:04 ` Matthieu Baerts
2024-08-14 22:37 ` Martin KaFai Lau
2024-08-15 20:57 ` Matthieu Baerts
2024-08-15 21:00 ` Matthieu Baerts [this message]
2024-08-18 2:13 ` Geliang Tang
2024-08-19 23:28 ` Martin KaFai Lau
2024-08-21 20:32 ` Manu Bretelle
2024-08-22 9:13 ` Matthieu Baerts
2024-08-14 22:16 ` Martin KaFai Lau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b0f25eeb-e644-4cd7-a719-b9dcd1b57480@kernel.org \
--to=matttbe@kernel.org \
--cc=geliang@kernel.org \
--cc=mptcp@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.