* Re: mptcp: fix data re-injection from stale subflow: Tests Results
2024-01-23 21:03 [PATCH mptcp-net] mptcp: fix data re-injection from stale subflow Paolo Abeni
@ 2024-01-23 22:01 ` MPTCP CI
2024-01-23 22:32 ` MPTCP CI
` (4 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: MPTCP CI @ 2024-01-23 22:01 UTC (permalink / raw)
To: Paolo Abeni; +Cc: mptcp
Hi Paolo,
Thank you for your modifications, that's great!
Our CI (GitHub Action) did some validations and here is its report:
- KVM Validation: normal:
- Unstable: 1 failed test(s): packetdrill_regressions 🔴:
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/7631907725
Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/035781441dd7
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-normal
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: mptcp: fix data re-injection from stale subflow: Tests Results
2024-01-23 21:03 [PATCH mptcp-net] mptcp: fix data re-injection from stale subflow Paolo Abeni
2024-01-23 22:01 ` mptcp: fix data re-injection from stale subflow: Tests Results MPTCP CI
@ 2024-01-23 22:32 ` MPTCP CI
2024-01-24 1:14 ` [PATCH mptcp-net] mptcp: fix data re-injection from stale subflow Mat Martineau
` (3 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: MPTCP CI @ 2024-01-23 22:32 UTC (permalink / raw)
To: Paolo Abeni; +Cc: mptcp
Hi Paolo,
Thank you for your modifications, that's great!
Our CI (Cirrus) did some validations with a debug kernel and here is its report:
- KVM Validation: debug (except selftest_mptcp_join):
- Unstable: 2 failed test(s): packetdrill_mp_join selftest_diag - Critical: 1 Call Trace(s) ❌:
- Task: https://cirrus-ci.com/task/6235859205226496
- Summary: https://api.cirrus-ci.com/v1/artifact/task/6235859205226496/summary/summary.txt
- KVM Validation: debug (only selftest_mptcp_join):
- Critical: 1 Call Trace(s) ❌:
- Task: https://cirrus-ci.com/task/4828484321673216
- Summary: https://api.cirrus-ci.com/v1/artifact/task/4828484321673216/summary/summary.txt
Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/035781441dd7
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-debug
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH mptcp-net] mptcp: fix data re-injection from stale subflow
2024-01-23 21:03 [PATCH mptcp-net] mptcp: fix data re-injection from stale subflow Paolo Abeni
2024-01-23 22:01 ` mptcp: fix data re-injection from stale subflow: Tests Results MPTCP CI
2024-01-23 22:32 ` MPTCP CI
@ 2024-01-24 1:14 ` Mat Martineau
2024-01-24 2:25 ` mptcp: fix data re-injection from stale subflow: Tests Results MPTCP CI
` (2 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: Mat Martineau @ 2024-01-24 1:14 UTC (permalink / raw)
To: Paolo Abeni; +Cc: mptcp
On Tue, 23 Jan 2024, Paolo Abeni wrote:
> When the MPTCP PM detects that a subflow is stale, all the packet
> scheduler must re-inject all the mptcp-level unacked data. To avoid
> acquiring unneeded locks, it first try to check if any unacked data
> is present at all in the RTX queue, but such check is currently
> broken, as it uses TCP-specific helper on an MPTCP socket.
>
> Funnily enough fuzzers and static checkers are happy, as the accessed
> memory still belongs to the mptcp_sock struct, and even from a
> functional perspective the recovery completed successfully, as
> the short-cut test always failed.
>
> A recent unrelated TCP change - commit d5fed5addb2b ("tcp: reorganize
> tcp_sock fast path variables") - exposed the issue, as the tcp field
> reorganization makes the mptcp code always skip the re-inection.
>
> Fix the issue dropping the bogus call: we are on a slow path, the early
> optimization proved once again to be evil.
>
Thanks for tracking this down, Paolo. Fix LGTM:
Reviewed-by: Mat Martineau <martineau@kernel.org>
> Fixes: 1e1d9d6f119c ("mptcp: handle pending data on closed subflow")
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/468
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> net/mptcp/protocol.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 53d6c5544900..a8a94b34a51e 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -2339,9 +2339,6 @@ bool __mptcp_retransmit_pending_data(struct sock *sk)
> if (__mptcp_check_fallback(msk))
> return false;
>
> - if (tcp_rtx_and_write_queues_empty(sk))
> - return false;
> -
> /* the closing socket has some data untransmitted and/or unacked:
> * some data in the mptcp rtx queue has not really xmitted yet.
> * keep it simple and re-inject the whole mptcp level rtx queue
> --
> 2.43.0
>
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: mptcp: fix data re-injection from stale subflow: Tests Results
2024-01-23 21:03 [PATCH mptcp-net] mptcp: fix data re-injection from stale subflow Paolo Abeni
` (2 preceding siblings ...)
2024-01-24 1:14 ` [PATCH mptcp-net] mptcp: fix data re-injection from stale subflow Mat Martineau
@ 2024-01-24 2:25 ` MPTCP CI
2024-01-24 2:33 ` MPTCP CI
2024-01-30 11:59 ` [PATCH mptcp-net] mptcp: fix data re-injection from stale subflow Matthieu Baerts
5 siblings, 0 replies; 11+ messages in thread
From: MPTCP CI @ 2024-01-24 2:25 UTC (permalink / raw)
To: Paolo Abeni; +Cc: mptcp
Hi Paolo,
Thank you for your modifications, that's great!
Our CI (GitHub Action) did some validations and here is its report:
- KVM Validation: normal:
- Unstable: 2 failed test(s): packetdrill_regressions selftest_mptcp_join 🔴:
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/7634219212
Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/333cd7ebfec0
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-normal
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: mptcp: fix data re-injection from stale subflow: Tests Results
2024-01-23 21:03 [PATCH mptcp-net] mptcp: fix data re-injection from stale subflow Paolo Abeni
` (3 preceding siblings ...)
2024-01-24 2:25 ` mptcp: fix data re-injection from stale subflow: Tests Results MPTCP CI
@ 2024-01-24 2:33 ` MPTCP CI
2024-01-30 11:59 ` [PATCH mptcp-net] mptcp: fix data re-injection from stale subflow Matthieu Baerts
5 siblings, 0 replies; 11+ messages in thread
From: MPTCP CI @ 2024-01-24 2:33 UTC (permalink / raw)
To: Paolo Abeni; +Cc: mptcp
Hi Paolo,
Thank you for your modifications, that's great!
Our CI (Cirrus) did some validations with a debug kernel and here is its report:
- KVM Validation: debug (except selftest_mptcp_join):
- Critical: 1 Call Trace(s) ❌:
- Task: https://cirrus-ci.com/task/6638955139956736
- Summary: https://api.cirrus-ci.com/v1/artifact/task/6638955139956736/summary/summary.txt
- KVM Validation: debug (only selftest_mptcp_join):
- Critical: 1 Call Trace(s) ❌:
- Task: https://cirrus-ci.com/task/4598261558804480
- Summary: https://api.cirrus-ci.com/v1/artifact/task/4598261558804480/summary/summary.txt
Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/333cd7ebfec0
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-debug
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH mptcp-net] mptcp: fix data re-injection from stale subflow
2024-01-23 21:03 [PATCH mptcp-net] mptcp: fix data re-injection from stale subflow Paolo Abeni
` (4 preceding siblings ...)
2024-01-24 2:33 ` MPTCP CI
@ 2024-01-30 11:59 ` Matthieu Baerts
2024-01-30 15:12 ` Paolo Abeni
5 siblings, 1 reply; 11+ messages in thread
From: Matthieu Baerts @ 2024-01-30 11:59 UTC (permalink / raw)
To: Paolo Abeni, mptcp
Hi Paolo, Mat,
On 23/01/2024 22:03, Paolo Abeni wrote:
> When the MPTCP PM detects that a subflow is stale, all the packet
> scheduler must re-inject all the mptcp-level unacked data. To avoid
> acquiring unneeded locks, it first try to check if any unacked data
> is present at all in the RTX queue, but such check is currently
> broken, as it uses TCP-specific helper on an MPTCP socket.
>
> Funnily enough fuzzers and static checkers are happy, as the accessed
> memory still belongs to the mptcp_sock struct, and even from a
> functional perspective the recovery completed successfully, as
> the short-cut test always failed.
>
> A recent unrelated TCP change - commit d5fed5addb2b ("tcp: reorganize
> tcp_sock fast path variables") - exposed the issue, as the tcp field
> reorganization makes the mptcp code always skip the re-inection.
>
> Fix the issue dropping the bogus call: we are on a slow path, the early
> optimization proved once again to be evil.
>
> Fixes: 1e1d9d6f119c ("mptcp: handle pending data on closed subflow")
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/468
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Thank you for the patch and the review!
Any ideas how to prevent such issues? Changing the signature of the
TCP-specific functions? Adding a custom check for our CI on our side?
New patches for t/upstream-net and t/upstream:
- 359966180a29: mptcp: fix data re-injection from stale subflow
- Results: d2cf5db9b409..ca82f8ccb3cc (export-net)
- Results: 171302925a80..4b8f1ec22243 (export)
Tests are now in progress:
https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export-net/20240130T115340
https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20240130T115340
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH mptcp-net] mptcp: fix data re-injection from stale subflow
2024-01-30 11:59 ` [PATCH mptcp-net] mptcp: fix data re-injection from stale subflow Matthieu Baerts
@ 2024-01-30 15:12 ` Paolo Abeni
2024-01-30 17:52 ` Matthieu Baerts
0 siblings, 1 reply; 11+ messages in thread
From: Paolo Abeni @ 2024-01-30 15:12 UTC (permalink / raw)
To: Matthieu Baerts, mptcp
On Tue, 2024-01-30 at 12:59 +0100, Matthieu Baerts wrote:
> Hi Paolo, Mat,
>
> On 23/01/2024 22:03, Paolo Abeni wrote:
> > When the MPTCP PM detects that a subflow is stale, all the packet
> > scheduler must re-inject all the mptcp-level unacked data. To avoid
> > acquiring unneeded locks, it first try to check if any unacked data
> > is present at all in the RTX queue, but such check is currently
> > broken, as it uses TCP-specific helper on an MPTCP socket.
> >
> > Funnily enough fuzzers and static checkers are happy, as the accessed
> > memory still belongs to the mptcp_sock struct, and even from a
> > functional perspective the recovery completed successfully, as
> > the short-cut test always failed.
> >
> > A recent unrelated TCP change - commit d5fed5addb2b ("tcp: reorganize
> > tcp_sock fast path variables") - exposed the issue, as the tcp field
> > reorganization makes the mptcp code always skip the re-inection.
> >
> > Fix the issue dropping the bogus call: we are on a slow path, the early
> > optimization proved once again to be evil.
> >
> > Fixes: 1e1d9d6f119c ("mptcp: handle pending data on closed subflow")
> > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/468
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>
> Thank you for the patch and the review!
>
> Any ideas how to prevent such issues? Changing the signature of the
> TCP-specific functions? Adding a custom check for our CI on our side?
I guess we could add debug-only code in mptcp_sk() and in tcp_sk()
checking that the argument pointer belongs to the relevant slab, see
kfree():
https://elixir.bootlin.com/linux/v6.8-rc2/source/mm/slub.c#L4407
we could fetch slab->slab_cache and check it vs
mptcp_v6_prot.slab/mptcp_prot.slab/tcp_prot.slab/tcpv6_prot.slab.
The main downside is that such code is heavily mm-dependent and may
change/break when upstream progresses.
Cheers,
Paolo
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH mptcp-net] mptcp: fix data re-injection from stale subflow
2024-01-30 15:12 ` Paolo Abeni
@ 2024-01-30 17:52 ` Matthieu Baerts
2024-01-30 18:53 ` Paolo Abeni
0 siblings, 1 reply; 11+ messages in thread
From: Matthieu Baerts @ 2024-01-30 17:52 UTC (permalink / raw)
To: Paolo Abeni, mptcp
Hi Paolo,
Thank you for your reply!
On 30/01/2024 16:12, Paolo Abeni wrote:
> On Tue, 2024-01-30 at 12:59 +0100, Matthieu Baerts wrote:
>> Hi Paolo, Mat,
>>
>> On 23/01/2024 22:03, Paolo Abeni wrote:
>>> When the MPTCP PM detects that a subflow is stale, all the packet
>>> scheduler must re-inject all the mptcp-level unacked data. To avoid
>>> acquiring unneeded locks, it first try to check if any unacked data
>>> is present at all in the RTX queue, but such check is currently
>>> broken, as it uses TCP-specific helper on an MPTCP socket.
>>>
>>> Funnily enough fuzzers and static checkers are happy, as the accessed
>>> memory still belongs to the mptcp_sock struct, and even from a
>>> functional perspective the recovery completed successfully, as
>>> the short-cut test always failed.
>>>
>>> A recent unrelated TCP change - commit d5fed5addb2b ("tcp: reorganize
>>> tcp_sock fast path variables") - exposed the issue, as the tcp field
>>> reorganization makes the mptcp code always skip the re-inection.
>>>
>>> Fix the issue dropping the bogus call: we are on a slow path, the early
>>> optimization proved once again to be evil.
>>>
>>> Fixes: 1e1d9d6f119c ("mptcp: handle pending data on closed subflow")
>>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/468
>>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>>
>> Thank you for the patch and the review!
>>
>> Any ideas how to prevent such issues? Changing the signature of the
>> TCP-specific functions? Adding a custom check for our CI on our side?
>
> I guess we could add debug-only code in mptcp_sk() and in tcp_sk()
> checking that the argument pointer belongs to the relevant slab, see
> kfree():
>
> https://elixir.bootlin.com/linux/v6.8-rc2/source/mm/slub.c#L4407
>
> we could fetch slab->slab_cache and check it vs
> mptcp_v6_prot.slab/mptcp_prot.slab/tcp_prot.slab/tcpv6_prot.slab.
>
> The main downside is that such code is heavily mm-dependent and may
> change/break when upstream progresses.
Good idea!
If you think it is too fragile, maybe we could add a new entry in
"struct inet_connection_sock" (or "struct sock"?) -- e.g. is_msk --, set
it in mptcp_init_sock(), and check it in mptcp_sk() and in tcp_sk().
We can do that only if KASAN kconfig is set (or another one?).
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH mptcp-net] mptcp: fix data re-injection from stale subflow
2024-01-30 17:52 ` Matthieu Baerts
@ 2024-01-30 18:53 ` Paolo Abeni
2024-01-31 11:12 ` Matthieu Baerts
0 siblings, 1 reply; 11+ messages in thread
From: Paolo Abeni @ 2024-01-30 18:53 UTC (permalink / raw)
To: Matthieu Baerts, mptcp
On Tue, 2024-01-30 at 18:52 +0100, Matthieu Baerts wrote:
> Hi Paolo,
>
> Thank you for your reply!
>
> On 30/01/2024 16:12, Paolo Abeni wrote:
> > On Tue, 2024-01-30 at 12:59 +0100, Matthieu Baerts wrote:
> > > Hi Paolo, Mat,
> > >
> > > On 23/01/2024 22:03, Paolo Abeni wrote:
> > > > When the MPTCP PM detects that a subflow is stale, all the packet
> > > > scheduler must re-inject all the mptcp-level unacked data. To avoid
> > > > acquiring unneeded locks, it first try to check if any unacked data
> > > > is present at all in the RTX queue, but such check is currently
> > > > broken, as it uses TCP-specific helper on an MPTCP socket.
> > > >
> > > > Funnily enough fuzzers and static checkers are happy, as the accessed
> > > > memory still belongs to the mptcp_sock struct, and even from a
> > > > functional perspective the recovery completed successfully, as
> > > > the short-cut test always failed.
> > > >
> > > > A recent unrelated TCP change - commit d5fed5addb2b ("tcp: reorganize
> > > > tcp_sock fast path variables") - exposed the issue, as the tcp field
> > > > reorganization makes the mptcp code always skip the re-inection.
> > > >
> > > > Fix the issue dropping the bogus call: we are on a slow path, the early
> > > > optimization proved once again to be evil.
> > > >
> > > > Fixes: 1e1d9d6f119c ("mptcp: handle pending data on closed subflow")
> > > > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/468
> > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > >
> > > Thank you for the patch and the review!
> > >
> > > Any ideas how to prevent such issues? Changing the signature of the
> > > TCP-specific functions? Adding a custom check for our CI on our side?
> >
> > I guess we could add debug-only code in mptcp_sk() and in tcp_sk()
> > checking that the argument pointer belongs to the relevant slab, see
> > kfree():
> >
> > https://elixir.bootlin.com/linux/v6.8-rc2/source/mm/slub.c#L4407
> >
> > we could fetch slab->slab_cache and check it vs
> > mptcp_v6_prot.slab/mptcp_prot.slab/tcp_prot.slab/tcpv6_prot.slab.
> >
> > The main downside is that such code is heavily mm-dependent and may
> > change/break when upstream progresses.
>
> Good idea!
>
> If you think it is too fragile, maybe we could add a new entry in
> "struct inet_connection_sock" (or "struct sock"?) -- e.g. is_msk --, set
> it in mptcp_init_sock(), and check it in mptcp_sk() and in tcp_sk().
Nice! that would be much more robust! Thinking again about it, we could
simply check:
sk->sk_prot == mptcp_prot || sk->sk_prot == mptcp_v6_prot
> We can do that only if KASAN kconfig is set (or another one?).
A possible option could be DEBUG_NET, but such option currently enables
almost no-overhead things. I think we can use it, if we opt for
checking sk_prot.
Cheers,
Paolo
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH mptcp-net] mptcp: fix data re-injection from stale subflow
2024-01-30 18:53 ` Paolo Abeni
@ 2024-01-31 11:12 ` Matthieu Baerts
0 siblings, 0 replies; 11+ messages in thread
From: Matthieu Baerts @ 2024-01-31 11:12 UTC (permalink / raw)
To: Paolo Abeni, mptcp
Hi Paolo,
On 30/01/2024 19:53, Paolo Abeni wrote:
> On Tue, 2024-01-30 at 18:52 +0100, Matthieu Baerts wrote:
>> Hi Paolo,
>>
>> Thank you for your reply!
>>
>> On 30/01/2024 16:12, Paolo Abeni wrote:
>>> On Tue, 2024-01-30 at 12:59 +0100, Matthieu Baerts wrote:
>>>> Hi Paolo, Mat,
>>>>
>>>> On 23/01/2024 22:03, Paolo Abeni wrote:
>>>>> When the MPTCP PM detects that a subflow is stale, all the packet
>>>>> scheduler must re-inject all the mptcp-level unacked data. To avoid
>>>>> acquiring unneeded locks, it first try to check if any unacked data
>>>>> is present at all in the RTX queue, but such check is currently
>>>>> broken, as it uses TCP-specific helper on an MPTCP socket.
>>>>>
>>>>> Funnily enough fuzzers and static checkers are happy, as the accessed
>>>>> memory still belongs to the mptcp_sock struct, and even from a
>>>>> functional perspective the recovery completed successfully, as
>>>>> the short-cut test always failed.
>>>>>
>>>>> A recent unrelated TCP change - commit d5fed5addb2b ("tcp: reorganize
>>>>> tcp_sock fast path variables") - exposed the issue, as the tcp field
>>>>> reorganization makes the mptcp code always skip the re-inection.
>>>>>
>>>>> Fix the issue dropping the bogus call: we are on a slow path, the early
>>>>> optimization proved once again to be evil.
>>>>>
>>>>> Fixes: 1e1d9d6f119c ("mptcp: handle pending data on closed subflow")
>>>>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/468
>>>>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>>>>
>>>> Thank you for the patch and the review!
>>>>
>>>> Any ideas how to prevent such issues? Changing the signature of the
>>>> TCP-specific functions? Adding a custom check for our CI on our side?
>>>
>>> I guess we could add debug-only code in mptcp_sk() and in tcp_sk()
>>> checking that the argument pointer belongs to the relevant slab, see
>>> kfree():
>>>
>>> https://elixir.bootlin.com/linux/v6.8-rc2/source/mm/slub.c#L4407
>>>
>>> we could fetch slab->slab_cache and check it vs
>>> mptcp_v6_prot.slab/mptcp_prot.slab/tcp_prot.slab/tcpv6_prot.slab.
>>>
>>> The main downside is that such code is heavily mm-dependent and may
>>> change/break when upstream progresses.
>>
>> Good idea!
>>
>> If you think it is too fragile, maybe we could add a new entry in
>> "struct inet_connection_sock" (or "struct sock"?) -- e.g. is_msk --, set
>> it in mptcp_init_sock(), and check it in mptcp_sk() and in tcp_sk().
>
> Nice! that would be much more robust! Thinking again about it, we could
> simply check:
>
> sk->sk_prot == mptcp_prot || sk->sk_prot == mptcp_v6_prot
Nice!
What about adding this in tcp_sk():
WARN_ON(sk->sk_protocol != IPPROTO_TCP);
and this in mptcp_sk():
WARN_ON(sk->sk_protocol != IPPROTO_MPTCP);
But then I suppose we will need to have these two helpers inlined when
we want to use this debug option. They are no longer inlined since:
- e9d9da91548b ("tcp: preserve const qualifier in tcp_sk()")
- 403a40f2304d ("mptcp: preserve const qualifier in mptcp_sk()")
>> We can do that only if KASAN kconfig is set (or another one?).
>
> A possible option could be DEBUG_NET, but such option currently enables
> almost no-overhead things. I think we can use it, if we opt for
> checking sk_prot.
Sounds good to me!
Do you think this kind of patch could even be upstreamed? We can suggest
it and see.
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 11+ messages in thread