* [PATCH mptcp-next v9 01/23] Squash to "mptcp: drop last_snd and MPTCP_RESET_SCHEDULER"
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
@ 2023-06-14 2:20 ` Geliang Tang
2023-06-14 2:20 ` [PATCH mptcp-next v9 02/23] Squash to "mptcp: add struct mptcp_sched_ops" Geliang Tang
` (21 subsequent siblings)
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:20 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
Drop all last_snd.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
net/mptcp/protocol.c | 9 ---------
net/mptcp/protocol.h | 1 -
2 files changed, 10 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 7ffb6709cacd..64343c664455 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1619,7 +1619,6 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags)
continue;
}
do_check_data_fin = true;
- msk->last_snd = ssk;
}
}
}
@@ -1660,7 +1659,6 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, bool
if (ret <= 0)
break;
copied += ret;
- msk->last_snd = ssk;
continue;
}
@@ -1673,7 +1671,6 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, bool
if (ret <= 0)
keep_pushing = false;
copied += ret;
- msk->last_snd = ssk;
}
mptcp_for_each_subflow(msk, subflow) {
@@ -2466,9 +2463,6 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
WRITE_ONCE(msk->first, NULL);
out:
- if (ssk == msk->last_snd)
- msk->last_snd = NULL;
-
if (need_push)
__mptcp_push_pending(sk, 0);
}
@@ -2648,8 +2642,6 @@ static void __mptcp_retrans(struct sock *sk)
}
release_sock(ssk);
-
- msk->last_snd = ssk;
}
}
@@ -3157,7 +3149,6 @@ static int mptcp_disconnect(struct sock *sk, int flags)
* subflow
*/
mptcp_destroy_common(msk, MPTCP_CF_FASTCLOSE);
- msk->last_snd = NULL;
WRITE_ONCE(msk->flags, 0);
msk->cb_flags = 0;
msk->push_pending = 0;
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index d2e59cf33f57..35dd3683f735 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -269,7 +269,6 @@ struct mptcp_sock {
u64 rcv_data_fin_seq;
u64 bytes_retrans;
int rmem_fwd_alloc;
- struct sock *last_snd;
int snd_burst;
int old_wspace;
u64 recovery_snd_nxt; /* in recovery mode accept up to this seq;
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 02/23] Squash to "mptcp: add struct mptcp_sched_ops"
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
2023-06-14 2:20 ` [PATCH mptcp-next v9 01/23] Squash to "mptcp: drop last_snd and MPTCP_RESET_SCHEDULER" Geliang Tang
@ 2023-06-14 2:20 ` Geliang Tang
2023-06-14 2:20 ` [PATCH mptcp-next v9 03/23] Squash to "mptcp: add sched_data_set_contexts helper" Geliang Tang
` (20 subsequent siblings)
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:20 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
Use two tabs before reinject.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
include/net/mptcp.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index 828b10ddabee..f5cd43cff66b 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -100,7 +100,8 @@ struct mptcp_out_options {
#define MPTCP_SUBFLOWS_MAX 8
struct mptcp_sched_data {
- bool reinject;
+ bool reinject;
+ u8 subflows;
struct mptcp_subflow_context *contexts[MPTCP_SUBFLOWS_MAX];
};
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 03/23] Squash to "mptcp: add sched_data_set_contexts helper"
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
2023-06-14 2:20 ` [PATCH mptcp-next v9 01/23] Squash to "mptcp: drop last_snd and MPTCP_RESET_SCHEDULER" Geliang Tang
2023-06-14 2:20 ` [PATCH mptcp-next v9 02/23] Squash to "mptcp: add struct mptcp_sched_ops" Geliang Tang
@ 2023-06-14 2:20 ` Geliang Tang
2023-06-14 2:20 ` [PATCH mptcp-next v9 04/23] mptcp: add mptcp_subflow_ctx_by_pos helper Geliang Tang
` (19 subsequent siblings)
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:20 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
Set data->subflows too.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
net/mptcp/sched.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/mptcp/sched.c b/net/mptcp/sched.c
index c7c167e48d72..ea4b06322d78 100644
--- a/net/mptcp/sched.c
+++ b/net/mptcp/sched.c
@@ -108,6 +108,7 @@ void mptcp_sched_data_set_contexts(const struct mptcp_sock *msk,
mptcp_subflow_set_scheduled(subflow, false);
data->contexts[i++] = subflow;
}
+ data->subflows = i;
for (; i < MPTCP_SUBFLOWS_MAX; i++)
data->contexts[i] = NULL;
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 04/23] mptcp: add mptcp_subflow_ctx_by_pos helper
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (2 preceding siblings ...)
2023-06-14 2:20 ` [PATCH mptcp-next v9 03/23] Squash to "mptcp: add sched_data_set_contexts helper" Geliang Tang
@ 2023-06-14 2:20 ` Geliang Tang
2023-06-14 2:20 ` [PATCH mptcp-next v9 05/23] Squash to "mptcp: add scheduler wrappers" Geliang Tang
` (18 subsequent siblings)
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:20 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
Add a new helper mptcp_subflow_ctx_by_pos() to get the given pos subflow
from the contexts array in struct mptcp_sched_data. It will be invoked by
the BPF schedulers to export the subflow context to the BPF contexts.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
net/mptcp/protocol.h | 2 ++
net/mptcp/sched.c | 8 ++++++++
2 files changed, 10 insertions(+)
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 35dd3683f735..4e42f52596d4 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -666,6 +666,8 @@ void mptcp_subflow_set_scheduled(struct mptcp_subflow_context *subflow,
bool scheduled);
void mptcp_sched_data_set_contexts(const struct mptcp_sock *msk,
struct mptcp_sched_data *data);
+struct mptcp_subflow_context *
+mptcp_subflow_ctx_by_pos(const struct mptcp_sock *msk, unsigned int pos);
struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk);
struct sock *mptcp_subflow_get_retrans(struct mptcp_sock *msk);
int mptcp_sched_get_send(struct mptcp_sock *msk);
diff --git a/net/mptcp/sched.c b/net/mptcp/sched.c
index ea4b06322d78..a15d7632abec 100644
--- a/net/mptcp/sched.c
+++ b/net/mptcp/sched.c
@@ -114,6 +114,14 @@ void mptcp_sched_data_set_contexts(const struct mptcp_sock *msk,
data->contexts[i] = NULL;
}
+struct mptcp_subflow_context *
+mptcp_subflow_ctx_by_pos(const struct mptcp_sock *msk, unsigned int pos)
+{
+ if (pos >= MPTCP_SUBFLOWS_MAX)
+ return NULL;
+ return msk->sched_data.contexts[pos];
+}
+
int mptcp_sched_get_send(struct mptcp_sock *msk)
{
struct mptcp_subflow_context *subflow;
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 05/23] Squash to "mptcp: add scheduler wrappers"
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (3 preceding siblings ...)
2023-06-14 2:20 ` [PATCH mptcp-next v9 04/23] mptcp: add mptcp_subflow_ctx_by_pos helper Geliang Tang
@ 2023-06-14 2:20 ` Geliang Tang
2023-06-17 0:54 ` Mat Martineau
2023-06-14 2:20 ` [PATCH mptcp-next v9 06/23] mptcp: add last_snd in sched_data Geliang Tang
` (17 subsequent siblings)
22 siblings, 1 reply; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:20 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
Add sched_data into mptcp_sock too.
Use msk->sched_data instead of the local variable data.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
net/mptcp/protocol.h | 1 +
net/mptcp/sched.c | 14 ++++++--------
2 files changed, 7 insertions(+), 8 deletions(-)
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 4e42f52596d4..bda91399be49 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -314,6 +314,7 @@ struct mptcp_sock {
*/
struct sock *first;
struct mptcp_pm_data pm;
+ struct mptcp_sched_data sched_data;
struct mptcp_sched_ops *sched;
struct {
u32 space; /* bytes copied in last measurement window */
diff --git a/net/mptcp/sched.c b/net/mptcp/sched.c
index a15d7632abec..ca4bd53cf5d9 100644
--- a/net/mptcp/sched.c
+++ b/net/mptcp/sched.c
@@ -125,7 +125,6 @@ mptcp_subflow_ctx_by_pos(const struct mptcp_sock *msk, unsigned int pos)
int mptcp_sched_get_send(struct mptcp_sock *msk)
{
struct mptcp_subflow_context *subflow;
- struct mptcp_sched_data data;
msk_owned_by_me(msk);
@@ -155,15 +154,14 @@ int mptcp_sched_get_send(struct mptcp_sock *msk)
return 0;
}
- data.reinject = false;
- msk->sched->data_init(msk, &data);
- return msk->sched->get_subflow(msk, &data);
+ msk->sched_data.reinject = false;
+ msk->sched->data_init(msk, &msk->sched_data);
+ return msk->sched->get_subflow(msk, &msk->sched_data);
}
int mptcp_sched_get_retrans(struct mptcp_sock *msk)
{
struct mptcp_subflow_context *subflow;
- struct mptcp_sched_data data;
msk_owned_by_me(msk);
@@ -186,7 +184,7 @@ int mptcp_sched_get_retrans(struct mptcp_sock *msk)
return 0;
}
- data.reinject = true;
- msk->sched->data_init(msk, &data);
- return msk->sched->get_subflow(msk, &data);
+ msk->sched_data.reinject = true;
+ msk->sched->data_init(msk, &msk->sched_data);
+ return msk->sched->get_subflow(msk, &msk->sched_data);
}
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* Re: [PATCH mptcp-next v9 05/23] Squash to "mptcp: add scheduler wrappers"
2023-06-14 2:20 ` [PATCH mptcp-next v9 05/23] Squash to "mptcp: add scheduler wrappers" Geliang Tang
@ 2023-06-17 0:54 ` Mat Martineau
0 siblings, 0 replies; 28+ messages in thread
From: Mat Martineau @ 2023-06-17 0:54 UTC (permalink / raw)
To: Geliang Tang; +Cc: mptcp, pabeni
On Wed, 14 Jun 2023, Geliang Tang wrote:
> Add sched_data into mptcp_sock too.
> Use msk->sched_data instead of the local variable data.
>
> Signed-off-by: Geliang Tang <geliang.tang@suse.com>
> ---
> net/mptcp/protocol.h | 1 +
> net/mptcp/sched.c | 14 ++++++--------
> 2 files changed, 7 insertions(+), 8 deletions(-)
>
> diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
> index 4e42f52596d4..bda91399be49 100644
> --- a/net/mptcp/protocol.h
> +++ b/net/mptcp/protocol.h
> @@ -314,6 +314,7 @@ struct mptcp_sock {
> */
> struct sock *first;
> struct mptcp_pm_data pm;
> + struct mptcp_sched_data sched_data;
Hi Geliang -
If mptcp_sched_data is part of the mptcp_sock then
mptcp_sock->sched_data->contexts[] is stale once mptcp_sched_get_send() or
mptcp_sched_get_retrans() return. It's also "wasted memory" while the
scheduler is not running.
Are you moving the struct here mainly to be able to add persistent
scheduler data (last_snd and snd_burst) in the next two patches?
I'm concerned that this design is setting up the BPF scheduler to require
custom additions to mptcp_sched_data for many new scheduler types. Is
there a way to leave mptcp_sched_data unchanged, but add a new member to
mptcp_sock just for bpf scheduler state?
Is it possible to create a BPF map and store scheduler-specific key-value
pairs in it that are accessble by the various BPF scheduler functions?
The basic idea is:
* struct bpf_map pointer is passed in to each BPF scheduler function
* If a BPF scheduler is configured for a MPTCP socket, allocate an empty
map. Store a struct bpf_map* in msk_sock (instead of moving sched_data).
* Scheduler sets up the map keys and initial values when configured
* The BPF get_send() functions access and update those values using bpf
helpers (https://man7.org/linux/man-pages/man7/bpf-helpers.7.html)
* When scheduler is reconfigured or socket is closed, free the map
This way sched_data remains temporary, and there's a new place to store
persistent scheduler-specific values.
What do you think?
- Mat
> struct mptcp_sched_ops *sched;
> struct {
> u32 space; /* bytes copied in last measurement window */
> diff --git a/net/mptcp/sched.c b/net/mptcp/sched.c
> index a15d7632abec..ca4bd53cf5d9 100644
> --- a/net/mptcp/sched.c
> +++ b/net/mptcp/sched.c
> @@ -125,7 +125,6 @@ mptcp_subflow_ctx_by_pos(const struct mptcp_sock *msk, unsigned int pos)
> int mptcp_sched_get_send(struct mptcp_sock *msk)
> {
> struct mptcp_subflow_context *subflow;
> - struct mptcp_sched_data data;
>
> msk_owned_by_me(msk);
>
> @@ -155,15 +154,14 @@ int mptcp_sched_get_send(struct mptcp_sock *msk)
> return 0;
> }
>
> - data.reinject = false;
> - msk->sched->data_init(msk, &data);
> - return msk->sched->get_subflow(msk, &data);
> + msk->sched_data.reinject = false;
> + msk->sched->data_init(msk, &msk->sched_data);
> + return msk->sched->get_subflow(msk, &msk->sched_data);
> }
>
> int mptcp_sched_get_retrans(struct mptcp_sock *msk)
> {
> struct mptcp_subflow_context *subflow;
> - struct mptcp_sched_data data;
>
> msk_owned_by_me(msk);
>
> @@ -186,7 +184,7 @@ int mptcp_sched_get_retrans(struct mptcp_sock *msk)
> return 0;
> }
>
> - data.reinject = true;
> - msk->sched->data_init(msk, &data);
> - return msk->sched->get_subflow(msk, &data);
> + msk->sched_data.reinject = true;
> + msk->sched->data_init(msk, &msk->sched_data);
> + return msk->sched->get_subflow(msk, &msk->sched_data);
> }
> --
> 2.35.3
>
>
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH mptcp-next v9 06/23] mptcp: add last_snd in sched_data
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (4 preceding siblings ...)
2023-06-14 2:20 ` [PATCH mptcp-next v9 05/23] Squash to "mptcp: add scheduler wrappers" Geliang Tang
@ 2023-06-14 2:20 ` Geliang Tang
2023-06-14 2:20 ` [PATCH mptcp-next v9 07/23] mptcp: add snd_burst " Geliang Tang
` (16 subsequent siblings)
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:20 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
This patch adds a member last_snd for struct mptcp_sched_data to make it
accessible to bpf schedulers like bpf_rr.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
include/net/mptcp.h | 1 +
net/mptcp/sched.c | 2 ++
2 files changed, 3 insertions(+)
diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index f5cd43cff66b..a9bd7d6cd177 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -100,6 +100,7 @@ struct mptcp_out_options {
#define MPTCP_SUBFLOWS_MAX 8
struct mptcp_sched_data {
+ struct sock *last_snd;
bool reinject;
u8 subflows;
struct mptcp_subflow_context *contexts[MPTCP_SUBFLOWS_MAX];
diff --git a/net/mptcp/sched.c b/net/mptcp/sched.c
index ca4bd53cf5d9..b59046e31457 100644
--- a/net/mptcp/sched.c
+++ b/net/mptcp/sched.c
@@ -81,6 +81,8 @@ void mptcp_release_sched(struct mptcp_sock *msk)
if (!sched)
return;
+ if (msk->sched_data.last_snd)
+ msk->sched_data.last_snd = NULL;
msk->sched = NULL;
if (sched->release)
sched->release(msk);
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 07/23] mptcp: add snd_burst in sched_data
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (5 preceding siblings ...)
2023-06-14 2:20 ` [PATCH mptcp-next v9 06/23] mptcp: add last_snd in sched_data Geliang Tang
@ 2023-06-14 2:20 ` Geliang Tang
2023-06-14 2:20 ` [PATCH mptcp-next v9 08/23] mptcp: register default scheduler Geliang Tang
` (15 subsequent siblings)
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:20 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
This patch moves the member snd_burst from struct mptcp_sock to struct
mptcp_sched_data to make it accessible to bpf schedulers.
To make mptcp_subflow_get_send() adapt with MPTCP scheduler API, it's
necessary to make the msk parameter of it const. Also an new parameter
sched_data is needed.
With this change, msk->snd_burst should be replaced by
msk->sched_data->snd_burst.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
include/net/mptcp.h | 1 +
net/mptcp/protocol.c | 11 ++++++-----
net/mptcp/protocol.h | 4 ++--
net/mptcp/sched.c | 2 +-
4 files changed, 10 insertions(+), 8 deletions(-)
diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index a9bd7d6cd177..ef8264281577 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -101,6 +101,7 @@ struct mptcp_out_options {
struct mptcp_sched_data {
struct sock *last_snd;
+ int snd_burst;
bool reinject;
u8 subflows;
struct mptcp_subflow_context *contexts[MPTCP_SUBFLOWS_MAX];
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 64343c664455..3ca15f8935b3 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1414,7 +1414,8 @@ bool mptcp_subflow_active(struct mptcp_subflow_context *subflow)
* returns the subflow that will transmit the next DSS
* additionally updates the rtx timeout
*/
-struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk)
+struct sock *mptcp_subflow_get_send(const struct mptcp_sock *msk,
+ struct mptcp_sched_data *data)
{
struct subflow_send_info send_info[SSK_MODE_MAX];
struct mptcp_subflow_context *subflow;
@@ -1484,7 +1485,7 @@ struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk)
subflow->avg_pacing_rate = div_u64((u64)subflow->avg_pacing_rate * wmem +
READ_ONCE(ssk->sk_pacing_rate) * burst,
burst + wmem);
- msk->snd_burst = burst;
+ data->snd_burst = burst;
return ssk;
}
@@ -1502,7 +1503,7 @@ static void mptcp_update_post_push(struct mptcp_sock *msk,
dfrag->already_sent += sent;
- msk->snd_burst -= sent;
+ msk->sched_data.snd_burst -= sent;
snd_nxt_new += dfrag->already_sent;
@@ -1555,7 +1556,7 @@ static int __subflow_push_pending(struct sock *sk, struct sock *ssk,
}
WRITE_ONCE(msk->first_pending, mptcp_send_next(sk));
- if (msk->snd_burst <= 0 ||
+ if (msk->sched_data.snd_burst <= 0 ||
!sk_stream_memory_free(ssk) ||
!mptcp_subflow_active(mptcp_subflow_ctx(ssk))) {
err = copied;
@@ -2352,7 +2353,7 @@ bool __mptcp_retransmit_pending_data(struct sock *sk)
mptcp_data_unlock(sk);
msk->first_pending = rtx_head;
- msk->snd_burst = 0;
+ msk->sched_data.snd_burst = 0;
/* be sure to clear the "sent status" on all re-injected fragments */
list_for_each_entry(cur, &msk->rtx_queue, list) {
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index bda91399be49..f404a5e313ca 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -269,7 +269,6 @@ struct mptcp_sock {
u64 rcv_data_fin_seq;
u64 bytes_retrans;
int rmem_fwd_alloc;
- int snd_burst;
int old_wspace;
u64 recovery_snd_nxt; /* in recovery mode accept up to this seq;
* recovery related fields are under data_lock
@@ -669,7 +668,8 @@ void mptcp_sched_data_set_contexts(const struct mptcp_sock *msk,
struct mptcp_sched_data *data);
struct mptcp_subflow_context *
mptcp_subflow_ctx_by_pos(const struct mptcp_sock *msk, unsigned int pos);
-struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk);
+struct sock *mptcp_subflow_get_send(const struct mptcp_sock *msk,
+ struct mptcp_sched_data *data);
struct sock *mptcp_subflow_get_retrans(struct mptcp_sock *msk);
int mptcp_sched_get_send(struct mptcp_sock *msk);
int mptcp_sched_get_retrans(struct mptcp_sock *msk);
diff --git a/net/mptcp/sched.c b/net/mptcp/sched.c
index b59046e31457..50ef71b17beb 100644
--- a/net/mptcp/sched.c
+++ b/net/mptcp/sched.c
@@ -149,7 +149,7 @@ int mptcp_sched_get_send(struct mptcp_sock *msk)
if (!msk->sched) {
struct sock *ssk;
- ssk = mptcp_subflow_get_send(msk);
+ ssk = mptcp_subflow_get_send(msk, &msk->sched_data);
if (!ssk)
return -EINVAL;
mptcp_subflow_set_scheduled(mptcp_subflow_ctx(ssk), true);
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 08/23] mptcp: register default scheduler
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (6 preceding siblings ...)
2023-06-14 2:20 ` [PATCH mptcp-next v9 07/23] mptcp: add snd_burst " Geliang Tang
@ 2023-06-14 2:20 ` Geliang Tang
2023-06-14 2:20 ` [PATCH mptcp-next v9 09/23] Squash to "bpf: Add bpf_mptcp_sched_ops" Geliang Tang
` (14 subsequent siblings)
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:20 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
This patch defines the default packet scheduler mptcp_sched_default.
Register it in mptcp_sched_init(), which is invoked in mptcp_proto_init().
Skip deleting this default scheduler in mptcp_unregister_scheduler().
Set msk->sched to the default scheduler when the input parameter of
mptcp_init_sched() is NULL.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
net/mptcp/protocol.c | 3 ++-
net/mptcp/protocol.h | 3 ++-
net/mptcp/sched.c | 38 ++++++++++++++++++++++++++++++++++++--
3 files changed, 40 insertions(+), 4 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 3ca15f8935b3..cd96a0fb369c 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2281,7 +2281,7 @@ static void mptcp_timeout_timer(struct timer_list *t)
*
* A backup subflow is returned only if that is the only kind available.
*/
-struct sock *mptcp_subflow_get_retrans(struct mptcp_sock *msk)
+struct sock *mptcp_subflow_get_retrans(const struct mptcp_sock *msk)
{
struct sock *backup = NULL, *pick = NULL;
struct mptcp_subflow_context *subflow;
@@ -4025,6 +4025,7 @@ void __init mptcp_proto_init(void)
mptcp_subflow_init();
mptcp_pm_init();
+ mptcp_sched_init();
mptcp_token_init();
if (proto_register(&mptcp_prot, 1) != 0)
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index f404a5e313ca..e41b76f16729 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -659,6 +659,7 @@ void mptcp_info2sockaddr(const struct mptcp_addr_info *info,
struct mptcp_sched_ops *mptcp_sched_find(const char *name);
int mptcp_register_scheduler(struct mptcp_sched_ops *sched);
void mptcp_unregister_scheduler(struct mptcp_sched_ops *sched);
+void mptcp_sched_init(void);
int mptcp_init_sched(struct mptcp_sock *msk,
struct mptcp_sched_ops *sched);
void mptcp_release_sched(struct mptcp_sock *msk);
@@ -670,7 +671,7 @@ struct mptcp_subflow_context *
mptcp_subflow_ctx_by_pos(const struct mptcp_sock *msk, unsigned int pos);
struct sock *mptcp_subflow_get_send(const struct mptcp_sock *msk,
struct mptcp_sched_data *data);
-struct sock *mptcp_subflow_get_retrans(struct mptcp_sock *msk);
+struct sock *mptcp_subflow_get_retrans(const struct mptcp_sock *msk);
int mptcp_sched_get_send(struct mptcp_sock *msk);
int mptcp_sched_get_retrans(struct mptcp_sock *msk);
diff --git a/net/mptcp/sched.c b/net/mptcp/sched.c
index 50ef71b17beb..ba3922a4fe62 100644
--- a/net/mptcp/sched.c
+++ b/net/mptcp/sched.c
@@ -16,6 +16,33 @@
static DEFINE_SPINLOCK(mptcp_sched_list_lock);
static LIST_HEAD(mptcp_sched_list);
+static void mptcp_sched_default_data_init(const struct mptcp_sock *msk,
+ struct mptcp_sched_data *data)
+{
+ data->snd_burst = 0;
+}
+
+static int mptcp_sched_default_get_subflow(const struct mptcp_sock *msk,
+ struct mptcp_sched_data *data)
+{
+ struct sock *ssk;
+
+ ssk = data->reinject ? mptcp_subflow_get_retrans(msk) :
+ mptcp_subflow_get_send(msk, data);
+ if (!ssk)
+ return -EINVAL;
+
+ mptcp_subflow_set_scheduled(mptcp_subflow_ctx(ssk), true);
+ return 0;
+}
+
+static struct mptcp_sched_ops mptcp_sched_default = {
+ .data_init = mptcp_sched_default_data_init,
+ .get_subflow = mptcp_sched_default_get_subflow,
+ .name = "default",
+ .owner = THIS_MODULE,
+};
+
/* Must be called with rcu read lock held */
struct mptcp_sched_ops *mptcp_sched_find(const char *name)
{
@@ -50,16 +77,24 @@ int mptcp_register_scheduler(struct mptcp_sched_ops *sched)
void mptcp_unregister_scheduler(struct mptcp_sched_ops *sched)
{
+ if (sched == &mptcp_sched_default)
+ return;
+
spin_lock(&mptcp_sched_list_lock);
list_del_rcu(&sched->list);
spin_unlock(&mptcp_sched_list_lock);
}
+void mptcp_sched_init(void)
+{
+ mptcp_register_scheduler(&mptcp_sched_default);
+}
+
int mptcp_init_sched(struct mptcp_sock *msk,
struct mptcp_sched_ops *sched)
{
if (!sched)
- goto out;
+ sched = &mptcp_sched_default;
if (!bpf_try_module_get(sched, sched->owner))
return -EBUSY;
@@ -70,7 +105,6 @@ int mptcp_init_sched(struct mptcp_sock *msk,
pr_debug("sched=%s", msk->sched->name);
-out:
return 0;
}
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 09/23] Squash to "bpf: Add bpf_mptcp_sched_ops"
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (7 preceding siblings ...)
2023-06-14 2:20 ` [PATCH mptcp-next v9 08/23] mptcp: register default scheduler Geliang Tang
@ 2023-06-14 2:20 ` Geliang Tang
2023-06-14 2:20 ` [PATCH mptcp-next v9 10/23] Squash to "bpf: Add bpf_mptcp_sched_kfunc_set" Geliang Tang
` (13 subsequent siblings)
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:20 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
Add struct mptcp_sched_data write access.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
net/mptcp/bpf.c | 30 +++++++++++++++++++++---------
1 file changed, 21 insertions(+), 9 deletions(-)
diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
index dd1208670c54..e2ed4223617a 100644
--- a/net/mptcp/bpf.c
+++ b/net/mptcp/bpf.c
@@ -18,8 +18,9 @@
#ifdef CONFIG_BPF_JIT
extern struct bpf_struct_ops bpf_mptcp_sched_ops;
extern struct btf *btf_vmlinux;
-static const struct btf_type *mptcp_sched_type __read_mostly;
-static u32 mptcp_sched_id;
+static const struct btf_type *mptcp_context_type __read_mostly;
+static const struct btf_type *mptcp_data_type __read_mostly;
+static u32 mptcp_context_id, mptcp_data_id;
static u32 optional_sched_ops[] = {
offsetof(struct mptcp_sched_ops, init),
@@ -41,8 +42,8 @@ static int bpf_mptcp_sched_btf_struct_access(struct bpf_verifier_log *log,
size_t end;
t = btf_type_by_id(reg->btf, reg->btf_id);
- if (t != mptcp_sched_type) {
- bpf_log(log, "only access to mptcp_subflow_context is supported\n");
+ if (t != mptcp_context_type && t != mptcp_data_type) {
+ bpf_log(log, "only access to subflow_context or sched_data is supported\n");
return -EACCES;
}
@@ -50,14 +51,18 @@ static int bpf_mptcp_sched_btf_struct_access(struct bpf_verifier_log *log,
case offsetof(struct mptcp_subflow_context, scheduled):
end = offsetofend(struct mptcp_subflow_context, scheduled);
break;
+ case offsetof(struct mptcp_sched_data, last_snd):
+ end = offsetofend(struct mptcp_sched_data, last_snd);
+ break;
default:
- bpf_log(log, "no write support to mptcp_subflow_context at off %d\n", off);
+ bpf_log(log, "no write support to %s at off %d\n",
+ t == mptcp_context_type ? "subflow_context" : "sched_data", off);
return -EACCES;
}
if (off + size > end) {
- bpf_log(log, "access beyond mptcp_subflow_context at off %u size %u ended at %zu",
- off, size, end);
+ bpf_log(log, "access beyond %s at off %u size %u ended at %zu",
+ t == mptcp_context_type ? "subflow_context" : "sched_data", off, size, end);
return -EACCES;
}
@@ -141,8 +146,15 @@ static int bpf_mptcp_sched_init(struct btf *btf)
BTF_KIND_STRUCT);
if (type_id < 0)
return -EINVAL;
- mptcp_sched_id = type_id;
- mptcp_sched_type = btf_type_by_id(btf, mptcp_sched_id);
+ mptcp_context_id = type_id;
+ mptcp_context_type = btf_type_by_id(btf, mptcp_context_id);
+
+ type_id = btf_find_by_name_kind(btf, "mptcp_sched_data",
+ BTF_KIND_STRUCT);
+ if (type_id < 0)
+ return -EINVAL;
+ mptcp_data_id = type_id;
+ mptcp_data_type = btf_type_by_id(btf, mptcp_data_id);
return 0;
}
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 10/23] Squash to "bpf: Add bpf_mptcp_sched_kfunc_set"
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (8 preceding siblings ...)
2023-06-14 2:20 ` [PATCH mptcp-next v9 09/23] Squash to "bpf: Add bpf_mptcp_sched_ops" Geliang Tang
@ 2023-06-14 2:20 ` Geliang Tang
2023-06-14 2:20 ` [PATCH mptcp-next v9 11/23] Squash to "selftests/bpf: Add mptcp sched structs" Geliang Tang
` (12 subsequent siblings)
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:20 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
Export mptcp_subflow_ctx_by_pos too.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
net/mptcp/bpf.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
index e2ed4223617a..0558e6a0e794 100644
--- a/net/mptcp/bpf.c
+++ b/net/mptcp/bpf.c
@@ -172,6 +172,7 @@ struct bpf_struct_ops bpf_mptcp_sched_ops = {
BTF_SET8_START(bpf_mptcp_sched_kfunc_ids)
BTF_ID_FLAGS(func, mptcp_subflow_set_scheduled)
BTF_ID_FLAGS(func, mptcp_sched_data_set_contexts)
+BTF_ID_FLAGS(func, mptcp_subflow_ctx_by_pos)
BTF_SET8_END(bpf_mptcp_sched_kfunc_ids)
static const struct btf_kfunc_id_set bpf_mptcp_sched_kfunc_set = {
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 11/23] Squash to "selftests/bpf: Add mptcp sched structs"
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (9 preceding siblings ...)
2023-06-14 2:20 ` [PATCH mptcp-next v9 10/23] Squash to "bpf: Add bpf_mptcp_sched_kfunc_set" Geliang Tang
@ 2023-06-14 2:20 ` Geliang Tang
2023-06-14 2:20 ` [PATCH mptcp-next v9 12/23] Squash to "selftests/bpf: Add bpf_first scheduler" Geliang Tang
` (11 subsequent siblings)
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:20 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
Use two tabs before reinject.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
tools/testing/selftests/bpf/bpf_tcp_helpers.h | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/bpf_tcp_helpers.h b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
index 72c618037386..60b9ad35112b 100644
--- a/tools/testing/selftests/bpf/bpf_tcp_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
@@ -239,8 +239,8 @@ struct mptcp_subflow_context {
} __attribute__((preserve_access_index));
struct mptcp_sched_data {
- bool reinject;
- struct mptcp_subflow_context *contexts[MPTCP_SUBFLOWS_MAX];
+ bool reinject;
+ __u8 subflows;
} __attribute__((preserve_access_index));
struct mptcp_sched_ops {
@@ -262,6 +262,7 @@ struct mptcp_sock {
struct sock *last_snd;
__u32 token;
struct sock *first;
+ struct mptcp_sched_data sched_data;
char ca_name[TCP_CA_NAME_MAX];
} __attribute__((preserve_access_index));
@@ -269,5 +270,7 @@ extern void mptcp_subflow_set_scheduled(struct mptcp_subflow_context *subflow,
bool scheduled) __ksym;
extern void mptcp_sched_data_set_contexts(const struct mptcp_sock *msk,
struct mptcp_sched_data *data) __ksym;
+extern struct mptcp_subflow_context *
+mptcp_subflow_ctx_by_pos(const struct mptcp_sock *msk, unsigned int pos) __ksym;
#endif
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 12/23] Squash to "selftests/bpf: Add bpf_first scheduler"
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (10 preceding siblings ...)
2023-06-14 2:20 ` [PATCH mptcp-next v9 11/23] Squash to "selftests/bpf: Add mptcp sched structs" Geliang Tang
@ 2023-06-14 2:20 ` Geliang Tang
2023-06-14 2:21 ` [PATCH mptcp-next v9 13/23] Squash to "selftests/bpf: Add bpf_bkup scheduler" Geliang Tang
` (10 subsequent siblings)
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:20 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
Update API.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
tools/testing/selftests/bpf/progs/mptcp_bpf_first.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/progs/mptcp_bpf_first.c b/tools/testing/selftests/bpf/progs/mptcp_bpf_first.c
index e4caa2dd8c6f..b9cc4a5a7125 100644
--- a/tools/testing/selftests/bpf/progs/mptcp_bpf_first.c
+++ b/tools/testing/selftests/bpf/progs/mptcp_bpf_first.c
@@ -25,7 +25,7 @@ void BPF_STRUCT_OPS(bpf_first_data_init, const struct mptcp_sock *msk,
int BPF_STRUCT_OPS(bpf_first_get_subflow, const struct mptcp_sock *msk,
struct mptcp_sched_data *data)
{
- mptcp_subflow_set_scheduled(data->contexts[0], true);
+ mptcp_subflow_set_scheduled(mptcp_subflow_ctx_by_pos(msk, 0), true);
return 0;
}
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 13/23] Squash to "selftests/bpf: Add bpf_bkup scheduler"
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (11 preceding siblings ...)
2023-06-14 2:20 ` [PATCH mptcp-next v9 12/23] Squash to "selftests/bpf: Add bpf_first scheduler" Geliang Tang
@ 2023-06-14 2:21 ` Geliang Tang
2023-06-14 2:21 ` [PATCH mptcp-next v9 14/23] Squash to "selftests/bpf: Add bpf_rr scheduler" Geliang Tang
` (9 subsequent siblings)
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:21 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
Update API.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
tools/testing/selftests/bpf/progs/mptcp_bpf_bkup.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/progs/mptcp_bpf_bkup.c b/tools/testing/selftests/bpf/progs/mptcp_bpf_bkup.c
index b2724426676e..a8c53f93bf9f 100644
--- a/tools/testing/selftests/bpf/progs/mptcp_bpf_bkup.c
+++ b/tools/testing/selftests/bpf/progs/mptcp_bpf_bkup.c
@@ -27,17 +27,20 @@ int BPF_STRUCT_OPS(bpf_bkup_get_subflow, const struct mptcp_sock *msk,
{
int nr = 0;
- for (int i = 0; i < MPTCP_SUBFLOWS_MAX; i++) {
- if (!data->contexts[i])
+ for (int i = 0; i < data->subflows && i < MPTCP_SUBFLOWS_MAX; i++) {
+ struct mptcp_subflow_context *subflow;
+
+ subflow = mptcp_subflow_ctx_by_pos(msk, i);
+ if (!subflow)
break;
- if (!BPF_CORE_READ_BITFIELD_PROBED(data->contexts[i], backup)) {
+ if (!BPF_CORE_READ_BITFIELD_PROBED(subflow, backup)) {
nr = i;
break;
}
}
- mptcp_subflow_set_scheduled(data->contexts[nr], true);
+ mptcp_subflow_set_scheduled(mptcp_subflow_ctx_by_pos(msk, nr), true);
return 0;
}
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 14/23] Squash to "selftests/bpf: Add bpf_rr scheduler"
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (12 preceding siblings ...)
2023-06-14 2:21 ` [PATCH mptcp-next v9 13/23] Squash to "selftests/bpf: Add bpf_bkup scheduler" Geliang Tang
@ 2023-06-14 2:21 ` Geliang Tang
2023-06-14 2:21 ` [PATCH mptcp-next v9 15/23] Squash to "selftests/bpf: Add bpf_red scheduler" Geliang Tang
` (8 subsequent siblings)
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:21 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
Use data->last_snd instead of msk->last_snd.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
tools/testing/selftests/bpf/bpf_tcp_helpers.h | 7 ++++++-
tools/testing/selftests/bpf/progs/mptcp_bpf_rr.c | 14 +++++++++-----
2 files changed, 15 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/bpf/bpf_tcp_helpers.h b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
index 60b9ad35112b..07f007eb5cbb 100644
--- a/tools/testing/selftests/bpf/bpf_tcp_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
@@ -239,6 +239,7 @@ struct mptcp_subflow_context {
} __attribute__((preserve_access_index));
struct mptcp_sched_data {
+ struct sock *last_snd;
bool reinject;
__u8 subflows;
} __attribute__((preserve_access_index));
@@ -259,7 +260,6 @@ struct mptcp_sched_ops {
struct mptcp_sock {
struct inet_connection_sock sk;
- struct sock *last_snd;
__u32 token;
struct sock *first;
struct mptcp_sched_data sched_data;
@@ -272,5 +272,10 @@ extern void mptcp_sched_data_set_contexts(const struct mptcp_sock *msk,
struct mptcp_sched_data *data) __ksym;
extern struct mptcp_subflow_context *
mptcp_subflow_ctx_by_pos(const struct mptcp_sock *msk, unsigned int pos) __ksym;
+static inline struct sock *
+mptcp_subflow_tcp_sock(const struct mptcp_subflow_context *subflow)
+{
+ return subflow->tcp_sock;
+}
#endif
diff --git a/tools/testing/selftests/bpf/progs/mptcp_bpf_rr.c b/tools/testing/selftests/bpf/progs/mptcp_bpf_rr.c
index e101428e5906..7c4c5f3b1134 100644
--- a/tools/testing/selftests/bpf/progs/mptcp_bpf_rr.c
+++ b/tools/testing/selftests/bpf/progs/mptcp_bpf_rr.c
@@ -25,14 +25,16 @@ void BPF_STRUCT_OPS(bpf_rr_data_init, const struct mptcp_sock *msk,
int BPF_STRUCT_OPS(bpf_rr_get_subflow, const struct mptcp_sock *msk,
struct mptcp_sched_data *data)
{
+ struct mptcp_subflow_context *subflow;
int nr = 0;
- for (int i = 0; i < MPTCP_SUBFLOWS_MAX; i++) {
- if (!msk->last_snd || !data->contexts[i])
+ for (int i = 0; i < data->subflows && i < MPTCP_SUBFLOWS_MAX; i++) {
+ subflow = mptcp_subflow_ctx_by_pos(msk, i);
+ if (!data->last_snd || !subflow)
break;
- if (data->contexts[i]->tcp_sock == msk->last_snd) {
- if (i + 1 == MPTCP_SUBFLOWS_MAX || !data->contexts[i + 1])
+ if (mptcp_subflow_tcp_sock(subflow) == data->last_snd) {
+ if (i + 1 == MPTCP_SUBFLOWS_MAX || !mptcp_subflow_ctx_by_pos(msk, i + 1))
break;
nr = i + 1;
@@ -40,7 +42,9 @@ int BPF_STRUCT_OPS(bpf_rr_get_subflow, const struct mptcp_sock *msk,
}
}
- mptcp_subflow_set_scheduled(data->contexts[nr], true);
+ subflow = mptcp_subflow_ctx_by_pos(msk, nr);
+ mptcp_subflow_set_scheduled(subflow, true);
+ data->last_snd = mptcp_subflow_tcp_sock(subflow);
return 0;
}
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 15/23] Squash to "selftests/bpf: Add bpf_red scheduler"
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (13 preceding siblings ...)
2023-06-14 2:21 ` [PATCH mptcp-next v9 14/23] Squash to "selftests/bpf: Add bpf_rr scheduler" Geliang Tang
@ 2023-06-14 2:21 ` Geliang Tang
2023-06-14 2:21 ` [PATCH mptcp-next v9 16/23] mptcp: add two wrappers needed by bpf_burst Geliang Tang
` (7 subsequent siblings)
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:21 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
Update API.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
tools/testing/selftests/bpf/progs/mptcp_bpf_red.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/progs/mptcp_bpf_red.c b/tools/testing/selftests/bpf/progs/mptcp_bpf_red.c
index 30dd6f521b7f..320d46d78b2e 100644
--- a/tools/testing/selftests/bpf/progs/mptcp_bpf_red.c
+++ b/tools/testing/selftests/bpf/progs/mptcp_bpf_red.c
@@ -25,11 +25,11 @@ void BPF_STRUCT_OPS(bpf_red_data_init, const struct mptcp_sock *msk,
int BPF_STRUCT_OPS(bpf_red_get_subflow, const struct mptcp_sock *msk,
struct mptcp_sched_data *data)
{
- for (int i = 0; i < MPTCP_SUBFLOWS_MAX; i++) {
- if (!data->contexts[i])
+ for (int i = 0; i < data->subflows && i < MPTCP_SUBFLOWS_MAX; i++) {
+ if (!mptcp_subflow_ctx_by_pos(msk, i))
break;
- mptcp_subflow_set_scheduled(data->contexts[i], true);
+ mptcp_subflow_set_scheduled(mptcp_subflow_ctx_by_pos(msk, i), true);
}
return 0;
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 16/23] mptcp: add two wrappers needed by bpf_burst
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (14 preceding siblings ...)
2023-06-14 2:21 ` [PATCH mptcp-next v9 15/23] Squash to "selftests/bpf: Add bpf_red scheduler" Geliang Tang
@ 2023-06-14 2:21 ` Geliang Tang
2023-06-14 2:21 ` [PATCH mptcp-next v9 17/23] bpf: Add bpf_burst write accesses Geliang Tang
` (6 subsequent siblings)
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:21 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
sk_stream_memory_free() and tcp_rtx_and_write_queues_empty() are needed
to export into the BPF context for bpf_burst scheduler. But these two
functions are inline ones. So this patch added two wrappers for them,
and export the wrappers in the BPF context.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
net/mptcp/bpf.c | 10 ++++++++++
net/mptcp/protocol.h | 2 ++
2 files changed, 12 insertions(+)
diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
index 0558e6a0e794..066a02bb26bd 100644
--- a/net/mptcp/bpf.c
+++ b/net/mptcp/bpf.c
@@ -169,6 +169,16 @@ struct bpf_struct_ops bpf_mptcp_sched_ops = {
.name = "mptcp_sched_ops",
};
+bool bpf_sk_stream_memory_free(const struct sock *sk)
+{
+ return sk_stream_memory_free(sk);
+}
+
+bool bpf_tcp_rtx_and_write_queues_empty(const struct sock *sk)
+{
+ return tcp_rtx_and_write_queues_empty(sk);
+}
+
BTF_SET8_START(bpf_mptcp_sched_kfunc_ids)
BTF_ID_FLAGS(func, mptcp_subflow_set_scheduled)
BTF_ID_FLAGS(func, mptcp_sched_data_set_contexts)
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index e41b76f16729..784f4643831c 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -669,6 +669,8 @@ void mptcp_sched_data_set_contexts(const struct mptcp_sock *msk,
struct mptcp_sched_data *data);
struct mptcp_subflow_context *
mptcp_subflow_ctx_by_pos(const struct mptcp_sock *msk, unsigned int pos);
+bool bpf_sk_stream_memory_free(const struct sock *sk);
+bool bpf_tcp_rtx_and_write_queues_empty(const struct sock *sk);
struct sock *mptcp_subflow_get_send(const struct mptcp_sock *msk,
struct mptcp_sched_data *data);
struct sock *mptcp_subflow_get_retrans(const struct mptcp_sock *msk);
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 17/23] bpf: Add bpf_burst write accesses
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (15 preceding siblings ...)
2023-06-14 2:21 ` [PATCH mptcp-next v9 16/23] mptcp: add two wrappers needed by bpf_burst Geliang Tang
@ 2023-06-14 2:21 ` Geliang Tang
2023-06-14 2:21 ` [PATCH mptcp-next v9 18/23] bpf: Export more bpf_burst related functions Geliang Tang
` (5 subsequent siblings)
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:21 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
Add write accesses for avg_pacing_rate of struct mptcp_subflow_context and
snd_burst of struct mptcp_sched_data in .btf_struct_access. They will be
used in the bpf_burst selftests.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
net/mptcp/bpf.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
index 066a02bb26bd..97fc0dc1f3d9 100644
--- a/net/mptcp/bpf.c
+++ b/net/mptcp/bpf.c
@@ -51,9 +51,15 @@ static int bpf_mptcp_sched_btf_struct_access(struct bpf_verifier_log *log,
case offsetof(struct mptcp_subflow_context, scheduled):
end = offsetofend(struct mptcp_subflow_context, scheduled);
break;
+ case offsetof(struct mptcp_subflow_context, avg_pacing_rate):
+ end = offsetofend(struct mptcp_subflow_context, avg_pacing_rate);
+ break;
case offsetof(struct mptcp_sched_data, last_snd):
end = offsetofend(struct mptcp_sched_data, last_snd);
break;
+ case offsetof(struct mptcp_sched_data, snd_burst):
+ end = offsetofend(struct mptcp_sched_data, snd_burst);
+ break;
default:
bpf_log(log, "no write support to %s at off %d\n",
t == mptcp_context_type ? "subflow_context" : "sched_data", off);
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 18/23] bpf: Export more bpf_burst related functions
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (16 preceding siblings ...)
2023-06-14 2:21 ` [PATCH mptcp-next v9 17/23] bpf: Add bpf_burst write accesses Geliang Tang
@ 2023-06-14 2:21 ` Geliang Tang
2023-06-14 2:21 ` [PATCH mptcp-next v9 19/23] selftests/bpf: Add bpf_burst scheduler Geliang Tang
` (4 subsequent siblings)
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:21 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
Add more bpf_burst related functions into bpf_mptcp_sched_kfunc_set to make
sure these helpers can be accessed from the BPF context.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
net/mptcp/bpf.c | 6 ++++++
net/mptcp/protocol.c | 4 ++--
net/mptcp/protocol.h | 2 ++
3 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
index 97fc0dc1f3d9..15f399317ac2 100644
--- a/net/mptcp/bpf.c
+++ b/net/mptcp/bpf.c
@@ -189,6 +189,12 @@ BTF_SET8_START(bpf_mptcp_sched_kfunc_ids)
BTF_ID_FLAGS(func, mptcp_subflow_set_scheduled)
BTF_ID_FLAGS(func, mptcp_sched_data_set_contexts)
BTF_ID_FLAGS(func, mptcp_subflow_ctx_by_pos)
+BTF_ID_FLAGS(func, mptcp_subflow_active)
+BTF_ID_FLAGS(func, mptcp_set_timeout)
+BTF_ID_FLAGS(func, mptcp_wnd_end)
+BTF_ID_FLAGS(func, bpf_sk_stream_memory_free)
+BTF_ID_FLAGS(func, bpf_tcp_rtx_and_write_queues_empty)
+BTF_ID_FLAGS(func, mptcp_pm_subflow_chk_stale)
BTF_SET8_END(bpf_mptcp_sched_kfunc_ids)
static const struct btf_kfunc_id_set bpf_mptcp_sched_kfunc_set = {
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index cd96a0fb369c..0093128a8c75 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -50,7 +50,7 @@ DEFINE_PER_CPU(struct mptcp_delegated_action, mptcp_delegated_actions);
static struct net_device mptcp_napi_dev;
/* Returns end sequence number of the receiver's advertised window */
-static u64 mptcp_wnd_end(const struct mptcp_sock *msk)
+u64 mptcp_wnd_end(const struct mptcp_sock *msk)
{
return READ_ONCE(msk->wnd_end);
}
@@ -498,7 +498,7 @@ static long mptcp_timeout_from_subflow(const struct mptcp_subflow_context *subfl
inet_csk(ssk)->icsk_timeout - jiffies : 0;
}
-static void mptcp_set_timeout(struct sock *sk)
+void mptcp_set_timeout(struct sock *sk)
{
struct mptcp_subflow_context *subflow;
long tout = 0;
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 784f4643831c..88a8c72ecdcf 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -638,6 +638,8 @@ void __mptcp_subflow_send_ack(struct sock *ssk);
void mptcp_subflow_reset(struct sock *ssk);
void mptcp_subflow_queue_clean(struct sock *sk, struct sock *ssk);
void mptcp_sock_graft(struct sock *sk, struct socket *parent);
+u64 mptcp_wnd_end(const struct mptcp_sock *msk);
+void mptcp_set_timeout(struct sock *sk);
struct socket *__mptcp_nmpc_socket(struct mptcp_sock *msk);
bool __mptcp_close(struct sock *sk, long timeout);
void mptcp_cancel_work(struct sock *sk);
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 19/23] selftests/bpf: Add bpf_burst scheduler
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (17 preceding siblings ...)
2023-06-14 2:21 ` [PATCH mptcp-next v9 18/23] bpf: Export more bpf_burst related functions Geliang Tang
@ 2023-06-14 2:21 ` Geliang Tang
2023-06-14 2:21 ` [PATCH mptcp-next v9 20/23] selftests/bpf: Add bpf_burst test Geliang Tang
` (3 subsequent siblings)
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:21 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
This patch implements the burst BPF MPTCP scheduler, named bpf_burst,
which is the default scheduler in protocol.c. bpf_burst_get_send() uses
the same logic as mptcp_subflow_get_send() and bpf_burst_get_retrans
uses the same logic as mptcp_subflow_get_retrans().
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
tools/testing/selftests/bpf/bpf_tcp_helpers.h | 5 +
.../selftests/bpf/progs/mptcp_bpf_burst.c | 189 ++++++++++++++++++
2 files changed, 194 insertions(+)
create mode 100644 tools/testing/selftests/bpf/progs/mptcp_bpf_burst.c
diff --git a/tools/testing/selftests/bpf/bpf_tcp_helpers.h b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
index 07f007eb5cbb..a0d4d05c0642 100644
--- a/tools/testing/selftests/bpf/bpf_tcp_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
@@ -36,6 +36,7 @@ enum sk_pacing {
struct sock {
struct sock_common __sk_common;
#define sk_state __sk_common.skc_state
+ int sk_wmem_queued;
unsigned long sk_pacing_rate;
__u32 sk_pacing_status; /* see enum sk_pacing */
} __attribute__((preserve_access_index));
@@ -234,12 +235,15 @@ extern void tcp_cong_avoid_ai(struct tcp_sock *tp, __u32 w, __u32 acked) __ksym;
#define MPTCP_SUBFLOWS_MAX 8
struct mptcp_subflow_context {
+ unsigned long avg_pacing_rate;
__u32 backup : 1;
+ __u8 stale_count;
struct sock *tcp_sock; /* tcp sk backpointer */
} __attribute__((preserve_access_index));
struct mptcp_sched_data {
struct sock *last_snd;
+ int snd_burst;
bool reinject;
__u8 subflows;
} __attribute__((preserve_access_index));
@@ -260,6 +264,7 @@ struct mptcp_sched_ops {
struct mptcp_sock {
struct inet_connection_sock sk;
+ __u64 snd_nxt;
__u32 token;
struct sock *first;
struct mptcp_sched_data sched_data;
diff --git a/tools/testing/selftests/bpf/progs/mptcp_bpf_burst.c b/tools/testing/selftests/bpf/progs/mptcp_bpf_burst.c
new file mode 100644
index 000000000000..7ff27c528bcb
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/mptcp_bpf_burst.c
@@ -0,0 +1,189 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2023, SUSE. */
+
+#include <linux/bpf.h>
+#include <limits.h>
+#include "bpf_tcp_helpers.h"
+
+char _license[] SEC("license") = "GPL";
+
+#define MPTCP_SEND_BURST_SIZE 65428
+
+struct subflow_send_info {
+ __u8 subflow_id;
+ __u64 linger_time;
+};
+
+static inline __u64 div_u64_rem(__u64 dividend, __u32 divisor, __u32 *remainder)
+{
+ *remainder = dividend % divisor;
+ return dividend / divisor;
+}
+
+static inline __u64 div_u64(__u64 dividend, __u32 divisor)
+{
+ __u32 remainder;
+
+ return div_u64_rem(dividend, divisor, &remainder);
+}
+
+extern bool mptcp_subflow_active(struct mptcp_subflow_context *subflow) __ksym;
+extern void mptcp_set_timeout(struct sock *sk) __ksym;
+extern __u64 mptcp_wnd_end(const struct mptcp_sock *msk) __ksym;
+extern bool bpf_sk_stream_memory_free(const struct sock *sk) __ksym;
+extern bool bpf_tcp_rtx_and_write_queues_empty(const struct sock *sk) __ksym;
+extern void mptcp_pm_subflow_chk_stale(const struct mptcp_sock *msk, struct sock *ssk) __ksym;
+
+#define SSK_MODE_ACTIVE 0
+#define SSK_MODE_BACKUP 1
+#define SSK_MODE_MAX 2
+
+SEC("struct_ops/mptcp_sched_burst_init")
+void BPF_PROG(mptcp_sched_burst_init, const struct mptcp_sock *msk)
+{
+}
+
+SEC("struct_ops/mptcp_sched_burst_release")
+void BPF_PROG(mptcp_sched_burst_release, const struct mptcp_sock *msk)
+{
+}
+
+void BPF_STRUCT_OPS(bpf_burst_data_init, const struct mptcp_sock *msk,
+ struct mptcp_sched_data *data)
+{
+ mptcp_sched_data_set_contexts(msk, data);
+}
+
+static int bpf_burst_get_send(const struct mptcp_sock *msk,
+ struct mptcp_sched_data *data)
+{
+ struct subflow_send_info send_info[SSK_MODE_MAX];
+ struct mptcp_subflow_context *subflow;
+ struct sock *sk = (struct sock *)msk;
+ __u32 pace, burst, wmem;
+ __u64 linger_time;
+ struct sock *ssk;
+ int i;
+
+ /* pick the subflow with the lower wmem/wspace ratio */
+ for (i = 0; i < SSK_MODE_MAX; ++i) {
+ send_info[i].subflow_id = MPTCP_SUBFLOWS_MAX;
+ send_info[i].linger_time = -1;
+ }
+
+ for (i = 0; i < data->subflows && i < MPTCP_SUBFLOWS_MAX; i++) {
+ subflow = mptcp_subflow_ctx_by_pos(msk, i);
+ if (!subflow)
+ break;
+
+ ssk = mptcp_subflow_tcp_sock(subflow);
+ if (!mptcp_subflow_active(subflow))
+ continue;
+
+ pace = subflow->avg_pacing_rate;
+ if (!pace) {
+ /* init pacing rate from socket */
+ subflow->avg_pacing_rate = ssk->sk_pacing_rate;
+ pace = subflow->avg_pacing_rate;
+ if (!pace)
+ continue;
+ }
+
+ linger_time = div_u64((__u64)ssk->sk_wmem_queued << 32, pace);
+ if (linger_time < send_info[subflow->backup].linger_time) {
+ send_info[subflow->backup].subflow_id = i;
+ send_info[subflow->backup].linger_time = linger_time;
+ }
+ }
+ mptcp_set_timeout(sk);
+
+ /* pick the best backup if no other subflow is active */
+ if (send_info[SSK_MODE_ACTIVE].subflow_id == MPTCP_SUBFLOWS_MAX)
+ send_info[SSK_MODE_ACTIVE].subflow_id = send_info[SSK_MODE_BACKUP].subflow_id;
+
+ subflow = mptcp_subflow_ctx_by_pos(msk, send_info[SSK_MODE_ACTIVE].subflow_id);
+ if (!subflow)
+ return -1;
+ ssk = mptcp_subflow_tcp_sock(subflow);
+ if (!ssk || !bpf_sk_stream_memory_free(ssk))
+ return -1;
+
+ burst = min(MPTCP_SEND_BURST_SIZE, mptcp_wnd_end(msk) - msk->snd_nxt);
+ wmem = ssk->sk_wmem_queued;
+ if (!burst)
+ goto out;
+
+ subflow->avg_pacing_rate = div_u64((__u64)subflow->avg_pacing_rate * wmem +
+ ssk->sk_pacing_rate * burst,
+ burst + wmem);
+ data->snd_burst = burst;
+
+out:
+ mptcp_subflow_set_scheduled(subflow, true);
+ return 0;
+}
+
+static int bpf_burst_get_retrans(const struct mptcp_sock *msk,
+ struct mptcp_sched_data *data)
+{
+ int backup = MPTCP_SUBFLOWS_MAX, pick = MPTCP_SUBFLOWS_MAX, subflow_id;
+ struct mptcp_subflow_context *subflow;
+ int min_stale_count = INT_MAX;
+ struct sock *ssk;
+
+ for (int i = 0; i < data->subflows && i < MPTCP_SUBFLOWS_MAX; i++) {
+ subflow = mptcp_subflow_ctx_by_pos(msk, i);
+ if (!subflow)
+ break;
+
+ if (!mptcp_subflow_active(subflow))
+ continue;
+
+ ssk = mptcp_subflow_tcp_sock(subflow);
+ /* still data outstanding at TCP level? skip this */
+ if (!bpf_tcp_rtx_and_write_queues_empty(ssk)) {
+ mptcp_pm_subflow_chk_stale(msk, ssk);
+ min_stale_count = min(min_stale_count, subflow->stale_count);
+ continue;
+ }
+
+ if (subflow->backup) {
+ if (backup == MPTCP_SUBFLOWS_MAX)
+ backup = i;
+ continue;
+ }
+
+ if (pick == MPTCP_SUBFLOWS_MAX)
+ pick = i;
+ }
+
+ if (pick < MPTCP_SUBFLOWS_MAX) {
+ subflow_id = pick;
+ goto out;
+ }
+ subflow_id = min_stale_count > 1 ? backup : MPTCP_SUBFLOWS_MAX;
+
+out:
+ subflow = mptcp_subflow_ctx_by_pos(msk, subflow_id);
+ if (!subflow)
+ return -1;
+ mptcp_subflow_set_scheduled(subflow, true);
+ return 0;
+}
+
+int BPF_STRUCT_OPS(bpf_burst_get_subflow, const struct mptcp_sock *msk,
+ struct mptcp_sched_data *data)
+{
+ if (data->reinject)
+ return bpf_burst_get_retrans(msk, data);
+ return bpf_burst_get_send(msk, data);
+}
+
+SEC(".struct_ops")
+struct mptcp_sched_ops burst = {
+ .init = (void *)mptcp_sched_burst_init,
+ .release = (void *)mptcp_sched_burst_release,
+ .data_init = (void *)bpf_burst_data_init,
+ .get_subflow = (void *)bpf_burst_get_subflow,
+ .name = "bpf_burst",
+};
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 20/23] selftests/bpf: Add bpf_burst test
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (18 preceding siblings ...)
2023-06-14 2:21 ` [PATCH mptcp-next v9 19/23] selftests/bpf: Add bpf_burst scheduler Geliang Tang
@ 2023-06-14 2:21 ` Geliang Tang
2023-06-14 2:21 ` [PATCH mptcp-next v9 21/23] bpf: Add subflow bit flags write accesses Geliang Tang
` (2 subsequent siblings)
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:21 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
This patch adds the burst BPF MPTCP scheduler test: test_burst(). Use
sysctl to set net.mptcp.scheduler to use this sched. Add two veth net
devices to simulate the multiple addresses case. Use 'ip mptcp endpoint'
command to add the new endpoint ADDR_2 to PM netlink. Send data and check
bytes_sent of 'ss' output after it to make sure the data has been sent
on both net devices.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
.../testing/selftests/bpf/prog_tests/mptcp.c | 38 +++++++++++++++++++
1 file changed, 38 insertions(+)
diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c
index a968641cc94a..b9f6dcf995fd 100644
--- a/tools/testing/selftests/bpf/prog_tests/mptcp.c
+++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c
@@ -10,6 +10,7 @@
#include "mptcp_bpf_bkup.skel.h"
#include "mptcp_bpf_rr.skel.h"
#include "mptcp_bpf_red.skel.h"
+#include "mptcp_bpf_burst.skel.h"
char NS_TEST[32];
@@ -455,6 +456,41 @@ static void test_red(void)
mptcp_bpf_red__destroy(red_skel);
}
+static void test_burst(void)
+{
+ struct mptcp_bpf_burst *burst_skel;
+ int server_fd, client_fd;
+ struct nstoken *nstoken;
+ struct bpf_link *link;
+
+ burst_skel = mptcp_bpf_burst__open_and_load();
+ if (!ASSERT_OK_PTR(burst_skel, "bpf_burst__open_and_load"))
+ return;
+
+ link = bpf_map__attach_struct_ops(burst_skel->maps.burst);
+ if (!ASSERT_OK_PTR(link, "bpf_map__attach_struct_ops")) {
+ mptcp_bpf_burst__destroy(burst_skel);
+ return;
+ }
+
+ nstoken = sched_init("subflow", "bpf_burst");
+ if (!ASSERT_OK_PTR(nstoken, "sched_init:bpf_burst"))
+ goto fail;
+ server_fd = start_mptcp_server(AF_INET, ADDR_1, 0, 0);
+ client_fd = connect_to_fd(server_fd, 0);
+
+ send_data(server_fd, client_fd);
+ ASSERT_OK(has_bytes_sent(ADDR_1), "has_bytes_sent addr 1");
+ ASSERT_OK(has_bytes_sent(ADDR_2), "has_bytes_sent addr 2");
+
+ close(client_fd);
+ close(server_fd);
+fail:
+ cleanup_netns(nstoken);
+ bpf_link__destroy(link);
+ mptcp_bpf_burst__destroy(burst_skel);
+}
+
void test_mptcp(void)
{
if (test__start_subtest("base"))
@@ -467,4 +503,6 @@ void test_mptcp(void)
test_rr();
if (test__start_subtest("red"))
test_red();
+ if (test__start_subtest("burst"))
+ test_burst();
}
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 21/23] bpf: Add subflow bit flags write accesses
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (19 preceding siblings ...)
2023-06-14 2:21 ` [PATCH mptcp-next v9 20/23] selftests/bpf: Add bpf_burst test Geliang Tang
@ 2023-06-14 2:21 ` Geliang Tang
2023-06-14 2:21 ` [PATCH mptcp-next v9 22/23] selftests/bpf: Add bpf_stale scheduler Geliang Tang
2023-06-14 2:21 ` [PATCH mptcp-next v9 23/23] selftests/bpf: Add bpf_stale test Geliang Tang
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:21 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
Add write accesses for all bit flags of struct mptcp_subflow_context
between map_csum_len and data_avail in .btf_struct_access. The stale
flag will be used in the bpf_stale selftests.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
net/mptcp/bpf.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c
index 15f399317ac2..3b1ea220beae 100644
--- a/net/mptcp/bpf.c
+++ b/net/mptcp/bpf.c
@@ -51,6 +51,9 @@ static int bpf_mptcp_sched_btf_struct_access(struct bpf_verifier_log *log,
case offsetof(struct mptcp_subflow_context, scheduled):
end = offsetofend(struct mptcp_subflow_context, scheduled);
break;
+ case offsetofend(struct mptcp_subflow_context, map_csum_len):
+ end = offsetof(struct mptcp_subflow_context, data_avail);
+ break;
case offsetof(struct mptcp_subflow_context, avg_pacing_rate):
end = offsetofend(struct mptcp_subflow_context, avg_pacing_rate);
break;
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 22/23] selftests/bpf: Add bpf_stale scheduler
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (20 preceding siblings ...)
2023-06-14 2:21 ` [PATCH mptcp-next v9 21/23] bpf: Add subflow bit flags write accesses Geliang Tang
@ 2023-06-14 2:21 ` Geliang Tang
2023-06-14 2:21 ` [PATCH mptcp-next v9 23/23] selftests/bpf: Add bpf_stale test Geliang Tang
22 siblings, 0 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:21 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
This patch implements the setting of stale flag in BPF MPTCP scheduler,
named bpf_stale. The stale flag will be set in bpf_stale_data_init() and
will be checked in bpf_stale_get_subflow().
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
tools/testing/selftests/bpf/bpf_tcp_helpers.h | 3 +-
.../selftests/bpf/progs/mptcp_bpf_stale.c | 65 +++++++++++++++++++
2 files changed, 67 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/bpf/progs/mptcp_bpf_stale.c
diff --git a/tools/testing/selftests/bpf/bpf_tcp_helpers.h b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
index a0d4d05c0642..e52dd795cfdd 100644
--- a/tools/testing/selftests/bpf/bpf_tcp_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
@@ -236,7 +236,8 @@ extern void tcp_cong_avoid_ai(struct tcp_sock *tp, __u32 w, __u32 acked) __ksym;
struct mptcp_subflow_context {
unsigned long avg_pacing_rate;
- __u32 backup : 1;
+ __u32 backup : 1,
+ stale : 1;
__u8 stale_count;
struct sock *tcp_sock; /* tcp sk backpointer */
} __attribute__((preserve_access_index));
diff --git a/tools/testing/selftests/bpf/progs/mptcp_bpf_stale.c b/tools/testing/selftests/bpf/progs/mptcp_bpf_stale.c
new file mode 100644
index 000000000000..4f21211b3931
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/mptcp_bpf_stale.c
@@ -0,0 +1,65 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2023, SUSE. */
+
+#include <linux/bpf.h>
+#include "bpf_tcp_helpers.h"
+
+char _license[] SEC("license") = "GPL";
+
+static void mptcp_subflow_set_stale(struct mptcp_subflow_context *subflow,
+ int stale)
+{
+ subflow->stale = stale;
+}
+
+SEC("struct_ops/mptcp_sched_stale_init")
+void BPF_PROG(mptcp_sched_stale_init, const struct mptcp_sock *msk)
+{
+}
+
+SEC("struct_ops/mptcp_sched_stale_release")
+void BPF_PROG(mptcp_sched_stale_release, const struct mptcp_sock *msk)
+{
+}
+
+void BPF_STRUCT_OPS(bpf_stale_data_init, const struct mptcp_sock *msk,
+ struct mptcp_sched_data *data)
+{
+ struct mptcp_subflow_context *subflow;
+
+ mptcp_sched_data_set_contexts(msk, data);
+ subflow = mptcp_subflow_ctx_by_pos(msk, 1);
+ if (subflow)
+ mptcp_subflow_set_stale(subflow, 1);
+}
+
+int BPF_STRUCT_OPS(bpf_stale_get_subflow, const struct mptcp_sock *msk,
+ struct mptcp_sched_data *data)
+{
+ int nr = 0;
+
+ for (int i = 0; i < data->subflows && i < MPTCP_SUBFLOWS_MAX; i++) {
+ struct mptcp_subflow_context *subflow;
+
+ subflow = mptcp_subflow_ctx_by_pos(msk, i);
+ if (!subflow)
+ break;
+
+ if (!BPF_CORE_READ_BITFIELD_PROBED(subflow, stale))
+ break;
+
+ nr = i;
+ }
+
+ mptcp_subflow_set_scheduled(mptcp_subflow_ctx_by_pos(msk, nr), true);
+ return 0;
+}
+
+SEC(".struct_ops")
+struct mptcp_sched_ops stale = {
+ .init = (void *)mptcp_sched_stale_init,
+ .release = (void *)mptcp_sched_stale_release,
+ .data_init = (void *)bpf_stale_data_init,
+ .get_subflow = (void *)bpf_stale_get_subflow,
+ .name = "bpf_stale",
+};
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* [PATCH mptcp-next v9 23/23] selftests/bpf: Add bpf_stale test
2023-06-14 2:20 [PATCH mptcp-next v9 00/23] BPF packet scheduler updates Geliang Tang
` (21 preceding siblings ...)
2023-06-14 2:21 ` [PATCH mptcp-next v9 22/23] selftests/bpf: Add bpf_stale scheduler Geliang Tang
@ 2023-06-14 2:21 ` Geliang Tang
2023-06-14 3:01 ` selftests/bpf: Add bpf_stale test: Build Failure MPTCP CI
` (2 more replies)
22 siblings, 3 replies; 28+ messages in thread
From: Geliang Tang @ 2023-06-14 2:21 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
This patch adds the bpf_stale scheduler test: test_stale(). Use sysctl to
set net.mptcp.scheduler to use this sched. Add two veth net devices to
simulate the multiple addresses case. Use 'ip mptcp endpoint' command to
add the new endpoint ADDR_2 to PM netlink. Send data and check bytes_sent
of 'ss' output after it to make sure the data has been only sent on ADDR_1
since ADDR_2 is set as stale.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
---
.../testing/selftests/bpf/prog_tests/mptcp.c | 38 +++++++++++++++++++
1 file changed, 38 insertions(+)
diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c
index b9f6dcf995fd..94b7624bdfef 100644
--- a/tools/testing/selftests/bpf/prog_tests/mptcp.c
+++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c
@@ -11,6 +11,7 @@
#include "mptcp_bpf_rr.skel.h"
#include "mptcp_bpf_red.skel.h"
#include "mptcp_bpf_burst.skel.h"
+#include "mptcp_bpf_stale.skel.h"
char NS_TEST[32];
@@ -491,6 +492,41 @@ static void test_burst(void)
mptcp_bpf_burst__destroy(burst_skel);
}
+static void test_stale(void)
+{
+ struct mptcp_bpf_stale *stale_skel;
+ int server_fd, client_fd;
+ struct nstoken *nstoken;
+ struct bpf_link *link;
+
+ stale_skel = mptcp_bpf_stale__open_and_load();
+ if (!ASSERT_OK_PTR(stale_skel, "bpf_stale__open_and_load"))
+ return;
+
+ link = bpf_map__attach_struct_ops(stale_skel->maps.stale);
+ if (!ASSERT_OK_PTR(link, "bpf_map__attach_struct_ops")) {
+ mptcp_bpf_stale__destroy(stale_skel);
+ return;
+ }
+
+ nstoken = sched_init("subflow", "bpf_stale");
+ if (!ASSERT_OK_PTR(nstoken, "sched_init:bpf_stale"))
+ goto fail;
+ server_fd = start_mptcp_server(AF_INET, ADDR_1, 0, 0);
+ client_fd = connect_to_fd(server_fd, 0);
+
+ send_data(server_fd, client_fd);
+ ASSERT_OK(has_bytes_sent(ADDR_1), "has_bytes_sent addr_1");
+ ASSERT_GT(has_bytes_sent(ADDR_2), 0, "has_bytes_sent addr_2");
+
+ close(client_fd);
+ close(server_fd);
+fail:
+ cleanup_netns(nstoken);
+ bpf_link__destroy(link);
+ mptcp_bpf_stale__destroy(stale_skel);
+}
+
void test_mptcp(void)
{
if (test__start_subtest("base"))
@@ -505,4 +541,6 @@ void test_mptcp(void)
test_red();
if (test__start_subtest("burst"))
test_burst();
+ if (test__start_subtest("stale"))
+ test_stale();
}
--
2.35.3
^ permalink raw reply related [flat|nested] 28+ messages in thread* Re: selftests/bpf: Add bpf_stale test: Build Failure
2023-06-14 2:21 ` [PATCH mptcp-next v9 23/23] selftests/bpf: Add bpf_stale test Geliang Tang
@ 2023-06-14 3:01 ` MPTCP CI
2023-06-14 4:41 ` selftests/bpf: Add bpf_stale test: Tests Results MPTCP CI
2023-06-14 6:22 ` MPTCP CI
2 siblings, 0 replies; 28+ messages in thread
From: MPTCP CI @ 2023-06-14 3:01 UTC (permalink / raw)
To: Geliang Tang; +Cc: mptcp
Hi Geliang,
Thank you for your modifications, that's great!
But sadly, our CI spotted some issues with it when trying to build it.
You can find more details there:
https://patchwork.kernel.org/project/mptcp/patch/bd10b96d43aef97369c5f197ceb1085c99dd4fab.1686709116.git.geliang.tang@suse.com/
https://github.com/multipath-tcp/mptcp_net-next/actions/runs/5262461384
Status: failure
Initiator: MPTCPimporter
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/45a9e0e26856
Feel free to reply to this email if you cannot access logs, if you need
some support to fix the error, if this doesn't seem to be caused by your
modifications or if the error is a false positive one.
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (Tessares)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: selftests/bpf: Add bpf_stale test: Tests Results
2023-06-14 2:21 ` [PATCH mptcp-next v9 23/23] selftests/bpf: Add bpf_stale test Geliang Tang
2023-06-14 3:01 ` selftests/bpf: Add bpf_stale test: Build Failure MPTCP CI
@ 2023-06-14 4:41 ` MPTCP CI
2023-06-14 6:22 ` MPTCP CI
2 siblings, 0 replies; 28+ messages in thread
From: MPTCP CI @ 2023-06-14 4:41 UTC (permalink / raw)
To: Geliang Tang; +Cc: mptcp
Hi Geliang,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- {"code":404,"message":
- "Can't find artifacts containing file conclusion.txt"}:
- Task: https://cirrus-ci.com/task/4937696309673984
- Summary: https://api.cirrus-ci.com/v1/artifact/task/4937696309673984/summary/summary.txt
- KVM Validation: debug (except selftest_mptcp_join):
- Success! ✅:
- Task: https://cirrus-ci.com/task/5500646263095296
- Summary: https://api.cirrus-ci.com/v1/artifact/task/5500646263095296/summary/summary.txt
- KVM Validation: debug (only selftest_mptcp_join):
- Success! ✅:
- Task: https://cirrus-ci.com/task/6626546169937920
- Summary: https://api.cirrus-ci.com/v1/artifact/task/6626546169937920/summary/summary.txt
- KVM Validation: normal (only selftest_mptcp_join):
- Success! ✅:
- Task: https://cirrus-ci.com/task/6063596216516608
- Summary: https://api.cirrus-ci.com/v1/artifact/task/6063596216516608/summary/summary.txt
Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/45a9e0e26856
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-debug
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (Tessares)
^ permalink raw reply [flat|nested] 28+ messages in thread* Re: selftests/bpf: Add bpf_stale test: Tests Results
2023-06-14 2:21 ` [PATCH mptcp-next v9 23/23] selftests/bpf: Add bpf_stale test Geliang Tang
2023-06-14 3:01 ` selftests/bpf: Add bpf_stale test: Build Failure MPTCP CI
2023-06-14 4:41 ` selftests/bpf: Add bpf_stale test: Tests Results MPTCP CI
@ 2023-06-14 6:22 ` MPTCP CI
2 siblings, 0 replies; 28+ messages in thread
From: MPTCP CI @ 2023-06-14 6:22 UTC (permalink / raw)
To: Geliang Tang; +Cc: mptcp
Hi Geliang,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- KVM Validation: normal (except selftest_mptcp_join):
- Success! ✅:
- Task: https://cirrus-ci.com/task/6422467376316416
- Summary: https://api.cirrus-ci.com/v1/artifact/task/6422467376316416/summary/summary.txt
- KVM Validation: debug (except selftest_mptcp_join):
- Success! ✅:
- Task: https://cirrus-ci.com/task/5500646263095296
- Summary: https://api.cirrus-ci.com/v1/artifact/task/5500646263095296/summary/summary.txt
- KVM Validation: debug (only selftest_mptcp_join):
- Success! ✅:
- Task: https://cirrus-ci.com/task/6626546169937920
- Summary: https://api.cirrus-ci.com/v1/artifact/task/6626546169937920/summary/summary.txt
- KVM Validation: normal (only selftest_mptcp_join):
- Success! ✅:
- Task: https://cirrus-ci.com/task/6063596216516608
- Summary: https://api.cirrus-ci.com/v1/artifact/task/6063596216516608/summary/summary.txt
Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/45a9e0e26856
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-debug
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (Tessares)
^ permalink raw reply [flat|nested] 28+ messages in thread