* [PATCH mptcp-next v5 00/11] refactor push pending
@ 2022-10-06 12:17 Geliang Tang
2022-10-06 12:17 ` [PATCH mptcp-next v5 01/11] Squash to "mptcp: add get_subflow wrappers" Geliang Tang
` (12 more replies)
0 siblings, 13 replies; 19+ messages in thread
From: Geliang Tang @ 2022-10-06 12:17 UTC (permalink / raw)
To: mptcp; +Cc: Geliang Tang
v5:
- address Mat's comments in v4.
v4:
- update __mptcp_subflow_push_pending as Mat suggested.
- add more patches from "BPF redundant scheduler" series.
v3:
- add a cleanup patch.
- remove msk->last_snd in mptcp_subflow_get_send().
- add the loop that calls the scheduler again in __mptcp_push_pending().
v2:
- add snd_burst check in dfrags loop as Mat suggested.
Refactor __mptcp_push_pending() and __mptcp_subflow_push_pending() to
remove duplicate code and support redundant scheduler more easily in
__mptcp_subflow_push_pending().
Geliang Tang (11):
Squash to "mptcp: add get_subflow wrappers"
mptcp: 'first' argument for subflow_push_pending
mptcp: refactor push_pending logic
mptcp: drop last_snd for burst scheduler
mptcp: simplify push_pending
mptcp: multi subflows push_pending
mptcp: use msk instead of mptcp_sk
mptcp: refactor subflow_push_pending logic
mptcp: simplify subflow_push_pending
mptcp: multi subflows subflow_push_pending
mptcp: multi subflows retrans support
net/mptcp/pm.c | 9 +-
net/mptcp/pm_netlink.c | 3 -
net/mptcp/protocol.c | 285 ++++++++++++++++++++++-------------------
net/mptcp/protocol.h | 5 +-
net/mptcp/sched.c | 61 +++++----
5 files changed, 184 insertions(+), 179 deletions(-)
--
2.35.3
^ permalink raw reply [flat|nested] 19+ messages in thread* [PATCH mptcp-next v5 01/11] Squash to "mptcp: add get_subflow wrappers" 2022-10-06 12:17 [PATCH mptcp-next v5 00/11] refactor push pending Geliang Tang @ 2022-10-06 12:17 ` Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 02/11] mptcp: 'first' argument for subflow_push_pending Geliang Tang ` (11 subsequent siblings) 12 siblings, 0 replies; 19+ messages in thread From: Geliang Tang @ 2022-10-06 12:17 UTC (permalink / raw) To: mptcp; +Cc: Geliang Tang Please update the commit log: ''' This patch defines two new wrappers mptcp_sched_get_send() and mptcp_sched_get_retrans(), invoke get_subflow() of msk->sched in them. Set the subflow pointers array in struct mptcp_sched_data before invoking get_subflow(), then it can be used in get_subflow() in the BPF contexts. Check the subflow scheduled flags to test which subflow or subflows are picked by the scheduler. Move sock_owned_by_me() and the fallback check code from mptcp_subflow_get_send/retrans() into the wrappers. ''' --- net/mptcp/protocol.c | 8 +++--- net/mptcp/protocol.h | 4 +-- net/mptcp/sched.c | 59 +++++++++++++++++++++----------------------- 3 files changed, 34 insertions(+), 37 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 8feb684408f7..d500a00fa778 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1547,7 +1547,7 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags) int ret = 0; prev_ssk = ssk; - ssk = mptcp_sched_get_send(msk); + ssk = mptcp_subflow_get_send(msk); /* First check. If the ssk has changed since * the last round, release prev_ssk @@ -1616,7 +1616,7 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk) * check for a different subflow usage only after * spooling the first chunk of data */ - xmit_ssk = first ? ssk : mptcp_sched_get_send(mptcp_sk(sk)); + xmit_ssk = first ? ssk : mptcp_subflow_get_send(mptcp_sk(sk)); if (!xmit_ssk) goto out; if (xmit_ssk != ssk) { @@ -2478,7 +2478,7 @@ static void __mptcp_retrans(struct sock *sk) mptcp_clean_una_wakeup(sk); /* first check ssk: need to kick "stale" logic */ - ssk = mptcp_sched_get_retrans(msk); + ssk = mptcp_subflow_get_retrans(msk); dfrag = mptcp_rtx_head(sk); if (!dfrag) { if (mptcp_data_fin_enabled(msk)) { @@ -3196,7 +3196,7 @@ void __mptcp_check_push(struct sock *sk, struct sock *ssk) return; if (!sock_owned_by_user(sk)) { - struct sock *xmit_ssk = mptcp_sched_get_send(mptcp_sk(sk)); + struct sock *xmit_ssk = mptcp_subflow_get_send(mptcp_sk(sk)); if (xmit_ssk == ssk) __mptcp_subflow_push_pending(sk, ssk); diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 93c535440a5c..e81399debff9 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -640,8 +640,8 @@ void mptcp_subflow_set_scheduled(struct mptcp_subflow_context *subflow, bool scheduled); struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk); struct sock *mptcp_subflow_get_retrans(struct mptcp_sock *msk); -struct sock *mptcp_sched_get_send(struct mptcp_sock *msk); -struct sock *mptcp_sched_get_retrans(struct mptcp_sock *msk); +int mptcp_sched_get_send(struct mptcp_sock *msk); +int mptcp_sched_get_retrans(struct mptcp_sock *msk); static inline bool __tcp_can_send(const struct sock *ssk) { diff --git a/net/mptcp/sched.c b/net/mptcp/sched.c index 044c5ec8bbfb..d1cad1eefb35 100644 --- a/net/mptcp/sched.c +++ b/net/mptcp/sched.c @@ -114,67 +114,64 @@ static int mptcp_sched_data_init(struct mptcp_sock *msk, bool reinject, for (; i < MPTCP_SUBFLOWS_MAX; i++) data->contexts[i] = NULL; + msk->snd_burst = 0; + return 0; } -struct sock *mptcp_sched_get_send(struct mptcp_sock *msk) +int mptcp_sched_get_send(struct mptcp_sock *msk) { struct mptcp_sched_data data; struct sock *ssk = NULL; - int i; sock_owned_by_me((struct sock *)msk); /* the following check is moved out of mptcp_subflow_get_send */ if (__mptcp_check_fallback(msk)) { - if (!msk->first) - return NULL; - return __tcp_can_send(msk->first) && - sk_stream_memory_free(msk->first) ? msk->first : NULL; + if (msk->first && + __tcp_can_send(msk->first) && + sk_stream_memory_free(msk->first)) { + mptcp_subflow_set_scheduled(mptcp_subflow_ctx(msk->first), true); + return 0; + } + return -EINVAL; } - if (!msk->sched) - return mptcp_subflow_get_send(msk); + if (!msk->sched) { + ssk = mptcp_subflow_get_send(msk); + if (!ssk) + return -EINVAL; + mptcp_subflow_set_scheduled(mptcp_subflow_ctx(ssk), true); + return 0; + } mptcp_sched_data_init(msk, false, &data); msk->sched->get_subflow(msk, &data); - for (i = 0; i < MPTCP_SUBFLOWS_MAX; i++) { - if (data.contexts[i] && READ_ONCE(data.contexts[i]->scheduled)) { - ssk = data.contexts[i]->tcp_sock; - msk->last_snd = ssk; - break; - } - } - - return ssk; + return 0; } -struct sock *mptcp_sched_get_retrans(struct mptcp_sock *msk) +int mptcp_sched_get_retrans(struct mptcp_sock *msk) { struct mptcp_sched_data data; struct sock *ssk = NULL; - int i; sock_owned_by_me((const struct sock *)msk); /* the following check is moved out of mptcp_subflow_get_retrans */ if (__mptcp_check_fallback(msk)) - return NULL; + return -EINVAL; - if (!msk->sched) - return mptcp_subflow_get_retrans(msk); + if (!msk->sched) { + ssk = mptcp_subflow_get_retrans(msk); + if (!ssk) + return -EINVAL; + mptcp_subflow_set_scheduled(mptcp_subflow_ctx(ssk), true); + return 0; + } mptcp_sched_data_init(msk, true, &data); msk->sched->get_subflow(msk, &data); - for (i = 0; i < MPTCP_SUBFLOWS_MAX; i++) { - if (data.contexts[i] && READ_ONCE(data.contexts[i]->scheduled)) { - ssk = data.contexts[i]->tcp_sock; - msk->last_snd = ssk; - break; - } - } - - return ssk; + return 0; } -- 2.35.3 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH mptcp-next v5 02/11] mptcp: 'first' argument for subflow_push_pending 2022-10-06 12:17 [PATCH mptcp-next v5 00/11] refactor push pending Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 01/11] Squash to "mptcp: add get_subflow wrappers" Geliang Tang @ 2022-10-06 12:17 ` Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 03/11] mptcp: refactor push_pending logic Geliang Tang ` (10 subsequent siblings) 12 siblings, 0 replies; 19+ messages in thread From: Geliang Tang @ 2022-10-06 12:17 UTC (permalink / raw) To: mptcp; +Cc: Geliang Tang The function mptcp_subflow_process_delegated() uses the input ssk first, while __mptcp_check_push() invokes the packet scheduler first. So this patch adds a new argument named first for the function __mptcp_subflow_push_pending() to deal with these two cases separately. With this change, the code that invokes the packet scheduler in the fuction __mptcp_check_push() can be removed, and replaced by invoking __mptcp_subflow_push_pending() directly. Signed-off-by: Geliang Tang <geliang.tang@suse.com> --- net/mptcp/protocol.c | 21 +++++++-------------- 1 file changed, 7 insertions(+), 14 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index d500a00fa778..84d33393d24e 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1593,7 +1593,8 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags) __mptcp_check_send_data_fin(sk); } -static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk) +static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, + bool first) { struct mptcp_sock *msk = mptcp_sk(sk); struct mptcp_sendmsg_info info = { @@ -1602,7 +1603,6 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk) struct mptcp_data_frag *dfrag; struct sock *xmit_ssk; int len, copied = 0; - bool first = true; info.flags = 0; while ((dfrag = mptcp_send_head(sk))) { @@ -1612,8 +1612,7 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk) while (len > 0) { int ret = 0; - /* the caller already invoked the packet scheduler, - * check for a different subflow usage only after + /* check for a different subflow usage only after * spooling the first chunk of data */ xmit_ssk = first ? ssk : mptcp_subflow_get_send(mptcp_sk(sk)); @@ -3195,16 +3194,10 @@ void __mptcp_check_push(struct sock *sk, struct sock *ssk) if (!mptcp_send_head(sk)) return; - if (!sock_owned_by_user(sk)) { - struct sock *xmit_ssk = mptcp_subflow_get_send(mptcp_sk(sk)); - - if (xmit_ssk == ssk) - __mptcp_subflow_push_pending(sk, ssk); - else if (xmit_ssk) - mptcp_subflow_delegate(mptcp_subflow_ctx(xmit_ssk), MPTCP_DELEGATE_SEND); - } else { + if (!sock_owned_by_user(sk)) + __mptcp_subflow_push_pending(sk, ssk, false); + else __set_bit(MPTCP_PUSH_PENDING, &mptcp_sk(sk)->cb_flags); - } } #define MPTCP_FLAGS_PROCESS_CTX_NEED (BIT(MPTCP_PUSH_PENDING) | \ @@ -3295,7 +3288,7 @@ void mptcp_subflow_process_delegated(struct sock *ssk) if (test_bit(MPTCP_DELEGATE_SEND, &subflow->delegated_status)) { mptcp_data_lock(sk); if (!sock_owned_by_user(sk)) - __mptcp_subflow_push_pending(sk, ssk); + __mptcp_subflow_push_pending(sk, ssk, true); else __set_bit(MPTCP_PUSH_PENDING, &mptcp_sk(sk)->cb_flags); mptcp_data_unlock(sk); -- 2.35.3 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH mptcp-next v5 03/11] mptcp: refactor push_pending logic 2022-10-06 12:17 [PATCH mptcp-next v5 00/11] refactor push pending Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 01/11] Squash to "mptcp: add get_subflow wrappers" Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 02/11] mptcp: 'first' argument for subflow_push_pending Geliang Tang @ 2022-10-06 12:17 ` Geliang Tang 2022-10-11 0:38 ` Mat Martineau 2022-10-06 12:17 ` [PATCH mptcp-next v5 04/11] mptcp: drop last_snd for burst scheduler Geliang Tang ` (9 subsequent siblings) 12 siblings, 1 reply; 19+ messages in thread From: Geliang Tang @ 2022-10-06 12:17 UTC (permalink / raw) To: mptcp; +Cc: Geliang Tang To support redundant package schedulers more easily, this patch refactors __mptcp_push_pending() logic from: For each dfrag: While sends succeed: Call the scheduler (selects subflow and msk->snd_burst) Update subflow locks (push/release/acquire as needed) Send the dfrag data with mptcp_sendmsg_frag() Update already_sent, snd_nxt, snd_burst Update msk->first_pending Push/release on final subflow to: While the scheduler selects one subflow: Lock the subflow For each pending dfrag: While sends succeed: Send the dfrag data with mptcp_sendmsg_frag() Update already_sent, snd_nxt, snd_burst Update msk->first_pending Break if required by msk->snd_burst / etc Push and release the subflow This patch alse moves the burst check conditions out of the function mptcp_subflow_get_send(), check them in __mptcp_push_pending() and __mptcp_subflow_push_pending() in the inner "for each pending dfrag" loop. Signed-off-by: Geliang Tang <geliang.tang@suse.com> --- net/mptcp/protocol.c | 86 ++++++++++++++++++++------------------------ 1 file changed, 39 insertions(+), 47 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 84d33393d24e..bf77defbc546 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1417,14 +1417,6 @@ struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk) u64 linger_time; long tout = 0; - /* re-use last subflow, if the burst allow that */ - if (msk->last_snd && msk->snd_burst > 0 && - sk_stream_memory_free(msk->last_snd) && - mptcp_subflow_active(mptcp_subflow_ctx(msk->last_snd))) { - mptcp_set_timeout(sk); - return msk->last_snd; - } - /* pick the subflow with the lower wmem/wspace ratio */ for (i = 0; i < SSK_MODE_MAX; ++i) { send_info[i].ssk = NULL; @@ -1530,60 +1522,53 @@ void mptcp_check_and_set_pending(struct sock *sk) void __mptcp_push_pending(struct sock *sk, unsigned int flags) { - struct sock *prev_ssk = NULL, *ssk = NULL; struct mptcp_sock *msk = mptcp_sk(sk); struct mptcp_sendmsg_info info = { .flags = flags, }; bool do_check_data_fin = false; struct mptcp_data_frag *dfrag; + struct sock *ssk; int len; - while ((dfrag = mptcp_send_head(sk))) { - info.sent = dfrag->already_sent; - info.limit = dfrag->data_len; - len = dfrag->data_len - dfrag->already_sent; - while (len > 0) { - int ret = 0; - - prev_ssk = ssk; - ssk = mptcp_subflow_get_send(msk); - - /* First check. If the ssk has changed since - * the last round, release prev_ssk - */ - if (ssk != prev_ssk && prev_ssk) - mptcp_push_release(prev_ssk, &info); - if (!ssk) - goto out; +again: + while (mptcp_send_head(sk) && (ssk = mptcp_subflow_get_send(msk))) { + lock_sock(ssk); - /* Need to lock the new subflow only if different - * from the previous one, otherwise we are still - * helding the relevant lock - */ - if (ssk != prev_ssk) - lock_sock(ssk); + while ((dfrag = mptcp_send_head(sk))) { + info.sent = dfrag->already_sent; + info.limit = dfrag->data_len; + len = dfrag->data_len - dfrag->already_sent; + while (len > 0) { + int ret = 0; + + ret = mptcp_sendmsg_frag(sk, ssk, dfrag, &info); + if (ret <= 0) { + if (ret == -EAGAIN) + goto again; + mptcp_push_release(ssk, &info); + goto out; + } + + do_check_data_fin = true; + info.sent += ret; + len -= ret; + + mptcp_update_post_push(msk, dfrag, ret); + } + WRITE_ONCE(msk->first_pending, mptcp_send_next(sk)); - ret = mptcp_sendmsg_frag(sk, ssk, dfrag, &info); - if (ret <= 0) { - if (ret == -EAGAIN) - continue; + if (msk->snd_burst <= 0 || + !sk_stream_memory_free(ssk) || + !mptcp_subflow_active(mptcp_subflow_ctx(ssk))) { mptcp_push_release(ssk, &info); - goto out; + goto again; } - - do_check_data_fin = true; - info.sent += ret; - len -= ret; - - mptcp_update_post_push(msk, dfrag, ret); + mptcp_set_timeout(sk); } - WRITE_ONCE(msk->first_pending, mptcp_send_next(sk)); - } - /* at this point we held the socket lock for the last subflow we used */ - if (ssk) mptcp_push_release(ssk, &info); + } out: /* ensure the rtx timer is running */ @@ -1636,6 +1621,13 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, mptcp_update_post_push(msk, dfrag, ret); } WRITE_ONCE(msk->first_pending, mptcp_send_next(sk)); + + if (msk->snd_burst <= 0 || + !sk_stream_memory_free(ssk) || + !mptcp_subflow_active(mptcp_subflow_ctx(ssk))) { + goto out; + } + mptcp_set_timeout(sk); } out: -- 2.35.3 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v5 03/11] mptcp: refactor push_pending logic 2022-10-06 12:17 ` [PATCH mptcp-next v5 03/11] mptcp: refactor push_pending logic Geliang Tang @ 2022-10-11 0:38 ` Mat Martineau 0 siblings, 0 replies; 19+ messages in thread From: Mat Martineau @ 2022-10-11 0:38 UTC (permalink / raw) To: Geliang Tang; +Cc: mptcp On Thu, 6 Oct 2022, Geliang Tang wrote: > To support redundant package schedulers more easily, this patch refactors > __mptcp_push_pending() logic from: > > For each dfrag: > While sends succeed: > Call the scheduler (selects subflow and msk->snd_burst) > Update subflow locks (push/release/acquire as needed) > Send the dfrag data with mptcp_sendmsg_frag() > Update already_sent, snd_nxt, snd_burst > Update msk->first_pending > Push/release on final subflow > > to: > > While the scheduler selects one subflow: > Lock the subflow > For each pending dfrag: > While sends succeed: > Send the dfrag data with mptcp_sendmsg_frag() > Update already_sent, snd_nxt, snd_burst > Update msk->first_pending > Break if required by msk->snd_burst / etc > Push and release the subflow > > This patch alse moves the burst check conditions out of the function > mptcp_subflow_get_send(), check them in __mptcp_push_pending() and > __mptcp_subflow_push_pending() in the inner "for each pending dfrag" > loop. > > Signed-off-by: Geliang Tang <geliang.tang@suse.com> > --- > net/mptcp/protocol.c | 86 ++++++++++++++++++++------------------------ > 1 file changed, 39 insertions(+), 47 deletions(-) > > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c > index 84d33393d24e..bf77defbc546 100644 > --- a/net/mptcp/protocol.c > +++ b/net/mptcp/protocol.c > @@ -1417,14 +1417,6 @@ struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk) > u64 linger_time; > long tout = 0; > > - /* re-use last subflow, if the burst allow that */ > - if (msk->last_snd && msk->snd_burst > 0 && > - sk_stream_memory_free(msk->last_snd) && > - mptcp_subflow_active(mptcp_subflow_ctx(msk->last_snd))) { > - mptcp_set_timeout(sk); > - return msk->last_snd; > - } > - > /* pick the subflow with the lower wmem/wspace ratio */ > for (i = 0; i < SSK_MODE_MAX; ++i) { > send_info[i].ssk = NULL; > @@ -1530,60 +1522,53 @@ void mptcp_check_and_set_pending(struct sock *sk) > > void __mptcp_push_pending(struct sock *sk, unsigned int flags) > { > - struct sock *prev_ssk = NULL, *ssk = NULL; > struct mptcp_sock *msk = mptcp_sk(sk); > struct mptcp_sendmsg_info info = { > .flags = flags, > }; > bool do_check_data_fin = false; > struct mptcp_data_frag *dfrag; > + struct sock *ssk; > int len; > > - while ((dfrag = mptcp_send_head(sk))) { > - info.sent = dfrag->already_sent; > - info.limit = dfrag->data_len; > - len = dfrag->data_len - dfrag->already_sent; > - while (len > 0) { > - int ret = 0; > - > - prev_ssk = ssk; > - ssk = mptcp_subflow_get_send(msk); > - > - /* First check. If the ssk has changed since > - * the last round, release prev_ssk > - */ > - if (ssk != prev_ssk && prev_ssk) > - mptcp_push_release(prev_ssk, &info); > - if (!ssk) > - goto out; > +again: > + while (mptcp_send_head(sk) && (ssk = mptcp_subflow_get_send(msk))) { > + lock_sock(ssk); > > - /* Need to lock the new subflow only if different > - * from the previous one, otherwise we are still > - * helding the relevant lock > - */ > - if (ssk != prev_ssk) > - lock_sock(ssk); > + while ((dfrag = mptcp_send_head(sk))) { > + info.sent = dfrag->already_sent; > + info.limit = dfrag->data_len; > + len = dfrag->data_len - dfrag->already_sent; > + while (len > 0) { > + int ret = 0; > + > + ret = mptcp_sendmsg_frag(sk, ssk, dfrag, &info); > + if (ret <= 0) { > + if (ret == -EAGAIN) > + goto again; ssk is still locked here, so the lock_sock(ssk) could deadlock or leave the ssk locked when another is selected by mptcp_subflow_get_send(). - Mat > + mptcp_push_release(ssk, &info); > + goto out; > + } > + > + do_check_data_fin = true; > + info.sent += ret; > + len -= ret; > + > + mptcp_update_post_push(msk, dfrag, ret); > + } > + WRITE_ONCE(msk->first_pending, mptcp_send_next(sk)); > > - ret = mptcp_sendmsg_frag(sk, ssk, dfrag, &info); > - if (ret <= 0) { > - if (ret == -EAGAIN) > - continue; > + if (msk->snd_burst <= 0 || > + !sk_stream_memory_free(ssk) || > + !mptcp_subflow_active(mptcp_subflow_ctx(ssk))) { > mptcp_push_release(ssk, &info); > - goto out; > + goto again; > } > - > - do_check_data_fin = true; > - info.sent += ret; > - len -= ret; > - > - mptcp_update_post_push(msk, dfrag, ret); > + mptcp_set_timeout(sk); > } > - WRITE_ONCE(msk->first_pending, mptcp_send_next(sk)); > - } > > - /* at this point we held the socket lock for the last subflow we used */ > - if (ssk) > mptcp_push_release(ssk, &info); > + } > > out: > /* ensure the rtx timer is running */ > @@ -1636,6 +1621,13 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, > mptcp_update_post_push(msk, dfrag, ret); > } > WRITE_ONCE(msk->first_pending, mptcp_send_next(sk)); > + > + if (msk->snd_burst <= 0 || > + !sk_stream_memory_free(ssk) || > + !mptcp_subflow_active(mptcp_subflow_ctx(ssk))) { > + goto out; > + } > + mptcp_set_timeout(sk); > } > > out: > -- > 2.35.3 > > > -- Mat Martineau Intel ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH mptcp-next v5 04/11] mptcp: drop last_snd for burst scheduler 2022-10-06 12:17 [PATCH mptcp-next v5 00/11] refactor push pending Geliang Tang ` (2 preceding siblings ...) 2022-10-06 12:17 ` [PATCH mptcp-next v5 03/11] mptcp: refactor push_pending logic Geliang Tang @ 2022-10-06 12:17 ` Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 05/11] mptcp: simplify push_pending Geliang Tang ` (8 subsequent siblings) 12 siblings, 0 replies; 19+ messages in thread From: Geliang Tang @ 2022-10-06 12:17 UTC (permalink / raw) To: mptcp; +Cc: Geliang Tang msk->last_snd is no longer used anymore in burst scheduler, now it is used for the round-robin BPF MPTCP scheduler only. This patch removes the last_snd related code for burst scheduler. Signed-off-by: Geliang Tang <geliang.tang@suse.com> --- net/mptcp/pm.c | 9 +-------- net/mptcp/pm_netlink.c | 3 --- net/mptcp/protocol.c | 11 +---------- net/mptcp/protocol.h | 1 - net/mptcp/sched.c | 2 ++ 5 files changed, 4 insertions(+), 22 deletions(-) diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c index 45e2a48397b9..cdeb7280ac76 100644 --- a/net/mptcp/pm.c +++ b/net/mptcp/pm.c @@ -282,15 +282,8 @@ void mptcp_pm_mp_prio_received(struct sock *ssk, u8 bkup) pr_debug("subflow->backup=%d, bkup=%d\n", subflow->backup, bkup); msk = mptcp_sk(sk); - if (subflow->backup != bkup) { + if (subflow->backup != bkup) subflow->backup = bkup; - mptcp_data_lock(sk); - if (!sock_owned_by_user(sk)) - msk->last_snd = NULL; - else - __set_bit(MPTCP_RESET_SCHEDULER, &msk->cb_flags); - mptcp_data_unlock(sk); - } mptcp_event(MPTCP_EVENT_SUB_PRIORITY, msk, ssk, GFP_ATOMIC); } diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 9813ed0fde9b..1f2da4aedcb4 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -475,9 +475,6 @@ static void __mptcp_pm_send_ack(struct mptcp_sock *msk, struct mptcp_subflow_con slow = lock_sock_fast(ssk); if (prio) { - if (subflow->backup != backup) - msk->last_snd = NULL; - subflow->send_mp_prio = 1; subflow->backup = backup; subflow->request_bkup = backup; diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index bf77defbc546..8708e1e4ba16 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1469,16 +1469,13 @@ struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk) burst = min_t(int, MPTCP_SEND_BURST_SIZE, mptcp_wnd_end(msk) - msk->snd_nxt); wmem = READ_ONCE(ssk->sk_wmem_queued); - if (!burst) { - msk->last_snd = NULL; + if (!burst) return ssk; - } subflow = mptcp_subflow_ctx(ssk); subflow->avg_pacing_rate = div_u64((u64)subflow->avg_pacing_rate * wmem + READ_ONCE(ssk->sk_pacing_rate) * burst, burst + wmem); - msk->last_snd = ssk; msk->snd_burst = burst; return ssk; } @@ -2343,9 +2340,6 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk, msk->first = NULL; out: - if (ssk == msk->last_snd) - msk->last_snd = NULL; - if (need_push) __mptcp_push_pending(sk, 0); } @@ -2978,7 +2972,6 @@ static int mptcp_disconnect(struct sock *sk, int flags) * subflow */ mptcp_destroy_common(msk, MPTCP_CF_FASTCLOSE); - msk->last_snd = NULL; WRITE_ONCE(msk->flags, 0); msk->cb_flags = 0; msk->push_pending = 0; @@ -3239,8 +3232,6 @@ static void mptcp_release_cb(struct sock *sk) __mptcp_set_connected(sk); if (__test_and_clear_bit(MPTCP_ERROR_REPORT, &msk->cb_flags)) __mptcp_error_report(sk); - if (__test_and_clear_bit(MPTCP_RESET_SCHEDULER, &msk->cb_flags)) - msk->last_snd = NULL; } __mptcp_update_rmem(sk); diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index e81399debff9..05f4c6fd0cd8 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -124,7 +124,6 @@ #define MPTCP_RETRANSMIT 4 #define MPTCP_FLUSH_JOIN_LIST 5 #define MPTCP_CONNECTED 6 -#define MPTCP_RESET_SCHEDULER 7 static inline bool before64(__u64 seq1, __u64 seq2) { diff --git a/net/mptcp/sched.c b/net/mptcp/sched.c index d1cad1eefb35..3d805760ae99 100644 --- a/net/mptcp/sched.c +++ b/net/mptcp/sched.c @@ -67,6 +67,7 @@ int mptcp_init_sched(struct mptcp_sock *msk, msk->sched = sched; if (msk->sched->init) msk->sched->init(msk); + msk->last_snd = NULL; pr_debug("sched=%s", msk->sched->name); @@ -84,6 +85,7 @@ void mptcp_release_sched(struct mptcp_sock *msk) msk->sched = NULL; if (sched->release) sched->release(msk); + msk->last_snd = NULL; bpf_module_put(sched, sched->owner); } -- 2.35.3 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH mptcp-next v5 05/11] mptcp: simplify push_pending 2022-10-06 12:17 [PATCH mptcp-next v5 00/11] refactor push pending Geliang Tang ` (3 preceding siblings ...) 2022-10-06 12:17 ` [PATCH mptcp-next v5 04/11] mptcp: drop last_snd for burst scheduler Geliang Tang @ 2022-10-06 12:17 ` Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 06/11] mptcp: multi subflows push_pending Geliang Tang ` (7 subsequent siblings) 12 siblings, 0 replies; 19+ messages in thread From: Geliang Tang @ 2022-10-06 12:17 UTC (permalink / raw) To: mptcp; +Cc: Geliang Tang This patch moves the duplicate code from __mptcp_push_pending() and __mptcp_subflow_push_pending() into a new helper function, named __subflow_push_pending(). And simplify __mptcp_push_pending() by invoking this helper. Signed-off-by: Geliang Tang <geliang.tang@suse.com> --- net/mptcp/protocol.c | 95 +++++++++++++++++++++++++------------------- 1 file changed, 54 insertions(+), 41 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 8708e1e4ba16..29905214103e 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1480,12 +1480,6 @@ struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk) return ssk; } -static void mptcp_push_release(struct sock *ssk, struct mptcp_sendmsg_info *info) -{ - tcp_push(ssk, 0, info->mss_now, tcp_sk(ssk)->nonagle, info->size_goal); - release_sock(ssk); -} - static void mptcp_update_post_push(struct mptcp_sock *msk, struct mptcp_data_frag *dfrag, u32 sent) @@ -1517,61 +1511,80 @@ void mptcp_check_and_set_pending(struct sock *sk) mptcp_sk(sk)->push_pending |= BIT(MPTCP_PUSH_PENDING); } +static int __subflow_push_pending(struct sock *sk, struct sock *ssk, + struct mptcp_sendmsg_info *info) +{ + struct mptcp_sock *msk = mptcp_sk(sk); + struct mptcp_data_frag *dfrag; + int len, copied = 0, err = 0; + + while ((dfrag = mptcp_send_head(sk))) { + info->sent = dfrag->already_sent; + info->limit = dfrag->data_len; + len = dfrag->data_len - dfrag->already_sent; + while (len > 0) { + int ret = 0; + + ret = mptcp_sendmsg_frag(sk, ssk, dfrag, info); + if (ret <= 0) { + err = ret; + goto out; + } + + info->sent += ret; + copied += ret; + len -= ret; + + mptcp_update_post_push(msk, dfrag, ret); + } + WRITE_ONCE(msk->first_pending, mptcp_send_next(sk)); + + if (msk->snd_burst <= 0 || + !sk_stream_memory_free(ssk) || + !mptcp_subflow_active(mptcp_subflow_ctx(ssk))) { + err = -EAGAIN; + goto out; + } + mptcp_set_timeout(sk); + } + +out: + if (copied) { + tcp_push(ssk, 0, info->mss_now, tcp_sk(ssk)->nonagle, + info->size_goal); + err = copied; + } + + return err; +} + void __mptcp_push_pending(struct sock *sk, unsigned int flags) { struct mptcp_sock *msk = mptcp_sk(sk); struct mptcp_sendmsg_info info = { .flags = flags, }; - bool do_check_data_fin = false; - struct mptcp_data_frag *dfrag; struct sock *ssk; - int len; + int ret = 0; again: while (mptcp_send_head(sk) && (ssk = mptcp_subflow_get_send(msk))) { lock_sock(ssk); + ret = __subflow_push_pending(sk, ssk, &info); + release_sock(ssk); - while ((dfrag = mptcp_send_head(sk))) { - info.sent = dfrag->already_sent; - info.limit = dfrag->data_len; - len = dfrag->data_len - dfrag->already_sent; - while (len > 0) { - int ret = 0; - - ret = mptcp_sendmsg_frag(sk, ssk, dfrag, &info); - if (ret <= 0) { - if (ret == -EAGAIN) - goto again; - mptcp_push_release(ssk, &info); - goto out; - } - - do_check_data_fin = true; - info.sent += ret; - len -= ret; - - mptcp_update_post_push(msk, dfrag, ret); - } - WRITE_ONCE(msk->first_pending, mptcp_send_next(sk)); - - if (msk->snd_burst <= 0 || - !sk_stream_memory_free(ssk) || - !mptcp_subflow_active(mptcp_subflow_ctx(ssk))) { - mptcp_push_release(ssk, &info); + if (ret <= 0) { + if (ret == -EAGAIN) goto again; - } - mptcp_set_timeout(sk); + goto out; } - - mptcp_push_release(ssk, &info); } out: /* ensure the rtx timer is running */ if (!mptcp_timer_pending(sk)) mptcp_reset_timer(sk); - if (do_check_data_fin) + if (ret > 0) __mptcp_check_send_data_fin(sk); } -- 2.35.3 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH mptcp-next v5 06/11] mptcp: multi subflows push_pending 2022-10-06 12:17 [PATCH mptcp-next v5 00/11] refactor push pending Geliang Tang ` (4 preceding siblings ...) 2022-10-06 12:17 ` [PATCH mptcp-next v5 05/11] mptcp: simplify push_pending Geliang Tang @ 2022-10-06 12:17 ` Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 07/11] mptcp: use msk instead of mptcp_sk Geliang Tang ` (6 subsequent siblings) 12 siblings, 0 replies; 19+ messages in thread From: Geliang Tang @ 2022-10-06 12:17 UTC (permalink / raw) To: mptcp; +Cc: Geliang Tang This patch adds the multiple subflows support for __mptcp_push_pending(). Use mptcp_sched_get_send() wrapper instead of mptcp_subflow_get_send() in it. Check the subflow scheduled flags to test which subflow or subflows are picked by the scheduler, use them to send data. Signed-off-by: Geliang Tang <geliang.tang@suse.com> --- net/mptcp/protocol.c | 33 +++++++++++++++++++++------------ 1 file changed, 21 insertions(+), 12 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 29905214103e..fdb879e09a32 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1561,22 +1561,31 @@ static int __subflow_push_pending(struct sock *sk, struct sock *ssk, void __mptcp_push_pending(struct sock *sk, unsigned int flags) { struct mptcp_sock *msk = mptcp_sk(sk); - struct mptcp_sendmsg_info info = { - .flags = flags, - }; - struct sock *ssk; int ret = 0; again: - while (mptcp_send_head(sk) && (ssk = mptcp_subflow_get_send(msk))) { - lock_sock(ssk); - ret = __subflow_push_pending(sk, ssk, &info); - release_sock(ssk); + while (mptcp_send_head(sk) && !mptcp_sched_get_send(msk)) { + struct mptcp_subflow_context *subflow; + struct mptcp_sendmsg_info info = { + .flags = flags, + }; - if (ret <= 0) { - if (ret == -EAGAIN) - goto again; - goto out; + mptcp_for_each_subflow(msk, subflow) { + if (READ_ONCE(subflow->scheduled)) { + struct sock *ssk = mptcp_subflow_tcp_sock(subflow); + + lock_sock(ssk); + ret = __subflow_push_pending(sk, ssk, &info); + release_sock(ssk); + + if (ret <= 0) { + if (ret == -EAGAIN) + goto again; + goto out; + } + msk->last_snd = ssk; + mptcp_subflow_set_scheduled(subflow, false); + } } } -- 2.35.3 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH mptcp-next v5 07/11] mptcp: use msk instead of mptcp_sk 2022-10-06 12:17 [PATCH mptcp-next v5 00/11] refactor push pending Geliang Tang ` (5 preceding siblings ...) 2022-10-06 12:17 ` [PATCH mptcp-next v5 06/11] mptcp: multi subflows push_pending Geliang Tang @ 2022-10-06 12:17 ` Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 08/11] mptcp: refactor subflow_push_pending logic Geliang Tang ` (5 subsequent siblings) 12 siblings, 0 replies; 19+ messages in thread From: Geliang Tang @ 2022-10-06 12:17 UTC (permalink / raw) To: mptcp; +Cc: Geliang Tang Use msk instead of mptcp_sk(sk) in the functions where the variable "msk = mptcp_sk(sk)" has been defined. Signed-off-by: Geliang Tang <geliang.tang@suse.com> --- net/mptcp/protocol.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index fdb879e09a32..b15b97e8cbf7 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1619,7 +1619,7 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, /* check for a different subflow usage only after * spooling the first chunk of data */ - xmit_ssk = first ? ssk : mptcp_subflow_get_send(mptcp_sk(sk)); + xmit_ssk = first ? ssk : mptcp_subflow_get_send(msk); if (!xmit_ssk) goto out; if (xmit_ssk != ssk) { @@ -2249,7 +2249,7 @@ bool __mptcp_retransmit_pending_data(struct sock *sk) struct mptcp_data_frag *cur, *rtx_head; struct mptcp_sock *msk = mptcp_sk(sk); - if (__mptcp_check_fallback(mptcp_sk(sk))) + if (__mptcp_check_fallback(msk)) return false; if (tcp_rtx_and_write_queues_empty(sk)) @@ -2928,7 +2928,7 @@ bool __mptcp_close(struct sock *sk, long timeout) sock_hold(sk); pr_debug("msk=%p state=%d", sk, sk->sk_state); - if (mptcp_sk(sk)->token) + if (msk->token) mptcp_event(MPTCP_EVENT_CLOSED, msk, NULL, GFP_KERNEL); if (sk->sk_state == TCP_CLOSE) { @@ -2987,8 +2987,8 @@ static int mptcp_disconnect(struct sock *sk, int flags) mptcp_stop_timer(sk); sk_stop_timer(sk, &sk->sk_timer); - if (mptcp_sk(sk)->token) - mptcp_event(MPTCP_EVENT_CLOSED, mptcp_sk(sk), NULL, GFP_KERNEL); + if (msk->token) + mptcp_event(MPTCP_EVENT_CLOSED, msk, NULL, GFP_KERNEL); /* msk->subflow is still intact, the following will not free the first * subflow -- 2.35.3 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH mptcp-next v5 08/11] mptcp: refactor subflow_push_pending logic 2022-10-06 12:17 [PATCH mptcp-next v5 00/11] refactor push pending Geliang Tang ` (6 preceding siblings ...) 2022-10-06 12:17 ` [PATCH mptcp-next v5 07/11] mptcp: use msk instead of mptcp_sk Geliang Tang @ 2022-10-06 12:17 ` Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 09/11] mptcp: simplify subflow_push_pending Geliang Tang ` (4 subsequent siblings) 12 siblings, 0 replies; 19+ messages in thread From: Geliang Tang @ 2022-10-06 12:17 UTC (permalink / raw) To: mptcp; +Cc: Geliang Tang This patch refactors __mptcp_subflow_push_pending logic from: For each dfrag: While sends succeed: Call the scheduler (selects subflow and msk->snd_burst) Send the dfrag data with mptcp_subflow_delegate(), break Send the dfrag data with mptcp_sendmsg_frag() Update dfrag->already_sent, msk->snd_nxt, msk->snd_burst Update msk->first_pending to: While first_pending isn't empty: Call the scheduler (selects subflow and msk->snd_burst) Send the dfrag data with mptcp_subflow_delegate(), break Send the dfrag data with mptcp_sendmsg_frag() For each pending dfrag: While sends succeed: Send the dfrag data with mptcp_sendmsg_frag() Update already_sent, snd_nxt, snd_burst Update msk->first_pending Break if required by msk->snd_burst / etc Signed-off-by: Geliang Tang <geliang.tang@suse.com> --- net/mptcp/protocol.c | 66 +++++++++++++++++++++++--------------------- 1 file changed, 34 insertions(+), 32 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index b15b97e8cbf7..e2f47e8ca63e 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1609,44 +1609,46 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, int len, copied = 0; info.flags = 0; - while ((dfrag = mptcp_send_head(sk))) { - info.sent = dfrag->already_sent; - info.limit = dfrag->data_len; - len = dfrag->data_len - dfrag->already_sent; - while (len > 0) { - int ret = 0; + while (mptcp_send_head(sk)) { + /* check for a different subflow usage only after + * spooling the first chunk of data + */ + xmit_ssk = first ? ssk : mptcp_subflow_get_send(msk); + if (!xmit_ssk) + goto out; + if (xmit_ssk != ssk) { + mptcp_subflow_delegate(mptcp_subflow_ctx(xmit_ssk), + MPTCP_DELEGATE_SEND); + goto out; + } - /* check for a different subflow usage only after - * spooling the first chunk of data - */ - xmit_ssk = first ? ssk : mptcp_subflow_get_send(msk); - if (!xmit_ssk) - goto out; - if (xmit_ssk != ssk) { - mptcp_subflow_delegate(mptcp_subflow_ctx(xmit_ssk), - MPTCP_DELEGATE_SEND); - goto out; - } + while ((dfrag = mptcp_send_head(sk))) { + info.sent = dfrag->already_sent; + info.limit = dfrag->data_len; + len = dfrag->data_len - dfrag->already_sent; + while (len > 0) { + int ret = 0; - ret = mptcp_sendmsg_frag(sk, ssk, dfrag, &info); - if (ret <= 0) - goto out; + ret = mptcp_sendmsg_frag(sk, ssk, dfrag, &info); + if (ret <= 0) + goto out; - info.sent += ret; - copied += ret; - len -= ret; - first = false; + info.sent += ret; + copied += ret; + len -= ret; + first = false; - mptcp_update_post_push(msk, dfrag, ret); - } - WRITE_ONCE(msk->first_pending, mptcp_send_next(sk)); + mptcp_update_post_push(msk, dfrag, ret); + } + WRITE_ONCE(msk->first_pending, mptcp_send_next(sk)); - if (msk->snd_burst <= 0 || - !sk_stream_memory_free(ssk) || - !mptcp_subflow_active(mptcp_subflow_ctx(ssk))) { - goto out; + if (msk->snd_burst <= 0 || + !sk_stream_memory_free(ssk) || + !mptcp_subflow_active(mptcp_subflow_ctx(ssk))) { + goto out; + } + mptcp_set_timeout(sk); } - mptcp_set_timeout(sk); } out: -- 2.35.3 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH mptcp-next v5 09/11] mptcp: simplify subflow_push_pending 2022-10-06 12:17 [PATCH mptcp-next v5 00/11] refactor push pending Geliang Tang ` (7 preceding siblings ...) 2022-10-06 12:17 ` [PATCH mptcp-next v5 08/11] mptcp: refactor subflow_push_pending logic Geliang Tang @ 2022-10-06 12:17 ` Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 10/11] mptcp: multi subflows subflow_push_pending Geliang Tang ` (3 subsequent siblings) 12 siblings, 0 replies; 19+ messages in thread From: Geliang Tang @ 2022-10-06 12:17 UTC (permalink / raw) To: mptcp; +Cc: Geliang Tang This patch simplifies __mptcp_subflow_push_pending() by invoking __subflow_push_pending() helper. Signed-off-by: Geliang Tang <geliang.tang@suse.com> --- net/mptcp/protocol.c | 39 ++++++++------------------------------- 1 file changed, 8 insertions(+), 31 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index e2f47e8ca63e..66436885b749 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1604,10 +1604,10 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, struct mptcp_sendmsg_info info = { .data_lock_held = true, }; - struct mptcp_data_frag *dfrag; struct sock *xmit_ssk; - int len, copied = 0; + int ret = 0; +again: info.flags = 0; while (mptcp_send_head(sk)) { /* check for a different subflow usage only after @@ -1622,32 +1622,11 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, goto out; } - while ((dfrag = mptcp_send_head(sk))) { - info.sent = dfrag->already_sent; - info.limit = dfrag->data_len; - len = dfrag->data_len - dfrag->already_sent; - while (len > 0) { - int ret = 0; - - ret = mptcp_sendmsg_frag(sk, ssk, dfrag, &info); - if (ret <= 0) - goto out; - - info.sent += ret; - copied += ret; - len -= ret; - first = false; - - mptcp_update_post_push(msk, dfrag, ret); - } - WRITE_ONCE(msk->first_pending, mptcp_send_next(sk)); - - if (msk->snd_burst <= 0 || - !sk_stream_memory_free(ssk) || - !mptcp_subflow_active(mptcp_subflow_ctx(ssk))) { - goto out; - } - mptcp_set_timeout(sk); + ret = __subflow_push_pending(sk, ssk, &info); + if (ret <= 0) { + if (ret == -EAGAIN) + goto again; + break; } } @@ -1655,9 +1634,7 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, /* __mptcp_alloc_tx_skb could have released some wmem and we are * not going to flush it via release_sock() */ - if (copied) { - tcp_push(ssk, 0, info.mss_now, tcp_sk(ssk)->nonagle, - info.size_goal); + if (ret > 0) { if (!mptcp_timer_pending(sk)) mptcp_reset_timer(sk); -- 2.35.3 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH mptcp-next v5 10/11] mptcp: multi subflows subflow_push_pending 2022-10-06 12:17 [PATCH mptcp-next v5 00/11] refactor push pending Geliang Tang ` (8 preceding siblings ...) 2022-10-06 12:17 ` [PATCH mptcp-next v5 09/11] mptcp: simplify subflow_push_pending Geliang Tang @ 2022-10-06 12:17 ` Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 11/11] mptcp: multi subflows retrans support Geliang Tang ` (2 subsequent siblings) 12 siblings, 0 replies; 19+ messages in thread From: Geliang Tang @ 2022-10-06 12:17 UTC (permalink / raw) To: mptcp; +Cc: Geliang Tang This patch adds the multiple subflows support for __mptcp_subflow_push_pending(). Use mptcp_sched_get_send() wrapper instead of mptcp_subflow_get_send() in it. Check the subflow scheduled flags to test which subflow or subflows are picked by the scheduler, use them to send data. Signed-off-by: Geliang Tang <geliang.tang@suse.com> --- net/mptcp/protocol.c | 50 ++++++++++++++++++++++++++++++++------------ 1 file changed, 37 insertions(+), 13 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 66436885b749..ecea2a400e6b 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1601,10 +1601,10 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, bool first) { struct mptcp_sock *msk = mptcp_sk(sk); + struct mptcp_subflow_context *subflow; struct mptcp_sendmsg_info info = { .data_lock_held = true, }; - struct sock *xmit_ssk; int ret = 0; again: @@ -1613,20 +1613,44 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, /* check for a different subflow usage only after * spooling the first chunk of data */ - xmit_ssk = first ? ssk : mptcp_subflow_get_send(msk); - if (!xmit_ssk) - goto out; - if (xmit_ssk != ssk) { - mptcp_subflow_delegate(mptcp_subflow_ctx(xmit_ssk), - MPTCP_DELEGATE_SEND); - goto out; + if (first) { + ret = __subflow_push_pending(sk, ssk, &info); + if (ret <= 0) { + if (ret == -EAGAIN) + goto again; + break; + } + first = false; + msk->last_snd = ssk; + continue; } - ret = __subflow_push_pending(sk, ssk, &info); - if (ret <= 0) { - if (ret == -EAGAIN) - goto again; - break; + if (mptcp_sched_get_send(msk)) + goto out; + + mptcp_for_each_subflow(msk, subflow) { + if (READ_ONCE(subflow->scheduled)) { + struct sock *xmit_ssk = mptcp_subflow_tcp_sock(subflow); + + if (xmit_ssk != ssk) { + if (mptcp_subflow_has_delegated_action(subflow)) + goto out; + mptcp_subflow_delegate(mptcp_subflow_ctx(xmit_ssk), + MPTCP_DELEGATE_SEND); + msk->last_snd = xmit_ssk; + mptcp_subflow_set_scheduled(subflow, false); + continue; + } + + ret = __subflow_push_pending(sk, ssk, &info); + if (ret <= 0) { + if (ret == -EAGAIN) + goto again; + goto out; + } + msk->last_snd = ssk; + mptcp_subflow_set_scheduled(subflow, false); + } } } -- 2.35.3 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH mptcp-next v5 11/11] mptcp: multi subflows retrans support 2022-10-06 12:17 [PATCH mptcp-next v5 00/11] refactor push pending Geliang Tang ` (9 preceding siblings ...) 2022-10-06 12:17 ` [PATCH mptcp-next v5 10/11] mptcp: multi subflows subflow_push_pending Geliang Tang @ 2022-10-06 12:17 ` Geliang Tang 2022-10-06 17:53 ` mptcp: multi subflows retrans support: Tests Results MPTCP CI 2022-10-07 0:16 ` [PATCH mptcp-next v5 00/11] refactor push pending Mat Martineau 2022-10-11 1:01 ` Mat Martineau 12 siblings, 1 reply; 19+ messages in thread From: Geliang Tang @ 2022-10-06 12:17 UTC (permalink / raw) To: mptcp; +Cc: Geliang Tang This patch adds the multiple subflows support for __mptcp_retrans(). In it, use sched_get_retrans() wrapper instead of mptcp_subflow_get_retrans(). Iterate each subflow of msk, check the scheduled flag to test if it is picked by the scheduler. If so, use it to retrans data. Signed-off-by: Geliang Tang <geliang.tang@suse.com> --- net/mptcp/protocol.c | 62 ++++++++++++++++++++++++++++---------------- 1 file changed, 39 insertions(+), 23 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index ecea2a400e6b..b422d5de435b 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2479,16 +2479,17 @@ static void mptcp_check_fastclose(struct mptcp_sock *msk) static void __mptcp_retrans(struct sock *sk) { struct mptcp_sock *msk = mptcp_sk(sk); + struct mptcp_subflow_context *subflow; struct mptcp_sendmsg_info info = {}; struct mptcp_data_frag *dfrag; - size_t copied = 0; struct sock *ssk; - int ret; + int ret, err; + u16 len = 0; mptcp_clean_una_wakeup(sk); /* first check ssk: need to kick "stale" logic */ - ssk = mptcp_subflow_get_retrans(msk); + err = mptcp_sched_get_retrans(msk); dfrag = mptcp_rtx_head(sk); if (!dfrag) { if (mptcp_data_fin_enabled(msk)) { @@ -2507,31 +2508,46 @@ static void __mptcp_retrans(struct sock *sk) goto reset_timer; } - if (!ssk) + if (err) goto reset_timer; - lock_sock(ssk); + mptcp_for_each_subflow(msk, subflow) { + if (READ_ONCE(subflow->scheduled)) { + u16 copied = 0; + + ssk = mptcp_subflow_tcp_sock(subflow); + if (!ssk) + goto reset_timer; + + lock_sock(ssk); + + /* limit retransmission to the bytes already sent on some subflows */ + info.sent = 0; + info.limit = READ_ONCE(msk->csum_enabled) ? dfrag->data_len : + dfrag->already_sent; + while (info.sent < info.limit) { + ret = mptcp_sendmsg_frag(sk, ssk, dfrag, &info); + if (ret <= 0) + break; + + MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_RETRANSSEGS); + copied += ret; + info.sent += ret; + } + if (copied) { + len = max(copied, len); + tcp_push(ssk, 0, info.mss_now, tcp_sk(ssk)->nonagle, + info.size_goal); + WRITE_ONCE(msk->allow_infinite_fallback, false); + } - /* limit retransmission to the bytes already sent on some subflows */ - info.sent = 0; - info.limit = READ_ONCE(msk->csum_enabled) ? dfrag->data_len : dfrag->already_sent; - while (info.sent < info.limit) { - ret = mptcp_sendmsg_frag(sk, ssk, dfrag, &info); - if (ret <= 0) - break; + release_sock(ssk); - MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_RETRANSSEGS); - copied += ret; - info.sent += ret; - } - if (copied) { - dfrag->already_sent = max(dfrag->already_sent, info.sent); - tcp_push(ssk, 0, info.mss_now, tcp_sk(ssk)->nonagle, - info.size_goal); - WRITE_ONCE(msk->allow_infinite_fallback, false); + msk->last_snd = ssk; + mptcp_subflow_set_scheduled(subflow, false); + } } - - release_sock(ssk); + dfrag->already_sent = max(dfrag->already_sent, len); reset_timer: mptcp_check_and_set_pending(sk); -- 2.35.3 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: mptcp: multi subflows retrans support: Tests Results 2022-10-06 12:17 ` [PATCH mptcp-next v5 11/11] mptcp: multi subflows retrans support Geliang Tang @ 2022-10-06 17:53 ` MPTCP CI 0 siblings, 0 replies; 19+ messages in thread From: MPTCP CI @ 2022-10-06 17:53 UTC (permalink / raw) To: Geliang Tang; +Cc: mptcp Hi Geliang, Thank you for your modifications, that's great! Our CI did some validations and here is its report: - KVM Validation: normal: - Unstable: 2 failed test(s): packetdrill_add_addr selftest_mptcp_join 🔴: - Task: https://cirrus-ci.com/task/5053437011296256 - Summary: https://api.cirrus-ci.com/v1/artifact/task/5053437011296256/summary/summary.txt - KVM Validation: debug: - Success! ✅: - Task: https://cirrus-ci.com/task/6179336918138880 - Summary: https://api.cirrus-ci.com/v1/artifact/task/6179336918138880/summary/summary.txt Initiator: Patchew Applier Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/acfdc20f458a If there are some issues, you can reproduce them using the same environment as the one used by the CI thanks to a docker image, e.g.: $ cd [kernel source code] $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \ --pull always mptcp/mptcp-upstream-virtme-docker:latest \ auto-debug For more details: https://github.com/multipath-tcp/mptcp-upstream-virtme-docker Please note that despite all the efforts that have been already done to have a stable tests suite when executed on a public CI like here, it is possible some reported issues are not due to your modifications. Still, do not hesitate to help us improve that ;-) Cheers, MPTCP GH Action bot Bot operated by Matthieu Baerts (Tessares) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v5 00/11] refactor push pending 2022-10-06 12:17 [PATCH mptcp-next v5 00/11] refactor push pending Geliang Tang ` (10 preceding siblings ...) 2022-10-06 12:17 ` [PATCH mptcp-next v5 11/11] mptcp: multi subflows retrans support Geliang Tang @ 2022-10-07 0:16 ` Mat Martineau 2022-10-07 9:00 ` Geliang Tang 2022-10-11 1:01 ` Mat Martineau 12 siblings, 1 reply; 19+ messages in thread From: Mat Martineau @ 2022-10-07 0:16 UTC (permalink / raw) To: Geliang Tang; +Cc: mptcp On Thu, 6 Oct 2022, Geliang Tang wrote: > v5: > - address Mat's comments in v4. Hi Geliang - Thanks for the v5. I haven't finished looking over all the patches in detail yet, but two things I do want to reply to right now: * Thanks for explaining in patch 4 that last_snd is still useful for round robin. I had forgotten about that, and it looked like a "write-only" variable in the kernel code. * In the meeting today Paolo suggested that a good test for the new scheduler loop would be to modify simult_flows.sh to use much larger files, then see if the modified code slowed down any of the simult_flows tests. He suggested making the test file 10x larger in simult_flows.sh: - size=$((2 * 2048 * 4096)) + size=$((2 * 2048 * 4096 * 10)) Can you compare the test times between the export branch and this series, with both of them using the larger file size? Thanks, Mat > > v4: > - update __mptcp_subflow_push_pending as Mat suggested. > - add more patches from "BPF redundant scheduler" series. > > v3: > - add a cleanup patch. > - remove msk->last_snd in mptcp_subflow_get_send(). > - add the loop that calls the scheduler again in __mptcp_push_pending(). > > v2: > - add snd_burst check in dfrags loop as Mat suggested. > > Refactor __mptcp_push_pending() and __mptcp_subflow_push_pending() to > remove duplicate code and support redundant scheduler more easily in > __mptcp_subflow_push_pending(). > > Geliang Tang (11): > Squash to "mptcp: add get_subflow wrappers" > mptcp: 'first' argument for subflow_push_pending > mptcp: refactor push_pending logic > mptcp: drop last_snd for burst scheduler > mptcp: simplify push_pending > mptcp: multi subflows push_pending > mptcp: use msk instead of mptcp_sk > mptcp: refactor subflow_push_pending logic > mptcp: simplify subflow_push_pending > mptcp: multi subflows subflow_push_pending > mptcp: multi subflows retrans support > > net/mptcp/pm.c | 9 +- > net/mptcp/pm_netlink.c | 3 - > net/mptcp/protocol.c | 285 ++++++++++++++++++++++------------------- > net/mptcp/protocol.h | 5 +- > net/mptcp/sched.c | 61 +++++---- > 5 files changed, 184 insertions(+), 179 deletions(-) > > -- > 2.35.3 > > > -- Mat Martineau Intel ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v5 00/11] refactor push pending 2022-10-07 0:16 ` [PATCH mptcp-next v5 00/11] refactor push pending Mat Martineau @ 2022-10-07 9:00 ` Geliang Tang 2022-10-10 15:05 ` Geliang Tang 0 siblings, 1 reply; 19+ messages in thread From: Geliang Tang @ 2022-10-07 9:00 UTC (permalink / raw) To: Mat Martineau; +Cc: mptcp [-- Attachment #1: Type: text/plain, Size: 3689 bytes --] On Thu, Oct 06, 2022 at 05:16:55PM -0700, Mat Martineau wrote: > On Thu, 6 Oct 2022, Geliang Tang wrote: > > > v5: > > - address Mat's comments in v4. > > Hi Geliang - > > Thanks for the v5. I haven't finished looking over all the patches in detail > yet, but two things I do want to reply to right now: > > * Thanks for explaining in patch 4 that last_snd is still useful for round > robin. I had forgotten about that, and it looked like a "write-only" > variable in the kernel code. > > * In the meeting today Paolo suggested that a good test for the new > scheduler loop would be to modify simult_flows.sh to use much larger files, > then see if the modified code slowed down any of the simult_flows tests. > > He suggested making the test file 10x larger in simult_flows.sh: > > - size=$((2 * 2048 * 4096)) > + size=$((2 * 2048 * 4096 * 10)) > > Can you compare the test times between the export branch and this series, > with both of them using the larger file size? 10x larger failed in my tests with timeout. So I changed it to 5x larger. Here's the patch: @@ -52,7 +52,7 @@ setup() sout=$(mktemp) cout=$(mktemp) capout=$(mktemp) - size=$((2 * 2048 * 4096)) + size=$((2 * 2048 * 4096 * 5)) dd if=/dev/zero of=$small bs=4096 count=20 >/dev/null 2>&1 dd if=/dev/zero of=$large bs=4096 count=$((size / 4096)) >/dev/null 2>&1 @@ -210,13 +210,13 @@ do_transfer() fi echo " [ fail ]" - echo "client exit code $retc, server $rets" 1>&2 - echo -e "\nnetns ${ns3} socket stat for $port:" 1>&2 - ip netns exec ${ns3} ss -nita 1>&2 -o "sport = :$port" - echo -e "\nnetns ${ns1} socket stat for $port:" 1>&2 - ip netns exec ${ns1} ss -nita 1>&2 -o "dport = :$port" - ls -l $sin $cout - ls -l $cin $sout + #echo "client exit code $retc, server $rets" 1>&2 + #echo -e "\nnetns ${ns3} socket stat for $port:" 1>&2 + #ip netns exec ${ns3} ss -nita 1>&2 -o "sport = :$port" + #echo -e "\nnetns ${ns1} socket stat for $port:" 1>&2 + #ip netns exec ${ns1} ss -nita 1>&2 -o "dport = :$port" + #ls -l $sin $cout + #ls -l $cin $sout cat "$capout" return 1 All logs are attached. "5x_10times_export.log" is for the export branch and "5x_10times_refactor_v5.log" is for this series. Thanks, -Geliang > > > Thanks, > > Mat > > > > > > v4: > > - update __mptcp_subflow_push_pending as Mat suggested. > > - add more patches from "BPF redundant scheduler" series. > > > > v3: > > - add a cleanup patch. > > - remove msk->last_snd in mptcp_subflow_get_send(). > > - add the loop that calls the scheduler again in __mptcp_push_pending(). > > > > v2: > > - add snd_burst check in dfrags loop as Mat suggested. > > > > Refactor __mptcp_push_pending() and __mptcp_subflow_push_pending() to > > remove duplicate code and support redundant scheduler more easily in > > __mptcp_subflow_push_pending(). > > > > Geliang Tang (11): > > Squash to "mptcp: add get_subflow wrappers" > > mptcp: 'first' argument for subflow_push_pending > > mptcp: refactor push_pending logic > > mptcp: drop last_snd for burst scheduler > > mptcp: simplify push_pending > > mptcp: multi subflows push_pending > > mptcp: use msk instead of mptcp_sk > > mptcp: refactor subflow_push_pending logic > > mptcp: simplify subflow_push_pending > > mptcp: multi subflows subflow_push_pending > > mptcp: multi subflows retrans support > > > > net/mptcp/pm.c | 9 +- > > net/mptcp/pm_netlink.c | 3 - > > net/mptcp/protocol.c | 285 ++++++++++++++++++++++------------------- > > net/mptcp/protocol.h | 5 +- > > net/mptcp/sched.c | 61 +++++---- > > 5 files changed, 184 insertions(+), 179 deletions(-) > > > > -- > > 2.35.3 > > > > > > > > -- > Mat Martineau > Intel [-- Attachment #2: 5x_10times_export.log --] [-- Type: text/plain, Size: 13848 bytes --] 1 balanced bwidth transfer slower than expected! runtime 36150 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36135 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36087 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36103 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth 18173 max 18227 [ OK ] unbalanced bwidth - reverse direction transfer slower than expected! runtime 18284 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay 18160 max 18227 [ OK ] unbalanced bwidth with unbalanced delay - reverse direction 18169 max 18227 [ OK ] unbalanced bwidth with opposed, unbalanced delay transfer slower than expected! runtime 18349 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay - reverse direction18170 max 18227 [ OK ] 2 balanced bwidth transfer slower than expected! runtime 36085 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36095 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36104 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36106 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth transfer slower than expected! runtime 18364 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth - reverse direction transfer slower than expected! runtime 18272 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay transfer slower than expected! runtime 18246 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay - reverse direction 18150 max 18227 [ OK ] unbalanced bwidth with opposed, unbalanced delay 18216 max 18227 [ OK ] unbalanced bwidth with opposed, unbalanced delay - reverse direction18199 max 18227 [ OK ] 3 balanced bwidth transfer slower than expected! runtime 36114 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36082 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36125 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36093 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth 18198 max 18227 [ OK ] unbalanced bwidth - reverse direction transfer slower than expected! runtime 18431 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay transfer slower than expected! runtime 18312 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay - reverse direction 18197 max 18227 [ OK ] unbalanced bwidth with opposed, unbalanced delay transfer slower than expected! runtime 18284 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay - reverse directiontransfer slower than expected! runtime 18450 ms, expected 18227 ms max 18227 [ fail ] 4 balanced bwidth transfer slower than expected! runtime 36111 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36112 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36126 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36106 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth 18217 max 18227 [ OK ] unbalanced bwidth - reverse direction 18214 max 18227 [ OK ] unbalanced bwidth with unbalanced delay transfer slower than expected! runtime 18366 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 18347 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay 18218 max 18227 [ OK ] unbalanced bwidth with opposed, unbalanced delay - reverse directiontransfer slower than expected! runtime 18459 ms, expected 18227 ms max 18227 [ fail ] 5 balanced bwidth transfer slower than expected! runtime 36128 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36106 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36125 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36108 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth 18172 max 18227 [ OK ] unbalanced bwidth - reverse direction transfer slower than expected! runtime 18288 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay transfer slower than expected! runtime 18312 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 18265 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay transfer slower than expected! runtime 18344 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay - reverse direction18133 max 18227 [ OK ] 6 balanced bwidth transfer slower than expected! runtime 36114 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36085 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36156 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36109 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth transfer slower than expected! runtime 18345 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth - reverse direction transfer slower than expected! runtime 18409 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay 18167 max 18227 [ OK ] unbalanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 18317 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay 18178 max 18227 [ OK ] unbalanced bwidth with opposed, unbalanced delay - reverse direction18163 max 18227 [ OK ] 7 balanced bwidth transfer slower than expected! runtime 36150 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36087 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36108 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36082 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth transfer slower than expected! runtime 18292 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth - reverse direction transfer slower than expected! runtime 18442 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay transfer slower than expected! runtime 18283 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay - reverse direction 18203 max 18227 [ OK ] unbalanced bwidth with opposed, unbalanced delay 18218 max 18227 [ OK ] unbalanced bwidth with opposed, unbalanced delay - reverse directiontransfer slower than expected! runtime 18383 ms, expected 18227 ms max 18227 [ fail ] 8 balanced bwidth transfer slower than expected! runtime 36090 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36089 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36106 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36101 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth transfer slower than expected! runtime 18271 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth - reverse direction transfer slower than expected! runtime 18267 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay transfer slower than expected! runtime 18330 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 18323 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay transfer slower than expected! runtime 18332 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay - reverse directiontransfer slower than expected! runtime 18357 ms, expected 18227 ms max 18227 [ fail ] 9 balanced bwidth transfer slower than expected! runtime 36134 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36108 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36110 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36077 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth transfer slower than expected! runtime 18256 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth - reverse direction transfer slower than expected! runtime 18232 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay transfer slower than expected! runtime 18265 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay - reverse direction 18211 max 18227 [ OK ] unbalanced bwidth with opposed, unbalanced delay transfer slower than expected! runtime 18389 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay - reverse direction18165 max 18227 [ OK ] 10 balanced bwidth transfer slower than expected! runtime 36113 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36089 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36092 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36103 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth transfer slower than expected! runtime 18413 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth - reverse direction 18162 max 18227 [ OK ] unbalanced bwidth with unbalanced delay 18139 max 18227 [ OK ] unbalanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 18349 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay transfer slower than expected! runtime 18256 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay - reverse directiontransfer slower than expected! runtime 18540 ms, expected 18227 ms max 18227 [ fail ] [-- Attachment #3: 5x_10times_refactor_v5.log --] [-- Type: text/plain, Size: 14553 bytes --] 1 balanced bwidth transfer slower than expected! runtime 36095 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36168 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36112 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36120 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth transfer slower than expected! runtime 18235 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth - reverse direction transfer slower than expected! runtime 18445 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay transfer slower than expected! runtime 18310 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 18299 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay transfer slower than expected! runtime 18388 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay - reverse directiontransfer slower than expected! runtime 18493 ms, expected 18227 ms max 18227 [ fail ] 2 balanced bwidth transfer slower than expected! runtime 36134 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36202 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36174 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36145 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth 18144 max 18227 [ OK ] unbalanced bwidth - reverse direction transfer slower than expected! runtime 18329 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay transfer slower than expected! runtime 18281 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 18512 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay transfer slower than expected! runtime 18297 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay - reverse directiontransfer slower than expected! runtime 18361 ms, expected 18227 ms max 18227 [ fail ] 3 balanced bwidth transfer slower than expected! runtime 36186 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36166 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36157 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36185 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth 18207 max 18227 [ OK ] unbalanced bwidth - reverse direction transfer slower than expected! runtime 18270 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay 18148 max 18227 [ OK ] unbalanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 18247 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay transfer slower than expected! runtime 18405 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay - reverse directiontransfer slower than expected! runtime 18495 ms, expected 18227 ms max 18227 [ fail ] 4 balanced bwidth transfer slower than expected! runtime 36128 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36145 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36176 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36109 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth transfer slower than expected! runtime 18355 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth - reverse direction transfer slower than expected! runtime 18521 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay 18203 max 18227 [ OK ] unbalanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 18282 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay transfer slower than expected! runtime 18473 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay - reverse directiontransfer slower than expected! runtime 18415 ms, expected 18227 ms max 18227 [ fail ] 5 balanced bwidth transfer slower than expected! runtime 36131 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36167 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36173 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36143 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth 18182 max 18227 [ OK ] unbalanced bwidth - reverse direction transfer slower than expected! runtime 18377 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay 18184 max 18227 [ OK ] unbalanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 18274 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay 18151 max 18227 [ OK ] unbalanced bwidth with opposed, unbalanced delay - reverse directiontransfer slower than expected! runtime 18450 ms, expected 18227 ms max 18227 [ fail ] 6 balanced bwidth transfer slower than expected! runtime 36129 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36199 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36133 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36158 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth 18133 max 18227 [ OK ] unbalanced bwidth - reverse direction transfer slower than expected! runtime 18488 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay transfer slower than expected! runtime 18334 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 18457 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay transfer slower than expected! runtime 18379 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay - reverse directiontransfer slower than expected! runtime 18271 ms, expected 18227 ms max 18227 [ fail ] 7 balanced bwidth transfer slower than expected! runtime 36135 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36119 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36131 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36112 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth transfer slower than expected! runtime 18344 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth - reverse direction transfer slower than expected! runtime 18378 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay 18145 max 18227 [ OK ] unbalanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 18434 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay transfer slower than expected! runtime 18612 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay - reverse directiontransfer slower than expected! runtime 18535 ms, expected 18227 ms max 18227 [ fail ] 8 balanced bwidth transfer slower than expected! runtime 36158 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36180 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36108 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36136 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth 18183 max 18227 [ OK ] unbalanced bwidth - reverse direction transfer slower than expected! runtime 18352 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay 18151 max 18227 [ OK ] unbalanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 18423 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay transfer slower than expected! runtime 18333 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay - reverse directiontransfer slower than expected! runtime 18373 ms, expected 18227 ms max 18227 [ fail ] 9 balanced bwidth transfer slower than expected! runtime 36123 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36152 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36112 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36128 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth transfer slower than expected! runtime 18267 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth - reverse direction transfer slower than expected! runtime 18311 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay 18167 max 18227 [ OK ] unbalanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 18294 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay transfer slower than expected! runtime 18279 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay - reverse directiontransfer slower than expected! runtime 18392 ms, expected 18227 ms max 18227 [ fail ] 10 balanced bwidth transfer slower than expected! runtime 36152 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth - reverse direction transfer slower than expected! runtime 36164 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay transfer slower than expected! runtime 36172 ms, expected 36005 ms max 36005 [ fail ] balanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 36170 ms, expected 36005 ms max 36005 [ fail ] unbalanced bwidth transfer slower than expected! runtime 18345 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth - reverse direction transfer slower than expected! runtime 18317 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay transfer slower than expected! runtime 18367 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with unbalanced delay - reverse direction transfer slower than expected! runtime 18407 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay transfer slower than expected! runtime 18444 ms, expected 18227 ms max 18227 [ fail ] unbalanced bwidth with opposed, unbalanced delay - reverse directiontransfer slower than expected! runtime 18489 ms, expected 18227 ms max 18227 [ fail ] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v5 00/11] refactor push pending 2022-10-07 9:00 ` Geliang Tang @ 2022-10-10 15:05 ` Geliang Tang 2022-10-11 0:12 ` Mat Martineau 0 siblings, 1 reply; 19+ messages in thread From: Geliang Tang @ 2022-10-10 15:05 UTC (permalink / raw) To: Geliang Tang; +Cc: Mat Martineau, mptcp Geliang Tang <geliang.tang@suse.com> 于2022年10月7日周五 16:59写道: > > On Thu, Oct 06, 2022 at 05:16:55PM -0700, Mat Martineau wrote: > > On Thu, 6 Oct 2022, Geliang Tang wrote: > > > > > v5: > > > - address Mat's comments in v4. > > > > Hi Geliang - > > > > Thanks for the v5. I haven't finished looking over all the patches in detail > > yet, but two things I do want to reply to right now: > > > > * Thanks for explaining in patch 4 that last_snd is still useful for round > > robin. I had forgotten about that, and it looked like a "write-only" > > variable in the kernel code. > > > > * In the meeting today Paolo suggested that a good test for the new > > scheduler loop would be to modify simult_flows.sh to use much larger files, > > then see if the modified code slowed down any of the simult_flows tests. > > > > He suggested making the test file 10x larger in simult_flows.sh: > > > > - size=$((2 * 2048 * 4096)) > > + size=$((2 * 2048 * 4096 * 10)) > > > > Can you compare the test times between the export branch and this series, > > with both of them using the larger file size? > > 10x larger failed in my tests with timeout. So I changed it to 5x > larger. > > Here's the patch: > > @@ -52,7 +52,7 @@ setup() > sout=$(mktemp) > cout=$(mktemp) > capout=$(mktemp) > - size=$((2 * 2048 * 4096)) > + size=$((2 * 2048 * 4096 * 5)) > > dd if=/dev/zero of=$small bs=4096 count=20 >/dev/null 2>&1 > dd if=/dev/zero of=$large bs=4096 count=$((size / 4096)) >/dev/null 2>&1 > @@ -210,13 +210,13 @@ do_transfer() > fi > > echo " [ fail ]" > - echo "client exit code $retc, server $rets" 1>&2 > - echo -e "\nnetns ${ns3} socket stat for $port:" 1>&2 > - ip netns exec ${ns3} ss -nita 1>&2 -o "sport = :$port" > - echo -e "\nnetns ${ns1} socket stat for $port:" 1>&2 > - ip netns exec ${ns1} ss -nita 1>&2 -o "dport = :$port" > - ls -l $sin $cout > - ls -l $cin $sout > + #echo "client exit code $retc, server $rets" 1>&2 > + #echo -e "\nnetns ${ns3} socket stat for $port:" 1>&2 > + #ip netns exec ${ns3} ss -nita 1>&2 -o "sport = :$port" > + #echo -e "\nnetns ${ns1} socket stat for $port:" 1>&2 > + #ip netns exec ${ns1} ss -nita 1>&2 -o "dport = :$port" > + #ls -l $sin $cout > + #ls -l $cin $sout > > cat "$capout" > return 1 > > All logs are attached. "5x_10times_export.log" is for the export branch > and "5x_10times_refactor_v5.log" is for this series. I compared the two log data. The test time of this series is indeed longer than that of export, but the difference is very small. I removed the useless information in the two logs and only retained the running time and expected time. It looks like this: > cat 5x_export.log | head 36150 max 36005 36135 max 36005 36087 max 36005 36103 max 36005 18284 max 18227 18349 max 18227 Add all running time and expected time: > awk '{ sum += $1 }; END { print sum }' 5x_export.log 2540920 > awk '{ sum += $3 }; END { print sum }' 5x_export.log 2533820 The ratio of the total running time to the total expected time is 1.00280209 (2540920/2533820). Use the same method to calculate data in 5x_10times_refactor_v5.log: > cat 5x_refactor_v5.log | head 36095 max 36005 36168 max 36005 36112 max 36005 36120 max 36005 18235 max 18227 18445 max 18227 > awk '{ sum += $1 }; END { print sum }' 5x_refactor_v5.log 2546024 > awk '{ sum += $3 }; END { print sum }' 5x_refactor_v5.log 2533820 The ratio is 1.00481644. It is 0.002 more than export data. I think this difference is acceptable. Thanks, -Geliang > > Thanks, > -Geliang > > > > > > > Thanks, > > > > Mat > > > > > > > > > > v4: > > > - update __mptcp_subflow_push_pending as Mat suggested. > > > - add more patches from "BPF redundant scheduler" series. > > > > > > v3: > > > - add a cleanup patch. > > > - remove msk->last_snd in mptcp_subflow_get_send(). > > > - add the loop that calls the scheduler again in __mptcp_push_pending(). > > > > > > v2: > > > - add snd_burst check in dfrags loop as Mat suggested. > > > > > > Refactor __mptcp_push_pending() and __mptcp_subflow_push_pending() to > > > remove duplicate code and support redundant scheduler more easily in > > > __mptcp_subflow_push_pending(). > > > > > > Geliang Tang (11): > > > Squash to "mptcp: add get_subflow wrappers" > > > mptcp: 'first' argument for subflow_push_pending > > > mptcp: refactor push_pending logic > > > mptcp: drop last_snd for burst scheduler > > > mptcp: simplify push_pending > > > mptcp: multi subflows push_pending > > > mptcp: use msk instead of mptcp_sk > > > mptcp: refactor subflow_push_pending logic > > > mptcp: simplify subflow_push_pending > > > mptcp: multi subflows subflow_push_pending > > > mptcp: multi subflows retrans support > > > > > > net/mptcp/pm.c | 9 +- > > > net/mptcp/pm_netlink.c | 3 - > > > net/mptcp/protocol.c | 285 ++++++++++++++++++++++------------------- > > > net/mptcp/protocol.h | 5 +- > > > net/mptcp/sched.c | 61 +++++---- > > > 5 files changed, 184 insertions(+), 179 deletions(-) > > > > > > -- > > > 2.35.3 > > > > > > > > > > > > > -- > > Mat Martineau > > Intel ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v5 00/11] refactor push pending 2022-10-10 15:05 ` Geliang Tang @ 2022-10-11 0:12 ` Mat Martineau 0 siblings, 0 replies; 19+ messages in thread From: Mat Martineau @ 2022-10-11 0:12 UTC (permalink / raw) To: Geliang Tang; +Cc: Geliang Tang, mptcp, Paolo Abeni [-- Attachment #1: Type: text/plain, Size: 4064 bytes --] On Mon, 10 Oct 2022, Geliang Tang wrote: > Geliang Tang <geliang.tang@suse.com> 于2022年10月7日周五 16:59写道: >> >> On Thu, Oct 06, 2022 at 05:16:55PM -0700, Mat Martineau wrote: >>> On Thu, 6 Oct 2022, Geliang Tang wrote: >>> >>>> v5: >>>> - address Mat's comments in v4. >>> >>> Hi Geliang - >>> >>> Thanks for the v5. I haven't finished looking over all the patches in detail >>> yet, but two things I do want to reply to right now: >>> >>> * Thanks for explaining in patch 4 that last_snd is still useful for round >>> robin. I had forgotten about that, and it looked like a "write-only" >>> variable in the kernel code. >>> >>> * In the meeting today Paolo suggested that a good test for the new >>> scheduler loop would be to modify simult_flows.sh to use much larger files, >>> then see if the modified code slowed down any of the simult_flows tests. >>> >>> He suggested making the test file 10x larger in simult_flows.sh: >>> >>> - size=$((2 * 2048 * 4096)) >>> + size=$((2 * 2048 * 4096 * 10)) >>> >>> Can you compare the test times between the export branch and this series, >>> with both of them using the larger file size? >> >> 10x larger failed in my tests with timeout. So I changed it to 5x >> larger. >> >> Here's the patch: >> >> @@ -52,7 +52,7 @@ setup() >> sout=$(mktemp) >> cout=$(mktemp) >> capout=$(mktemp) >> - size=$((2 * 2048 * 4096)) >> + size=$((2 * 2048 * 4096 * 5)) >> >> dd if=/dev/zero of=$small bs=4096 count=20 >/dev/null 2>&1 >> dd if=/dev/zero of=$large bs=4096 count=$((size / 4096)) >/dev/null 2>&1 >> @@ -210,13 +210,13 @@ do_transfer() >> fi >> >> echo " [ fail ]" >> - echo "client exit code $retc, server $rets" 1>&2 >> - echo -e "\nnetns ${ns3} socket stat for $port:" 1>&2 >> - ip netns exec ${ns3} ss -nita 1>&2 -o "sport = :$port" >> - echo -e "\nnetns ${ns1} socket stat for $port:" 1>&2 >> - ip netns exec ${ns1} ss -nita 1>&2 -o "dport = :$port" >> - ls -l $sin $cout >> - ls -l $cin $sout >> + #echo "client exit code $retc, server $rets" 1>&2 >> + #echo -e "\nnetns ${ns3} socket stat for $port:" 1>&2 >> + #ip netns exec ${ns3} ss -nita 1>&2 -o "sport = :$port" >> + #echo -e "\nnetns ${ns1} socket stat for $port:" 1>&2 >> + #ip netns exec ${ns1} ss -nita 1>&2 -o "dport = :$port" >> + #ls -l $sin $cout >> + #ls -l $cin $sout >> >> cat "$capout" >> return 1 >> >> All logs are attached. "5x_10times_export.log" is for the export branch >> and "5x_10times_refactor_v5.log" is for this series. > > I compared the two log data. The test time of this series is indeed > longer than that of export, but the difference is very small. > > I removed the useless information in the two logs and only retained > the running time and expected time. It looks like this: > >> cat 5x_export.log | head > 36150 max 36005 > 36135 max 36005 > 36087 max 36005 > 36103 max 36005 > 18284 max 18227 > 18349 max 18227 > > Add all running time and expected time: > >> awk '{ sum += $1 }; END { print sum }' 5x_export.log > 2540920 >> awk '{ sum += $3 }; END { print sum }' 5x_export.log > 2533820 > > The ratio of the total running time to the total expected time is > 1.00280209 (2540920/2533820). > > Use the same method to calculate data in 5x_10times_refactor_v5.log: > >> cat 5x_refactor_v5.log | head > 36095 max 36005 > 36168 max 36005 > 36112 max 36005 > 36120 max 36005 > 18235 max 18227 > 18445 max 18227 > >> awk '{ sum += $1 }; END { print sum }' 5x_refactor_v5.log > 2546024 >> awk '{ sum += $3 }; END { print sum }' 5x_refactor_v5.log > 2533820 > > The ratio is 1.00481644. It is 0.002 more than export data. I think > this difference is acceptable. > Thanks for taking a look at the data Geliang. To me it seems like the 0.2% difference is well within the margin of error, so there doesn't seem to be a significant performance regression in your data. Paolo, how does this performance look to you? -- Mat Martineau Intel ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH mptcp-next v5 00/11] refactor push pending 2022-10-06 12:17 [PATCH mptcp-next v5 00/11] refactor push pending Geliang Tang ` (11 preceding siblings ...) 2022-10-07 0:16 ` [PATCH mptcp-next v5 00/11] refactor push pending Mat Martineau @ 2022-10-11 1:01 ` Mat Martineau 12 siblings, 0 replies; 19+ messages in thread From: Mat Martineau @ 2022-10-11 1:01 UTC (permalink / raw) To: Geliang Tang; +Cc: mptcp On Thu, 6 Oct 2022, Geliang Tang wrote: > v5: > - address Mat's comments in v4. > Hi Geliang - I do think these changes are getting closer to being good to merge, there's a remaining bisectability issue that I mention in patch 3 and the feedback on the squash-to patches to address. - Mat > v4: > - update __mptcp_subflow_push_pending as Mat suggested. > - add more patches from "BPF redundant scheduler" series. > > v3: > - add a cleanup patch. > - remove msk->last_snd in mptcp_subflow_get_send(). > - add the loop that calls the scheduler again in __mptcp_push_pending(). > > v2: > - add snd_burst check in dfrags loop as Mat suggested. > > Refactor __mptcp_push_pending() and __mptcp_subflow_push_pending() to > remove duplicate code and support redundant scheduler more easily in > __mptcp_subflow_push_pending(). > > Geliang Tang (11): > Squash to "mptcp: add get_subflow wrappers" > mptcp: 'first' argument for subflow_push_pending > mptcp: refactor push_pending logic > mptcp: drop last_snd for burst scheduler > mptcp: simplify push_pending > mptcp: multi subflows push_pending > mptcp: use msk instead of mptcp_sk > mptcp: refactor subflow_push_pending logic > mptcp: simplify subflow_push_pending > mptcp: multi subflows subflow_push_pending > mptcp: multi subflows retrans support > > net/mptcp/pm.c | 9 +- > net/mptcp/pm_netlink.c | 3 - > net/mptcp/protocol.c | 285 ++++++++++++++++++++++------------------- > net/mptcp/protocol.h | 5 +- > net/mptcp/sched.c | 61 +++++---- > 5 files changed, 184 insertions(+), 179 deletions(-) > > -- > 2.35.3 > > > -- Mat Martineau Intel ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2022-10-11 1:01 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-10-06 12:17 [PATCH mptcp-next v5 00/11] refactor push pending Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 01/11] Squash to "mptcp: add get_subflow wrappers" Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 02/11] mptcp: 'first' argument for subflow_push_pending Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 03/11] mptcp: refactor push_pending logic Geliang Tang 2022-10-11 0:38 ` Mat Martineau 2022-10-06 12:17 ` [PATCH mptcp-next v5 04/11] mptcp: drop last_snd for burst scheduler Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 05/11] mptcp: simplify push_pending Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 06/11] mptcp: multi subflows push_pending Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 07/11] mptcp: use msk instead of mptcp_sk Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 08/11] mptcp: refactor subflow_push_pending logic Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 09/11] mptcp: simplify subflow_push_pending Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 10/11] mptcp: multi subflows subflow_push_pending Geliang Tang 2022-10-06 12:17 ` [PATCH mptcp-next v5 11/11] mptcp: multi subflows retrans support Geliang Tang 2022-10-06 17:53 ` mptcp: multi subflows retrans support: Tests Results MPTCP CI 2022-10-07 0:16 ` [PATCH mptcp-next v5 00/11] refactor push pending Mat Martineau 2022-10-07 9:00 ` Geliang Tang 2022-10-10 15:05 ` Geliang Tang 2022-10-11 0:12 ` Mat Martineau 2022-10-11 1:01 ` Mat Martineau
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.