* [MPTCP] Re: [PATCH v4 7/7] mptcp: use mptcp backlog.
@ 2020-11-18 23:11 Florian Westphal
0 siblings, 0 replies; 4+ messages in thread
From: Florian Westphal @ 2020-11-18 23:11 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 766 bytes --]
Paolo Abeni <pabeni(a)redhat.com> wrote:
> @@ -2812,6 +2859,17 @@ static void mptcp_release_cb(struct sock *sk)
> {
> unsigned long flags, nflags;
>
> + /* push_pending may touch wmem_reserved, do it before the later
> + * cleanup
> + */
Can you expand here a bit on why the lock break is needed & safe?
We still own the socket, so I think its safe, but I am not sure why its
needed. Only due to GFP_KERNEL alloc in mptcp_push_pending, or is there
more to it?
> + set_bit(MPTCP_NOSPACE, &msk->flags);
> + smp_mb__after_atomic(); /* msk->flags is changed by write_space cb */
I'm not sure why the barrier is needed. What load/store needs the ordering?
Maybe I'm blind -- nothing catches my eye either here or in
subflow_write_space.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [MPTCP] Re: [PATCH v4 7/7] mptcp: use mptcp backlog.
@ 2020-11-19 8:36 Paolo Abeni
0 siblings, 0 replies; 4+ messages in thread
From: Paolo Abeni @ 2020-11-19 8:36 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 1951 bytes --]
Hello,
On Thu, 2020-11-19 at 00:11 +0100, Florian Westphal wrote:
> Paolo Abeni <pabeni(a)redhat.com> wrote:
> > @@ -2812,6 +2859,17 @@ static void mptcp_release_cb(struct sock *sk)
> > {
> > unsigned long flags, nflags;
> >
> > + /* push_pending may touch wmem_reserved, do it before the later
> > + * cleanup
> > + */
>
> Can you expand here a bit on why the lock break is needed & safe?
>
> We still own the socket, so I think its safe, but I am not sure why its
> needed. Only due to GFP_KERNEL alloc in mptcp_push_pending, or is there
> more to it?
mptcp_push_pending() acquires the subflow socket lock:
1) can't be invoked in atomic scope
2) abba deadlock with msk socket spinlock, since the rx datapath
acquires the msk socket spinlock while helding the subflow socket lock.
>
> > + set_bit(MPTCP_NOSPACE, &msk->flags);
> > + smp_mb__after_atomic(); /* msk->flags is changed by write_space cb */
>
> I'm not sure why the barrier is needed. What load/store needs the ordering?
Maybe cargo-cult c&p on my side. Now MPTCP write poll code mirrors very
closely TCP's one, with s/MPTCP_NOSPACE/SOCK_NOSPACE/, and retains the
same barriers.
My understanding is that is needed in the following scenario:
BH - CPU0 User-space CPU1
<rx path> <poll>
<free space> <test write space>
<mb> set_bit(...)
if test_bit(...) <mb>
sk_stream_write_space(sk) <test write space>
without the memory barrier pairs the 2nd <test write space> could
actually miss the update done by <frees space> and test_bit can miss
the set_bit() perfomed on the other side due to reordering.
Looks like we currently lack the mb() on the rx side ;)
Additionally it would be better move the sk_stream_write_space() from
subflow_write_space af the end of mptcp_clean_una, where wspace is
actually freed.
> Maybe I'm blind -- nothing catches my eye either here or in
> subflow_write_space.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [MPTCP] Re: [PATCH v4 7/7] mptcp: use mptcp backlog.
@ 2020-11-19 9:41 Florian Westphal
0 siblings, 0 replies; 4+ messages in thread
From: Florian Westphal @ 2020-11-19 9:41 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 1968 bytes --]
Paolo Abeni <pabeni(a)redhat.com> wrote:
> > We still own the socket, so I think its safe, but I am not sure why its
> > needed. Only due to GFP_KERNEL alloc in mptcp_push_pending, or is there
> > more to it?
>
> mptcp_push_pending() acquires the subflow socket lock:
>
> 1) can't be invoked in atomic scope
> 2) abba deadlock with msk socket spinlock, since the rx datapath
> acquires the msk socket spinlock while helding the subflow socket lock.
Can you add that to the comment? Thanks!
> > > + set_bit(MPTCP_NOSPACE, &msk->flags);
> > > + smp_mb__after_atomic(); /* msk->flags is changed by write_space cb */
> >
> > I'm not sure why the barrier is needed. What load/store needs the ordering?
>
> Maybe cargo-cult c&p on my side. Now MPTCP write poll code mirrors very
> closely TCP's one, with s/MPTCP_NOSPACE/SOCK_NOSPACE/, and retains the
> same barriers.
>
> My understanding is that is needed in the following scenario:
>
> BH - CPU0 User-space CPU1
> <rx path> <poll>
>
> <free space> <test write space>
> <mb> set_bit(...)
> if test_bit(...) <mb>
> sk_stream_write_space(sk) <test write space>
You mean:
set_bit(MPTCP_NOSPACE, &msk->flags);
smp_mb__after_atomic();
/* write_space might have changed right now,
* so we need to re-test.
*/
if (sk_stream_is_writeable(sk))
return EPOLLOUT | EPOLLWRNORM;
> without the memory barrier pairs the 2nd <test write space> could
> actually miss the update done by <frees space> and test_bit can miss
> the set_bit() perfomed on the other side due to reordering.
>
> Looks like we currently lack the mb() on the rx side ;)
Yes...
> Additionally it would be better move the sk_stream_write_space() from
> subflow_write_space af the end of mptcp_clean_una, where wspace is
> actually freed.
Right, there is no guarantee that we can push more data just because
a subflow has new space.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [MPTCP] Re: [PATCH v4 7/7] mptcp: use mptcp backlog.
@ 2020-11-19 11:09 Paolo Abeni
0 siblings, 0 replies; 4+ messages in thread
From: Paolo Abeni @ 2020-11-19 11:09 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 2600 bytes --]
On Thu, 2020-11-19 at 10:41 +0100, Florian Westphal wrote:
> Paolo Abeni <pabeni(a)redhat.com> wrote:
> > > We still own the socket, so I think its safe, but I am not sure why its
> > > needed. Only due to GFP_KERNEL alloc in mptcp_push_pending, or is there
> > > more to it?
> >
> > mptcp_push_pending() acquires the subflow socket lock:
> >
> > 1) can't be invoked in atomic scope
> > 2) abba deadlock with msk socket spinlock, since the rx datapath
> > acquires the msk socket spinlock while helding the subflow socket lock.
>
> Can you add that to the comment? Thanks!
Ok, will do.
> > > > + set_bit(MPTCP_NOSPACE, &msk->flags);
> > > > + smp_mb__after_atomic(); /* msk->flags is changed by write_space cb */
> > >
> > > I'm not sure why the barrier is needed. What load/store needs the ordering?
> >
> > Maybe cargo-cult c&p on my side. Now MPTCP write poll code mirrors very
> > closely TCP's one, with s/MPTCP_NOSPACE/SOCK_NOSPACE/, and retains the
> > same barriers.
> >
> > My understanding is that is needed in the following scenario:
> >
> > BH - CPU0 User-space CPU1
> > <rx path> <poll>
> >
> > <free space> <test write space>
> > <mb> set_bit(...)
> > if test_bit(...) <mb>
> > sk_stream_write_space(sk) <test write space>
>
> You mean:
>
> set_bit(MPTCP_NOSPACE, &msk->flags);
> smp_mb__after_atomic();
>
> /* write_space might have changed right now,
> * so we need to re-test.
> */
> if (sk_stream_is_writeable(sk))
> return EPOLLOUT | EPOLLWRNORM;
Yep, I tried to be agnostic, as the same pattern works for TCP, too.
> > without the memory barrier pairs the 2nd <test write space> could
> > actually miss the update done by <frees space> and test_bit can miss
> > the set_bit() perfomed on the other side due to reordering.
> >
> > Looks like we currently lack the mb() on the rx side ;)
>
> Yes...
I'll add it.
> > Additionally it would be better move the sk_stream_write_space() from
> > subflow_write_space af the end of mptcp_clean_una, where wspace is
> > actually freed.
>
> Right, there is no guarantee that we can push more data just because
> a subflow has new space.
Yep, but the current code (subflow_write_space()) checks both ssk and
msk free_space, so it should be correct.
A potential problem is that __mptcp_clean_una() can happen _after_
subflow_write_space() is invoked - when delayed to release_cb(), so we
would need a later packet to actually wake up the sender.
I'll cook v5 with all the above.
/P
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-11-19 11:09 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-11-19 9:41 [MPTCP] Re: [PATCH v4 7/7] mptcp: use mptcp backlog Florian Westphal
-- strict thread matches above, loose matches on Subject: below --
2020-11-19 11:09 Paolo Abeni
2020-11-19 8:36 Paolo Abeni
2020-11-18 23:11 Florian Westphal
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.