All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH mptcp-next 3/3] mptcp: add SIOCINQ, OUTQ and OUTQNSD ioctls
  2021-11-08 10:57 [PATCH mptcp-next 0/3] support for setsockopt(TCP_INQ) Florian Westphal
@ 2021-11-08 10:57 ` Florian Westphal
  2021-11-08 17:08   ` Matthieu Baerts
  2021-11-10  8:48   ` Paolo Abeni
  0 siblings, 2 replies; 8+ messages in thread
From: Florian Westphal @ 2021-11-08 10:57 UTC (permalink / raw)
  To: mptcp; +Cc: Florian Westphal

Allows to query in-sequence data ready for read(), total bytes in
write queue and total bytes in write queue that have not yet been sent.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/mptcp/protocol.c | 52 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index b4263bf821ac..29b6f57b917e 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -22,6 +22,7 @@
 #endif
 #include <net/mptcp.h>
 #include <net/xfrm.h>
+#include <asm/ioctls.h>
 #include "protocol.h"
 #include "mib.h"
 
@@ -3260,6 +3261,56 @@ static int mptcp_forward_alloc_get(const struct sock *sk)
 	return sk->sk_forward_alloc + mptcp_sk(sk)->rmem_fwd_alloc;
 }
 
+static int mptcp_ioctl_outq(const struct mptcp_sock *msk, u64 v)
+{
+	const struct sock *sk = (void *)msk;
+	u64 delta;
+
+	if (sk->sk_state == TCP_LISTEN)
+		return -EINVAL;
+
+	if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV))
+		return 0;
+
+	delta = READ_ONCE(msk->write_seq) - v;
+	if (delta > INT_MAX)
+		delta = INT_MAX;
+
+	return (int)delta;
+}
+
+static int mptcp_ioctl(struct sock *sk, int cmd, unsigned long arg)
+{
+	struct mptcp_sock *msk = mptcp_sk(sk);
+	int answ;
+	bool slow;
+
+	switch (cmd) {
+	case SIOCINQ:
+		if (sk->sk_state == TCP_LISTEN)
+			return -EINVAL;
+
+		slow = lock_sock_fast(sk);
+		answ = mptcp_inq_hint(sk);
+		unlock_sock_fast(sk, slow);
+		break;
+	case SIOCOUTQ:
+		slow = lock_sock_fast(sk);
+		answ = mptcp_ioctl_outq(msk, READ_ONCE(msk->snd_una));
+		unlock_sock_fast(sk, slow);
+		break;
+	case SIOCOUTQNSD:
+		slow = lock_sock_fast(sk);
+		answ = mptcp_ioctl_outq(msk, READ_ONCE(msk->snd_nxt));
+		unlock_sock_fast(sk, slow);
+		break;
+	default:
+		return -ENOIOCTLCMD;
+	}
+
+	return put_user(answ, (int __user *)arg);
+}
+
 static struct proto mptcp_prot = {
 	.name		= "MPTCP",
 	.owner		= THIS_MODULE,
@@ -3272,6 +3323,7 @@ static struct proto mptcp_prot = {
 	.shutdown	= mptcp_shutdown,
 	.destroy	= mptcp_destroy,
 	.sendmsg	= mptcp_sendmsg,
+	.ioctl		= mptcp_ioctl,
 	.recvmsg	= mptcp_recvmsg,
 	.release_cb	= mptcp_release_cb,
 	.hash		= mptcp_hash,
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH mptcp-next 3/3] mptcp: add SIOCINQ, OUTQ and OUTQNSD ioctls
  2021-11-08 10:57 ` [PATCH mptcp-next 3/3] mptcp: add SIOCINQ, OUTQ and OUTQNSD ioctls Florian Westphal
@ 2021-11-08 17:08   ` Matthieu Baerts
  2021-11-10  8:48   ` Paolo Abeni
  1 sibling, 0 replies; 8+ messages in thread
From: Matthieu Baerts @ 2021-11-08 17:08 UTC (permalink / raw)
  To: Florian Westphal; +Cc: mptcp

Hi Florian,

On 08/11/2021 11:57, Florian Westphal wrote:
> Allows to query in-sequence data ready for read(), total bytes in
> write queue and total bytes in write queue that have not yet been sent.

Thank you for looking at that!

(...)

> +static int mptcp_ioctl(struct sock *sk, int cmd, unsigned long arg)
> +{
> +	struct mptcp_sock *msk = mptcp_sk(sk);
> +	int answ;
> +	bool slow;

Very very small detail: we should swap the two lines.
I can do that when applying the patches if no v2 is needed.

Cheers,
Matt

-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH mptcp-next 3/3] mptcp: add SIOCINQ, OUTQ and OUTQNSD ioctls
  2021-11-08 10:57 ` [PATCH mptcp-next 3/3] mptcp: add SIOCINQ, OUTQ and OUTQNSD ioctls Florian Westphal
  2021-11-08 17:08   ` Matthieu Baerts
@ 2021-11-10  8:48   ` Paolo Abeni
  2021-11-10  8:53     ` Florian Westphal
  1 sibling, 1 reply; 8+ messages in thread
From: Paolo Abeni @ 2021-11-10  8:48 UTC (permalink / raw)
  To: Florian Westphal, mptcp

Hello,

On Mon, 2021-11-08 at 11:57 +0100, Florian Westphal wrote:
> Allows to query in-sequence data ready for read(), total bytes in
> write queue and total bytes in write queue that have not yet been sent.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>  net/mptcp/protocol.c | 52 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 52 insertions(+)
> 
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index b4263bf821ac..29b6f57b917e 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -22,6 +22,7 @@
>  #endif
>  #include <net/mptcp.h>
>  #include <net/xfrm.h>
> +#include <asm/ioctls.h>
>  #include "protocol.h"
>  #include "mib.h"
>  
> @@ -3260,6 +3261,56 @@ static int mptcp_forward_alloc_get(const struct sock *sk)
>  	return sk->sk_forward_alloc + mptcp_sk(sk)->rmem_fwd_alloc;
>  }
>  
> +static int mptcp_ioctl_outq(const struct mptcp_sock *msk, u64 v)
> +{
> +	const struct sock *sk = (void *)msk;
> +	u64 delta;
> +
> +	if (sk->sk_state == TCP_LISTEN)
> +		return -EINVAL;
> +
> +	if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV))
> +		return 0;
> +
> +	delta = READ_ONCE(msk->write_seq) - v;

This is under the msk socket lock and write_seq is protected by the
full/plain msk socket lock so READ_ONCE should not be necessary, I
think. The same for 'snd_nxt' below. 

'snd_una' instead is updated under the msk data lock, so the read once
there should be required.

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH mptcp-next 3/3] mptcp: add SIOCINQ, OUTQ and OUTQNSD ioctls
  2021-11-10  8:48   ` Paolo Abeni
@ 2021-11-10  8:53     ` Florian Westphal
  0 siblings, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2021-11-10  8:53 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Florian Westphal, mptcp

Paolo Abeni <pabeni@redhat.com> wrote:
> > +	if (sk->sk_state == TCP_LISTEN)
> > +		return -EINVAL;
> > +
> > +	if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV))
> > +		return 0;
> > +
> > +	delta = READ_ONCE(msk->write_seq) - v;
> 
> This is under the msk socket lock and write_seq is protected by the
> full/plain msk socket lock so READ_ONCE should not be necessary, I
> think. The same for 'snd_nxt' below. 

Then why is write_seq updated with WRITE_ONCE in lots of places?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH mptcp-next 3/3] mptcp: add SIOCINQ, OUTQ and OUTQNSD ioctls
       [not found] <7bfe33f96550ffd6efaa3266f6027a8d7c500b70.camel@redhat.com>
@ 2021-11-10 11:02 ` Paolo Abeni
  2021-11-10 11:37   ` Florian Westphal
  0 siblings, 1 reply; 8+ messages in thread
From: Paolo Abeni @ 2021-11-10 11:02 UTC (permalink / raw)
  To: mptcp

Whoops, I omitted the ML in my reply, sorry...

@Florian: sorry for the dups! please reply to this one, if needed, to
re-include properly the ML into the thread.

On Wed, 2021-11-10 at 09:53 +0100, Florian Westphal wrote:
> Paolo Abeni <pabeni@redhat.com> wrote:
> > > +	if (sk->sk_state == TCP_LISTEN)
> > > +		return -EINVAL;
> > > +
> > > +	if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV))
> > > +		return 0;
> > > +
> > > +	delta = READ_ONCE(msk->write_seq) - v;
> > 
> > This is under the msk socket lock and write_seq is protected by the
> > full/plain msk socket lock so READ_ONCE should not be necessary, I
> > think. The same for 'snd_nxt' below. 
> 
> Then why is write_seq updated with WRITE_ONCE in lots of places?

'write_seq' is read outside the msk socket lock scope in a few places. 

My understanding is that to avoid reces and ubsan splat we need to mark
each write with WRITE_ONCE, and read outside the socket lock with
READ_ONCE.

If all the write happens under the socket lock, read under the socket
lock don't need the _ONCE annotation. My reference is:

https://marc.info/?l=linux-netdev&m=161198643714502&w=2

Feel free to correct me if I'm wrong! 

BTW this always reports '1' on fallback sockets after shutdown, as
after shutdown write_seq is incremented by 1, but snd_una/snd_nxt are
not incremented when receiving the TCP fin. I guess the fix not
strictly related to this patch.

Cheers,

Paolo



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH mptcp-next 3/3] mptcp: add SIOCINQ, OUTQ and OUTQNSD ioctls
  2021-11-10 11:02 ` [PATCH mptcp-next 3/3] mptcp: add SIOCINQ, OUTQ and OUTQNSD ioctls Paolo Abeni
@ 2021-11-10 11:37   ` Florian Westphal
  2021-11-10 11:45     ` Paolo Abeni
  0 siblings, 1 reply; 8+ messages in thread
From: Florian Westphal @ 2021-11-10 11:37 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: mptcp

Paolo Abeni <pabeni@redhat.com> wrote:
> Whoops, I omitted the ML in my reply, sorry...
> 
> @Florian: sorry for the dups! please reply to this one, if needed, to
> re-include properly the ML into the thread.
> 
> On Wed, 2021-11-10 at 09:53 +0100, Florian Westphal wrote:
> > Paolo Abeni <pabeni@redhat.com> wrote:
> > > > +	if (sk->sk_state == TCP_LISTEN)
> > > > +		return -EINVAL;
> > > > +
> > > > +	if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV))
> > > > +		return 0;
> > > > +
> > > > +	delta = READ_ONCE(msk->write_seq) - v;
> > > 
> > > This is under the msk socket lock and write_seq is protected by the
> > > full/plain msk socket lock so READ_ONCE should not be necessary, I
> > > think. The same for 'snd_nxt' below. 
> > 
> > Then why is write_seq updated with WRITE_ONCE in lots of places?
> 
> 'write_seq' is read outside the msk socket lock scope in a few places. 

Bah.  You are right. TBH I do not understand READ|WRITE_ONCE usage in
mptcp anymore.  Maybe its correct, who knows.

The different locking schemes are not easy to follow for me.

> BTW this always reports '1' on fallback sockets after shutdown, as
> after shutdown write_seq is incremented by 1, but snd_una/snd_nxt are
> not incremented when receiving the TCP fin. I guess the fix not
> strictly related to this patch.

That should be ok, we should return nonzero inq hint when next read()
would indicate EOF.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH mptcp-next 3/3] mptcp: add SIOCINQ, OUTQ and OUTQNSD ioctls
  2021-11-10 11:37   ` Florian Westphal
@ 2021-11-10 11:45     ` Paolo Abeni
  2021-11-10 11:49       ` Florian Westphal
  0 siblings, 1 reply; 8+ messages in thread
From: Paolo Abeni @ 2021-11-10 11:45 UTC (permalink / raw)
  To: Florian Westphal; +Cc: mptcp

On Wed, 2021-11-10 at 12:37 +0100, Florian Westphal wrote:
> Paolo Abeni <pabeni@redhat.com> wrote:
> > Whoops, I omitted the ML in my reply, sorry...
> > 
> > @Florian: sorry for the dups! please reply to this one, if needed, to
> > re-include properly the ML into the thread.
> > 
> > On Wed, 2021-11-10 at 09:53 +0100, Florian Westphal wrote:
> > > Paolo Abeni <pabeni@redhat.com> wrote:
> > > > > +	if (sk->sk_state == TCP_LISTEN)
> > > > > +		return -EINVAL;
> > > > > +
> > > > > +	if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV))
> > > > > +		return 0;
> > > > > +
> > > > > +	delta = READ_ONCE(msk->write_seq) - v;
> > > > 
> > > > This is under the msk socket lock and write_seq is protected by the
> > > > full/plain msk socket lock so READ_ONCE should not be necessary, I
> > > > think. The same for 'snd_nxt' below. 
> > > 
> > > Then why is write_seq updated with WRITE_ONCE in lots of places?
> > 
> > 'write_seq' is read outside the msk socket lock scope in a few places. 
> 
> Bah.  You are right. TBH I do not understand READ|WRITE_ONCE usage in
> mptcp anymore.  Maybe its correct, who knows.
> 
> The different locking schemes are not easy to follow for me.

Indeed I agreed!

I *think* (mostly guess) some _ONCE() annotation are now obsoleted/made
unnecessary by the several re-factors.

For sure we need to document clarely the locking schema (where? how?
would comments on the protectect fields suffice?) and possibly/likely
review the current annotations.

> > BTW this always reports '1' on fallback sockets after shutdown, as
> > after shutdown write_seq is incremented by 1, but snd_una/snd_nxt are
> > not incremented when receiving the TCP fin. I guess the fix not
> > strictly related to this patch.
> 
> That should be ok, we should return nonzero inq hint when next read()
> would indicate EOF.

This is for the outq ?!? I think it should be 0 after all outstanding
data is acked ?!?

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH mptcp-next 3/3] mptcp: add SIOCINQ, OUTQ and OUTQNSD ioctls
  2021-11-10 11:45     ` Paolo Abeni
@ 2021-11-10 11:49       ` Florian Westphal
  0 siblings, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2021-11-10 11:49 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Florian Westphal, mptcp

Paolo Abeni <pabeni@redhat.com> wrote:
> > Bah.  You are right. TBH I do not understand READ|WRITE_ONCE usage in
> > mptcp anymore.  Maybe its correct, who knows.
> > 
> > The different locking schemes are not easy to follow for me.
> 
> Indeed I agreed!
> 
> I *think* (mostly guess) some _ONCE() annotation are now obsoleted/made
> unnecessary by the several re-factors.
> 
> For sure we need to document clarely the locking schema (where? how?
> would comments on the protectect fields suffice?) and possibly/likely
> review the current annotations.

I think adding comments in the mptcp_sock struct would be a great start.

> > > BTW this always reports '1' on fallback sockets after shutdown, as
> > > after shutdown write_seq is incremented by 1, but snd_una/snd_nxt are
> > > not incremented when receiving the TCP fin. I guess the fix not
> > > strictly related to this patch.
> > 
> > That should be ok, we should return nonzero inq hint when next read()
> > would indicate EOF.
> 
> This is for the outq ?!? I think it should be 0 after all outstanding
> data is acked ?!?

Oh, I misread it as inq.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-11-10 11:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <7bfe33f96550ffd6efaa3266f6027a8d7c500b70.camel@redhat.com>
2021-11-10 11:02 ` [PATCH mptcp-next 3/3] mptcp: add SIOCINQ, OUTQ and OUTQNSD ioctls Paolo Abeni
2021-11-10 11:37   ` Florian Westphal
2021-11-10 11:45     ` Paolo Abeni
2021-11-10 11:49       ` Florian Westphal
2021-11-08 10:57 [PATCH mptcp-next 0/3] support for setsockopt(TCP_INQ) Florian Westphal
2021-11-08 10:57 ` [PATCH mptcp-next 3/3] mptcp: add SIOCINQ, OUTQ and OUTQNSD ioctls Florian Westphal
2021-11-08 17:08   ` Matthieu Baerts
2021-11-10  8:48   ` Paolo Abeni
2021-11-10  8:53     ` Florian Westphal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.