All of lore.kernel.org
 help / color / mirror / Atom feed
From: Florian Westphal <fw@strlen.de>
To: David Miller <davem@davemloft.net>
Cc: mathew.j.martineau@linux.intel.com, netdev@vger.kernel.org,
	mptcp@lists.01.org, ast@kernel.org, daniel@iogearbox.net,
	bpf@vger.kernel.org
Subject: Re: [MPTCP] Re: [PATCH net-next v7 02/11] sock: Make sk_protocol a 16-bit value
Date: Thu, 9 Jan 2020 22:25:28 +0100	[thread overview]
Message-ID: <20200109212528.GF795@breakpoint.cc> (raw)
In-Reply-To: <20200109.110514.747612850299504416.davem@davemloft.net>

David Miller <davem@davemloft.net> wrote:
> From: Mat Martineau <mathew.j.martineau@linux.intel.com>
> Date: Thu,  9 Jan 2020 07:59:15 -0800
> 
> > Match the 16-bit width of skbuff->protocol. Fills an 8-bit hole so
> > sizeof(struct sock) does not change.
> > 
> > Also take care of BPF field access for sk_type/sk_protocol. Both of them
> > are now outside the bitfield, so we can use load instructions without
> > further shifting/masking.
> > 
> > v5 -> v6:
> >  - update eBPF accessors, too (Intel's kbuild test robot)
> > v2 -> v3:
> >  - keep 'sk_type' 2 bytes aligned (Eric)
> > v1 -> v2:
> >  - preserve sk_pacing_shift as bit field (Eric)
> > 
> > Cc: Alexei Starovoitov <ast@kernel.org>
> > Cc: Daniel Borkmann <daniel@iogearbox.net>
> > Cc: bpf@vger.kernel.org
> > Co-developed-by: Paolo Abeni <pabeni@redhat.com>
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
> > Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
> > Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
> 
> This is worrisome for me.
> 
> We have lots of places that now are going to be assigning  sk->sk_protocol
> into a u8 somewhere else.  A lot of them are ok because limits are enforced
> in various places, but for example:
> 
> net/ipv6/udp.c:	fl6.flowi6_proto = sk->sk_protocol;
> net/l2tp/l2tp_ip6.c:	fl6.flowi6_proto = sk->sk_protocol;
> 
> net/ipv6/inet6_connection_sock.c:	fl6->flowi6_proto = sk->sk_protocol;
> 
> net/ipv6/af_inet6.c:		fl6.flowi6_proto = sk->sk_protocol;
> net/ipv6/datagram.c:	fl6->flowi6_proto = sk->sk_protocol;
> 
> This is one just one small example situation, where flowi6_proto is a u8.

There are parts in the stack (e.g. in setsockopt code paths) that test
sk->sk_protocol vs. IPPROTO_TCP, then call tcp specific code under the sane
assumption that sk is a tcp_sock struct.

With 8bit sk_protocol, mptcp_sock structs (which is what kernel gets via
file descriptor number) would be treated as a tcp socket, because
"IPPROTO_MPTCP & 0xff" yields IPPROTO_TCP.

Changing IPPROTO_MPTCP to a value <= 255 could lead to conflicts with
real inet protocols in the future, so we can't redefine it to a 8bit
value.

If we keep sk_protocol as 8bit field, we will need to make sure that all
places testing sk_protocol == IPPROTO_TCP gain an additional sanity check
to tell tcp and mptcp sockets apart.  Moreover, any further changes to
kernel code would need same extra test, so this is a non-starter to me.

Alternatively we could change the first member of mptcp_sk struct from
inet_connection_sock to a full tcp_sock struct.  Thats roughly 1k increase
of mptcp_sock struct to ~ 3744 bytes, but then we would not have to
worry about mptcp sockets ending up in tcp code paths.

If you think such a size increase is ok I could give that solution a shot
and see what other problems with 8bit sk_protocol might remain.

Mat reported /sys/kernel/debug/tracing/trace lists mptcp sockets as
IPPROTO_TCP in the '8 bit sk_protocol' case, but if thats the only issue
this might have a smaller/acceptable "avoidance fix".

WARNING: multiple messages have this Message-ID (diff)
From: Florian Westphal <fw at strlen.de>
To: mptcp at lists.01.org
Subject: [MPTCP] Re: [PATCH net-next v7 02/11] sock: Make sk_protocol a 16-bit value
Date: Thu, 09 Jan 2020 22:25:28 +0100	[thread overview]
Message-ID: <20200109212528.GF795@breakpoint.cc> (raw)
In-Reply-To: 20200109.110514.747612850299504416.davem@davemloft.net

[-- Attachment #1: Type: text/plain, Size: 3205 bytes --]

David Miller <davem(a)davemloft.net> wrote:
> From: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
> Date: Thu,  9 Jan 2020 07:59:15 -0800
> 
> > Match the 16-bit width of skbuff->protocol. Fills an 8-bit hole so
> > sizeof(struct sock) does not change.
> > 
> > Also take care of BPF field access for sk_type/sk_protocol. Both of them
> > are now outside the bitfield, so we can use load instructions without
> > further shifting/masking.
> > 
> > v5 -> v6:
> >  - update eBPF accessors, too (Intel's kbuild test robot)
> > v2 -> v3:
> >  - keep 'sk_type' 2 bytes aligned (Eric)
> > v1 -> v2:
> >  - preserve sk_pacing_shift as bit field (Eric)
> > 
> > Cc: Alexei Starovoitov <ast(a)kernel.org>
> > Cc: Daniel Borkmann <daniel(a)iogearbox.net>
> > Cc: bpf(a)vger.kernel.org
> > Co-developed-by: Paolo Abeni <pabeni(a)redhat.com>
> > Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
> > Co-developed-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
> > Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
> > Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
> 
> This is worrisome for me.
> 
> We have lots of places that now are going to be assigning  sk->sk_protocol
> into a u8 somewhere else.  A lot of them are ok because limits are enforced
> in various places, but for example:
> 
> net/ipv6/udp.c:	fl6.flowi6_proto = sk->sk_protocol;
> net/l2tp/l2tp_ip6.c:	fl6.flowi6_proto = sk->sk_protocol;
> 
> net/ipv6/inet6_connection_sock.c:	fl6->flowi6_proto = sk->sk_protocol;
> 
> net/ipv6/af_inet6.c:		fl6.flowi6_proto = sk->sk_protocol;
> net/ipv6/datagram.c:	fl6->flowi6_proto = sk->sk_protocol;
> 
> This is one just one small example situation, where flowi6_proto is a u8.

There are parts in the stack (e.g. in setsockopt code paths) that test
sk->sk_protocol vs. IPPROTO_TCP, then call tcp specific code under the sane
assumption that sk is a tcp_sock struct.

With 8bit sk_protocol, mptcp_sock structs (which is what kernel gets via
file descriptor number) would be treated as a tcp socket, because
"IPPROTO_MPTCP & 0xff" yields IPPROTO_TCP.

Changing IPPROTO_MPTCP to a value <= 255 could lead to conflicts with
real inet protocols in the future, so we can't redefine it to a 8bit
value.

If we keep sk_protocol as 8bit field, we will need to make sure that all
places testing sk_protocol == IPPROTO_TCP gain an additional sanity check
to tell tcp and mptcp sockets apart.  Moreover, any further changes to
kernel code would need same extra test, so this is a non-starter to me.

Alternatively we could change the first member of mptcp_sk struct from
inet_connection_sock to a full tcp_sock struct.  Thats roughly 1k increase
of mptcp_sock struct to ~ 3744 bytes, but then we would not have to
worry about mptcp sockets ending up in tcp code paths.

If you think such a size increase is ok I could give that solution a shot
and see what other problems with 8bit sk_protocol might remain.

Mat reported /sys/kernel/debug/tracing/trace lists mptcp sockets as
IPPROTO_TCP in the '8 bit sk_protocol' case, but if thats the only issue
this might have a smaller/acceptable "avoidance fix".

  reply	other threads:[~2020-01-09 21:25 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-09 15:59 [MPTCP] [PATCH net-next v7 00/11] Multipath TCP: Prerequisites Mat Martineau
2020-01-09 15:59 ` Mat Martineau
2020-01-09 15:59 ` [PATCH net-next v7 02/11] sock: Make sk_protocol a 16-bit value Mat Martineau
2020-01-09 15:59   ` [MPTCP] " Mat Martineau
2020-01-09 19:05   ` David Miller
2020-01-09 19:05     ` [MPTCP] " David Miller
2020-01-09 21:25     ` Florian Westphal [this message]
2020-01-09 21:25       ` Florian Westphal
2020-01-10  2:43       ` David Miller
2020-01-10  2:43         ` David Miller
  -- strict thread matches above, loose matches on Subject: below --
2020-01-09 15:59 [MPTCP] [PATCH net-next v7 01/11] net: Make sock protocol value checks more specific Mat Martineau
2020-01-09 15:59 ` Mat Martineau
2020-01-09 15:59 [MPTCP] [PATCH net-next v7 03/11] tcp: Define IPPROTO_MPTCP Mat Martineau
2020-01-09 15:59 ` Mat Martineau
2020-01-09 15:59 [MPTCP] [PATCH net-next v7 04/11] tcp: Add MPTCP option number Mat Martineau
2020-01-09 15:59 ` Mat Martineau
2020-01-09 15:59 [MPTCP] [PATCH net-next v7 05/11] tcp, ulp: Add clone operation to tcp_ulp_ops Mat Martineau
2020-01-09 15:59 ` Mat Martineau
2020-01-09 15:59 [MPTCP] [PATCH net-next v7 06/11] mptcp: Add MPTCP to skb extensions Mat Martineau
2020-01-09 15:59 ` Mat Martineau
2020-01-09 15:59 [MPTCP] [PATCH net-next v7 07/11] tcp: coalesce/collapse must respect MPTCP extensions Mat Martineau
2020-01-09 15:59 ` Mat Martineau
2020-01-09 15:59 [MPTCP] [PATCH net-next v7 08/11] tcp: Export TCP functions and ops struct Mat Martineau
2020-01-09 15:59 ` Mat Martineau
2020-01-09 15:59 [MPTCP] [PATCH net-next v7 09/11] tcp: Check for filled TCP option space before SACK Mat Martineau
2020-01-09 15:59 ` Mat Martineau
2020-01-09 15:59 [MPTCP] [PATCH net-next v7 10/11] tcp: clean ext on tx recycle Mat Martineau
2020-01-09 15:59 ` Mat Martineau
2020-01-09 15:59 [MPTCP] [PATCH net-next v7 11/11] skb: add helpers to allocate ext independently from sk_buff Mat Martineau
2020-01-09 15:59 ` Mat Martineau
2020-01-10  2:44 [MPTCP] Re: [PATCH net-next v7 00/11] Multipath TCP: Prerequisites David Miller
2020-01-10  2:44 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200109212528.GF795@breakpoint.cc \
    --to=fw@strlen.de \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=mathew.j.martineau@linux.intel.com \
    --cc=mptcp@lists.01.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.