From: Florian Westphal <fw@strlen.de>
To: David Miller <davem@davemloft.net>
Cc: mathew.j.martineau@linux.intel.com, netdev@vger.kernel.org,
mptcp@lists.01.org, ast@kernel.org, daniel@iogearbox.net,
bpf@vger.kernel.org
Subject: Re: [MPTCP] Re: [PATCH net-next v7 02/11] sock: Make sk_protocol a 16-bit value
Date: Thu, 9 Jan 2020 22:25:28 +0100 [thread overview]
Message-ID: <20200109212528.GF795@breakpoint.cc> (raw)
In-Reply-To: <20200109.110514.747612850299504416.davem@davemloft.net>
David Miller <davem@davemloft.net> wrote:
> From: Mat Martineau <mathew.j.martineau@linux.intel.com>
> Date: Thu, 9 Jan 2020 07:59:15 -0800
>
> > Match the 16-bit width of skbuff->protocol. Fills an 8-bit hole so
> > sizeof(struct sock) does not change.
> >
> > Also take care of BPF field access for sk_type/sk_protocol. Both of them
> > are now outside the bitfield, so we can use load instructions without
> > further shifting/masking.
> >
> > v5 -> v6:
> > - update eBPF accessors, too (Intel's kbuild test robot)
> > v2 -> v3:
> > - keep 'sk_type' 2 bytes aligned (Eric)
> > v1 -> v2:
> > - preserve sk_pacing_shift as bit field (Eric)
> >
> > Cc: Alexei Starovoitov <ast@kernel.org>
> > Cc: Daniel Borkmann <daniel@iogearbox.net>
> > Cc: bpf@vger.kernel.org
> > Co-developed-by: Paolo Abeni <pabeni@redhat.com>
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
> > Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
> > Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
>
> This is worrisome for me.
>
> We have lots of places that now are going to be assigning sk->sk_protocol
> into a u8 somewhere else. A lot of them are ok because limits are enforced
> in various places, but for example:
>
> net/ipv6/udp.c: fl6.flowi6_proto = sk->sk_protocol;
> net/l2tp/l2tp_ip6.c: fl6.flowi6_proto = sk->sk_protocol;
>
> net/ipv6/inet6_connection_sock.c: fl6->flowi6_proto = sk->sk_protocol;
>
> net/ipv6/af_inet6.c: fl6.flowi6_proto = sk->sk_protocol;
> net/ipv6/datagram.c: fl6->flowi6_proto = sk->sk_protocol;
>
> This is one just one small example situation, where flowi6_proto is a u8.
There are parts in the stack (e.g. in setsockopt code paths) that test
sk->sk_protocol vs. IPPROTO_TCP, then call tcp specific code under the sane
assumption that sk is a tcp_sock struct.
With 8bit sk_protocol, mptcp_sock structs (which is what kernel gets via
file descriptor number) would be treated as a tcp socket, because
"IPPROTO_MPTCP & 0xff" yields IPPROTO_TCP.
Changing IPPROTO_MPTCP to a value <= 255 could lead to conflicts with
real inet protocols in the future, so we can't redefine it to a 8bit
value.
If we keep sk_protocol as 8bit field, we will need to make sure that all
places testing sk_protocol == IPPROTO_TCP gain an additional sanity check
to tell tcp and mptcp sockets apart. Moreover, any further changes to
kernel code would need same extra test, so this is a non-starter to me.
Alternatively we could change the first member of mptcp_sk struct from
inet_connection_sock to a full tcp_sock struct. Thats roughly 1k increase
of mptcp_sock struct to ~ 3744 bytes, but then we would not have to
worry about mptcp sockets ending up in tcp code paths.
If you think such a size increase is ok I could give that solution a shot
and see what other problems with 8bit sk_protocol might remain.
Mat reported /sys/kernel/debug/tracing/trace lists mptcp sockets as
IPPROTO_TCP in the '8 bit sk_protocol' case, but if thats the only issue
this might have a smaller/acceptable "avoidance fix".
next prev parent reply other threads:[~2020-01-09 21:25 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20200109155924.30122-1-mathew.j.martineau@linux.intel.com>
2020-01-09 15:59 ` [PATCH net-next v7 02/11] sock: Make sk_protocol a 16-bit value Mat Martineau
2020-01-09 19:05 ` David Miller
2020-01-09 21:25 ` Florian Westphal [this message]
2020-01-10 2:43 ` [MPTCP] " David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200109212528.GF795@breakpoint.cc \
--to=fw@strlen.de \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=mathew.j.martineau@linux.intel.com \
--cc=mptcp@lists.01.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox