* Re: [PATCH net] ethtool: do not print warning for applications using legacy API
From: David Decotigny @ 2017-12-31 3:56 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David S. Miller, netdev, linux-kernel
In-Reply-To: <20171229180252.6981-1-sthemmin@microsoft.com>
Signed-off-by: David Decotigny <decot@googlers.com>
On Fri, Dec 29, 2017 at 10:02 AM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> From: Stephen Hemminger <stephen@networkplumber.org>
>
> In kernel log ths message appears on every boot:
> "warning: `NetworkChangeNo' uses legacy ethtool link settings API,
> link modes are only partially reported"
>
> When ethtool link settings API changed, it started complaining about
> usages of old API. Ironically, the original patch was from google but
> the application using the legacy API is chrome.
>
> Linux ABI is fixed as much as possible. The kernel must not break it
> and should not complain about applications using legacy API's.
> This patch just removes the warning since using legacy API's
> in Linux is perfectly acceptable.
>
> Fixes: 3f1ac7a700d0 ("net: ethtool: add new ETHTOOL_xLINKSETTINGS API")
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> net/core/ethtool.c | 15 ++-------------
> 1 file changed, 2 insertions(+), 13 deletions(-)
>
> diff --git a/net/core/ethtool.c b/net/core/ethtool.c
> index f8fcf450a36e..8225416911ae 100644
> --- a/net/core/ethtool.c
> +++ b/net/core/ethtool.c
> @@ -770,15 +770,6 @@ static int ethtool_set_link_ksettings(struct net_device *dev,
> return dev->ethtool_ops->set_link_ksettings(dev, &link_ksettings);
> }
>
> -static void
> -warn_incomplete_ethtool_legacy_settings_conversion(const char *details)
> -{
> - char name[sizeof(current->comm)];
> -
> - pr_info_once("warning: `%s' uses legacy ethtool link settings API, %s\n",
> - get_task_comm(name, current), details);
> -}
> -
> /* Query device for its ethtool_cmd settings.
> *
> * Backward compatibility note: for compatibility with legacy ethtool,
> @@ -805,10 +796,8 @@ static int ethtool_get_settings(struct net_device *dev, void __user *useraddr)
> &link_ksettings);
> if (err < 0)
> return err;
> - if (!convert_link_ksettings_to_legacy_settings(&cmd,
> - &link_ksettings))
> - warn_incomplete_ethtool_legacy_settings_conversion(
> - "link modes are only partially reported");
> + convert_link_ksettings_to_legacy_settings(&cmd,
> + &link_ksettings);
>
> /* send a sensible cmd tag back to user */
> cmd.cmd = ETHTOOL_GSET;
> --
> 2.11.0
>
^ permalink raw reply
* Re: [PATCH v3 net-next 2/5] net: tracepoint: replace tcp_set_state tracepoint with inet_sock_set_state tracepoint
From: Yafang Shao @ 2017-12-31 3:06 UTC (permalink / raw)
To: Brendan Gregg
Cc: Song Liu, David S. Miller, Marcelo Ricardo Leitner,
Steven Rostedt, Brendan Gregg, netdev, LKML
In-Reply-To: <CAE40pdd6jzcumwNv04fnS5ixAzAcef1X17KiDocry=VqVf_dTg@mail.gmail.com>
On Sun, Dec 31, 2017 at 6:33 AM, Brendan Gregg
<brendan.d.gregg@gmail.com> wrote:
> On Tue, Dec 19, 2017 at 7:12 PM, Yafang Shao <laoar.shao@gmail.com> wrote:
>> As sk_state is a common field for struct sock, so the state
>> transition tracepoint should not be a TCP specific feature.
>> Currently it traces all AF_INET state transition, so I rename this
>> tracepoint to inet_sock_set_state tracepoint with some minor changes and move it
>> into trace/events/sock.h.
>
> The tcp:tcp_set_state probe is tcp_set_state(), so it's only going to
> fire for TCP sessions. It's not broken, and we could add a
> sctp:sctp_set_state as well. Replacing tcp:tcp_set_state with
> inet_sk_set_state is feeling like we might be baking too much
> implementation detail into the tracepoint API.
>
> If we must have inet_sk_set_state, then must we also delete tcp:tcp_set_state?
>
Hi Brendan,
The reason we have to make this change could be got from this mail
thread, https://patchwork.kernel.org/patch/10099243/ .
The original tcp:tcp_set_state probe doesn't traced all TCP state transitions.
There're some state transitions in inet_connection_sock.c and
inet_hashtables.c are missed.
So we have to place this probe into these two files to fix the issue.
But as inet_connection_sock.c and inet_hashtables.c are common files
for all IPv4 protocols, not only for TCP, so it is not proper to place
a tcp_ function in these two files.
That's why we decide to rename tcp:tcp_set_state probe to
sock:inet_sock_set_state.
Thanks
Yafang
^ permalink raw reply
* Re: general protection fault in skb_segment
From: Marcelo Ricardo Leitner @ 2017-12-31 2:25 UTC (permalink / raw)
To: Willem de Bruijn
Cc: syzbot, David Miller, LKML, linux-sctp, Network Development,
nhorman, syzkaller-bugs, vyasevich
In-Reply-To: <20171231005220.GD22042@localhost.localdomain>
On Sat, Dec 30, 2017 at 10:52:20PM -0200, Marcelo Ricardo Leitner wrote:
> On Sat, Dec 30, 2017 at 08:42:41AM +0100, Willem de Bruijn wrote:
[...]
> > Somewhat tangential, but any PF_PACKET socket can set this
> > magic gso_size value in its virtio_net_hdr, so if it is assumed to
> > be an SCTP GSO specific option, setting it for a TCP GSO packet
> > may also cause unexpected results.
>
> It seems virtio_net could use more sanity checks. When PACKET_VNET_HDR
> is used, it will end up calling:
> tpacket_rcv() {
> ...
> if (do_vnet) {
> if (virtio_net_hdr_from_skb(skb, h.raw + macoff -
> sizeof(struct virtio_net_hdr),
> vio_le(), true)) {
> spin_lock(&sk->sk_receive_queue.lock);
> goto drop_n_account;
> }
> }
>
> and virtio_net_hdr_from_skb does:
> if (skb_is_gso(skb)) {
> ...
> if (sinfo->gso_type & SKB_GSO_TCPV4)
> hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
> else if (sinfo->gso_type & SKB_GSO_TCPV6)
> hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
> else
> return -EINVAL;
>
> Meaning that any gso_type other than TCP would be rejected, but this
> SCTP one got through. Seems the header contains a sctp header, but the
> gso_type set was actually pointing to TCP (otherwise it would have
> been rejected). AFAICT if this packet had an ESP header, for example,
> it could have hit esp4_gso_segment. Can you please confirm this?
I added:
--- a/net/sctp/offload.c
+++ b/net/sctp/offload.c
@@ -44,6 +44,18 @@ static struct sk_buff *sctp_gso_segment(struct sk_buff *skb,
{
struct sk_buff *segs = ERR_PTR(-EINVAL);
struct sctphdr *sh;
+ int fail = 0;
+
+ if (!(skb_shinfo(skb)->gso_type & SKB_GSO_SCTP)) {
+ printk("Bogus gso_type: %x\n", skb_shinfo(skb)->gso_type);
+ fail = 1;
+ }
+ if (skb_shinfo(skb)->gso_size != GSO_BY_FRAGS) {
+ printk("Bogus gso_size: %u\n", skb_shinfo(skb)->gso_size);
+ fail = 1;
+ }
+ if (fail)
+ goto out;
sh = sctp_hdr(skb);
if (!pskb_may_pull(skb, sizeof(*sh)))
and with the reproducer, got:
[ 54.255469] Bogus gso_type: 7
[ 54.258801] Bogus gso_size: 63464
[ 54.262532] ------------[ cut here ]------------
[ 54.267703] syz0: caps=(0x00000800000058c1, 0x0000000000000000) len=32 data_len=0 gso_size=63464 gso_type=7 ip_summed0
[ 54.279777] WARNING: CPU: 1 PID: 13005 at /root/linux/net/core/dev.c:2600 skb_warn_bad_offload+0xd6/0xec
gso_type 7 = SKB_GSO_TCPV4 | SKB_GSO_DODGY | SKB_GSO_TCP_ECN
as the warn indicated too.
Once this gets to sctp_gso_segment, it's too late to avoid the
warning. Would be nice if we could somehow filter this earlier in the
process.
Marcelo
^ permalink raw reply
* Re: general protection fault in skb_segment
From: Marcelo Ricardo Leitner @ 2017-12-31 0:52 UTC (permalink / raw)
To: Willem de Bruijn
Cc: syzbot, David Miller, LKML, linux-sctp, Network Development,
nhorman, syzkaller-bugs, vyasevich
In-Reply-To: <CAF=yD-+Ynn+t0dPJXUrXKQ2XMdtiKz8t6PKJUMNqeydYv=yG9A@mail.gmail.com>
On Sat, Dec 30, 2017 at 08:42:41AM +0100, Willem de Bruijn wrote:
> > syzkaller hit the following crash on
> > 37759fa6d0fa9e4d6036d19ac12f555bfc0aeafd
> > git://git.cmpxchg.org/linux-mmots.git/master
> > compiler: gcc (GCC) 7.1.1 20170620
> > .config is attached
> > Raw console output is attached.
> > C reproducer is attached
> > syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> > for information about syzkaller reproducers
>
> Reproduced with the C reproducer on v4.15-rc1 and mainline
> going back at least to v4.8, but not v4.7. SCTP GSO was
> introduced in v4.8-rc1, so a patch in this set is likely the starting
> point. Indeed crashes at 90017accff61 ("sctp: Add GSO support"),
> but not at 90017accff61~4.
>
> The reproducer with its sandbox removed shows this invocation in strace -f
>
> # strace -f ./repro2
> [... skipped ...]
> socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3
> open("/dev/net/tun", O_RDONLY) = 4
> fcntl(4, F_DUPFD, 3) = 5
> socket(PF_PACKET, SOCK_RAW|SOCK_CLOEXEC, 8) = 6
> ioctl(4, TUNSETIFF, 0x20e63000) = 0
> ioctl(3, SIOCSIFFLAGS, {ifr_name="syz0",
> ifr_flags=IFF_UP|IFF_PROMISC|IFF_ALLMULTI}) = 0
> setsockopt(6, SOL_PACKET, 0xf /* PACKET_??? */, [4096], 4) = 0
> ioctl(6, SIOCGIFINDEX, {ifr_name="syz0", ifr_index=24}) = 0
> bind(6, {sa_family=AF_PACKET, proto=0000, if24, pkttype=PACKET_HOST,
> addr(6)={1, aaaaaaaaaa00}, 20) = 0
> dup2(6, 5) = 5
> write(5, "\0\201\1\0\350\367\0\0\3\0E\364\0 \0d\0\0\7\2042\342\0\0\0
> \177\0\0\1\0\t"..., 42
>
> where 0xf in setsockopt is PACKET_VNET_HDR
>
> So this is a packet socket writing something that apparently looks
> like an SCTP packet, is only 42 bytes long, but has GSO set in its
> virtio_net_hdr struct.
>
> It crashes in skb_segment seemingly on a NULL list_skb.
>
> (gdb) list *(skb_segment+0x2a4)
> 0xffffffff8167cc24 is in skb_segment (net/core/skbuff.c:3566).
> 3561 if (hsize < 0)
> 3562 hsize = 0;
> 3563 if (hsize > len || !sg)
> 3564 hsize = len;
> 3565
> 3566 if (!hsize && i >= nfrags && skb_headlen(list_skb) &&
> 3567 (skb_headlen(list_skb) == len || sg)) {
> 3568 BUG_ON(skb_headlen(list_skb) > len);
> 3569
> 3570 i = 0;
>
> Likely there is a hidden assumption about SCTP GSO packets that does
> not hold for such packets generated by PF_PACKET.
>
> SCTP GSO introduced the GSO_BY_FRAGS mss value, so the code
> takes a different path for SCTP packets generated by the SCTP stack.
>
> PF_PACKET does not necessarily set gso_size to GSO_BY_FRAGS, so
> does not take the branch that requires list_skb to be non-zero here:
>
> if (unlikely(mss == GSO_BY_FRAGS)) {
> len = list_skb->len;
> } else {
> len = head_skb->len - offset;
> if (len > mss)
> len = mss;
> }
>
> hsize = skb_headlen(head_skb) - offset;
> if (hsize < 0)
> hsize = 0;
> if (hsize > len || !sg)
> hsize = len;
>
> if (!hsize && i >= nfrags && skb_headlen(list_skb) &&
> (skb_headlen(list_skb) == len || sg)) {
>
> Somewhat tangential, but any PF_PACKET socket can set this
> magic gso_size value in its virtio_net_hdr, so if it is assumed to
> be an SCTP GSO specific option, setting it for a TCP GSO packet
> may also cause unexpected results.
It seems virtio_net could use more sanity checks. When PACKET_VNET_HDR
is used, it will end up calling:
tpacket_rcv() {
...
if (do_vnet) {
if (virtio_net_hdr_from_skb(skb, h.raw + macoff -
sizeof(struct virtio_net_hdr),
vio_le(), true)) {
spin_lock(&sk->sk_receive_queue.lock);
goto drop_n_account;
}
}
and virtio_net_hdr_from_skb does:
if (skb_is_gso(skb)) {
...
if (sinfo->gso_type & SKB_GSO_TCPV4)
hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
else if (sinfo->gso_type & SKB_GSO_TCPV6)
hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
else
return -EINVAL;
Meaning that any gso_type other than TCP would be rejected, but this
SCTP one got through. Seems the header contains a sctp header, but the
gso_type set was actually pointing to TCP (otherwise it would have
been rejected). AFAICT if this packet had an ESP header, for example,
it could have hit esp4_gso_segment. Can you please confirm this?
I don't know of anywhere in the stack validating if the gso_type
matches the header that actually is in there.
The fix you mentioned is a good start, we want that one way or
another, but I'm afraid this bug is bigger than sctp.
Marcelo
^ permalink raw reply
* Re: [PATCH V4 4/4] selinux: Add SCTP support
From: Marcelo Ricardo Leitner @ 2017-12-30 23:16 UTC (permalink / raw)
To: Richard Haines
Cc: selinux, netdev, linux-sctp, linux-security-module, paul,
vyasevich, nhorman, sds, eparis, casey
In-Reply-To: <20171230172035.15837-1-richard_c_haines@btinternet.com>
On Sat, Dec 30, 2017 at 05:20:35PM +0000, Richard Haines wrote:
> The SELinux SCTP implementation is explained in:
> Documentation/security/SELinux-sctp.rst
>
> Signed-off-by: Richard Haines <richard_c_haines@btinternet.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Thanks Richard.
> ---
> Documentation/security/SELinux-sctp.rst | 157 ++++++++++++++++++
> security/selinux/hooks.c | 280 +++++++++++++++++++++++++++++---
> security/selinux/include/classmap.h | 2 +-
> security/selinux/include/netlabel.h | 21 ++-
> security/selinux/include/objsec.h | 4 +
> security/selinux/netlabel.c | 138 ++++++++++++++--
> 6 files changed, 570 insertions(+), 32 deletions(-)
> create mode 100644 Documentation/security/SELinux-sctp.rst
>
> diff --git a/Documentation/security/SELinux-sctp.rst b/Documentation/security/SELinux-sctp.rst
> new file mode 100644
> index 0000000..2f66bf3
> --- /dev/null
> +++ b/Documentation/security/SELinux-sctp.rst
> @@ -0,0 +1,157 @@
> +SCTP SELinux Support
> +=====================
> +
> +Security Hooks
> +===============
> +
> +``Documentation/security/LSM-sctp.rst`` describes the following SCTP security
> +hooks with the SELinux specifics expanded below::
> +
> + security_sctp_assoc_request()
> + security_sctp_bind_connect()
> + security_sctp_sk_clone()
> + security_inet_conn_established()
> +
> +
> +security_sctp_assoc_request()
> +-----------------------------
> +Passes the ``@ep`` and ``@chunk->skb`` of the association INIT packet to the
> +security module. Returns 0 on success, error on failure.
> +::
> +
> + @ep - pointer to sctp endpoint structure.
> + @skb - pointer to skbuff of association packet.
> +
> +The security module performs the following operations:
> + IF this is the first association on ``@ep->base.sk``, then set the peer
> + sid to that in ``@skb``. This will ensure there is only one peer sid
> + assigned to ``@ep->base.sk`` that may support multiple associations.
> +
> + ELSE validate the ``@ep->base.sk peer_sid`` against the ``@skb peer sid``
> + to determine whether the association should be allowed or denied.
> +
> + Set the sctp ``@ep sid`` to socket's sid (from ``ep->base.sk``) with
> + MLS portion taken from ``@skb peer sid``. This will be used by SCTP
> + TCP style sockets and peeled off connections as they cause a new socket
> + to be generated.
> +
> + If IP security options are configured (CIPSO/CALIPSO), then the ip
> + options are set on the socket.
> +
> +
> +security_sctp_bind_connect()
> +-----------------------------
> +Checks permissions required for ipv4/ipv6 addresses based on the ``@optname``
> +as follows::
> +
> + ------------------------------------------------------------------
> + | BIND Permission Checks |
> + | @optname | @address contains |
> + |----------------------------|-----------------------------------|
> + | SCTP_SOCKOPT_BINDX_ADD | One or more ipv4 / ipv6 addresses |
> + | SCTP_PRIMARY_ADDR | Single ipv4 or ipv6 address |
> + | SCTP_SET_PEER_PRIMARY_ADDR | Single ipv4 or ipv6 address |
> + ------------------------------------------------------------------
> +
> + ------------------------------------------------------------------
> + | CONNECT Permission Checks |
> + | @optname | @address contains |
> + |----------------------------|-----------------------------------|
> + | SCTP_SOCKOPT_CONNECTX | One or more ipv4 / ipv6 addresses |
> + | SCTP_PARAM_ADD_IP | One or more ipv4 / ipv6 addresses |
> + | SCTP_SENDMSG_CONNECT | Single ipv4 or ipv6 address |
> + | SCTP_PARAM_SET_PRIMARY | Single ipv4 or ipv6 address |
> + ------------------------------------------------------------------
> +
> +
> +``Documentation/security/LSM-sctp.rst`` gives a summary of the ``@optname``
> +entries and also describes ASCONF chunk processing when Dynamic Address
> +Reconfiguration is enabled.
> +
> +
> +security_sctp_sk_clone()
> +-------------------------
> +Called whenever a new socket is created by **accept**\(2) (i.e. a TCP style
> +socket) or when a socket is 'peeled off' e.g userspace calls
> +**sctp_peeloff**\(3). ``security_sctp_sk_clone()`` will set the new
> +sockets sid and peer sid to that contained in the ``@ep sid`` and
> +``@ep peer sid`` respectively.
> +::
> +
> + @ep - pointer to current sctp endpoint structure.
> + @sk - pointer to current sock structure.
> + @sk - pointer to new sock structure.
> +
> +
> +security_inet_conn_established()
> +---------------------------------
> +Called when a COOKIE ACK is received where it sets the connection's peer sid
> +to that in ``@skb``::
> +
> + @sk - pointer to sock structure.
> + @skb - pointer to skbuff of the COOKIE ACK packet.
> +
> +
> +Policy Statements
> +==================
> +The following class and permissions to support SCTP are available within the
> +kernel::
> +
> + class sctp_socket inherits socket { node_bind }
> +
> +whenever the following policy capability is enabled::
> +
> + policycap extended_socket_class;
> +
> +SELinux SCTP support adds the ``name_connect`` permission for connecting
> +to a specific port type and the ``association`` permission that is explained
> +in the section below.
> +
> +If userspace tools have been updated, SCTP will support the ``portcon``
> +statement as shown in the following example::
> +
> + portcon sctp 1024-1036 system_u:object_r:sctp_ports_t:s0
> +
> +
> +SCTP Peer Labeling
> +===================
> +An SCTP socket will only have one peer label assigned to it. This will be
> +assigned during the establishment of the first association. Once the peer
> +label has been assigned, any new associations will have the ``association``
> +permission validated by checking the socket peer sid against the received
> +packets peer sid to determine whether the association should be allowed or
> +denied.
> +
> +NOTES:
> + 1) If peer labeling is not enabled, then the peer context will always be
> + ``SECINITSID_UNLABELED`` (``unlabeled_t`` in Reference Policy).
> +
> + 2) As SCTP can support more than one transport address per endpoint
> + (multi-homing) on a single socket, it is possible to configure policy
> + and NetLabel to provide different peer labels for each of these. As the
> + socket peer label is determined by the first associations transport
> + address, it is recommended that all peer labels are consistent.
> +
> + 3) **getpeercon**\(3) may be used by userspace to retrieve the sockets peer
> + context.
> +
> + 4) While not SCTP specific, be aware when using NetLabel that if a label
> + is assigned to a specific interface, and that interface 'goes down',
> + then the NetLabel service will remove the entry. Therefore ensure that
> + the network startup scripts call **netlabelctl**\(8) to set the required
> + label (see **netlabel-config**\(8) helper script for details).
> +
> + 5) The NetLabel SCTP peer labeling rules apply as discussed in the following
> + set of posts tagged "netlabel" at: http://www.paul-moore.com/blog/t.
> +
> + 6) CIPSO is only supported for IPv4 addressing: ``socket(AF_INET, ...)``
> + CALIPSO is only supported for IPv6 addressing: ``socket(AF_INET6, ...)``
> +
> + Note the following when testing CIPSO/CALIPSO:
> + a) CIPSO will send an ICMP packet if an SCTP packet cannot be
> + delivered because of an invalid label.
> + b) CALIPSO does not send an ICMP packet, just silently discards it.
> +
> + 7) IPSEC is not supported as RFC 3554 - sctp/ipsec support has not been
> + implemented in userspace (**racoon**\(8) or **ipsec_pluto**\(8)),
> + although the kernel supports SCTP/IPSEC.
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index f5d3047..24d6f39 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -67,6 +67,8 @@
> #include <linux/tcp.h>
> #include <linux/udp.h>
> #include <linux/dccp.h>
> +#include <linux/sctp.h>
> +#include <net/sctp/structs.h>
> #include <linux/quota.h>
> #include <linux/un.h> /* for Unix socket types */
> #include <net/af_unix.h> /* for Unix socket types */
> @@ -4126,6 +4128,23 @@ static int selinux_parse_skb_ipv4(struct sk_buff *skb,
> break;
> }
>
> +#if IS_ENABLED(CONFIG_IP_SCTP)
> + case IPPROTO_SCTP: {
> + struct sctphdr _sctph, *sh;
> +
> + if (ntohs(ih->frag_off) & IP_OFFSET)
> + break;
> +
> + offset += ihlen;
> + sh = skb_header_pointer(skb, offset, sizeof(_sctph), &_sctph);
> + if (sh == NULL)
> + break;
> +
> + ad->u.net->sport = sh->source;
> + ad->u.net->dport = sh->dest;
> + break;
> + }
> +#endif
> default:
> break;
> }
> @@ -4199,6 +4218,19 @@ static int selinux_parse_skb_ipv6(struct sk_buff *skb,
> break;
> }
>
> +#if IS_ENABLED(CONFIG_IP_SCTP)
> + case IPPROTO_SCTP: {
> + struct sctphdr _sctph, *sh;
> +
> + sh = skb_header_pointer(skb, offset, sizeof(_sctph), &_sctph);
> + if (sh == NULL)
> + break;
> +
> + ad->u.net->sport = sh->source;
> + ad->u.net->dport = sh->dest;
> + break;
> + }
> +#endif
> /* includes fragments */
> default:
> break;
> @@ -4388,6 +4420,10 @@ static int selinux_socket_post_create(struct socket *sock, int family,
> sksec = sock->sk->sk_security;
> sksec->sclass = sclass;
> sksec->sid = sid;
> + /* Allows detection of the first association on this socket */
> + if (sksec->sclass == SECCLASS_SCTP_SOCKET)
> + sksec->sctp_assoc_state = SCTP_ASSOC_UNSET;
> +
> err = selinux_netlbl_socket_post_create(sock->sk, family);
> }
>
> @@ -4408,11 +4444,7 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
> if (err)
> goto out;
>
> - /*
> - * If PF_INET or PF_INET6, check name_bind permission for the port.
> - * Multiple address binding for SCTP is not supported yet: we just
> - * check the first address now.
> - */
> + /* If PF_INET or PF_INET6, check name_bind permission for the port. */
> family = sk->sk_family;
> if (family == PF_INET || family == PF_INET6) {
> char *addrp;
> @@ -4424,7 +4456,13 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
> unsigned short snum;
> u32 sid, node_perm;
>
> - if (family == PF_INET) {
> + /*
> + * sctp_bindx(3) calls via selinux_sctp_bind_connect()
> + * that validates multiple binding addresses. Because of this
> + * need to check address->sa_family as it is possible to have
> + * sk->sk_family = PF_INET6 with addr->sa_family = AF_INET.
> + */
> + if (address->sa_family == AF_INET) {
> if (addrlen < sizeof(struct sockaddr_in)) {
> err = -EINVAL;
> goto out;
> @@ -4478,6 +4516,10 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
> node_perm = DCCP_SOCKET__NODE_BIND;
> break;
>
> + case SECCLASS_SCTP_SOCKET:
> + node_perm = SCTP_SOCKET__NODE_BIND;
> + break;
> +
> default:
> node_perm = RAWIP_SOCKET__NODE_BIND;
> break;
> @@ -4492,7 +4534,7 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
> ad.u.net->sport = htons(snum);
> ad.u.net->family = family;
>
> - if (family == PF_INET)
> + if (address->sa_family == AF_INET)
> ad.u.net->v4info.saddr = addr4->sin_addr.s_addr;
> else
> ad.u.net->v6info.saddr = addr6->sin6_addr;
> @@ -4506,7 +4548,11 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
> return err;
> }
>
> -static int selinux_socket_connect(struct socket *sock, struct sockaddr *address, int addrlen)
> +/* This supports connect(2) and SCTP connect services such as sctp_connectx(3)
> + * and sctp_sendmsg(3) as described in Documentation/security/LSM-sctp.txt
> + */
> +static int selinux_socket_connect_helper(struct socket *sock,
> + struct sockaddr *address, int addrlen)
> {
> struct sock *sk = sock->sk;
> struct sk_security_struct *sksec = sk->sk_security;
> @@ -4517,10 +4563,12 @@ static int selinux_socket_connect(struct socket *sock, struct sockaddr *address,
> return err;
>
> /*
> - * If a TCP or DCCP socket, check name_connect permission for the port.
> + * If a TCP, DCCP or SCTP socket, check name_connect permission
> + * for the port.
> */
> if (sksec->sclass == SECCLASS_TCP_SOCKET ||
> - sksec->sclass == SECCLASS_DCCP_SOCKET) {
> + sksec->sclass == SECCLASS_DCCP_SOCKET ||
> + sksec->sclass == SECCLASS_SCTP_SOCKET) {
> struct common_audit_data ad;
> struct lsm_network_audit net = {0,};
> struct sockaddr_in *addr4 = NULL;
> @@ -4528,7 +4576,12 @@ static int selinux_socket_connect(struct socket *sock, struct sockaddr *address,
> unsigned short snum;
> u32 sid, perm;
>
> - if (sk->sk_family == PF_INET) {
> + /* sctp_connectx(3) calls via selinux_sctp_bind_connect()
> + * that validates multiple connect addresses. Because of this
> + * need to check address->sa_family as it is possible to have
> + * sk->sk_family = PF_INET6 with addr->sa_family = AF_INET.
> + */
> + if (address->sa_family == AF_INET) {
> addr4 = (struct sockaddr_in *)address;
> if (addrlen < sizeof(struct sockaddr_in))
> return -EINVAL;
> @@ -4542,10 +4595,19 @@ static int selinux_socket_connect(struct socket *sock, struct sockaddr *address,
>
> err = sel_netport_sid(sk->sk_protocol, snum, &sid);
> if (err)
> - goto out;
> + return err;
>
> - perm = (sksec->sclass == SECCLASS_TCP_SOCKET) ?
> - TCP_SOCKET__NAME_CONNECT : DCCP_SOCKET__NAME_CONNECT;
> + switch (sksec->sclass) {
> + case SECCLASS_TCP_SOCKET:
> + perm = TCP_SOCKET__NAME_CONNECT;
> + break;
> + case SECCLASS_DCCP_SOCKET:
> + perm = DCCP_SOCKET__NAME_CONNECT;
> + break;
> + case SECCLASS_SCTP_SOCKET:
> + perm = SCTP_SOCKET__NAME_CONNECT;
> + break;
> + }
>
> ad.type = LSM_AUDIT_DATA_NET;
> ad.u.net = &net;
> @@ -4553,13 +4615,24 @@ static int selinux_socket_connect(struct socket *sock, struct sockaddr *address,
> ad.u.net->family = sk->sk_family;
> err = avc_has_perm(sksec->sid, sid, sksec->sclass, perm, &ad);
> if (err)
> - goto out;
> + return err;
> }
>
> - err = selinux_netlbl_socket_connect(sk, address);
> + return 0;
> +}
>
> -out:
> - return err;
> +/* Supports connect(2), see comments in selinux_socket_connect_helper() */
> +static int selinux_socket_connect(struct socket *sock,
> + struct sockaddr *address, int addrlen)
> +{
> + int err;
> + struct sock *sk = sock->sk;
> +
> + err = selinux_socket_connect_helper(sock, address, addrlen);
> + if (err)
> + return err;
> +
> + return selinux_netlbl_socket_connect(sk, address);
> }
>
> static int selinux_socket_listen(struct socket *sock, int backlog)
> @@ -4822,7 +4895,8 @@ static int selinux_socket_getpeersec_stream(struct socket *sock, char __user *op
> u32 peer_sid = SECSID_NULL;
>
> if (sksec->sclass == SECCLASS_UNIX_STREAM_SOCKET ||
> - sksec->sclass == SECCLASS_TCP_SOCKET)
> + sksec->sclass == SECCLASS_TCP_SOCKET ||
> + sksec->sclass == SECCLASS_SCTP_SOCKET)
> peer_sid = sksec->peer_sid;
> if (peer_sid == SECSID_NULL)
> return -ENOPROTOOPT;
> @@ -4935,6 +5009,171 @@ static void selinux_sock_graft(struct sock *sk, struct socket *parent)
> sksec->sclass = isec->sclass;
> }
>
> +/* Called whenever SCTP receives an INIT chunk. This happens when an incoming
> + * connect(2), sctp_connectx(3) or sctp_sendmsg(3) (with no association
> + * already present).
> + */
> +static int selinux_sctp_assoc_request(struct sctp_endpoint *ep,
> + struct sk_buff *skb)
> +{
> + struct sk_security_struct *sksec = ep->base.sk->sk_security;
> + struct common_audit_data ad;
> + struct lsm_network_audit net = {0,};
> + u8 peerlbl_active;
> + u32 peer_sid = SECINITSID_UNLABELED;
> + u32 conn_sid;
> + int err = 0;
> +
> + if (!selinux_policycap_extsockclass)
> + return 0;
> +
> + peerlbl_active = selinux_peerlbl_enabled();
> +
> + if (peerlbl_active) {
> + /* This will return peer_sid = SECSID_NULL if there are
> + * no peer labels, see security_net_peersid_resolve().
> + */
> + err = selinux_skb_peerlbl_sid(skb, ep->base.sk->sk_family,
> + &peer_sid);
> + if (err)
> + return err;
> +
> + if (peer_sid == SECSID_NULL)
> + peer_sid = SECINITSID_UNLABELED;
> + }
> +
> + if (sksec->sctp_assoc_state == SCTP_ASSOC_UNSET) {
> + sksec->sctp_assoc_state = SCTP_ASSOC_SET;
> +
> + /* Here as first association on socket. As the peer SID
> + * was allowed by peer recv (and the netif/node checks),
> + * then it is approved by policy and used as the primary
> + * peer SID for getpeercon(3).
> + */
> + sksec->peer_sid = peer_sid;
> + } else if (sksec->peer_sid != peer_sid) {
> + /* Other association peer SIDs are checked to enforce
> + * consistency among the peer SIDs.
> + */
> + ad.type = LSM_AUDIT_DATA_NET;
> + ad.u.net = &net;
> + ad.u.net->sk = ep->base.sk;
> + err = avc_has_perm(sksec->peer_sid, peer_sid, sksec->sclass,
> + SCTP_SOCKET__ASSOCIATION, &ad);
> + if (err)
> + return err;
> + }
> +
> + /* Compute the MLS component for the connection and store
> + * the information in ep. This will be used by SCTP TCP type
> + * sockets and peeled off connections as they cause a new
> + * socket to be generated. selinux_sctp_sk_clone() will then
> + * plug this into the new socket.
> + */
> + err = selinux_conn_sid(sksec->sid, peer_sid, &conn_sid);
> + if (err)
> + return err;
> +
> + ep->secid = conn_sid;
> + ep->peer_secid = peer_sid;
> +
> + /* Set any NetLabel labels including CIPSO/CALIPSO options. */
> + return selinux_netlbl_sctp_assoc_request(ep, skb);
> +}
> +
> +/* Check if sctp IPv4/IPv6 addresses are valid for binding or connecting
> + * based on their @optname.
> + */
> +static int selinux_sctp_bind_connect(struct sock *sk, int optname,
> + struct sockaddr *address,
> + int addrlen)
> +{
> + int len, err = 0, walk_size = 0;
> + void *addr_buf;
> + struct sockaddr *addr;
> + struct socket *sock;
> +
> + if (!selinux_policycap_extsockclass)
> + return 0;
> +
> + /* Process one or more addresses that may be IPv4 or IPv6 */
> + sock = sk->sk_socket;
> + addr_buf = address;
> +
> + while (walk_size < addrlen) {
> + addr = addr_buf;
> + switch (addr->sa_family) {
> + case AF_INET:
> + len = sizeof(struct sockaddr_in);
> + break;
> + case AF_INET6:
> + len = sizeof(struct sockaddr_in6);
> + break;
> + default:
> + return -EAFNOSUPPORT;
> + }
> +
> + err = -EINVAL;
> + switch (optname) {
> + /* Bind checks */
> + case SCTP_PRIMARY_ADDR:
> + case SCTP_SET_PEER_PRIMARY_ADDR:
> + case SCTP_SOCKOPT_BINDX_ADD:
> + err = selinux_socket_bind(sock, addr, len);
> + break;
> + /* Connect checks */
> + case SCTP_SOCKOPT_CONNECTX:
> + case SCTP_PARAM_SET_PRIMARY:
> + case SCTP_PARAM_ADD_IP:
> + case SCTP_SENDMSG_CONNECT:
> + err = selinux_socket_connect_helper(sock, addr, len);
> + if (err)
> + return err;
> +
> + /* As selinux_sctp_bind_connect() is called by the
> + * SCTP protocol layer, the socket is already locked,
> + * therefore selinux_netlbl_socket_connect_locked() is
> + * is called here. The situations handled are:
> + * sctp_connectx(3), sctp_sendmsg(3), sendmsg(2),
> + * whenever a new IP address is added or when a new
> + * primary address is selected.
> + * Note that an SCTP connect(2) call happens before
> + * the SCTP protocol layer and is handled via
> + * selinux_socket_connect().
> + */
> + err = selinux_netlbl_socket_connect_locked(sk, addr);
> + break;
> + }
> +
> + if (err)
> + return err;
> +
> + addr_buf += len;
> + walk_size += len;
> + }
> +
> + return 0;
> +}
> +
> +/* Called whenever a new socket is created by accept(2) or sctp_peeloff(3). */
> +static void selinux_sctp_sk_clone(struct sctp_endpoint *ep, struct sock *sk,
> + struct sock *newsk)
> +{
> + struct sk_security_struct *sksec = sk->sk_security;
> + struct sk_security_struct *newsksec = newsk->sk_security;
> +
> + /* If policy does not support SECCLASS_SCTP_SOCKET then call
> + * the non-sctp clone version.
> + */
> + if (!selinux_policycap_extsockclass)
> + return selinux_sk_clone_security(sk, newsk);
> +
> + newsksec->sid = ep->secid;
> + newsksec->peer_sid = ep->peer_secid;
> + newsksec->sclass = sksec->sclass;
> + selinux_netlbl_sctp_sk_clone(sk, newsk);
> +}
> +
> static int selinux_inet_conn_request(struct sock *sk, struct sk_buff *skb,
> struct request_sock *req)
> {
> @@ -6422,6 +6661,9 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
> LSM_HOOK_INIT(sk_clone_security, selinux_sk_clone_security),
> LSM_HOOK_INIT(sk_getsecid, selinux_sk_getsecid),
> LSM_HOOK_INIT(sock_graft, selinux_sock_graft),
> + LSM_HOOK_INIT(sctp_assoc_request, selinux_sctp_assoc_request),
> + LSM_HOOK_INIT(sctp_sk_clone, selinux_sctp_sk_clone),
> + LSM_HOOK_INIT(sctp_bind_connect, selinux_sctp_bind_connect),
> LSM_HOOK_INIT(inet_conn_request, selinux_inet_conn_request),
> LSM_HOOK_INIT(inet_csk_clone, selinux_inet_csk_clone),
> LSM_HOOK_INIT(inet_conn_established, selinux_inet_conn_established),
> diff --git a/security/selinux/include/classmap.h b/security/selinux/include/classmap.h
> index cc35695..167c20a 100644
> --- a/security/selinux/include/classmap.h
> +++ b/security/selinux/include/classmap.h
> @@ -176,7 +176,7 @@ struct security_class_mapping secclass_map[] = {
> { COMMON_CAP2_PERMS, NULL } },
> { "sctp_socket",
> { COMMON_SOCK_PERMS,
> - "node_bind", NULL } },
> + "node_bind", "name_connect", "association", NULL } },
> { "icmp_socket",
> { COMMON_SOCK_PERMS,
> "node_bind", NULL } },
> diff --git a/security/selinux/include/netlabel.h b/security/selinux/include/netlabel.h
> index 75686d5..0fae720 100644
> --- a/security/selinux/include/netlabel.h
> +++ b/security/selinux/include/netlabel.h
> @@ -33,6 +33,7 @@
> #include <linux/skbuff.h>
> #include <net/sock.h>
> #include <net/request_sock.h>
> +#include <net/sctp/structs.h>
>
> #include "avc.h"
> #include "objsec.h"
> @@ -53,9 +54,11 @@ int selinux_netlbl_skbuff_getsid(struct sk_buff *skb,
> int selinux_netlbl_skbuff_setsid(struct sk_buff *skb,
> u16 family,
> u32 sid);
> -
> +int selinux_netlbl_sctp_assoc_request(struct sctp_endpoint *ep,
> + struct sk_buff *skb);
> int selinux_netlbl_inet_conn_request(struct request_sock *req, u16 family);
> void selinux_netlbl_inet_csk_clone(struct sock *sk, u16 family);
> +void selinux_netlbl_sctp_sk_clone(struct sock *sk, struct sock *newsk);
> int selinux_netlbl_socket_post_create(struct sock *sk, u16 family);
> int selinux_netlbl_sock_rcv_skb(struct sk_security_struct *sksec,
> struct sk_buff *skb,
> @@ -65,6 +68,8 @@ int selinux_netlbl_socket_setsockopt(struct socket *sock,
> int level,
> int optname);
> int selinux_netlbl_socket_connect(struct sock *sk, struct sockaddr *addr);
> +int selinux_netlbl_socket_connect_locked(struct sock *sk,
> + struct sockaddr *addr);
>
> #else
> static inline void selinux_netlbl_cache_invalidate(void)
> @@ -114,6 +119,11 @@ static inline int selinux_netlbl_conn_setsid(struct sock *sk,
> return 0;
> }
>
> +static inline int selinux_netlbl_sctp_assoc_request(struct sctp_endpoint *ep,
> + struct sk_buff *skb)
> +{
> + return 0;
> +}
> static inline int selinux_netlbl_inet_conn_request(struct request_sock *req,
> u16 family)
> {
> @@ -123,6 +133,10 @@ static inline void selinux_netlbl_inet_csk_clone(struct sock *sk, u16 family)
> {
> return;
> }
> +static inline void selinux_netlbl_sctp_sk_clone(struct sock *sk, sock *newsk)
> +{
> + return;
> +}
> static inline int selinux_netlbl_socket_post_create(struct sock *sk,
> u16 family)
> {
> @@ -146,6 +160,11 @@ static inline int selinux_netlbl_socket_connect(struct sock *sk,
> {
> return 0;
> }
> +static inline int selinux_netlbl_socket_connect_locked(struct sock *sk,
> + struct sockaddr *addr)
> +{
> + return 0;
> +}
> #endif /* CONFIG_NETLABEL */
>
> #endif
> diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h
> index 1649cd1..be145cf 100644
> --- a/security/selinux/include/objsec.h
> +++ b/security/selinux/include/objsec.h
> @@ -130,6 +130,10 @@ struct sk_security_struct {
> u32 sid; /* SID of this object */
> u32 peer_sid; /* SID of peer */
> u16 sclass; /* sock security class */
> + enum { /* SCTP association state */
> + SCTP_ASSOC_UNSET = 0,
> + SCTP_ASSOC_SET,
> + } sctp_assoc_state;
> };
>
> struct tun_security_struct {
> diff --git a/security/selinux/netlabel.c b/security/selinux/netlabel.c
> index aaba667..0a566e3 100644
> --- a/security/selinux/netlabel.c
> +++ b/security/selinux/netlabel.c
> @@ -250,6 +250,7 @@ int selinux_netlbl_skbuff_setsid(struct sk_buff *skb,
> sk = skb_to_full_sk(skb);
> if (sk != NULL) {
> struct sk_security_struct *sksec = sk->sk_security;
> +
> if (sksec->nlbl_state != NLBL_REQSKB)
> return 0;
> secattr = selinux_netlbl_sock_getattr(sk, sid);
> @@ -270,6 +271,61 @@ int selinux_netlbl_skbuff_setsid(struct sk_buff *skb,
> return rc;
> }
>
> +/**
> + * selinux_netlbl_sctp_assoc_request - Label an incoming sctp association.
> + * @ep: incoming association endpoint.
> + * @skb: the packet.
> + *
> + * Description:
> + * A new incoming connection is represented by @ep, ......
> + * Returns zero on success, negative values on failure.
> + *
> + */
> +int selinux_netlbl_sctp_assoc_request(struct sctp_endpoint *ep,
> + struct sk_buff *skb)
> +{
> + int rc;
> + struct netlbl_lsm_secattr secattr;
> + struct sk_security_struct *sksec = ep->base.sk->sk_security;
> + struct sockaddr *addr;
> + struct sockaddr_in addr4;
> +#if IS_ENABLED(CONFIG_IPV6)
> + struct sockaddr_in6 addr6;
> +#endif
> +
> + if (ep->base.sk->sk_family != PF_INET &&
> + ep->base.sk->sk_family != PF_INET6)
> + return 0;
> +
> + netlbl_secattr_init(&secattr);
> + rc = security_netlbl_sid_to_secattr(ep->secid, &secattr);
> + if (rc != 0)
> + goto assoc_request_return;
> +
> + /* Move skb hdr address info to a struct sockaddr and then call
> + * netlbl_conn_setattr().
> + */
> + if (ip_hdr(skb)->version == 4) {
> + addr4.sin_family = AF_INET;
> + addr4.sin_addr.s_addr = ip_hdr(skb)->saddr;
> + addr = (struct sockaddr *)&addr4;
> +#if IS_ENABLED(CONFIG_IPV6)
> + } else {
> + addr6.sin6_family = AF_INET6;
> + addr6.sin6_addr = ipv6_hdr(skb)->saddr;
> + addr = (struct sockaddr *)&addr6;
> +#endif
> + }
> +
> + rc = netlbl_conn_setattr(ep->base.sk, addr, &secattr);
> + if (rc == 0)
> + sksec->nlbl_state = NLBL_LABELED;
> +
> +assoc_request_return:
> + netlbl_secattr_destroy(&secattr);
> + return rc;
> +}
> +
> /**
> * selinux_netlbl_inet_conn_request - Label an incoming stream connection
> * @req: incoming connection request socket
> @@ -319,6 +375,22 @@ void selinux_netlbl_inet_csk_clone(struct sock *sk, u16 family)
> sksec->nlbl_state = NLBL_UNSET;
> }
>
> +/**
> + * selinux_netlbl_sctp_sk_clone - Copy state to the newly created sock
> + * @sk: current sock
> + * @newsk: the new sock
> + *
> + * Description:
> + * Called whenever a new socket is created by accept(2) or sctp_peeloff(3).
> + */
> +void selinux_netlbl_sctp_sk_clone(struct sock *sk, struct sock *newsk)
> +{
> + struct sk_security_struct *sksec = sk->sk_security;
> + struct sk_security_struct *newsksec = newsk->sk_security;
> +
> + newsksec->nlbl_state = sksec->nlbl_state;
> +}
> +
> /**
> * selinux_netlbl_socket_post_create - Label a socket using NetLabel
> * @sock: the socket to label
> @@ -470,7 +542,8 @@ int selinux_netlbl_socket_setsockopt(struct socket *sock,
> }
>
> /**
> - * selinux_netlbl_socket_connect - Label a client-side socket on connect
> + * selinux_netlbl_socket_connect_helper - Help label a client-side socket on
> + * connect
> * @sk: the socket to label
> * @addr: the destination address
> *
> @@ -479,18 +552,13 @@ int selinux_netlbl_socket_setsockopt(struct socket *sock,
> * Returns zero values on success, negative values on failure.
> *
> */
> -int selinux_netlbl_socket_connect(struct sock *sk, struct sockaddr *addr)
> +static int selinux_netlbl_socket_connect_helper(struct sock *sk,
> + struct sockaddr *addr)
> {
> int rc;
> struct sk_security_struct *sksec = sk->sk_security;
> struct netlbl_lsm_secattr *secattr;
>
> - if (sksec->nlbl_state != NLBL_REQSKB &&
> - sksec->nlbl_state != NLBL_CONNLABELED)
> - return 0;
> -
> - lock_sock(sk);
> -
> /* connected sockets are allowed to disconnect when the address family
> * is set to AF_UNSPEC, if that is what is happening we want to reset
> * the socket */
> @@ -498,18 +566,66 @@ int selinux_netlbl_socket_connect(struct sock *sk, struct sockaddr *addr)
> netlbl_sock_delattr(sk);
> sksec->nlbl_state = NLBL_REQSKB;
> rc = 0;
> - goto socket_connect_return;
> + return rc;
> }
> secattr = selinux_netlbl_sock_genattr(sk);
> if (secattr == NULL) {
> rc = -ENOMEM;
> - goto socket_connect_return;
> + return rc;
> }
> rc = netlbl_conn_setattr(sk, addr, secattr);
> if (rc == 0)
> sksec->nlbl_state = NLBL_CONNLABELED;
>
> -socket_connect_return:
> + return rc;
> +}
> +
> +/**
> + * selinux_netlbl_socket_connect - Label a client-side socket on connect
> + * @sk: the socket to label
> + * @addr: the destination address
> + *
> + * Description:
> + * Attempt to label a connected socket with NetLabel using the given address.
> + * Returns zero values on success, negative values on failure.
> + *
> + */
> +int selinux_netlbl_socket_connect(struct sock *sk, struct sockaddr *addr)
> +{
> + int rc;
> + struct sk_security_struct *sksec = sk->sk_security;
> +
> + if (sksec->nlbl_state != NLBL_REQSKB &&
> + sksec->nlbl_state != NLBL_CONNLABELED)
> + return 0;
> +
> + lock_sock(sk);
> + rc = selinux_netlbl_socket_connect_helper(sk, addr);
> release_sock(sk);
> +
> return rc;
> }
> +
> +/**
> + * selinux_netlbl_socket_connect_locked - Label a client-side socket on
> + * connect
> + * @sk: the socket to label
> + * @addr: the destination address
> + *
> + * Description:
> + * Attempt to label a connected socket that already has the socket locked
> + * with NetLabel using the given address.
> + * Returns zero values on success, negative values on failure.
> + *
> + */
> +int selinux_netlbl_socket_connect_locked(struct sock *sk,
> + struct sockaddr *addr)
> +{
> + struct sk_security_struct *sksec = sk->sk_security;
> +
> + if (sksec->nlbl_state != NLBL_REQSKB &&
> + sksec->nlbl_state != NLBL_CONNLABELED)
> + return 0;
> +
> + return selinux_netlbl_socket_connect_helper(sk, addr);
> +}
> --
> 2.14.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply
* Re: [PATCH V4 3/4] sctp: Add LSM hooks
From: Marcelo Ricardo Leitner @ 2017-12-30 23:15 UTC (permalink / raw)
To: Richard Haines
Cc: selinux, netdev, linux-sctp, linux-security-module, paul,
vyasevich, nhorman, sds, eparis, casey
In-Reply-To: <20171230172013.15788-1-richard_c_haines@btinternet.com>
On Sat, Dec 30, 2017 at 05:20:13PM +0000, Richard Haines wrote:
> Add security hooks to allow security modules to exercise access control
> over SCTP.
>
> Signed-off-by: Richard Haines <richard_c_haines@btinternet.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> ---
> include/net/sctp/structs.h | 10 ++++++++
> include/uapi/linux/sctp.h | 1 +
> net/sctp/sm_make_chunk.c | 12 +++++++++
> net/sctp/sm_statefuns.c | 18 ++++++++++++++
> net/sctp/socket.c | 61 +++++++++++++++++++++++++++++++++++++++++++++-
> 5 files changed, 101 insertions(+), 1 deletion(-)
>
> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
> index 9942ed5..2ca0a3f 100644
> --- a/include/net/sctp/structs.h
> +++ b/include/net/sctp/structs.h
> @@ -1271,6 +1271,16 @@ struct sctp_endpoint {
> reconf_enable:1;
>
> __u8 strreset_enable;
> +
> + /* Security identifiers from incoming (INIT). These are set by
> + * security_sctp_assoc_request(). These will only be used by
> + * SCTP TCP type sockets and peeled off connections as they
> + * cause a new socket to be generated. security_sctp_sk_clone()
> + * will then plug these into the new socket.
> + */
> +
> + u32 secid;
> + u32 peer_secid;
> };
>
> /* Recover the outter endpoint structure. */
> diff --git a/include/uapi/linux/sctp.h b/include/uapi/linux/sctp.h
> index cfe9712..cafac36 100644
> --- a/include/uapi/linux/sctp.h
> +++ b/include/uapi/linux/sctp.h
> @@ -123,6 +123,7 @@ typedef __s32 sctp_assoc_t;
> #define SCTP_RESET_ASSOC 120
> #define SCTP_ADD_STREAMS 121
> #define SCTP_SOCKOPT_PEELOFF_FLAGS 122
> +#define SCTP_SENDMSG_CONNECT 123
>
> /* PR-SCTP policies */
> #define SCTP_PR_SCTP_NONE 0x0000
> diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
> index 514465b..269fd3d 100644
> --- a/net/sctp/sm_make_chunk.c
> +++ b/net/sctp/sm_make_chunk.c
> @@ -3054,6 +3054,12 @@ static __be16 sctp_process_asconf_param(struct sctp_association *asoc,
> if (af->is_any(&addr))
> memcpy(&addr, &asconf->source, sizeof(addr));
>
> + if (security_sctp_bind_connect(asoc->ep->base.sk,
> + SCTP_PARAM_ADD_IP,
> + (struct sockaddr *)&addr,
> + af->sockaddr_len))
> + return SCTP_ERROR_REQ_REFUSED;
> +
> /* ADDIP 4.3 D9) If an endpoint receives an ADD IP address
> * request and does not have the local resources to add this
> * new address to the association, it MUST return an Error
> @@ -3120,6 +3126,12 @@ static __be16 sctp_process_asconf_param(struct sctp_association *asoc,
> if (af->is_any(&addr))
> memcpy(&addr.v4, sctp_source(asconf), sizeof(addr));
>
> + if (security_sctp_bind_connect(asoc->ep->base.sk,
> + SCTP_PARAM_SET_PRIMARY,
> + (struct sockaddr *)&addr,
> + af->sockaddr_len))
> + return SCTP_ERROR_REQ_REFUSED;
> +
> peer = sctp_assoc_lookup_paddr(asoc, &addr);
> if (!peer)
> return SCTP_ERROR_DNS_FAILED;
> diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
> index 8f8ccde..a2dfc5a 100644
> --- a/net/sctp/sm_statefuns.c
> +++ b/net/sctp/sm_statefuns.c
> @@ -318,6 +318,11 @@ enum sctp_disposition sctp_sf_do_5_1B_init(struct net *net,
> struct sctp_packet *packet;
> int len;
>
> + /* Update socket peer label if first association. */
> + if (security_sctp_assoc_request((struct sctp_endpoint *)ep,
> + chunk->skb))
> + return sctp_sf_pdiscard(net, ep, asoc, type, arg, commands);
> +
> /* 6.10 Bundling
> * An endpoint MUST NOT bundle INIT, INIT ACK or
> * SHUTDOWN COMPLETE with any other chunks.
> @@ -905,6 +910,9 @@ enum sctp_disposition sctp_sf_do_5_1E_ca(struct net *net,
> */
> sctp_add_cmd_sf(commands, SCTP_CMD_INIT_COUNTER_RESET, SCTP_NULL());
>
> + /* Set peer label for connection. */
> + security_inet_conn_established(ep->base.sk, chunk->skb);
> +
> /* RFC 2960 5.1 Normal Establishment of an Association
> *
> * E) Upon reception of the COOKIE ACK, endpoint "A" will move
> @@ -1433,6 +1441,11 @@ static enum sctp_disposition sctp_sf_do_unexpected_init(
> struct sctp_packet *packet;
> int len;
>
> + /* Update socket peer label if first association. */
> + if (security_sctp_assoc_request((struct sctp_endpoint *)ep,
> + chunk->skb))
> + return sctp_sf_pdiscard(net, ep, asoc, type, arg, commands);
> +
> /* 6.10 Bundling
> * An endpoint MUST NOT bundle INIT, INIT ACK or
> * SHUTDOWN COMPLETE with any other chunks.
> @@ -2103,6 +2116,11 @@ enum sctp_disposition sctp_sf_do_5_2_4_dupcook(
> }
> }
>
> + /* Update socket peer label if first association. */
> + if (security_sctp_assoc_request((struct sctp_endpoint *)ep,
> + chunk->skb))
> + return sctp_sf_pdiscard(net, ep, asoc, type, arg, commands);
> +
> /* Set temp so that it won't be added into hashtable */
> new_asoc->temp = 1;
>
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index 4373e2a..b40db2d 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -1045,6 +1045,12 @@ static int sctp_setsockopt_bindx(struct sock *sk,
> /* Do the work. */
> switch (op) {
> case SCTP_BINDX_ADD_ADDR:
> + /* Allow security module to validate bindx addresses. */
> + err = security_sctp_bind_connect(sk, SCTP_SOCKOPT_BINDX_ADD,
> + (struct sockaddr *)kaddrs,
> + addrs_size);
> + if (err)
> + goto out;
> err = sctp_bindx_add(sk, kaddrs, addrcnt);
> if (err)
> goto out;
> @@ -1254,6 +1260,7 @@ static int __sctp_connect(struct sock *sk,
>
> if (assoc_id)
> *assoc_id = asoc->assoc_id;
> +
> err = sctp_wait_for_connect(asoc, &timeo);
> /* Note: the asoc may be freed after the return of
> * sctp_wait_for_connect.
> @@ -1367,9 +1374,17 @@ static int __sctp_setsockopt_connectx(struct sock *sk,
> if (__copy_from_user(kaddrs, addrs, addrs_size)) {
> err = -EFAULT;
> } else {
> + /* Allow security module to validate connectx addresses. */
> + err = security_sctp_bind_connect(sk, SCTP_SOCKOPT_CONNECTX,
> + (struct sockaddr *)kaddrs,
> + addrs_size);
> + if (err)
> + goto out_free;
> +
> err = __sctp_connect(sk, kaddrs, addrs_size, assoc_id);
> }
>
> +out_free:
> kfree(kaddrs);
>
> return err;
> @@ -1636,6 +1651,7 @@ static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)
> struct sctp_transport *transport, *chunk_tp;
> struct sctp_chunk *chunk;
> union sctp_addr to;
> + struct sctp_af *af;
> struct sockaddr *msg_name = NULL;
> struct sctp_sndrcvinfo default_sinfo;
> struct sctp_sndrcvinfo *sinfo;
> @@ -1865,6 +1881,24 @@ static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)
> }
>
> scope = sctp_scope(&to);
> +
> + /* Label connection socket for first association 1-to-many
> + * style for client sequence socket()->sendmsg(). This
> + * needs to be done before sctp_assoc_add_peer() as that will
> + * set up the initial packet that needs to account for any
> + * security ip options (CIPSO/CALIPSO) added to the packet.
> + */
> + af = sctp_get_af_specific(to.sa.sa_family);
> + if (!af) {
> + err = -EINVAL;
> + goto out_unlock;
> + }
> + err = security_sctp_bind_connect(sk, SCTP_SENDMSG_CONNECT,
> + (struct sockaddr *)&to,
> + af->sockaddr_len);
> + if (err < 0)
> + goto out_unlock;
> +
> new_asoc = sctp_association_new(ep, sk, scope, GFP_KERNEL);
> if (!new_asoc) {
> err = -ENOMEM;
> @@ -2904,6 +2938,8 @@ static int sctp_setsockopt_primary_addr(struct sock *sk, char __user *optval,
> {
> struct sctp_prim prim;
> struct sctp_transport *trans;
> + struct sctp_af *af;
> + int err;
>
> if (optlen != sizeof(struct sctp_prim))
> return -EINVAL;
> @@ -2911,6 +2947,17 @@ static int sctp_setsockopt_primary_addr(struct sock *sk, char __user *optval,
> if (copy_from_user(&prim, optval, sizeof(struct sctp_prim)))
> return -EFAULT;
>
> + /* Allow security module to validate address but need address len. */
> + af = sctp_get_af_specific(prim.ssp_addr.ss_family);
> + if (!af)
> + return -EINVAL;
> +
> + err = security_sctp_bind_connect(sk, SCTP_PRIMARY_ADDR,
> + (struct sockaddr *)&prim.ssp_addr,
> + af->sockaddr_len);
> + if (err)
> + return err;
> +
> trans = sctp_addr_id2transport(sk, &prim.ssp_addr, prim.ssp_assoc_id);
> if (!trans)
> return -EINVAL;
> @@ -3233,6 +3280,13 @@ static int sctp_setsockopt_peer_primary_addr(struct sock *sk, char __user *optva
> if (!sctp_assoc_lookup_laddr(asoc, (union sctp_addr *)&prim.sspp_addr))
> return -EADDRNOTAVAIL;
>
> + /* Allow security module to validate address. */
> + err = security_sctp_bind_connect(sk, SCTP_SET_PEER_PRIMARY_ADDR,
> + (struct sockaddr *)&prim.sspp_addr,
> + af->sockaddr_len);
> + if (err)
> + return err;
> +
> /* Create an ASCONF chunk with SET_PRIMARY parameter */
> chunk = sctp_make_asconf_set_prim(asoc,
> (union sctp_addr *)&prim.sspp_addr);
> @@ -8084,6 +8138,8 @@ void sctp_copy_sock(struct sock *newsk, struct sock *sk,
> {
> struct inet_sock *inet = inet_sk(sk);
> struct inet_sock *newinet;
> + struct sctp_sock *sp = sctp_sk(sk);
> + struct sctp_endpoint *ep = sp->ep;
>
> newsk->sk_type = sk->sk_type;
> newsk->sk_bound_dev_if = sk->sk_bound_dev_if;
> @@ -8126,7 +8182,10 @@ void sctp_copy_sock(struct sock *newsk, struct sock *sk,
> if (newsk->sk_flags & SK_FLAGS_TIMESTAMP)
> net_enable_timestamp();
>
> - security_sk_clone(sk, newsk);
> + /* Set newsk security attributes from orginal sk and connection
> + * security attribute from ep.
> + */
> + security_sctp_sk_clone(ep, sk, newsk);
> }
>
> static inline void sctp_copy_descendant(struct sock *sk_to,
> --
> 2.14.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply
* Re: [PATCH V4 2/4] sctp: Add ip option support
From: Marcelo Ricardo Leitner @ 2017-12-30 23:15 UTC (permalink / raw)
To: Richard Haines
Cc: selinux, netdev, linux-sctp, linux-security-module, paul,
vyasevich, nhorman, sds, eparis, casey
In-Reply-To: <20171230171950.15739-1-richard_c_haines@btinternet.com>
On Sat, Dec 30, 2017 at 05:19:50PM +0000, Richard Haines wrote:
> Add ip option support to allow LSM security modules to utilise CIPSO/IPv4
> and CALIPSO/IPv6 services.
>
> Signed-off-by: Richard Haines <richard_c_haines@btinternet.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> ---
> include/net/sctp/sctp.h | 4 +++-
> include/net/sctp/structs.h | 2 ++
> net/sctp/chunk.c | 13 ++++++++-----
> net/sctp/ipv6.c | 42 +++++++++++++++++++++++++++++++++++-------
> net/sctp/output.c | 5 ++++-
> net/sctp/protocol.c | 36 ++++++++++++++++++++++++++++++++++++
> net/sctp/socket.c | 9 +++++++--
> 7 files changed, 95 insertions(+), 16 deletions(-)
>
> diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
> index d7d8cba..1b2f40a 100644
> --- a/include/net/sctp/sctp.h
> +++ b/include/net/sctp/sctp.h
> @@ -436,9 +436,11 @@ static inline int sctp_list_single_entry(struct list_head *head)
> static inline int sctp_frag_point(const struct sctp_association *asoc, int pmtu)
> {
> struct sctp_sock *sp = sctp_sk(asoc->base.sk);
> + struct sctp_af *af = sp->pf->af;
> int frag = pmtu;
>
> - frag -= sp->pf->af->net_header_len;
> + frag -= af->ip_options_len(asoc->base.sk);
> + frag -= af->net_header_len;
> frag -= sizeof(struct sctphdr) + sizeof(struct sctp_data_chunk);
>
> if (asoc->user_frag)
> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
> index 0477945..9942ed5 100644
> --- a/include/net/sctp/structs.h
> +++ b/include/net/sctp/structs.h
> @@ -461,6 +461,7 @@ struct sctp_af {
> void (*ecn_capable)(struct sock *sk);
> __u16 net_header_len;
> int sockaddr_len;
> + int (*ip_options_len)(struct sock *sk);
> sa_family_t sa_family;
> struct list_head list;
> };
> @@ -485,6 +486,7 @@ struct sctp_pf {
> int (*addr_to_user)(struct sctp_sock *sk, union sctp_addr *addr);
> void (*to_sk_saddr)(union sctp_addr *, struct sock *sk);
> void (*to_sk_daddr)(union sctp_addr *, struct sock *sk);
> + void (*copy_ip_options)(struct sock *sk, struct sock *newsk);
> struct sctp_af *af;
> };
>
> diff --git a/net/sctp/chunk.c b/net/sctp/chunk.c
> index 3afac27..9d130f4 100644
> --- a/net/sctp/chunk.c
> +++ b/net/sctp/chunk.c
> @@ -153,7 +153,6 @@ static void sctp_datamsg_assign(struct sctp_datamsg *msg, struct sctp_chunk *chu
> chunk->msg = msg;
> }
>
> -
> /* A data chunk can have a maximum payload of (2^16 - 20). Break
> * down any such message into smaller chunks. Opportunistically, fragment
> * the chunks down to the current MTU constraints. We may get refragmented
> @@ -170,6 +169,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
> struct list_head *pos, *temp;
> struct sctp_chunk *chunk;
> struct sctp_datamsg *msg;
> + struct sctp_sock *sp;
> + struct sctp_af *af;
> int err;
>
> msg = sctp_datamsg_new(GFP_KERNEL);
> @@ -188,9 +189,12 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
> /* This is the biggest possible DATA chunk that can fit into
> * the packet
> */
> - max_data = asoc->pathmtu -
> - sctp_sk(asoc->base.sk)->pf->af->net_header_len -
> - sizeof(struct sctphdr) - sizeof(struct sctp_data_chunk);
> + sp = sctp_sk(asoc->base.sk);
> + af = sp->pf->af;
> + max_data = asoc->pathmtu - af->net_header_len -
> + sizeof(struct sctphdr) - sizeof(struct sctp_data_chunk) -
> + af->ip_options_len(asoc->base.sk);
> +
> max_data = SCTP_TRUNC4(max_data);
>
> /* If the the peer requested that we authenticate DATA chunks
> @@ -210,7 +214,6 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
>
> /* Set first_len and then account for possible bundles on first frag */
> first_len = max_data;
> -
> /* Check to see if we have a pending SACK and try to let it be bundled
> * with this message. Do this if we don't have any data queued already.
> * To check that, look at out_qlen and retransmit list.
> diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
> index 3b18085..b06dc81 100644
> --- a/net/sctp/ipv6.c
> +++ b/net/sctp/ipv6.c
> @@ -423,6 +423,38 @@ static void sctp_v6_copy_addrlist(struct list_head *addrlist,
> rcu_read_unlock();
> }
>
> +/* Copy over any ip options */
> +static void sctp_v6_copy_ip_options(struct sock *sk, struct sock *newsk)
> +{
> + struct ipv6_pinfo *newnp, *np = inet6_sk(sk);
> + struct ipv6_txoptions *opt;
> +
> + newnp = inet6_sk(newsk);
> +
> + rcu_read_lock();
> + opt = rcu_dereference(np->opt);
> + if (opt)
> + opt = ipv6_dup_options(newsk, opt);
> + RCU_INIT_POINTER(newnp->opt, opt);
> + rcu_read_unlock();
> +}
> +
> +/* Account for the IP options */
> +static int sctp_v6_ip_options_len(struct sock *sk)
> +{
> + struct ipv6_pinfo *np = inet6_sk(sk);
> + struct ipv6_txoptions *opt;
> + int len = 0;
> +
> + rcu_read_lock();
> + opt = rcu_dereference(np->opt);
> + if (opt)
> + len = opt->opt_flen + opt->opt_nflen;
> +
> + rcu_read_unlock();
> + return len;
> +}
> +
> /* Initialize a sockaddr_storage from in incoming skb. */
> static void sctp_v6_from_skb(union sctp_addr *addr, struct sk_buff *skb,
> int is_saddr)
> @@ -662,7 +694,6 @@ static struct sock *sctp_v6_create_accept_sk(struct sock *sk,
> struct sock *newsk;
> struct ipv6_pinfo *newnp, *np = inet6_sk(sk);
> struct sctp6_sock *newsctp6sk;
> - struct ipv6_txoptions *opt;
>
> newsk = sk_alloc(sock_net(sk), PF_INET6, GFP_KERNEL, sk->sk_prot, kern);
> if (!newsk)
> @@ -685,12 +716,7 @@ static struct sock *sctp_v6_create_accept_sk(struct sock *sk,
> newnp->ipv6_ac_list = NULL;
> newnp->ipv6_fl_list = NULL;
>
> - rcu_read_lock();
> - opt = rcu_dereference(np->opt);
> - if (opt)
> - opt = ipv6_dup_options(newsk, opt);
> - RCU_INIT_POINTER(newnp->opt, opt);
> - rcu_read_unlock();
> + sctp_v6_copy_ip_options(sk, newsk);
>
> /* Initialize sk's sport, dport, rcv_saddr and daddr for getsockname()
> * and getpeername().
> @@ -1036,6 +1062,7 @@ static struct sctp_af sctp_af_inet6 = {
> .ecn_capable = sctp_v6_ecn_capable,
> .net_header_len = sizeof(struct ipv6hdr),
> .sockaddr_len = sizeof(struct sockaddr_in6),
> + .ip_options_len = sctp_v6_ip_options_len,
> #ifdef CONFIG_COMPAT
> .compat_setsockopt = compat_ipv6_setsockopt,
> .compat_getsockopt = compat_ipv6_getsockopt,
> @@ -1054,6 +1081,7 @@ static struct sctp_pf sctp_pf_inet6 = {
> .addr_to_user = sctp_v6_addr_to_user,
> .to_sk_saddr = sctp_v6_to_sk_saddr,
> .to_sk_daddr = sctp_v6_to_sk_daddr,
> + .copy_ip_options = sctp_v6_copy_ip_options,
> .af = &sctp_af_inet6,
> };
>
> diff --git a/net/sctp/output.c b/net/sctp/output.c
> index 4a865cd..2b39c70 100644
> --- a/net/sctp/output.c
> +++ b/net/sctp/output.c
> @@ -151,7 +151,10 @@ void sctp_packet_init(struct sctp_packet *packet,
> INIT_LIST_HEAD(&packet->chunk_list);
> if (asoc) {
> struct sctp_sock *sp = sctp_sk(asoc->base.sk);
> - overhead = sp->pf->af->net_header_len;
> + struct sctp_af *af = sp->pf->af;
> +
> + overhead = af->net_header_len +
> + af->ip_options_len(asoc->base.sk);
> } else {
> overhead = sizeof(struct ipv6hdr);
> }
> diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
> index fcd80fe..cde051a 100644
> --- a/net/sctp/protocol.c
> +++ b/net/sctp/protocol.c
> @@ -237,6 +237,38 @@ int sctp_copy_local_addr_list(struct net *net, struct sctp_bind_addr *bp,
> return error;
> }
>
> +/* Copy over any ip options */
> +static void sctp_v4_copy_ip_options(struct sock *sk, struct sock *newsk)
> +{
> + struct inet_sock *newinet, *inet = inet_sk(sk);
> + struct ip_options_rcu *inet_opt, *newopt = NULL;
> +
> + newinet = inet_sk(newsk);
> +
> + rcu_read_lock();
> + inet_opt = rcu_dereference(inet->inet_opt);
> + if (inet_opt) {
> + newopt = sock_kmalloc(newsk, sizeof(*inet_opt) +
> + inet_opt->opt.optlen, GFP_ATOMIC);
> + if (newopt)
> + memcpy(newopt, inet_opt, sizeof(*inet_opt) +
> + inet_opt->opt.optlen);
> + }
> + RCU_INIT_POINTER(newinet->inet_opt, newopt);
> + rcu_read_unlock();
> +}
> +
> +/* Account for the IP options */
> +static int sctp_v4_ip_options_len(struct sock *sk)
> +{
> + struct inet_sock *inet = inet_sk(sk);
> +
> + if (inet->inet_opt)
> + return inet->inet_opt->opt.optlen;
> + else
> + return 0;
> +}
> +
> /* Initialize a sctp_addr from in incoming skb. */
> static void sctp_v4_from_skb(union sctp_addr *addr, struct sk_buff *skb,
> int is_saddr)
> @@ -590,6 +622,8 @@ static struct sock *sctp_v4_create_accept_sk(struct sock *sk,
> sctp_copy_sock(newsk, sk, asoc);
> sock_reset_flag(newsk, SOCK_ZAPPED);
>
> + sctp_v4_copy_ip_options(sk, newsk);
> +
> newinet = inet_sk(newsk);
>
> newinet->inet_daddr = asoc->peer.primary_addr.v4.sin_addr.s_addr;
> @@ -1008,6 +1042,7 @@ static struct sctp_pf sctp_pf_inet = {
> .addr_to_user = sctp_v4_addr_to_user,
> .to_sk_saddr = sctp_v4_to_sk_saddr,
> .to_sk_daddr = sctp_v4_to_sk_daddr,
> + .copy_ip_options = sctp_v4_copy_ip_options,
> .af = &sctp_af_inet
> };
>
> @@ -1092,6 +1127,7 @@ static struct sctp_af sctp_af_inet = {
> .ecn_capable = sctp_v4_ecn_capable,
> .net_header_len = sizeof(struct iphdr),
> .sockaddr_len = sizeof(struct sockaddr_in),
> + .ip_options_len = sctp_v4_ip_options_len,
> #ifdef CONFIG_COMPAT
> .compat_setsockopt = compat_ip_setsockopt,
> .compat_getsockopt = compat_ip_getsockopt,
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index d6163f7..4373e2a 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -3162,8 +3162,11 @@ static int sctp_setsockopt_maxseg(struct sock *sk, char __user *optval, unsigned
>
> if (asoc) {
> if (val == 0) {
> + struct sctp_af *af = sp->pf->af;
> +
> val = asoc->pathmtu;
> - val -= sp->pf->af->net_header_len;
> + val -= af->ip_options_len(asoc->base.sk);
> + val -= af->net_header_len;
> val -= sizeof(struct sctphdr) +
> sizeof(struct sctp_data_chunk);
> }
> @@ -4964,9 +4967,11 @@ int sctp_do_peeloff(struct sock *sk, sctp_assoc_t id, struct socket **sockp)
> sctp_copy_sock(sock->sk, sk, asoc);
>
> /* Make peeled-off sockets more like 1-1 accepted sockets.
> - * Set the daddr and initialize id to something more random
> + * Set the daddr and initialize id to something more random and also
> + * copy over any ip options.
> */
> sp->pf->to_sk_daddr(&asoc->peer.primary_addr, sk);
> + sp->pf->copy_ip_options(sk, sock->sk);
>
> /* Populate the fields of the newsk from the oldsk and migrate the
> * asoc to the newsk.
> --
> 2.14.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply
* Re: [PATCH V4 1/4] security: Add support for SCTP security hooks
From: Marcelo Ricardo Leitner @ 2017-12-30 23:15 UTC (permalink / raw)
To: Richard Haines
Cc: selinux, netdev, linux-sctp, linux-security-module, paul,
vyasevich, nhorman, sds, eparis, casey
In-Reply-To: <20171230171926.15690-1-richard_c_haines@btinternet.com>
On Sat, Dec 30, 2017 at 05:19:26PM +0000, Richard Haines wrote:
> The SCTP security hooks are explained in:
> Documentation/security/LSM-sctp.rst
>
> Signed-off-by: Richard Haines <richard_c_haines@btinternet.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> ---
> Documentation/security/LSM-sctp.rst | 175 ++++++++++++++++++++++++++++++++++++
> include/linux/lsm_hooks.h | 36 ++++++++
> include/linux/security.h | 25 ++++++
> security/security.c | 22 +++++
> 4 files changed, 258 insertions(+)
> create mode 100644 Documentation/security/LSM-sctp.rst
>
> diff --git a/Documentation/security/LSM-sctp.rst b/Documentation/security/LSM-sctp.rst
> new file mode 100644
> index 0000000..6e5a392
> --- /dev/null
> +++ b/Documentation/security/LSM-sctp.rst
> @@ -0,0 +1,175 @@
> +SCTP LSM Support
> +================
> +
> +For security module support, three SCTP specific hooks have been implemented::
> +
> + security_sctp_assoc_request()
> + security_sctp_bind_connect()
> + security_sctp_sk_clone()
> +
> +Also the following security hook has been utilised::
> +
> + security_inet_conn_established()
> +
> +The usage of these hooks are described below with the SELinux implementation
> +described in ``Documentation/security/SELinux-sctp.rst``
> +
> +
> +security_sctp_assoc_request()
> +-----------------------------
> +Passes the ``@ep`` and ``@chunk->skb`` of the association INIT packet to the
> +security module. Returns 0 on success, error on failure.
> +::
> +
> + @ep - pointer to sctp endpoint structure.
> + @skb - pointer to skbuff of association packet.
> +
> +
> +security_sctp_bind_connect()
> +-----------------------------
> +Passes one or more ipv4/ipv6 addresses to the security module for validation
> +based on the ``@optname`` that will result in either a bind or connect
> +service as shown in the permission check tables below.
> +Returns 0 on success, error on failure.
> +::
> +
> + @sk - Pointer to sock structure.
> + @optname - Name of the option to validate.
> + @address - One or more ipv4 / ipv6 addresses.
> + @addrlen - The total length of address(s). This is calculated on each
> + ipv4 or ipv6 address using sizeof(struct sockaddr_in) or
> + sizeof(struct sockaddr_in6).
> +
> + ------------------------------------------------------------------
> + | BIND Type Checks |
> + | @optname | @address contains |
> + |----------------------------|-----------------------------------|
> + | SCTP_SOCKOPT_BINDX_ADD | One or more ipv4 / ipv6 addresses |
> + | SCTP_PRIMARY_ADDR | Single ipv4 or ipv6 address |
> + | SCTP_SET_PEER_PRIMARY_ADDR | Single ipv4 or ipv6 address |
> + ------------------------------------------------------------------
> +
> + ------------------------------------------------------------------
> + | CONNECT Type Checks |
> + | @optname | @address contains |
> + |----------------------------|-----------------------------------|
> + | SCTP_SOCKOPT_CONNECTX | One or more ipv4 / ipv6 addresses |
> + | SCTP_PARAM_ADD_IP | One or more ipv4 / ipv6 addresses |
> + | SCTP_SENDMSG_CONNECT | Single ipv4 or ipv6 address |
> + | SCTP_PARAM_SET_PRIMARY | Single ipv4 or ipv6 address |
> + ------------------------------------------------------------------
> +
> +A summary of the ``@optname`` entries is as follows::
> +
> + SCTP_SOCKOPT_BINDX_ADD - Allows additional bind addresses to be
> + associated after (optionally) calling
> + bind(3).
> + sctp_bindx(3) adds a set of bind
> + addresses on a socket.
> +
> + SCTP_SOCKOPT_CONNECTX - Allows the allocation of multiple
> + addresses for reaching a peer
> + (multi-homed).
> + sctp_connectx(3) initiates a connection
> + on an SCTP socket using multiple
> + destination addresses.
> +
> + SCTP_SENDMSG_CONNECT - Initiate a connection that is generated by a
> + sendmsg(2) or sctp_sendmsg(3) on a new asociation.
> +
> + SCTP_PRIMARY_ADDR - Set local primary address.
> +
> + SCTP_SET_PEER_PRIMARY_ADDR - Request peer sets address as
> + association primary.
> +
> + SCTP_PARAM_ADD_IP - These are used when Dynamic Address
> + SCTP_PARAM_SET_PRIMARY - Reconfiguration is enabled as explained below.
> +
> +
> +To support Dynamic Address Reconfiguration the following parameters must be
> +enabled on both endpoints (or use the appropriate **setsockopt**\(2))::
> +
> + /proc/sys/net/sctp/addip_enable
> + /proc/sys/net/sctp/addip_noauth_enable
> +
> +then the following *_PARAM_*'s are sent to the peer in an
> +ASCONF chunk when the corresponding ``@optname``'s are present::
> +
> + @optname ASCONF Parameter
> + ---------- ------------------
> + SCTP_SOCKOPT_BINDX_ADD -> SCTP_PARAM_ADD_IP
> + SCTP_SET_PEER_PRIMARY_ADDR -> SCTP_PARAM_SET_PRIMARY
> +
> +
> +security_sctp_sk_clone()
> +-------------------------
> +Called whenever a new socket is created by **accept**\(2)
> +(i.e. a TCP style socket) or when a socket is 'peeled off' e.g userspace
> +calls **sctp_peeloff**\(3).
> +::
> +
> + @ep - pointer to current sctp endpoint structure.
> + @sk - pointer to current sock structure.
> + @sk - pointer to new sock structure.
> +
> +
> +security_inet_conn_established()
> +---------------------------------
> +Called when a COOKIE ACK is received::
> +
> + @sk - pointer to sock structure.
> + @skb - pointer to skbuff of the COOKIE ACK packet.
> +
> +
> +Security Hooks used for Association Establishment
> +=================================================
> +The following diagram shows the use of ``security_sctp_bind_connect()``,
> +``security_sctp_assoc_request()``, ``security_inet_conn_established()`` when
> +establishing an association.
> +::
> +
> + SCTP endpoint "A" SCTP endpoint "Z"
> + ================= =================
> + sctp_sf_do_prm_asoc()
> + Association setup can be initiated
> + by a connect(2), sctp_connectx(3),
> + sendmsg(2) or sctp_sendmsg(3).
> + These will result in a call to
> + security_sctp_bind_connect() to
> + initiate an association to
> + SCTP peer endpoint "Z".
> + INIT --------------------------------------------->
> + sctp_sf_do_5_1B_init()
> + Respond to an INIT chunk.
> + SCTP peer endpoint "A" is
> + asking for an association. Call
> + security_sctp_assoc_request()
> + to set the peer label if first
> + association.
> + If not first association, check
> + whether allowed, IF so send:
> + <----------------------------------------------- INIT ACK
> + | ELSE audit event and silently
> + | discard the packet.
> + |
> + COOKIE ECHO ------------------------------------------>
> + |
> + |
> + |
> + <------------------------------------------- COOKIE ACK
> + | |
> + sctp_sf_do_5_1E_ca |
> + Call security_inet_conn_established() |
> + to set the peer label. |
> + | |
> + | If SCTP_SOCKET_TCP or peeled off
> + | socket security_sctp_sk_clone() is
> + | called to clone the new socket.
> + | |
> + ESTABLISHED ESTABLISHED
> + | |
> + ------------------------------------------------------------------
> + | Association Established |
> + ------------------------------------------------------------------
> +
> +
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index c925812..647e700 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -906,6 +906,33 @@
> * associated with the TUN device's security structure.
> * @security pointer to the TUN devices's security structure.
> *
> + * Security hooks for SCTP
> + *
> + * @sctp_assoc_request:
> + * Passes the @ep and @chunk->skb of the association INIT packet to
> + * the security module.
> + * @ep pointer to sctp endpoint structure.
> + * @skb pointer to skbuff of association packet.
> + * Return 0 on success, error on failure.
> + * @sctp_bind_connect:
> + * Validiate permissions required for each address associated with sock
> + * @sk. Depending on @optname, the addresses will be treated as either
> + * for a connect or bind service. The @addrlen is calculated on each
> + * ipv4 and ipv6 address using sizeof(struct sockaddr_in) or
> + * sizeof(struct sockaddr_in6).
> + * @sk pointer to sock structure.
> + * @optname name of the option to validate.
> + * @address list containing one or more ipv4/ipv6 addresses.
> + * @addrlen total length of address(s).
> + * Return 0 on success, error on failure.
> + * @sctp_sk_clone:
> + * Called whenever a new socket is created by accept(2) (i.e. a TCP
> + * style socket) or when a socket is 'peeled off' e.g userspace
> + * calls sctp_peeloff(3).
> + * @ep pointer to current sctp endpoint structure.
> + * @sk pointer to current sock structure.
> + * @sk pointer to new sock structure.
> + *
> * Security hooks for Infiniband
> *
> * @ib_pkey_access:
> @@ -1631,6 +1658,12 @@ union security_list_options {
> int (*tun_dev_attach_queue)(void *security);
> int (*tun_dev_attach)(struct sock *sk, void *security);
> int (*tun_dev_open)(void *security);
> + int (*sctp_assoc_request)(struct sctp_endpoint *ep,
> + struct sk_buff *skb);
> + int (*sctp_bind_connect)(struct sock *sk, int optname,
> + struct sockaddr *address, int addrlen);
> + void (*sctp_sk_clone)(struct sctp_endpoint *ep, struct sock *sk,
> + struct sock *newsk);
> #endif /* CONFIG_SECURITY_NETWORK */
>
> #ifdef CONFIG_SECURITY_INFINIBAND
> @@ -1869,6 +1902,9 @@ struct security_hook_heads {
> struct list_head tun_dev_attach_queue;
> struct list_head tun_dev_attach;
> struct list_head tun_dev_open;
> + struct list_head sctp_assoc_request;
> + struct list_head sctp_bind_connect;
> + struct list_head sctp_sk_clone;
> #endif /* CONFIG_SECURITY_NETWORK */
> #ifdef CONFIG_SECURITY_INFINIBAND
> struct list_head ib_pkey_access;
> diff --git a/include/linux/security.h b/include/linux/security.h
> index 3107754..2e5ec5c 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -115,6 +115,7 @@ struct xfrm_policy;
> struct xfrm_state;
> struct xfrm_user_sec_ctx;
> struct seq_file;
> +struct sctp_endpoint;
>
> #ifdef CONFIG_MMU
> extern unsigned long mmap_min_addr;
> @@ -1229,6 +1230,11 @@ int security_tun_dev_create(void);
> int security_tun_dev_attach_queue(void *security);
> int security_tun_dev_attach(struct sock *sk, void *security);
> int security_tun_dev_open(void *security);
> +int security_sctp_assoc_request(struct sctp_endpoint *ep, struct sk_buff *skb);
> +int security_sctp_bind_connect(struct sock *sk, int optname,
> + struct sockaddr *address, int addrlen);
> +void security_sctp_sk_clone(struct sctp_endpoint *ep, struct sock *sk,
> + struct sock *newsk);
>
> #else /* CONFIG_SECURITY_NETWORK */
> static inline int security_unix_stream_connect(struct sock *sock,
> @@ -1421,6 +1427,25 @@ static inline int security_tun_dev_open(void *security)
> {
> return 0;
> }
> +
> +static inline int security_sctp_assoc_request(struct sctp_endpoint *ep,
> + struct sk_buff *skb)
> +{
> + return 0;
> +}
> +
> +static inline int security_sctp_bind_connect(struct sock *sk, int optname,
> + struct sockaddr *address,
> + int addrlen)
> +{
> + return 0;
> +}
> +
> +static inline void security_sctp_sk_clone(struct sctp_endpoint *ep,
> + struct sock *sk,
> + struct sock *newsk)
> +{
> +}
> #endif /* CONFIG_SECURITY_NETWORK */
>
> #ifdef CONFIG_SECURITY_INFINIBAND
> diff --git a/security/security.c b/security/security.c
> index 4bf0f57..1400678 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -1472,6 +1472,7 @@ void security_inet_conn_established(struct sock *sk,
> {
> call_void_hook(inet_conn_established, sk, skb);
> }
> +EXPORT_SYMBOL(security_inet_conn_established);
>
> int security_secmark_relabel_packet(u32 secid)
> {
> @@ -1527,6 +1528,27 @@ int security_tun_dev_open(void *security)
> }
> EXPORT_SYMBOL(security_tun_dev_open);
>
> +int security_sctp_assoc_request(struct sctp_endpoint *ep, struct sk_buff *skb)
> +{
> + return call_int_hook(sctp_assoc_request, 0, ep, skb);
> +}
> +EXPORT_SYMBOL(security_sctp_assoc_request);
> +
> +int security_sctp_bind_connect(struct sock *sk, int optname,
> + struct sockaddr *address, int addrlen)
> +{
> + return call_int_hook(sctp_bind_connect, 0, sk, optname,
> + address, addrlen);
> +}
> +EXPORT_SYMBOL(security_sctp_bind_connect);
> +
> +void security_sctp_sk_clone(struct sctp_endpoint *ep, struct sock *sk,
> + struct sock *newsk)
> +{
> + call_void_hook(sctp_sk_clone, ep, sk, newsk);
> +}
> +EXPORT_SYMBOL(security_sctp_sk_clone);
> +
> #endif /* CONFIG_SECURITY_NETWORK */
>
> #ifdef CONFIG_SECURITY_INFINIBAND
> --
> 2.14.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply
* Re: [PATCH v3 net-next 2/5] net: tracepoint: replace tcp_set_state tracepoint with inet_sock_set_state tracepoint
From: Brendan Gregg @ 2017-12-30 22:33 UTC (permalink / raw)
To: Yafang Shao
Cc: songliubraving, David S. Miller, marcelo.leitner, Steven Rostedt,
Brendan Gregg, netdev, LKML
In-Reply-To: <1513739574-3345-3-git-send-email-laoar.shao@gmail.com>
On Tue, Dec 19, 2017 at 7:12 PM, Yafang Shao <laoar.shao@gmail.com> wrote:
> As sk_state is a common field for struct sock, so the state
> transition tracepoint should not be a TCP specific feature.
> Currently it traces all AF_INET state transition, so I rename this
> tracepoint to inet_sock_set_state tracepoint with some minor changes and move it
> into trace/events/sock.h.
The tcp:tcp_set_state probe is tcp_set_state(), so it's only going to
fire for TCP sessions. It's not broken, and we could add a
sctp:sctp_set_state as well. Replacing tcp:tcp_set_state with
inet_sk_set_state is feeling like we might be baking too much
implementation detail into the tracepoint API.
If we must have inet_sk_set_state, then must we also delete tcp:tcp_set_state?
Brendan
> We dont need to create a file named trace/events/inet_sock.h for this one single
> tracepoint.
>
> Two helpers are introduced to trace sk_state transition
> - void inet_sk_state_store(struct sock *sk, int newstate);
> - void inet_sk_set_state(struct sock *sk, int state);
> As trace header should not be included in other header files,
> so they are defined in sock.c.
>
> The protocol such as SCTP maybe compiled as a ko, hence export
> inet_sk_set_state().
>[...]
^ permalink raw reply
* Re: [PATCH] rds: fix use-after-free read in rds_find_bound
From: Sowmini Varadhan @ 2017-12-30 22:32 UTC (permalink / raw)
To: santosh.shilimkar@oracle.com; +Cc: netdev, davem
In-Reply-To: <27c5708f-5d56-b2bd-c9b8-82a3e5f728f9@oracle.com>
On (12/30/17 13:37), santosh.shilimkar@oracle.com wrote:
> Well thats what the report says o.w flag test wouldn't have
> been attempted.
the bug report says "use-after-free".
It doesnt say that rds_rs_to_sk(rs) is null (if rds_rs_to_sk(rs) was null,
rs would also be null, please cscope struct rds_sock)
What the bug report says is
" The buggy address belongs to the object at ffff8801c09a6080
which belongs to the cache RDS of size 1472
The buggy address is located 96 bytes inside of .."
96 is the offset of sk->sk_flags. so yes, there is a socket refcount
issue.
But the patch you sent (see next two lines) will not solve that.
> >>- if (rs && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD))
> >>+ if (rs && rds_rs_to_sk(rs) && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD))
Sowmini>I think the real issue is refcount bug somewhere,
> Thats what I thought as well initially but since the reported case,
> the rs seems to be valid where as sk seems to be freed up as part of
> sock_release callback.
I dont understand the statement above- how can "rs be valid, and sk
be freed"?
rs_sk is embedded in the struct rds_sock, it is not a pointer.
let's find and fix the refcount bug. See stack trace in commit comment.
The socket release is happening prematurely and existing WARN_ONs
are not catching it.
> >Was the syzbot test run with http://patchwork.ozlabs.org/patch/852492/
> >this sounds like that type of bug.
--Sowmini
^ permalink raw reply
* Re: [PATCH] rds: fix use-after-free read in rds_find_bound
From: santosh.shilimkar @ 2017-12-30 21:37 UTC (permalink / raw)
To: Sowmini Varadhan; +Cc: netdev, davem
In-Reply-To: <20171230202631.GB27855@oracle.com>
On 12/30/17 12:26 PM, Sowmini Varadhan wrote:
> On (12/30/17 11:36), Santosh Shilimkar wrote:
>>
>> socket buffer can get freed as part of sock_close
>> callback so before adding reference check underneath
>> socket validity.
>
> I'm not sure I understand this fix-
>
> struct rds_sock is:
> struct rds_sock {
> struct sock rs_sk;
> :
> }
>
> How can rs be non-null but rds_rs_to_sk() is null? (Note that
> rds_rs_to_sk just returns &rs->rs_sk) so the changed line is
> identical to the original line.
>
Well thats what the report says o.w flag test wouldn't have
been attempted.
>> - if (rs && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD))
>> + if (rs && rds_rs_to_sk(rs) && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD))
>
> I think the real issue is refcount bug somewhere,
>
Thats what I thought as well initially but since the reported case,
the rs seems to be valid where as sk seems to be freed up as part of
sock_release callback.
> Was the syzbot test run with http://patchwork.ozlabs.org/patch/852492/
> this sounds like that type of bug.
>
That fix scenario, the rs don't get inserted in hash table and
in this particular bug, the lookup was successful so am not sure
if these two bugs are related.
But since bound address fix was still not part of the build
reproduced use after free bug, $subject fix can wait for next
reproduction. Unfortunately as per the report, there is no
reproducer for it to test if other fix fixes this issue.
Regards,
Santosh
^ permalink raw reply
* [PATCH] wan/fsl_ucc_hdlc: Delete an error message for a failed memory allocation in ucc_hdlc_probe()
From: SF Markus Elfring @ 2017-12-30 21:30 UTC (permalink / raw)
To: netdev, linuxppc-dev, Zhao Qiang; +Cc: LKML, kernel-janitors
From: Markus Elfring <elfring@users.sourceforge.net>
Date: Sat, 30 Dec 2017 22:25:44 +0100
Omit an extra message for a memory allocation failure in this function.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
---
drivers/net/wan/fsl_ucc_hdlc.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/net/wan/fsl_ucc_hdlc.c b/drivers/net/wan/fsl_ucc_hdlc.c
index 33df76405b86..98f8be206bae 100644
--- a/drivers/net/wan/fsl_ucc_hdlc.c
+++ b/drivers/net/wan/fsl_ucc_hdlc.c
@@ -1082,7 +1082,6 @@ static int ucc_hdlc_probe(struct platform_device *pdev)
utdm = kzalloc(sizeof(*utdm), GFP_KERNEL);
if (!utdm) {
ret = -ENOMEM;
- dev_err(&pdev->dev, "No mem to alloc ucc tdm data\n");
goto free_uhdlc_priv;
}
uhdlc_priv->utdm = utdm;
--
2.15.1
^ permalink raw reply related
* Re: [patch net-next v2 00/10] Add support for resource abstraction
From: David Ahern @ 2017-12-30 21:15 UTC (permalink / raw)
To: Yuval Mintz, Roopa Prabhu, Jiri Pirko, Arkadi Sharshevsky
Cc: netdev@vger.kernel.org, David Miller, mlxsw, Andrew Lunn,
Vivien Didelot, Florian Fainelli, Michael Chan,
ganeshgr@chelsio.com, Saeed Mahameed, Matan Barak,
Leon Romanovsky, Ido Schimmel, jakub.kicinski@netronome.com,
ast@kernel.org, Daniel Borkmann, Simon Horman,
pieter.jansenvanvuuren@netronome.com, "john.hurley@netronome
In-Reply-To: <VI1PR05MB3520A850FFF069C1BE8CF19ABF040@VI1PR05MB3520.eurprd05.prod.outlook.com>
On 12/28/17 1:21 AM, Yuval Mintz wrote:
> I think it goes the other way around. The dpipe tables are the ones that
> can be translated to functionality; The resources are internal and HW-specific
> representing the possible internal division of resources -
> but a given resource sn't necessarily mapped to a single networking feature.
> [It might be in some cases, but not in the general case]
This is what I am getting at -- a single resource /kvd/linear is used
for multiple networking features, and those networking features do map
to well known entities -- fdb entries, ACL entries, ipv4/v6 host
entries, LPM entries, etc.
Nothing about the output from devlink helps the user in any way to
understand how to change the resource values. Saying that these
resources, what they mean and how they are used is MLX proprietary and
is known only to MLX employees and those with MLX agreements is not
acceptable. Likewise, requiring some network admin to deep dive into the
mlxsw driver to piece together how kvd/linear (for example) is used is
not acceptable.
The cover letter touts "Many of the ASIC's internal resources are
limited and are shared between several hardware procedures. For example,
unified hash-based memory can be used for many lookup purposes, like FDB
and LPM. In many cases the user can provide a partitioning scheme for
such a resource in order to perform fine tuning for his application."
Great, now give the user some indication of how to do that. Is setting
/kvd/linear to 0 acceptable? If not, why? What functionality is lost?
(Apparently, everything [1].)
The dpipe tables list some correlation between the kvd resources and
tables but that is not a complete list and again there is nothing to
tell a user that it is only a partial list of how a kvd resource is
used. For example, it shows ipv4 host is in /kvd/hash_single and that is
all it shows. So if I have an ipv6 only deployment can I conclude that I
can set /kvd/hash_single to 0? Or the reverse, can I set hash_double to
0 for an ipv4 only deployment? From the limited information given, it is
reasonable for a user to assume yes and has to learn through trial and
error what can be done. [2]
-----
[1] This is allowed by the current patch set and perhaps it should not be:
$ ip ro ls vrf vrf1101
unreachable default metric 8192
11.2.51.0/24 dev swp1s0.51 proto kernel scope link src 11.2.51.1 offload
11.3.51.0/24 dev swp1s1.51 proto kernel scope link src 11.3.51.1 offload
11.4.51.0/24 dev swp1s2.51 proto kernel scope link src 11.4.51.1 offload
11.5.51.0/24 dev swp1s3.51 proto kernel scope link src 11.5.51.1 offload
11.6.51.0/24 dev swp3s0.51 proto kernel scope link src 11.6.51.1 offload
11.7.51.0/24 dev swp3s1.51 proto kernel scope link src 11.7.51.1 offload
11.8.51.0/24 dev swp3s2.51 proto kernel scope link src 11.8.51.1 offload
11.9.51.0/24 dev swp3s3.51 proto kernel scope link src 11.9.51.1 offload
$ devlink resource set pci/0000:03:00.0 path /kvd/linear size 0
$ devlink reload pci/0000:03:00.0
$ ip ro ls vrf vrf1101
unreachable default metric 8192
[2] Same exact result for setting hash_double to 0:
$ ip ro ls vrf vrf1101
unreachable default metric 8192
11.2.51.0/24 dev swp1s0.51 proto kernel scope link src 11.2.51.1 offload
11.3.51.0/24 dev swp1s1.51 proto kernel scope link src 11.3.51.1 offload
11.4.51.0/24 dev swp1s2.51 proto kernel scope link src 11.4.51.1 offload
11.5.51.0/24 dev swp1s3.51 proto kernel scope link src 11.5.51.1 offload
11.6.51.0/24 dev swp3s0.51 proto kernel scope link src 11.6.51.1 offload
11.7.51.0/24 dev swp3s1.51 proto kernel scope link src 11.7.51.1 offload
11.8.51.0/24 dev swp3s2.51 proto kernel scope link src 11.8.51.1 offload
11.9.51.0/24 dev swp3s3.51 proto kernel scope link src 11.9.51.1 offload
$ devlink resource set pci/0000:03:00.0 path /kvd/hash_double size 0
$ devlink reload pci/0000:03:00.0
$ ip ro ls vrf vrf1101
unreachable default metric 8192
^ permalink raw reply
* [PATCH 2/2] at76c50x-usb: Improve size determinations in at76_usbdfu_download()
From: SF Markus Elfring @ 2017-12-30 21:08 UTC (permalink / raw)
To: linux-wireless, netdev, Andrew Zaborowski, Arvind Yadav,
Geert Uytterhoeven, Kalle Valo, Kees Cook
Cc: LKML, kernel-janitors
In-Reply-To: <1d76fe59-af15-ba4d-e05d-09dcd9ee38cd@users.sourceforge.net>
From: Markus Elfring <elfring@users.sourceforge.net>
Date: Sat, 30 Dec 2017 21:56:56 +0100
Replace the specification of two data types by pointer dereferences
as the parameter for the operator "sizeof" to make the corresponding size
determination a bit safer according to the Linux coding style convention.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
---
drivers/net/wireless/atmel/at76c50x-usb.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/atmel/at76c50x-usb.c b/drivers/net/wireless/atmel/at76c50x-usb.c
index 2893d339b440..6144d4a258ca 100644
--- a/drivers/net/wireless/atmel/at76c50x-usb.c
+++ b/drivers/net/wireless/atmel/at76c50x-usb.c
@@ -383,7 +383,7 @@ static int at76_usbdfu_download(struct usb_device *udev, u8 *buf, u32 size,
return -EINVAL;
}
- dfu_stat_buf = kmalloc(sizeof(struct dfu_status), GFP_KERNEL);
+ dfu_stat_buf = kmalloc(sizeof(*dfu_stat_buf), GFP_KERNEL);
if (!dfu_stat_buf) {
ret = -ENOMEM;
goto exit;
@@ -395,7 +395,7 @@ static int at76_usbdfu_download(struct usb_device *udev, u8 *buf, u32 size,
goto exit;
}
- dfu_state = kmalloc(sizeof(u8), GFP_KERNEL);
+ dfu_state = kmalloc(sizeof(*dfu_state), GFP_KERNEL);
if (!dfu_state) {
ret = -ENOMEM;
goto exit;
--
2.15.1
^ permalink raw reply related
* [PATCH 1/2] at76c50x-usb: Delete an error message for a failed memory allocation in at76_submit_rx_urb()
From: SF Markus Elfring @ 2017-12-30 21:07 UTC (permalink / raw)
To: linux-wireless, netdev, Andrew Zaborowski, Arvind Yadav,
Geert Uytterhoeven, Kalle Valo, Kees Cook
Cc: LKML, kernel-janitors
In-Reply-To: <1d76fe59-af15-ba4d-e05d-09dcd9ee38cd@users.sourceforge.net>
From: Markus Elfring <elfring@users.sourceforge.net>
Date: Sat, 30 Dec 2017 21:50:12 +0100
Omit an extra message for a memory allocation failure in this function.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
---
drivers/net/wireless/atmel/at76c50x-usb.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/net/wireless/atmel/at76c50x-usb.c b/drivers/net/wireless/atmel/at76c50x-usb.c
index ede89d4ffc88..2893d339b440 100644
--- a/drivers/net/wireless/atmel/at76c50x-usb.c
+++ b/drivers/net/wireless/atmel/at76c50x-usb.c
@@ -1223,8 +1223,6 @@ static int at76_submit_rx_urb(struct at76_priv *priv)
if (!skb) {
skb = dev_alloc_skb(sizeof(struct at76_rx_buffer));
if (!skb) {
- wiphy_err(priv->hw->wiphy,
- "cannot allocate rx skbuff\n");
ret = -ENOMEM;
goto exit;
}
--
2.15.1
^ permalink raw reply related
* [PATCH 0/2] at76c50x-usb: Adjustments for two function implementations
From: SF Markus Elfring @ 2017-12-30 21:06 UTC (permalink / raw)
To: linux-wireless, netdev, Andrew Zaborowski, Arvind Yadav,
Geert Uytterhoeven, Kalle Valo, Kees Cook
Cc: LKML, kernel-janitors
From: Markus Elfring <elfring@users.sourceforge.net>
Date: Sat, 30 Dec 2017 22:01:23 +0100
Two update suggestions were taken into account
from static source code analysis.
Markus Elfring (2):
Delete an error message for a failed memory allocation in at76_submit_rx_urb()
Improve size determinations in at76_usbdfu_download()
drivers/net/wireless/atmel/at76c50x-usb.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
--
2.15.1
^ permalink raw reply
* Re: [RFT net-next v3 0/5] dwmac-meson8b: RGMII clock fixes for Meson8b
From: Martin Blumenstingl @ 2017-12-30 21:02 UTC (permalink / raw)
To: Emiliano Ingrassia
Cc: Jerome Brunet, netdev, linus.luessing, khilman, linux-amlogic,
Neil Armstrong, peppe.cavallaro, alexandre.torgue
In-Reply-To: <20171229230058.GA17364@ingrassia.epigenesys.com>
Hi Emiliano,
On Sat, Dec 30, 2017 at 12:00 AM, Emiliano Ingrassia
<ingrassia@epigenesys.com> wrote:
> Hi Jerome, Hi Martin,
>
> On Fri, Dec 29, 2017 at 07:04:23PM +0100, Jerome Brunet wrote:
>> On Fri, 2017-12-29 at 02:31 +0100, Emiliano Ingrassia wrote:
>> > Hi Martin, Hi Dave,
>> >
>> > On Thu, Dec 28, 2017 at 11:21:23PM +0100, Martin Blumenstingl wrote:
>> > > Hi Dave,
>> > >
>> > > please do not apply this series until it got a Tested-by from Emiliano.
>> > >
>> > >
>> > > Hi Emiliano,
>> > >
>> > > you reported [0] that you couldn't get dwmac-meson8b to work on your
>> > > Odroid-C1. With your findings (register dumps, clk_summary output, etc.)
>> > > I think I was able to find a fix: it consists of two patches (which you
>> > > find in this series)
>> > >
>> > > Unfortunately I don't have any Meson8b boards with RGMII PHY so I could
>> > > only partially test this (I could only check if the clocks were
>> > > calculated correctly when using a dummy 500002394Hz input clock instead
>> > > of MPLL2).
>> > >
>> > > Could you please give this series a try and let me know about the
>> > > results?
>> > > You obviously still need your two "ARM: dts: meson8b" patches which
>> > > - add the amlogic,meson8b-dwmac" compatible to meson8b.dtsi
>> > > - enable Ethernet on the Odroid-C1
>> > >
>> > > When testing on Meson8b this also needs a fix for the MPLL clock driver:
>> > > "clk: meson: mpll: use 64-bit maths in params_from_rate", see:
>> > > https://patchwork.kernel.org/patch/10131677/
>> > >
>> > >
>> > > I have tested this myself on a Khadas VIM (GXL SoC, internal RMII PHY)
>> > > and a Khadas VIM2 (GXM SoC, external RGMII PHY). Both are still working
>> > > fine (so let's hope that this also fixes your Meson8b issue :)).
>> > >
>> > >
>> > > changes since v1 at [1]:
>> > > - changed the subject of the cover-letter to indicate that this is all
>> > > about the RGMII clock
>> > > - added PATCH #1 which ensures that we don't unnecessarily change the
>> > > parent clocks in RMII mode (and also makes the code easier to
>> > > understand)
>> > > - changed subject of PATCH #2 (formerly PATCH #1) to state that this
>> > > is about the RGMII clock
>> > > - added Jerome's Reviewed-by to PATCH #2 (formerly PATCH #1)
>> > > - replaced PATCH #3 (formerly PATCH #2) with one that sets
>> > > CLK_SET_RATE_PARENT on the mux and thus re-configures the MPLL2 clock
>> > > on Meson8b correctly
>> > >
>> > > changes since v2 at [2]:
>> > > - added PATCH #2 to make the following patch easier
>> > > - Emiliano reported that there's currently another bug in the
>> > > dwmac-meson8b driver which prevents it from working with RGMII PHYs on
>> > > Meson8b: bit 10 of the PRG_ETH0 register is configures a clock gate
>> > > (instead of a divide by 5 or divide by 10 clock divider). This has not
>> > > been visible on GXBB and later due to the input clock which always led
>> > > to a selection of "divide by 10" (which is done internally in the IP
>> > > block, but the bit actually means "enable RGMII clock output").
>> > > PATCH #3 was added to address this issue.
>> > > - the commit message of PATCH #4 and #5 (formerly PATCH #2 and #3) were
>> > > updated and the patch itself rebased because the m25_div clock was
>> > > removed with the new PATCH #3 (so some of the statements were not
>> > > valid anymore)
>> > >
>> >
>> > Here is the clk_summary relative to ethernet on Odroid-C1+
>> > with this new series applied:
>> >
>> > xtal 1 1 24000000 0 0
>> > sys_pll 0 0 1200000000 0 0
>> > cpu_clk 0 0 1200000000 0 0
>> > vid_pll 0 0 732000000 0 0
>> > fixed_pll 2 2 2550000000 0 0
>> > mpll2 1 1 249999701 0 0
>> > c9410000.ethernet#m250_sel 1 1 249999701 0 0
>> > c9410000.ethernet#m250_div 1 1 249999701 0 0
>> > c9410000.ethernet#fixed_div10 1 1 24999970 0 0
>> > c9410000.ethernet#m25_en 1 1 24999970 0 0
>> >
>> > The ethernet prg0 register is set to 0x74A1 which should be correct with
>> > respect to the information contained in the S805 SoC manual.
>> > Actually, the ethernet is not yet fully functional.
>> > Trying to ping the board, I can see ARP request from host to board using
>> > tcpdump. However, the host can't see any response.
>>
>> If the rx path is ok-ish, I suppose the clock setting applied is good.
>> Maybe you could try to play with the tx delay (BIT 5-6 of the register) ?
>>
>
> Thanks for the suggestion. Finally the ethernet works correctly using 4
> ns as tx-delay.
> The clock summary is the same reported above. The prg0 ethernet register
> value is 0x74c1 as expected.
this is awesome!
I think I also know the reason why you need a 4ns TX delay: because
with the vendor kernel it's the same!
the TX delay in PRG_ETHERNET0 is set to 2ns
however, the RTL8211F PHY can also generate a 2ns TX delay - this can
be enabled by pin-strapping (by pulling a pin low/high - there's no
public documentation on that). if this is also enabled then the two
delays sum up
the mainline RTL8211F PHY driver however disables the TX delay in the
PHY when not using phy-mode "rgmii-txid"
> I would like to thanks Martin for the support!
> As soon as this patch series [v3] will be submitted, I'll submit my patch for
> the device tree.
I would like to finish the discussion with Jerome on patch #3 - then I
will send a final version
if anyone has an oscilloscope that is able to measure up to ~50MHz
then I'd be happy if someone would volunteer to test... :)
(I ordered some parts to do that myself, but I'm not sure when this will arrive)
on any Amlogic SoC with RTL8211F Ethernet PHY this can be measured on
pin 36 of the PHY (Odroid-C1's datasheet describes pin 36 as
"XTAL_IN", see page 12 of the schematics: [0]):
- test 1: this series with bit 10 set -> should output 25MHz
- test 2: this series, but manually unset bit 10 after the
clk_set_rate() call -> if bit 10 is a gate then the clock signal is
off, if it's a divider then there should be a 50MHz signal
> Let me know if you have any question.
>
> Thanks again,
thank you for not giving up :)
> Emiliano
>
>> >
>> > Following the U-Boot value for prg0 register, which is 0x7d21, I also
>> > tried to set bit 11. As expected, this did not have any influence.
>> > Another thing that we should check is the "Ethernet Memory PD" (see S805
>> > manual - sec. 5.4) register which bits 3-2 enable/disable ethernet
>> > normal operation. However, those bits are already cleared by U-Boot.
>> >
>> > Thank you for the support.
>> >
>> > Best regards,
>> >
>> > Emiliano
>> >
>> > >
>> > > [0] http://lists.infradead.org/pipermail/linux-amlogic/2017-December/005596.html
>> > > [1] http://lists.infradead.org/pipermail/linux-amlogic/2017-December/005848.html
>> > > [2] http://lists.infradead.org/pipermail/linux-amlogic/2017-December/005861.html
>> > >
>> > >
>> > > Martin Blumenstingl (5):
>> > > net: stmmac: dwmac-meson8b: only configure the clocks in RGMII mode
>> > > net: stmmac: dwmac-meson8b: simplify generating the clock names
>> > > net: stmmac: dwmac-meson8b: fix internal RGMII clock configuration
>> > > net: stmmac: dwmac-meson8b: fix setting the RGMII clock on Meson8b
>> > > net: stmmac: dwmac-meson8b: propagate rate changes to the parent clock
>> > >
>> > > .../net/ethernet/stmicro/stmmac/dwmac-meson8b.c | 119 +++++++++++----------
>> > > 1 file changed, 63 insertions(+), 56 deletions(-)
>> > >
>> > > --
>> > > 2.15.1
>> > >
>>
Regards
Martin
[0] https://dn.odroid.com/S805/Schematics/odroid-c1+_rev0.4_20160226.pdf
^ permalink raw reply
* [PATCH bpf-next v4 3/3] libbpf: add missing SPDX-License-Identifier
From: Eric Leblond @ 2017-12-30 20:41 UTC (permalink / raw)
To: daniel, Toshiaki Makita, Philippe Ombredanne
Cc: Alexei Starovoitov, netdev, linux-kernel, Eric Leblond
In-Reply-To: <20171230204116.30871-1-eric@regit.org>
Signed-off-by: Eric Leblond <eric@regit.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
---
tools/lib/bpf/bpf.c | 2 ++
tools/lib/bpf/bpf.h | 2 ++
tools/lib/bpf/libbpf.c | 2 ++
tools/lib/bpf/libbpf.h | 2 ++
4 files changed, 8 insertions(+)
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index ceb20c5cae3b..ab8b2eb31273 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -1,3 +1,5 @@
+// SPDX-License-Identifier: LGPL-2.1
+
/*
* common eBPF ELF operations.
*
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 9f44c196931e..8d18fb73d7fb 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -1,3 +1,5 @@
+/* SPDX-License-Identifier: LGPL-2.1 */
+
/*
* common eBPF ELF operations.
*
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 5fe8aaa2123e..924a8b8431ab 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -1,3 +1,5 @@
+// SPDX-License-Identifier: LGPL-2.1
+
/*
* Common eBPF ELF object loading operations.
*
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index e42f96900318..f85906533cdd 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -1,3 +1,5 @@
+/* SPDX-License-Identifier: LGPL-2.1 */
+
/*
* Common eBPF ELF object loading operations.
*
--
2.15.1
^ permalink raw reply related
* [PATCH bpf-next v4 2/3] libbpf: add error reporting in XDP
From: Eric Leblond @ 2017-12-30 20:41 UTC (permalink / raw)
To: daniel, Toshiaki Makita, Philippe Ombredanne
Cc: Alexei Starovoitov, netdev, linux-kernel, Eric Leblond
In-Reply-To: <20171230204116.30871-1-eric@regit.org>
Parse netlink ext attribute to get the error message returned by
the card. Code is partially take from libnl.
Signed-off-by: Eric Leblond <eric@regit.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
---
tools/lib/bpf/Build | 2 +-
tools/lib/bpf/bpf.c | 10 ++-
tools/lib/bpf/nlattr.c | 187 +++++++++++++++++++++++++++++++++++++++++++++++++
tools/lib/bpf/nlattr.h | 69 ++++++++++++++++++
4 files changed, 266 insertions(+), 2 deletions(-)
create mode 100644 tools/lib/bpf/nlattr.c
create mode 100644 tools/lib/bpf/nlattr.h
diff --git a/tools/lib/bpf/Build b/tools/lib/bpf/Build
index d8749756352d..64c679d67109 100644
--- a/tools/lib/bpf/Build
+++ b/tools/lib/bpf/Build
@@ -1 +1 @@
-libbpf-y := libbpf.o bpf.o
+libbpf-y := libbpf.o bpf.o nlattr.o
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index f00fba2edeae..ceb20c5cae3b 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -26,6 +26,7 @@
#include <linux/bpf.h>
#include "bpf.h"
#include "libbpf.h"
+#include "nlattr.h"
#include <linux/rtnetlink.h>
#include <sys/socket.h>
#include <errno.h>
@@ -436,6 +437,7 @@ int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)
struct nlmsghdr *nh;
struct nlmsgerr *err;
socklen_t addrlen;
+ int one;
memset(&sa, 0, sizeof(sa));
sa.nl_family = AF_NETLINK;
@@ -445,6 +447,11 @@ int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)
return -errno;
}
+ if (setsockopt(sock, SOL_NETLINK, NETLINK_EXT_ACK,
+ &one, sizeof(one)) < 0) {
+ fprintf(stderr, "Netlink error reporting not supported\n");
+ }
+
if (bind(sock, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
ret = -errno;
goto cleanup;
@@ -520,7 +527,8 @@ int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)
err = (struct nlmsgerr *)NLMSG_DATA(nh);
if (!err->error)
continue;
- ret = err->error;
+ ret = -err->error;
+ nla_dump_errormsg(nh);
goto cleanup;
case NLMSG_DONE:
break;
diff --git a/tools/lib/bpf/nlattr.c b/tools/lib/bpf/nlattr.c
new file mode 100644
index 000000000000..4719434278b2
--- /dev/null
+++ b/tools/lib/bpf/nlattr.c
@@ -0,0 +1,187 @@
+// SPDX-License-Identifier: LGPL-2.1
+
+/*
+ * NETLINK Netlink attributes
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation version 2.1
+ * of the License.
+ *
+ * Copyright (c) 2003-2013 Thomas Graf <tgraf@suug.ch>
+ */
+
+#include <errno.h>
+#include "nlattr.h"
+#include <linux/rtnetlink.h>
+#include <string.h>
+#include <stdio.h>
+
+static uint16_t nla_attr_minlen[NLA_TYPE_MAX+1] = {
+ [NLA_U8] = sizeof(uint8_t),
+ [NLA_U16] = sizeof(uint16_t),
+ [NLA_U32] = sizeof(uint32_t),
+ [NLA_U64] = sizeof(uint64_t),
+ [NLA_STRING] = 1,
+ [NLA_FLAG] = 0,
+};
+
+static int nla_len(const struct nlattr *nla)
+{
+ return nla->nla_len - NLA_HDRLEN;
+}
+
+static struct nlattr *nla_next(const struct nlattr *nla, int *remaining)
+{
+ int totlen = NLA_ALIGN(nla->nla_len);
+
+ *remaining -= totlen;
+ return (struct nlattr *) ((char *) nla + totlen);
+}
+
+static int nla_ok(const struct nlattr *nla, int remaining)
+{
+ return remaining >= sizeof(*nla) &&
+ nla->nla_len >= sizeof(*nla) &&
+ nla->nla_len <= remaining;
+}
+
+static void *nla_data(const struct nlattr *nla)
+{
+ return (char *) nla + NLA_HDRLEN;
+}
+
+static int nla_type(const struct nlattr *nla)
+{
+ return nla->nla_type & NLA_TYPE_MASK;
+}
+
+static int validate_nla(struct nlattr *nla, int maxtype,
+ struct nla_policy *policy)
+{
+ struct nla_policy *pt;
+ unsigned int minlen = 0;
+ int type = nla_type(nla);
+
+ if (type < 0 || type > maxtype)
+ return 0;
+
+ pt = &policy[type];
+
+ if (pt->type > NLA_TYPE_MAX)
+ return 0;
+
+ if (pt->minlen)
+ minlen = pt->minlen;
+ else if (pt->type != NLA_UNSPEC)
+ minlen = nla_attr_minlen[pt->type];
+
+ if (nla_len(nla) < minlen)
+ return -1;
+
+ if (pt->maxlen && nla_len(nla) > pt->maxlen)
+ return -1;
+
+ if (pt->type == NLA_STRING) {
+ char *data = nla_data(nla);
+ if (data[nla_len(nla) - 1] != '\0')
+ return -1;
+ }
+
+ return 0;
+}
+
+static inline int nlmsg_len(const struct nlmsghdr *nlh)
+{
+ return nlh->nlmsg_len - NLMSG_HDRLEN;
+}
+
+/**
+ * Create attribute index based on a stream of attributes.
+ * @arg tb Index array to be filled (maxtype+1 elements).
+ * @arg maxtype Maximum attribute type expected and accepted.
+ * @arg head Head of attribute stream.
+ * @arg len Length of attribute stream.
+ * @arg policy Attribute validation policy.
+ *
+ * Iterates over the stream of attributes and stores a pointer to each
+ * attribute in the index array using the attribute type as index to
+ * the array. Attribute with a type greater than the maximum type
+ * specified will be silently ignored in order to maintain backwards
+ * compatibility. If \a policy is not NULL, the attribute will be
+ * validated using the specified policy.
+ *
+ * @see nla_validate
+ * @return 0 on success or a negative error code.
+ */
+static int nla_parse(struct nlattr *tb[], int maxtype, struct nlattr *head, int len,
+ struct nla_policy *policy)
+{
+ struct nlattr *nla;
+ int rem, err;
+
+ memset(tb, 0, sizeof(struct nlattr *) * (maxtype + 1));
+
+ nla_for_each_attr(nla, head, len, rem) {
+ int type = nla_type(nla);
+
+ if (type > maxtype)
+ continue;
+
+ if (policy) {
+ err = validate_nla(nla, maxtype, policy);
+ if (err < 0)
+ goto errout;
+ }
+
+ if (tb[type])
+ fprintf(stderr, "Attribute of type %#x found multiple times in message, "
+ "previous attribute is being ignored.\n", type);
+
+ tb[type] = nla;
+ }
+
+ err = 0;
+errout:
+ return err;
+}
+
+/* dump netlink extended ack error message */
+int nla_dump_errormsg(struct nlmsghdr *nlh)
+{
+ struct nla_policy extack_policy[NLMSGERR_ATTR_MAX + 1] = {
+ [NLMSGERR_ATTR_MSG] = { .type = NLA_STRING },
+ [NLMSGERR_ATTR_OFFS] = { .type = NLA_U32 },
+ };
+ struct nlattr *tb[NLMSGERR_ATTR_MAX + 1], *attr;
+ struct nlmsgerr *err;
+ char *errmsg = NULL;
+ int hlen, alen;
+
+ /* no TLVs, nothing to do here */
+ if (!(nlh->nlmsg_flags & NLM_F_ACK_TLVS))
+ return 0;
+
+ err = (struct nlmsgerr *)NLMSG_DATA(nlh);
+ hlen = sizeof(*err);
+
+ /* if NLM_F_CAPPED is set then the inner err msg was capped */
+ if (!(nlh->nlmsg_flags & NLM_F_CAPPED))
+ hlen += nlmsg_len(&err->msg);
+
+ attr = (struct nlattr *) ((void *) err + hlen);
+ alen = nlh->nlmsg_len - hlen;
+
+ if (nla_parse(tb, NLMSGERR_ATTR_MAX, attr, alen, extack_policy) != 0) {
+ fprintf(stderr,
+ "Failed to parse extended error attributes\n");
+ return 0;
+ }
+
+ if (tb[NLMSGERR_ATTR_MSG])
+ errmsg = (char *) nla_data(tb[NLMSGERR_ATTR_MSG]);
+
+ fprintf(stderr, "Kernel error message: %s\n", errmsg);
+
+ return 0;
+}
diff --git a/tools/lib/bpf/nlattr.h b/tools/lib/bpf/nlattr.h
new file mode 100644
index 000000000000..fa2d015334ef
--- /dev/null
+++ b/tools/lib/bpf/nlattr.h
@@ -0,0 +1,69 @@
+/* SPDX-License-Identifier: LGPL-2.1 */
+
+/*
+ * NETLINK Netlink attributes
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation version 2.1
+ * of the License.
+ *
+ * Copyright (c) 2003-2013 Thomas Graf <tgraf@suug.ch>
+ */
+
+#ifndef __NLATTR_H
+#define __NLATTR_H
+
+#include <linux/netlink.h>
+
+/**
+ * Standard attribute types to specify validation policy
+ */
+enum {
+ NLA_UNSPEC, /**< Unspecified type, binary data chunk */
+ NLA_U8, /**< 8 bit integer */
+ NLA_U16, /**< 16 bit integer */
+ NLA_U32, /**< 32 bit integer */
+ NLA_U64, /**< 64 bit integer */
+ NLA_STRING, /**< NUL terminated character string */
+ NLA_FLAG, /**< Flag */
+ NLA_MSECS, /**< Micro seconds (64bit) */
+ NLA_NESTED, /**< Nested attributes */
+ __NLA_TYPE_MAX,
+};
+
+#define NLA_TYPE_MAX (__NLA_TYPE_MAX - 1)
+
+/**
+ * @ingroup attr
+ * Attribute validation policy.
+ *
+ * See section @core_doc{core_attr_parse,Attribute Parsing} for more details.
+ */
+struct nla_policy {
+ /** Type of attribute or NLA_UNSPEC */
+ uint16_t type;
+
+ /** Minimal length of payload required */
+ uint16_t minlen;
+
+ /** Maximal length of payload allowed */
+ uint16_t maxlen;
+};
+
+/**
+ * @ingroup attr
+ * Iterate over a stream of attributes
+ * @arg pos loop counter, set to current attribute
+ * @arg head head of attribute stream
+ * @arg len length of attribute stream
+ * @arg rem initialized to len, holds bytes currently remaining in stream
+ */
+#define nla_for_each_attr(pos, head, len, rem) \
+ for (pos = head, rem = len; \
+ nla_ok(pos, rem); \
+ pos = nla_next(pos, &(rem)))
+
+int nla_dump_errormsg(struct nlmsghdr *nlh);
+
+#endif /* __NLATTR_H */
--
2.15.1
^ permalink raw reply related
* [PATCH bpf-next v4 1/3] libbpf: add function to setup XDP
From: Eric Leblond @ 2017-12-30 20:41 UTC (permalink / raw)
To: daniel, Toshiaki Makita, Philippe Ombredanne
Cc: Alexei Starovoitov, netdev, linux-kernel, Eric Leblond
In-Reply-To: <20171230204116.30871-1-eric@regit.org>
Most of the code is taken from set_link_xdp_fd() in bpf_load.c and
slightly modified to be library compliant.
Signed-off-by: Eric Leblond <eric@regit.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
---
tools/lib/bpf/bpf.c | 126 ++++++++++++++++++++++++++++++++++++++++++++++++-
tools/lib/bpf/libbpf.c | 2 +
tools/lib/bpf/libbpf.h | 4 ++
3 files changed, 130 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 5128677e4117..f00fba2edeae 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -25,6 +25,16 @@
#include <asm/unistd.h>
#include <linux/bpf.h>
#include "bpf.h"
+#include "libbpf.h"
+#include <linux/rtnetlink.h>
+#include <sys/socket.h>
+#include <errno.h>
+
+#ifndef IFLA_XDP_MAX
+#define IFLA_XDP 43
+#define IFLA_XDP_FD 1
+#define IFLA_XDP_FLAGS 3
+#endif
/*
* When building perf, unistd.h is overridden. __NR_bpf is
@@ -46,8 +56,6 @@
# endif
#endif
-#define min(x, y) ((x) < (y) ? (x) : (y))
-
static inline __u64 ptr_to_u64(const void *ptr)
{
return (__u64) (unsigned long) ptr;
@@ -413,3 +421,117 @@ int bpf_obj_get_info_by_fd(int prog_fd, void *info, __u32 *info_len)
return err;
}
+
+int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)
+{
+ struct sockaddr_nl sa;
+ int sock, seq = 0, len, ret = -1;
+ char buf[4096];
+ struct nlattr *nla, *nla_xdp;
+ struct {
+ struct nlmsghdr nh;
+ struct ifinfomsg ifinfo;
+ char attrbuf[64];
+ } req;
+ struct nlmsghdr *nh;
+ struct nlmsgerr *err;
+ socklen_t addrlen;
+
+ memset(&sa, 0, sizeof(sa));
+ sa.nl_family = AF_NETLINK;
+
+ sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
+ if (sock < 0) {
+ return -errno;
+ }
+
+ if (bind(sock, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
+ ret = -errno;
+ goto cleanup;
+ }
+
+ addrlen = sizeof(sa);
+ if (getsockname(sock, (struct sockaddr *)&sa, &addrlen) < 0) {
+ ret = -errno;
+ goto cleanup;
+ }
+
+ if (addrlen != sizeof(sa)) {
+ ret = -LIBBPF_ERRNO__INTERNAL;
+ goto cleanup;
+ }
+
+ memset(&req, 0, sizeof(req));
+ req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
+ req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
+ req.nh.nlmsg_type = RTM_SETLINK;
+ req.nh.nlmsg_pid = 0;
+ req.nh.nlmsg_seq = ++seq;
+ req.ifinfo.ifi_family = AF_UNSPEC;
+ req.ifinfo.ifi_index = ifindex;
+
+ /* started nested attribute for XDP */
+ nla = (struct nlattr *)(((char *)&req)
+ + NLMSG_ALIGN(req.nh.nlmsg_len));
+ nla->nla_type = NLA_F_NESTED | IFLA_XDP;
+ nla->nla_len = NLA_HDRLEN;
+
+ /* add XDP fd */
+ nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len);
+ nla_xdp->nla_type = IFLA_XDP_FD;
+ nla_xdp->nla_len = NLA_HDRLEN + sizeof(int);
+ memcpy((char *)nla_xdp + NLA_HDRLEN, &fd, sizeof(fd));
+ nla->nla_len += nla_xdp->nla_len;
+
+ /* if user passed in any flags, add those too */
+ if (flags) {
+ nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len);
+ nla_xdp->nla_type = IFLA_XDP_FLAGS;
+ nla_xdp->nla_len = NLA_HDRLEN + sizeof(flags);
+ memcpy((char *)nla_xdp + NLA_HDRLEN, &flags, sizeof(flags));
+ nla->nla_len += nla_xdp->nla_len;
+ }
+
+ req.nh.nlmsg_len += NLA_ALIGN(nla->nla_len);
+
+ if (send(sock, &req, req.nh.nlmsg_len, 0) < 0) {
+ ret = -errno;
+ goto cleanup;
+ }
+
+ len = recv(sock, buf, sizeof(buf), 0);
+ if (len < 0) {
+ ret = -errno;
+ goto cleanup;
+ }
+
+ for (nh = (struct nlmsghdr *)buf; NLMSG_OK(nh, len);
+ nh = NLMSG_NEXT(nh, len)) {
+ if (nh->nlmsg_pid != sa.nl_pid) {
+ ret = -LIBBPF_ERRNO__WRNGPID;
+ goto cleanup;
+ }
+ if (nh->nlmsg_seq != seq) {
+ ret = -LIBBPF_ERRNO__INVSEQ;
+ goto cleanup;
+ }
+ switch (nh->nlmsg_type) {
+ case NLMSG_ERROR:
+ err = (struct nlmsgerr *)NLMSG_DATA(nh);
+ if (!err->error)
+ continue;
+ ret = err->error;
+ goto cleanup;
+ case NLMSG_DONE:
+ break;
+ default:
+ break;
+ }
+ }
+
+ ret = 0;
+
+cleanup:
+ close(sock);
+ return ret;
+}
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index e9c4b7cabcf2..5fe8aaa2123e 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -106,6 +106,8 @@ static const char *libbpf_strerror_table[NR_ERRNO] = {
[ERRCODE_OFFSET(PROG2BIG)] = "Program too big",
[ERRCODE_OFFSET(KVER)] = "Incorrect kernel version",
[ERRCODE_OFFSET(PROGTYPE)] = "Kernel doesn't support this program type",
+ [ERRCODE_OFFSET(WRNGPID)] = "Wrong pid in netlink message",
+ [ERRCODE_OFFSET(INVSEQ)] = "Invalid netlink sequence",
};
int libbpf_strerror(int err, char *buf, size_t size)
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 6e20003109e0..e42f96900318 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -42,6 +42,8 @@ enum libbpf_errno {
LIBBPF_ERRNO__PROG2BIG, /* Program too big */
LIBBPF_ERRNO__KVER, /* Incorrect kernel version */
LIBBPF_ERRNO__PROGTYPE, /* Kernel doesn't support this program type */
+ LIBBPF_ERRNO__WRNGPID, /* Wrong pid in netlink message */
+ LIBBPF_ERRNO__INVSEQ, /* Invalid netlink sequence */
__LIBBPF_ERRNO__END,
};
@@ -246,4 +248,6 @@ long libbpf_get_error(const void *ptr);
int bpf_prog_load(const char *file, enum bpf_prog_type type,
struct bpf_object **pobj, int *prog_fd);
+
+int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags);
#endif
--
2.15.1
^ permalink raw reply related
* [PATCH bpf-next v4 0/3] libbpf: add function to setup XDP
From: Eric Leblond @ 2017-12-30 20:41 UTC (permalink / raw)
To: daniel, Toshiaki Makita, Philippe Ombredanne
Cc: Alexei Starovoitov, netdev, linux-kernel
In-Reply-To: <9ec8def5-a24f-a4ff-d0ae-fb8f11e4acdc@lab.ntt.co.jp>
Hello,
This updated patchset address the remarks by Toshiaki Makita and
Philippe Ombredanne:
- fixes on errno handling
- correct usage of SPDX header
Best regards,
--
Eric Leblond
^ permalink raw reply
* [PATCH] wireless: b43: Delete an error message for a failed memory allocation in b43_sdio_probe()
From: SF Markus Elfring @ 2017-12-30 20:33 UTC (permalink / raw)
To: b43-dev, linux-wireless, netdev, Kalle Valo; +Cc: LKML, kernel-janitors
From: Markus Elfring <elfring@users.sourceforge.net>
Date: Sat, 30 Dec 2017 21:23:47 +0100
Omit an extra message for a memory allocation failure in this function.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
---
drivers/net/wireless/broadcom/b43/sdio.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/net/wireless/broadcom/b43/sdio.c b/drivers/net/wireless/broadcom/b43/sdio.c
index 59a521800694..5a6dbcf170f9 100644
--- a/drivers/net/wireless/broadcom/b43/sdio.c
+++ b/drivers/net/wireless/broadcom/b43/sdio.c
@@ -146,7 +146,6 @@ static int b43_sdio_probe(struct sdio_func *func,
sdio = kzalloc(sizeof(*sdio), GFP_KERNEL);
if (!sdio) {
error = -ENOMEM;
- dev_err(&func->dev, "failed to allocate ssb bus\n");
goto err_disable_func;
}
error = ssb_bus_sdiobus_register(&sdio->ssb, func,
--
2.15.1
^ permalink raw reply related
* Re: [PATCH] rds: fix use-after-free read in rds_find_bound
From: Sowmini Varadhan @ 2017-12-30 20:26 UTC (permalink / raw)
To: Santosh Shilimkar; +Cc: netdev, davem
In-Reply-To: <1514662599-14491-1-git-send-email-santosh.shilimkar@oracle.com>
On (12/30/17 11:36), Santosh Shilimkar wrote:
>
> socket buffer can get freed as part of sock_close
> callback so before adding reference check underneath
> socket validity.
I'm not sure I understand this fix-
struct rds_sock is:
struct rds_sock {
struct sock rs_sk;
:
}
How can rs be non-null but rds_rs_to_sk() is null? (Note that
rds_rs_to_sk just returns &rs->rs_sk) so the changed line is
identical to the original line.
> - if (rs && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD))
> + if (rs && rds_rs_to_sk(rs) && !sock_flag(rds_rs_to_sk(rs), SOCK_DEAD))
I think the real issue is refcount bug somewhere,
Was the syzbot test run with http://patchwork.ozlabs.org/patch/852492/
this sounds like that type of bug.
--Sowmini
^ permalink raw reply
* Re: iproute2 net-next
From: Daniel Borkmann @ 2017-12-30 20:24 UTC (permalink / raw)
To: Stephen Hemminger, Jiri Pirko; +Cc: Leon Romanovsky, netdev, dsa
In-Reply-To: <20171229200028.78c1371a@xeon-e3>
On 12/30/2017 05:00 AM, Stephen Hemminger wrote:
> On Fri, 29 Dec 2017 09:58:23 +0100
> Jiri Pirko <jiri@resnulli.us> wrote:
>> Fri, Dec 29, 2017 at 12:46:31AM CET, daniel@iogearbox.net wrote:
>>> On 12/26/2017 10:35 AM, Leon Romanovsky wrote:
>>>> On Mon, Dec 25, 2017 at 10:14:26PM -0800, Stephen Hemminger wrote:
>>>>> On Tue, 26 Dec 2017 06:47:43 +0200
>>>>> Leon Romanovsky <leon@kernel.org> wrote:
>>>>>> On Mon, Dec 25, 2017 at 10:49:19AM -0800, Stephen Hemminger wrote:
>>>>>>> David Ahern has agreed to take over managing the net-next branch of iproute2.
>>>>>>> The new location is:
>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/dsahern/iproute2-next.git/
>>>>>>>
>>>>>>> In the past, I have accepted new features into iproute2 master branch, but
>>>>>>> am changing the policy so that outside of the merge window (up until -rc1)
>>>>>>> new features will get put into net-next to get some more review and testing
>>>>>>> time. This means that things like the proposed batch streaming mode will
>>>>>>> go through net-next.
>>>>>>
>>>>>> Did you consider to create one shared repo for the iproute2 to allow
>>>>>> multiple committers workflow?
>>>>>
>>>>> For now having separate trees is best, there is no need for multiple
>>>>> committers the load is very light.
>>>>>
>>>>>> It will be much convenient for the users to have one place for
>>>>>> master/stable/net-next branches, instead of actually following two
>>>>>> different repositories.
>>>>>
>>>>> If you are doing network development, you already need to deal with
>>>>> multiple repo's on the kernel side so there is no difference.
>>>>
>>>> I agree with you that one extra "git remote add .." is not so huge and
>>>> all people who develop for the netdev will do it. My concern is about
>>>> Documentation and newcomers, who will have a hard time to find a right
>>>> tree.
>>>
>>> I guess it would certainly help to identify the official repo to rebase
>>> against much quicker if it would be under a common group on korg e.g.
>>>
>>> * iproute2/iproute2.git - for current cycle
>>> * iproute2/iproute2-next.git - for net-next bits
>>>
>>> and also be in line with other tooling (ethtool and others), even if
>>> not as high volume, but it would make it unambiguous right away from
>>> the other, private iproute2 repos on korg, imho. Just a thought.
>>
>> +1
>>
>> I was about to suggest this. This is nice opportunity to do such change.
>>
>>>>>> Example, of such shared repo:
>>>>>> BPF: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/
>>>>>> Bluetooth: https://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git/
>>>>>> RDMA: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/
>>>>>
>>>>> Most of these are high volume or vendor silo'd which is not the case here.
>>> Cheers,
>>> Daniel
>
> Good news
> kup does support links so could make links from personal to iproute2 directory
That's nice indeed!
> Bad news
> kup won't allow me to make iproute2 directory right now. Will have to wait for
> Konstantin
Right, he also did set up the shared dir for bpf which was straight forward
though, so would be pretty much the same one-time procedure for iproute2.
Thanks,
Daniel
^ permalink raw reply
* [PATCH] wireless: airo: Delete an error message for a failed memory allocation in airo_networks_allocate()
From: SF Markus Elfring @ 2017-12-30 19:57 UTC (permalink / raw)
To: linux-wireless, netdev, Al Viro, David Howells, David S. Miller,
Gustavo A. R. Silva, Ingo Molnar, Johannes Berg, Kalle Valo
Cc: LKML, kernel-janitors
From: Markus Elfring <elfring@users.sourceforge.net>
Date: Sat, 30 Dec 2017 20:48:44 +0100
Omit an extra message for a memory allocation failure in this function.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
---
drivers/net/wireless/cisco/airo.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/drivers/net/wireless/cisco/airo.c b/drivers/net/wireless/cisco/airo.c
index 86e795de6760..a65d82d26eaa 100644
--- a/drivers/net/wireless/cisco/airo.c
+++ b/drivers/net/wireless/cisco/airo.c
@@ -2714,12 +2714,7 @@ static int airo_networks_allocate(struct airo_info *ai)
ai->networks = kcalloc(AIRO_MAX_NETWORK_COUNT, sizeof(BSSListElement),
GFP_KERNEL);
- if (!ai->networks) {
- airo_print_warn("", "Out of memory allocating beacons");
- return -ENOMEM;
- }
-
- return 0;
+ return ai->networks ? 0 : -ENOMEM;
}
static void airo_networks_free(struct airo_info *ai)
--
2.15.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox