All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: netdev@vger.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-arch@vger.kernel.org, Andi Kleen <ak@linux.intel.com>,
	Kees Cook <keescook@chromium.org>,
	kernel-hardening@lists.openwall.com,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Alan Cox <alan@linux.intel.com>,
	David Miller <davem@davemloft.net>
Subject: [kernel-hardening] [RFC][PATCH] get rid of the use of set_fs() (by way of kernel_recvmsg()) in sunrpc
Date: Thu, 18 Jan 2018 03:06:34 +0000	[thread overview]
Message-ID: <20180118030634.GY13338@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20180117185232.GW13338@ZenIV.linux.org.uk>

On Wed, Jan 17, 2018 at 06:52:32PM +0000, Al Viro wrote:

[use of set_fs() by sunrpc]
> We are asking for recvmsg() with zero data length; what we really want is
> ->msg_control.  And _that_ is why we need that set_fs() - we want the damn
> thing to go into local variable.
> 
> But note that filling ->msg_control will happen in put_cmsg(), called
> from ip_cmsg_recv_pktinfo(), called from ip_cmsg_recv_offset(),
> called from udp_recvmsg(), called from sock_recvmsg_nosec(), called
> from sock_recvmsg().  Or in another path in case of IPv6.
> Sure, we can arrange for propagation of that all way down those
> call chains.  My preference would be to try and mark that (one and
> only) case in ->msg_flags, so that put_cmsg() would be able to
> check.  ___sys_recvmsg() sets that as
>         msg_sys->msg_flags = flags & (MSG_CMSG_CLOEXEC|MSG_CMSG_COMPAT);
> so we ought to be free to use any bit other than those two.  Since
> put_cmsg() already checks ->msg_flags, that shouldn't put too much
> overhead.

Folks, does anybody have objections against the following:

Get rid of kernel_recvmsg() (and thus set_fs()) use in sunrpc

In net/sunrpc/svcsock.c:svc_udp_recvfrom() we want to get
IP_PKTINFO; currently we stash the address of local variable
into ->msg_control (which normall contains a userland pointer
in recepients) and issue zero-length ->recvmsg() under
set_fs(KERNEL_DS).

Similar to the way put_cmsg() handles 32bit case on biarch
targets, introduce a flag telling put_cmsg() to treat
->msg_control as kernel pointer, using memcpy instead of
copy_to_user().  That allows to avoid the use of kernel_recvmsg()
with its set_fs().

All places that might have non-NULL ->msg_control either pass the
msghdr only to ->sendmsg() and its ilk, or have ->msg_flags
sanitized before passing msghdr anywhere.  IOW, there no
way the new flag to reach put_cmsg() in the mainline kernel,
and after this change it will only be seen in sunrpc case.

Incidentally, all other kernel_recvmsg() users are very easy to
convert to sock_recvmsg(), so that would allow to kill
kernel_recvmsg() off...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/include/linux/socket.h b/include/linux/socket.h
index 9286a5a8c60c..60947da9c806 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -298,6 +298,7 @@ struct ucred {
 #else
 #define MSG_CMSG_COMPAT	0		/* We never have 32 bit fixups */
 #endif
+#define MSG_CMSG_KERNEL	0x10000000
 
 
 /* Setsockoptions(2) level. Thanks to BSD these must match IPPROTO_xxx */
diff --git a/net/core/scm.c b/net/core/scm.c
index b1ff8a441748..1b73b682e441 100644
--- a/net/core/scm.c
+++ b/net/core/scm.c
@@ -237,10 +237,17 @@ int put_cmsg(struct msghdr * msg, int level, int type, int len, void *data)
 	cmhdr.cmsg_len = cmlen;
 
 	err = -EFAULT;
-	if (copy_to_user(cm, &cmhdr, sizeof cmhdr))
-		goto out;
-	if (copy_to_user(CMSG_DATA(cm), data, cmlen - sizeof(struct cmsghdr)))
-		goto out;
+	if (unlikely(MSG_CMSG_KERNEL & msg->msg_flags)) {
+		struct cmsghdr *p = msg->msg_control;
+		memcpy(p, &cmhdr, sizeof cmhdr);
+		memcpy(CMSG_DATA(p), data, cmlen - sizeof(struct cmsghdr));
+	} else {
+		if (copy_to_user(cm, &cmhdr, sizeof cmhdr))
+			goto out;
+		if (copy_to_user(CMSG_DATA(cm), data,
+				 cmlen - sizeof(struct cmsghdr)))
+			goto out;
+	}
 	cmlen = CMSG_SPACE(len);
 	if (msg->msg_controllen < cmlen)
 		cmlen = msg->msg_controllen;
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 5570719e4787..774904f67c93 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -545,7 +545,7 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
 		.msg_name = svc_addr(rqstp),
 		.msg_control = cmh,
 		.msg_controllen = sizeof(buffer),
-		.msg_flags = MSG_DONTWAIT,
+		.msg_flags = MSG_DONTWAIT | MSG_CMSG_KERNEL,
 	};
 	size_t len;
 	int err;
@@ -565,8 +565,7 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
 
 	clear_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
 	skb = NULL;
-	err = kernel_recvmsg(svsk->sk_sock, &msg, NULL,
-			     0, 0, MSG_PEEK | MSG_DONTWAIT);
+	err = sock_recvmsg(svsk->sk_sock, &msg, MSG_PEEK | MSG_DONTWAIT);
 	if (err >= 0)
 		skb = skb_recv_udp(svsk->sk_sk, 0, 1, &err);
 

WARNING: multiple messages have this Message-ID (diff)
From: Al Viro <viro@ZenIV.linux.org.uk>
To: netdev@vger.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-arch@vger.kernel.org, Andi Kleen <ak@linux.intel.com>,
	Kees Cook <keescook@chromium.org>,
	kernel-hardening@lists.openwall.com,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Alan Cox <alan@linux.intel.com>,
	David Miller <davem@davemloft.net>
Subject: [RFC][PATCH] get rid of the use of set_fs() (by way of kernel_recvmsg()) in sunrpc
Date: Thu, 18 Jan 2018 03:06:34 +0000	[thread overview]
Message-ID: <20180118030634.GY13338@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20180117185232.GW13338@ZenIV.linux.org.uk>

On Wed, Jan 17, 2018 at 06:52:32PM +0000, Al Viro wrote:

[use of set_fs() by sunrpc]
> We are asking for recvmsg() with zero data length; what we really want is
> ->msg_control.  And _that_ is why we need that set_fs() - we want the damn
> thing to go into local variable.
> 
> But note that filling ->msg_control will happen in put_cmsg(), called
> from ip_cmsg_recv_pktinfo(), called from ip_cmsg_recv_offset(),
> called from udp_recvmsg(), called from sock_recvmsg_nosec(), called
> from sock_recvmsg().  Or in another path in case of IPv6.
> Sure, we can arrange for propagation of that all way down those
> call chains.  My preference would be to try and mark that (one and
> only) case in ->msg_flags, so that put_cmsg() would be able to
> check.  ___sys_recvmsg() sets that as
>         msg_sys->msg_flags = flags & (MSG_CMSG_CLOEXEC|MSG_CMSG_COMPAT);
> so we ought to be free to use any bit other than those two.  Since
> put_cmsg() already checks ->msg_flags, that shouldn't put too much
> overhead.

Folks, does anybody have objections against the following:

Get rid of kernel_recvmsg() (and thus set_fs()) use in sunrpc

In net/sunrpc/svcsock.c:svc_udp_recvfrom() we want to get
IP_PKTINFO; currently we stash the address of local variable
into ->msg_control (which normall contains a userland pointer
in recepients) and issue zero-length ->recvmsg() under
set_fs(KERNEL_DS).

Similar to the way put_cmsg() handles 32bit case on biarch
targets, introduce a flag telling put_cmsg() to treat
->msg_control as kernel pointer, using memcpy instead of
copy_to_user().  That allows to avoid the use of kernel_recvmsg()
with its set_fs().

All places that might have non-NULL ->msg_control either pass the
msghdr only to ->sendmsg() and its ilk, or have ->msg_flags
sanitized before passing msghdr anywhere.  IOW, there no
way the new flag to reach put_cmsg() in the mainline kernel,
and after this change it will only be seen in sunrpc case.

Incidentally, all other kernel_recvmsg() users are very easy to
convert to sock_recvmsg(), so that would allow to kill
kernel_recvmsg() off...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/include/linux/socket.h b/include/linux/socket.h
index 9286a5a8c60c..60947da9c806 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -298,6 +298,7 @@ struct ucred {
 #else
 #define MSG_CMSG_COMPAT	0		/* We never have 32 bit fixups */
 #endif
+#define MSG_CMSG_KERNEL	0x10000000
 
 
 /* Setsockoptions(2) level. Thanks to BSD these must match IPPROTO_xxx */
diff --git a/net/core/scm.c b/net/core/scm.c
index b1ff8a441748..1b73b682e441 100644
--- a/net/core/scm.c
+++ b/net/core/scm.c
@@ -237,10 +237,17 @@ int put_cmsg(struct msghdr * msg, int level, int type, int len, void *data)
 	cmhdr.cmsg_len = cmlen;
 
 	err = -EFAULT;
-	if (copy_to_user(cm, &cmhdr, sizeof cmhdr))
-		goto out;
-	if (copy_to_user(CMSG_DATA(cm), data, cmlen - sizeof(struct cmsghdr)))
-		goto out;
+	if (unlikely(MSG_CMSG_KERNEL & msg->msg_flags)) {
+		struct cmsghdr *p = msg->msg_control;
+		memcpy(p, &cmhdr, sizeof cmhdr);
+		memcpy(CMSG_DATA(p), data, cmlen - sizeof(struct cmsghdr));
+	} else {
+		if (copy_to_user(cm, &cmhdr, sizeof cmhdr))
+			goto out;
+		if (copy_to_user(CMSG_DATA(cm), data,
+				 cmlen - sizeof(struct cmsghdr)))
+			goto out;
+	}
 	cmlen = CMSG_SPACE(len);
 	if (msg->msg_controllen < cmlen)
 		cmlen = msg->msg_controllen;
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 5570719e4787..774904f67c93 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -545,7 +545,7 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
 		.msg_name = svc_addr(rqstp),
 		.msg_control = cmh,
 		.msg_controllen = sizeof(buffer),
-		.msg_flags = MSG_DONTWAIT,
+		.msg_flags = MSG_DONTWAIT | MSG_CMSG_KERNEL,
 	};
 	size_t len;
 	int err;
@@ -565,8 +565,7 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
 
 	clear_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
 	skb = NULL;
-	err = kernel_recvmsg(svsk->sk_sock, &msg, NULL,
-			     0, 0, MSG_PEEK | MSG_DONTWAIT);
+	err = sock_recvmsg(svsk->sk_sock, &msg, MSG_PEEK | MSG_DONTWAIT);
 	if (err >= 0)
 		skb = skb_recv_udp(svsk->sk_sk, 0, 1, &err);
 

  parent reply	other threads:[~2018-01-18  3:06 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-13 18:17 [kernel-hardening] [PATCH v3 0/9] core, x86: prevent bounds-check bypass via speculative execution Dan Williams
2018-01-13 18:17 ` Dan Williams
2018-01-13 18:17 ` [kernel-hardening] [PATCH v3 1/9] Documentation: document array_ptr Dan Williams
2018-01-13 18:17   ` Dan Williams
2018-01-13 18:17 ` [kernel-hardening] [PATCH v3 2/9] arm64: implement ifence_array_ptr() Dan Williams
2018-01-13 18:17   ` Dan Williams
2018-01-13 18:17 ` [kernel-hardening] [PATCH v3 3/9] arm: " Dan Williams
2018-01-13 18:17   ` Dan Williams
2018-01-13 18:17 ` [kernel-hardening] [PATCH v3 4/9] x86: implement ifence() Dan Williams
2018-01-13 18:17   ` Dan Williams
2018-01-13 18:17 ` [kernel-hardening] [PATCH v3 5/9] x86: implement ifence_array_ptr() and array_ptr_mask() Dan Williams
2018-01-13 18:17   ` Dan Williams
2018-01-13 18:17 ` [kernel-hardening] [PATCH v3 6/9] asm/nospec: mask speculative execution flows Dan Williams
2018-01-13 18:17   ` Dan Williams
2018-01-13 18:18 ` [kernel-hardening] [PATCH v3 7/9] x86: introduce __uaccess_begin_nospec and ASM_IFENCE Dan Williams
2018-01-13 18:18   ` Dan Williams
2018-01-13 18:18 ` [kernel-hardening] [PATCH v3 8/9] x86: use __uaccess_begin_nospec and ASM_IFENCE in get_user paths Dan Williams
2018-01-13 18:18   ` Dan Williams
2018-01-13 19:05   ` [kernel-hardening] " Linus Torvalds
2018-01-13 19:05     ` Linus Torvalds
2018-01-13 19:33     ` [kernel-hardening] " Linus Torvalds
2018-01-13 19:33       ` Linus Torvalds
2018-01-13 20:22       ` [kernel-hardening] " Eric W. Biederman
2018-01-13 20:22         ` Eric W. Biederman
2018-01-13 20:22         ` Eric W. Biederman
2018-01-13 20:22         ` Eric W. Biederman
2018-01-16 22:23       ` [kernel-hardening] " Dan Williams
2018-01-16 22:23         ` Dan Williams
2018-01-16 22:23         ` Dan Williams
2018-01-16 22:23         ` Dan Williams
     [not found]         ` <CA+55aFxAFG5czVmCyhYMyHmXLNJ7pcXxWzusjZvLRh_qTGHj6Q@mail.gmail.com>
2018-01-16 22:41           ` [kernel-hardening] " Linus Torvalds
2018-01-16 22:41             ` Linus Torvalds
2018-01-17 14:17             ` [kernel-hardening] " Alan Cox
2018-01-17 14:17               ` Alan Cox
2018-01-17 18:52               ` [kernel-hardening] " Al Viro
2018-01-17 18:52                 ` Al Viro
2018-01-17 19:54                 ` [kernel-hardening] " Dan Williams
2018-01-17 19:54                   ` Dan Williams
2018-01-17 20:05                   ` [kernel-hardening] " Al Viro
2018-01-17 20:05                     ` Al Viro
2018-01-17 20:14                     ` [kernel-hardening] " Dan Williams
2018-01-17 20:14                       ` Dan Williams
2018-01-18  3:06                 ` Al Viro [this message]
2018-01-18  3:06                   ` [RFC][PATCH] get rid of the use of set_fs() (by way of kernel_recvmsg()) in sunrpc Al Viro
2018-01-18  3:16                   ` [kernel-hardening] " Linus Torvalds
2018-01-18  3:16                     ` Linus Torvalds
2018-01-18  4:43                     ` [kernel-hardening] " Al Viro
2018-01-18  4:43                       ` Al Viro
2018-01-18 16:29                       ` [kernel-hardening] " Christoph Hellwig
2018-01-18 16:29                         ` Christoph Hellwig
2018-01-18 17:10                         ` [kernel-hardening] " Al Viro
2018-01-18 17:10                           ` Al Viro
2018-01-18 19:31                       ` [kernel-hardening] " Al Viro
2018-01-18 19:31                         ` Al Viro
2018-01-18 19:37                         ` [PATCH 01/10] net: separate SIOCGIFCONF handling from dev_ioctl() Al Viro
2018-01-18 19:37                           ` [PATCH 02/10] devinet_ioctl(): take copyin/copyout to caller Al Viro
2018-01-22 16:40                             ` Christoph Hellwig
2018-01-18 19:37                           ` [PATCH 03/10] ip_rt_ioctl(): take copyin " Al Viro
2018-01-22 16:43                             ` Christoph Hellwig
2018-01-18 19:37                           ` [PATCH 04/10] kill dev_ifsioc() Al Viro
2018-01-22 16:47                             ` Christoph Hellwig
2018-01-18 19:37                           ` [PATCH 05/10] kill bond_ioctl() Al Viro
2018-01-22 16:48                             ` Christoph Hellwig
2018-01-18 19:37                           ` [PATCH 06/10] kill dev_ifname32() Al Viro
2018-01-18 19:37                           ` [PATCH 07/10] lift handling of SIOCIW... out of dev_ioctl() Al Viro
2018-01-18 19:37                           ` [PATCH 08/10] ipconfig: use dev_set_mtu() Al Viro
2018-01-18 19:37                           ` [PATCH 09/10] dev_ioctl(): move copyin/copyout to callers Al Viro
2018-01-18 19:37                           ` [PATCH 10/10] kill kernel_sock_ioctl() Al Viro
2018-01-22 16:49                             ` Christoph Hellwig
2018-01-24 20:52                             ` David Miller
2018-01-25  0:01                               ` Al Viro
2018-01-25  0:21                                 ` Al Viro
2018-01-25  4:11                                 ` David Miller
2018-01-22 16:40                           ` [PATCH 01/10] net: separate SIOCGIFCONF handling from dev_ioctl() Christoph Hellwig
2018-01-18 20:33                         ` [kernel-hardening] Re: [RFC][PATCH] get rid of the use of set_fs() (by way of kernel_recvmsg()) in sunrpc Al Viro
2018-01-18 20:33                           ` Al Viro
2018-01-19  3:27                         ` [kernel-hardening] " Al Viro
2018-01-19  3:27                           ` Al Viro
2018-01-17 19:26               ` [kernel-hardening] Re: [PATCH v3 8/9] x86: use __uaccess_begin_nospec and ASM_IFENCE in get_user paths Linus Torvalds
2018-01-17 19:26                 ` Linus Torvalds
2018-01-17 20:01                 ` [kernel-hardening] " Eric Dumazet
2018-01-17 20:01                   ` Eric Dumazet
2018-01-18 16:38                 ` [kernel-hardening] " Christoph Hellwig
2018-01-18 16:38                   ` Christoph Hellwig
2018-01-18 16:49                   ` [kernel-hardening] " Linus Torvalds
2018-01-18 16:49                     ` Linus Torvalds
2018-01-18 18:12                     ` [kernel-hardening] " Al Viro
2018-01-18 18:12                       ` Al Viro
2018-01-17  4:30         ` [kernel-hardening] " Dan Williams
2018-01-17  4:30           ` Dan Williams
2018-01-17  6:28           ` [kernel-hardening] " Al Viro
2018-01-17  6:28             ` Al Viro
2018-01-17  6:50             ` [kernel-hardening] " Dan Williams
2018-01-17  6:50               ` Dan Williams
2018-01-17 10:07               ` [kernel-hardening] " David Laight
2018-01-17 10:07                 ` David Laight
2018-01-17 18:12               ` [kernel-hardening] " Dan Williams
2018-01-17 18:12                 ` Dan Williams
2018-01-17 19:16           ` [kernel-hardening] " Linus Torvalds
2018-01-17 19:16             ` Linus Torvalds
2018-01-13 18:18 ` [kernel-hardening] [PATCH v3 9/9] vfs, fdtable: prevent bounds-check bypass via speculative execution Dan Williams
2018-01-13 18:18   ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180118030634.GY13338@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=alan@linux.intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=davem@davemloft.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.