linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: 'Linus Torvalds' <torvalds@linux-foundation.org>,
	Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Iurii Zaikin <yzaikin@google.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: RE: [PATCH 03/11] fs: add new read_uptr and write_uptr file operations
Date: Tue, 30 Jun 2020 07:51:29 +0000	[thread overview]
Message-ID: <025de688a10d459489e8110a108fed43@AcuMS.aculab.com> (raw)
In-Reply-To: <CAHk-=wj_Br5dQt0GnMjHooSvBbVXwtGRVKQNkpCLwWjYko-4Zw@mail.gmail.com>

From: Linus Torvalds
> Sent: 29 June 2020 18:03
> On Mon, Jun 29, 2020 at 8:29 AM Christoph Hellwig <hch@lst.de> wrote:
> >
> > So based on that I'd rather get away without our flag and tag the
> > kernel pointer case in setsockopt explicitly.
> 
> Yeah, I'd be ok to pass that kind of flag around for setsockopt, in
> ways I _don't_ want to do for some very core vfs thing like 'read()'.
> 
> That said, is there no practical limit on how big "optlen" can be?
> Sure, I realize that a lot of setsockopt users may not use all of the
> data, but let's say that "optlen" is 128, but the actual low-level
> setsockopt operation only uses the first 16 bytes, maybe we could
> always just copy the 128 bytes from user space into kernel space, and
> just say "setsockopt() always gets a kernel pointer".
> 
> Then the bpf use is even simpler. It would just pass the kernel
> pointer natively.
> 
> Because that seems to be what the BPF code really wants to do: it
> takes the user optval, and munges it into a kernel optval, and then
> (if that has been done) runs the low-level sock_setsockopt() under
> KERNEL_DS.
> 
> Couldn't we switch things around instead, and just *always* copy
> things from user space, and sock_setsockopt (and
> sock->ops->setsockopt) _always_ get a kernel buffer?
> 
> And avoid the set_fs(KERNEL_DS) games entirely that way?

I did a patch for SCTP to do the copies in the protocol wrapper.
Apart from the issue of bad applications providing overlarge
buffers and effecting a local DoS attack there were some odd issues:

1) SCTP completely abuses both setsockopt and getsockopt
   to perform additional socket operations.
   I suspect the original implementation didn't want to
   add new system calls.
2) SCTP treats getsockopt as RMW on the user buffer.
   Mostly it only needs 4 bytes, but it can in include
   a sockaddr_storage.
3) SCTP has one getsockopt that is really a setsockopt
   (ie changes things) but is implemented using getsockopt
   so that it can return a value.
4) One of the SCTP getsockopt calls has to return the
   'wrong' value to userspace (ie not the length of the
   transferred data) for compatibility with the orginal
   broken code.

I'm wondering if the [sg]etsockopt wrapper should actually
pass through a structure containing:
	Kernel buffer address (on stack if short)
	User buffer address (may be NULL)
	Length of buffer
	copy_to_user length (normally zero)
	flag: embedded pointers are user/kernel

Most code will just use the kernel buffer and return length/error.

Code that knows the supplied length is invalid can use the
user pointer - but only support direct user requests.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

  parent reply	other threads:[~2020-06-30  7:51 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-24 16:28 [RFC] stop using ->read and ->write for kernel access Christoph Hellwig
2020-06-24 16:28 ` [PATCH 01/11] uptr: add a new "universal pointer" type Christoph Hellwig
2020-06-24 16:28 ` [PATCH 02/11] fs: factor out a set_fmode_can_read_write helper Christoph Hellwig
2020-06-24 16:28 ` [PATCH 03/11] fs: add new read_uptr and write_uptr file operations Christoph Hellwig
2020-06-24 17:19   ` Linus Torvalds
2020-06-24 17:55     ` Christoph Hellwig
2020-06-24 18:11       ` Linus Torvalds
2020-06-24 18:14         ` Christoph Hellwig
2020-06-24 18:20           ` Linus Torvalds
2020-06-24 18:24             ` Christoph Hellwig
2020-06-24 18:29               ` Matthew Wilcox
2020-06-24 18:31                 ` Christoph Hellwig
2020-06-24 18:15         ` Linus Torvalds
2020-06-27 10:49         ` David Laight
2020-06-27 16:33           ` Linus Torvalds
2020-06-29  8:21             ` David Laight
2020-06-29 15:29             ` Christoph Hellwig
2020-06-29 17:02               ` Linus Torvalds
2020-06-29 18:07                 ` Christoph Hellwig
2020-06-29 18:29                   ` Linus Torvalds
2020-06-29 18:36                     ` Christoph Hellwig
2020-06-29 19:10                       ` Linus Torvalds
2020-06-30  7:04                         ` Christoph Hellwig
2020-06-30  7:51                 ` David Laight [this message]
2020-07-08  5:14             ` Luis Chamberlain
2020-06-24 17:56     ` Matthew Wilcox
2020-06-24 17:59       ` Christoph Hellwig
2020-06-24 18:37         ` Christoph Hellwig
2020-06-24 18:43           ` Matthew Wilcox
2020-06-24 16:28 ` [PATCH 04/11] sysctl: switch to ->{read,write}_uptr Christoph Hellwig
2020-06-24 16:28 ` [PATCH 05/11] fs: refactor new_sync_read Christoph Hellwig
2020-06-24 16:28 ` [PATCH 06/11] proc: add a read_iter method to proc proc_ops Christoph Hellwig
2020-06-24 16:28 ` [PATCH 07/11] seq_file: add seq_read_iter Christoph Hellwig
2020-06-24 16:28 ` [PATCH 09/11] proc: switch over direct seq_read method calls to seq_read_iter Christoph Hellwig
2020-06-24 16:29 ` [PATCH 10/11] fs: don't allow kernel reads and writes using ->read and ->write Christoph Hellwig
2020-06-24 16:29 ` [PATCH 11/11] fs: don't allow splice read/write without explicit ops Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=025de688a10d459489e8110a108fed43@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=hch@lst.de \
    --cc=keescook@chromium.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yzaikin@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).