Re: [PATCH v3] sysfs: Unconditionally use vmalloc for buffer

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Kees Cook <keescook@chromium.org>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Michal Hocko <mhocko@suse.com>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Lee Duncan <lduncan@suse.com>, Chris Leech <cleech@redhat.com>,
	Adam Nichols <adam@grimm-co.com>,
	linux-fsdevel@vger.kernel.org, linux-hardening@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] sysfs: Unconditionally use vmalloc for buffer
Date: Thu, 1 Apr 2021 00:30:24 -0700	[thread overview]
Message-ID: <202104010022.5E7FB3069@keescook> (raw)
In-Reply-To: <YGVxzSH8fV9MwBDM@kroah.com>

On Thu, Apr 01, 2021 at 09:10:05AM +0200, Greg Kroah-Hartman wrote:
> On Wed, Mar 31, 2021 at 11:52:20PM -0700, Kees Cook wrote:
> > On Thu, Apr 01, 2021 at 07:16:56AM +0200, Greg Kroah-Hartman wrote:
> > > On Wed, Mar 31, 2021 at 07:21:45PM -0700, Kees Cook wrote:
> > > > The sysfs interface to seq_file continues to be rather fragile
> > > > (seq_get_buf() should not be used outside of seq_file), as seen with
> > > > some recent exploits[1]. Move the seq_file buffer to the vmap area
> > > > (while retaining the accounting flag), since it has guard pages that
> > > > will catch and stop linear overflows. This seems justified given that
> > > > sysfs's use of seq_file already uses kvmalloc(), is almost always using
> > > > a PAGE_SIZE or larger allocation, has normally short-lived allocations,
> > > > and is not normally on a performance critical path.
> > > > 
> > > > Once seq_get_buf() has been removed (and all sysfs callbacks using
> > > > seq_file directly), this change can also be removed.
> > > > 
> > > > [1] https://blog.grimm-co.com/2021/03/new-old-bugs-in-linux-kernel.html
> > > > 
> > > > Signed-off-by: Kees Cook <keescook@chromium.org>
> > > > ---
> > > > v3:
> > > > - Limit to only sysfs (instead of all of seq_file).
> > > > v2: https://lore.kernel.org/lkml/20210315174851.622228-1-keescook@chromium.org/
> > > > v1: https://lore.kernel.org/lkml/20210312205558.2947488-1-keescook@chromium.org/
> > > > ---
> > > >  fs/sysfs/file.c | 23 +++++++++++++++++++++++
> > > >  1 file changed, 23 insertions(+)
> > > > 
> > > > diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
> > > > index 9aefa7779b29..70e7a450e5d1 100644
> > > > --- a/fs/sysfs/file.c
> > > > +++ b/fs/sysfs/file.c
> > > > @@ -16,6 +16,7 @@
> > > >  #include <linux/mutex.h>
> > > >  #include <linux/seq_file.h>
> > > >  #include <linux/mm.h>
> > > > +#include <linux/vmalloc.h>
> > > >  
> > > >  #include "sysfs.h"
> > > >  
> > > > @@ -32,6 +33,25 @@ static const struct sysfs_ops *sysfs_file_ops(struct kernfs_node *kn)
> > > >  	return kobj->ktype ? kobj->ktype->sysfs_ops : NULL;
> > > >  }
> > > >  
> > > > +/*
> > > > + * To be proactively defensive against sysfs show() handlers that do not
> > > > + * correctly stay within their PAGE_SIZE buffer, use the vmap area to gain
> > > > + * the trailing guard page which will stop linear buffer overflows.
> > > > + */
> > > > +static void *sysfs_kf_seq_start(struct seq_file *sf, loff_t *ppos)
> > > > +{
> > > > +	struct kernfs_open_file *of = sf->private;
> > > > +	struct kernfs_node *kn = of->kn;
> > > > +
> > > > +	WARN_ON_ONCE(sf->buf);
> > > 
> > > How can buf ever not be NULL?  And if it is, we will leak memory in the
> > > next line so we shouldn't have _ONCE, we should always know, but not
> > > rebooting the machine would be nice.
> > 
> > It should never be possible. I did this because seq_file has some
> > unusual buf allocation patterns in the kernel, and I liked the cheap
> > leak check. I use _ONCE because spewing endlessly doesn't help most
> > cases. And if you want to trigger it again, you don't have to reboot:
> > https://www.kernel.org/doc/html/latest/admin-guide/clearing-warn-once.html
> 
> True, I was thinking of the panic-on-warn people, and the hesitation of
> adding new WARN_ON() to the kernel code.  If this really can happen,
> shouldn't we handle it properly?

It should never happen, but I hate silent bugs. Given the existing
pattern of "external preallocation", it seems like a fragile interface
worth asserting our expectations.

The panic_on_warn folks will get exactly what they wanted: immediate
feedback on "expected to be impossible" cases:
https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on

> > > > +	sf->buf = __vmalloc(kn->attr.size, GFP_KERNEL_ACCOUNT);
> > > > +	if (!sf->buf)
> > > > +		return ERR_PTR(-ENOMEM);
> > > > +	sf->size = kn->attr.size;
> > > > +
> > > > +	return NULL + !*ppos;
> > > > +}
> > > 
> > > Will this also cause the vmalloc fragmentation/abuse that others have
> > > mentioned as userspace can trigger this?
> > 
> > If I understood the concern correctly, it was about it being a risk for
> > doing it for all seq_file uses. This version confines the changes to only
> > sysfs seq_file uses.
> 
> There are a few sysfs files that userspace can read from out there :)

Yes, but the vmap area is also used by default for process stacks, etc.
Malicious fragmentation is already possible. I understood the concern to
be about "regular" use. (And if I'm wrong, we can add a knob maybe?)

> > > And what code frees it?
> > 
> > The existing hooks to seq_release() handle this already. This kind of
> > "preallocation" of the seq_file buffer is done in a few places already
> > (hence my desire for the sanity checking WARN lest future seq_file
> > semantics change).
> 
> Ah, "magic", gotta love it...

Yeeeah. :P

-- 
Kees Cook

next prev parent reply	other threads:[~2021-04-01  7:31 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-01  2:21 [PATCH v3] sysfs: Unconditionally use vmalloc for buffer Kees Cook
2021-04-01  5:16 ` Greg Kroah-Hartman
2021-04-01  6:52   ` Kees Cook
2021-04-01  7:10     ` Greg Kroah-Hartman
2021-04-01  7:30       ` Kees Cook [this message]
2021-04-01  6:41 ` kernel test robot
2021-04-01  6:41   ` kernel test robot
2021-04-01  6:47   ` Nathan Chancellor
2021-04-01  6:47     ` Nathan Chancellor
2021-04-01  6:59     ` Kees Cook
2021-04-01  6:59       ` Kees Cook
2021-04-01  7:08       ` Greg Kroah-Hartman
2021-04-01  7:08         ` Greg Kroah-Hartman
2021-04-01  7:14 ` Michal Hocko
2021-04-01  7:37   ` Kees Cook

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202104010022.5E7FB3069@keescook \
    --to=keescook@chromium.org \
    --cc=adam@grimm-co.com \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cleech@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=lduncan@suse.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=rafael@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.