Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Kees Cook <keescook@chromium.org>
Cc: Michal Hocko <mhocko@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Lee Duncan <lduncan@suse.com>, Chris Leech <cleech@redhat.com>,
	Adam Nichols <adam@grimm-co.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-hardening@vger.kernel.org,
	Uladzislau Rezki <urezki@gmail.com>
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer
Date: Thu, 18 Mar 2021 18:56:29 +0100	[thread overview]
Message-ID: <YFOUTZNeeIbq3cSw@kroah.com> (raw)
In-Reply-To: <202103180847.53EB96C@keescook>

On Thu, Mar 18, 2021 at 08:51:45AM -0700, Kees Cook wrote:
> On Thu, Mar 18, 2021 at 09:07:45AM +0100, Greg Kroah-Hartman wrote:
> > On Wed, Mar 17, 2021 at 02:30:47PM -0700, Kees Cook wrote:
> > > On Wed, Mar 17, 2021 at 04:38:57PM +0100, Greg Kroah-Hartman wrote:
> > > > On Wed, Mar 17, 2021 at 04:20:52PM +0100, Michal Hocko wrote:
> > > > > On Wed 17-03-21 15:56:44, Greg KH wrote:
> > > > > > On Wed, Mar 17, 2021 at 03:44:16PM +0100, Michal Hocko wrote:
> > > > > > > On Wed 17-03-21 14:34:27, Greg KH wrote:
> > > > > > > > On Wed, Mar 17, 2021 at 01:08:21PM +0100, Michal Hocko wrote:
> > > > > > > > > Btw. I still have problems with the approach. seq_file is intended to
> > > > > > > > > provide safe way to dump values to the userspace. Sacrificing
> > > > > > > > > performance just because of some abuser seems like a wrong way to go as
> > > > > > > > > Al pointed out earlier. Can we simply stop the abuse and disallow to
> > > > > > > > > manipulate the buffer directly? I do realize this might be more tricky
> > > > > > > > > for reasons mentioned in other emails but this is definitely worth
> > > > > > > > > doing.
> > > > > > > > 
> > > > > > > > We have to provide a buffer to "write into" somehow, so what is the best
> > > > > > > > way to stop "abuse" like this?
> > > > > > > 
> > > > > > > What is wrong about using seq_* interface directly?
> > > > > > 
> > > > > > Right now every show() callback of sysfs would have to be changed :(
> > > > > 
> > > > > Is this really the case? Would it be too ugly to have an intermediate
> > > > > buffer and then seq_puts it into the seq file inside sysfs_kf_seq_show.
> > > > 
> > > > Oh, good idea.
> > > > 
> > > > > Sure one copy more than necessary but it this shouldn't be a hot path or
> > > > > even visible on small strings. So that might be worth destroying an
> > > > > inherently dangerous seq API (seq_get_buf).
> > > > 
> > > > I'm all for that, let me see if I can carve out some time tomorrow to
> > > > try this out.
> > > 
> > > The trouble has been that C string APIs are just so impossibly fragile.
> > > We just get too many bugs with it, so we really do need to rewrite the
> > > callbacks to use seq_file, since it has a safe API.
> > > 
> > > I've been trying to write coccinelle scripts to do some of this
> > > refactoring, but I have not found a silver bullet. (This is why I've
> > > suggested adding the temporary "seq_show" and "seq_store" functions, so
> > > we can transition all the callbacks without a flag day.)
> > > 
> > > > But, you don't get rid of the "ability" to have a driver write more than
> > > > a PAGE_SIZE into the buffer passed to it.  I guess I could be paranoid
> > > > and do some internal checks (allocate a bunch of memory and check for
> > > > overflow by hand), if this is something to really be concerned about...
> > > 
> > > Besides the CFI prototype enforcement changes (which I can build into
> > > the new seq_show/seq_store callbacks), the buffer management is the
> > > primary issue: we just can't hand drivers a string (even with a length)
> > > because the C functions are terrible. e.g. just look at the snprintf vs
> > > scnprintf -- we constantly have to just build completely new API when
> > > what we need is a safe way (i.e. obfuscated away from the caller) to
> > > build a string. Luckily seq_file does this already, so leaning into that
> > > is good here.
> > 
> > But, is it really worth the churn here?
> > 
> > Yes, strings in C is "hard", but this _should_ be a simple thing for any
> > driver to handle:
> > 	return sysfs_emit(buffer, "%d\n", my_dev->value);
> > 
> > To change that to:
> > 	return seq_printf(seq, "%d\n", my_dev->value);
> > feels very much "don't we have other more valuable things we could be
> > doing?"
> > 
> > So far we have found 1 driver that messed up and overflowed the buffer
> > that I know of.  While reworking apis to make it "hard to get wrong" is
> > a great goal, the work involved here vs. any "protection" feels very
> > low.
> 
> I haven't been keeping a list, but it's not the only one. The _other_
> reason we need seq_file is so we can perform checks against f_cred for
> things like %p obfuscation (as was needed for modules that I hacked
> around) and is needed a proper bug fix for the kernel pointer exposure
> bug from the same batch. So now I'm up to 3 distinct reasons that the
> sysfs API is lacking -- I think it's worth the churn and time.

Ok, if you think so.

But if we do this, can we not do a "raw" seqfile api?  I would like to
see only 1 function that works like sysfs_emit() does.  Perhaps:
	void sysfs_printf(struct attribute *attr, const char *fmt, ...);

and then from there we can "derive" things like:
	void device_printf(struct device_attribute *attr, const char *fmt, ...);

You can "hide" the needed seq_file structure in the attribute structure
for the buffer management, but I don't think we need the crazy multiple
ways that seq_printf() has morphed into over the years, right?

seq_path() anyone?

binary attribute files are a totally different thing, and probably can
just be left alone for now.

> > How about moving everyone to sysfs_emit() first?  That way it becomes
> > much more "obvious" when drivers are doing stupid things with their
> > sysfs buffer.  But even then, it would not have caught the iscsi issue
> > as that was printing a user-provided string so maybe I'm just feeling
> > grumpy about the potential churn here...
> 
> I need to fix the prototypes for CFI sanity too. Switching to seq_file
> solves 2 problems, and if we have to change the prototype once for that,
> we can include the prototype fixes for CFI at the same time to avoid
> double the churn.

Yes, let's not go through this twice...

thanks,

greg k-h

next prev parent reply	other threads:[~2021-03-18 17:57 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-15 17:48 [PATCH v2] seq_file: Unconditionally use vmalloc for buffer Kees Cook
2021-03-15 18:33 ` Al Viro
2021-03-15 20:43   ` Kees Cook
2021-03-16  7:24     ` Greg Kroah-Hartman
2021-03-16 12:43       ` Al Viro
2021-03-16 12:55         ` Greg Kroah-Hartman
2021-03-16 13:01         ` Michal Hocko
2021-03-16 19:18         ` Kees Cook
2021-03-17 10:44           ` Greg Kroah-Hartman
2021-03-16  8:31 ` Michal Hocko
2021-03-16 19:08   ` Kees Cook
2021-03-17 12:08     ` Michal Hocko
2021-03-17 13:34       ` Greg Kroah-Hartman
2021-03-17 14:44         ` Michal Hocko
2021-03-17 14:56           ` Greg Kroah-Hartman
2021-03-17 15:20             ` Michal Hocko
2021-03-17 15:38               ` Greg Kroah-Hartman
2021-03-17 15:48                 ` Michal Hocko
2021-03-17 21:30                 ` Kees Cook
2021-03-18  8:07                   ` Greg Kroah-Hartman
2021-03-18 15:51                     ` Kees Cook
2021-03-18 17:56                       ` Greg Kroah-Hartman [this message]
2021-03-19 14:07 ` [seq_file] 5fd6060e50: stress-ng.eventfd.ops_per_sec -49.1% regression kernel test robot
2021-03-19 14:07   ` kernel test robot
2021-03-19 19:31   ` Kees Cook
2021-03-19 19:31     ` Kees Cook

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YFOUTZNeeIbq3cSw@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=adam@grimm-co.com \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cleech@redhat.com \
    --cc=keescook@chromium.org \
    --cc=lduncan@suse.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.