netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-kernel@vger.kernel.org, Jason Wang <jasowang@redhat.com>,
	netdev@vger.kernel.org
Subject: Re: [PATCH RFC] uaccess: user_access_begin_after_access_ok()
Date: Thu, 4 Jun 2020 06:10:23 -0400	[thread overview]
Message-ID: <20200604054516-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20200603165205.GU23230@ZenIV.linux.org.uk>

On Wed, Jun 03, 2020 at 05:52:05PM +0100, Al Viro wrote:
> On Wed, Jun 03, 2020 at 01:29:00AM -0400, Michael S. Tsirkin wrote:
> > On Wed, Jun 03, 2020 at 02:48:15AM +0100, Al Viro wrote:
> > > On Tue, Jun 02, 2020 at 04:45:05AM -0400, Michael S. Tsirkin wrote:
> > > > So vhost needs to poke at userspace *a lot* in a quick succession.  It
> > > > is thus benefitial to enable userspace access, do our thing, then
> > > > disable. Except access_ok has already been pre-validated with all the
> > > > relevant nospec checks, so we don't need that.  Add an API to allow
> > > > userspace access after access_ok and barrier_nospec are done.
> > > 
> > > BTW, what are you going to do about vq->iotlb != NULL case?  Because
> > > you sure as hell do *NOT* want e.g. translate_desc() under STAC.
> > > Disable it around the calls of translate_desc()?
> > > 
> > > How widely do you hope to stretch the user_access areas, anyway?
> > 
> > So ATM I'm looking at adding support for the packed ring format.
> > That does something like:
> > 
> > get_user(flags, desc->flags)
> > smp_rmb()
> > if (flags & VALID)
> > copy_from_user(&adesc, desc, sizeof adesc);
> > 
> > this would be a good candidate I think.
> 
> Perhaps, once we get stac/clac out of raw_copy_from_user() (coming cycle,
> probably).

That sounds good. Presumably raw_copy_from_user will be smart enough
to optimize aligned 2 byte accesses for flags above?

>  BTW, how large is the structure and how is it aligned?

It's a batch of 16 byte structures, aligned to 16 bytes.
At the moment I used batch size of 64 which seems enough.
Ideally we'd actually read the whole batch like this
without stac/clac back and forth. E.g.

	struct vring_desc adesc[64] = {};

	stac()
	for (i = 0; i < 64; ++i) {
	 get_user(flags, desc[i].flags)
	 smp_rmb()
	 if (!(flags & VALID))
		break;
	 copy_from_user(&adesc[i], desc + i, sizeof adesc[i]);
	}
	clac()




> > > BTW, speaking of possible annotations: looks like there's a large
> > > subset of call graph that can be reached only from vhost_worker()
> > > or from several ioctls, with all uaccess limited to that subgraph
> > > (thankfully).  Having that explicitly marked might be a good idea...
> > 
> > Sure. What's a good way to do that though? Any examples to follow?
> > Or do you mean code comments?
> 
> Not sure...  FWIW, the part of call graph from "known to be only
> used by vhost_worker" (->handle_kick/vhost_work_init callback/
> vhost_poll_init callback) and "part of ->ioctl()" to actual uaccess
> primitives is fairly large - the longest chain is
> handle_tx_net ->
>   handle_tx ->
>     handle_tx_zerocopy ->
>       get_tx_bufs ->
> 	vhost_net_tx_get_vq_desc ->
> 	  vhost_tx_batch ->
> 	    vhost_net_signal_used ->
> 	      vhost_add_used_and_signal_n ->
> 		vhost_signal ->
> 		  vhost_notify ->
> 		    vhost_get_avail_flags ->
> 		      vhost_get_avail ->
> 			vhost_get_user ->
> 			  __get_user()
> i.e. 14 levels deep and the graph doesn't factorize well...
> 
> Something along the lines of "all callers of thus annotated function
> must be annotated the same way themselves, any implicit conversion
> of pointers to such functions to anything other than boolean yields
> a warning, explicit cast is allowed only with __force", perhaps?
> Then slap such annotations on vhost_{get,put,copy_to,copy_from}_user(),
> on ->handle_kick(), a force-cast in the only caller of ->handle_kick()
> and force-casts in the 3 callers in ->ioctl().
> 
> And propagate the annotations until the warnings stop, basically...
> 
> Shouldn't be terribly hard to teach sparse that kind of stuff and it
> might be useful elsewhere.  It would act as a qualifier on function
> pointers, with syntax ultimately expanding to __attribute__((something)).
> I'll need to refresh my memories of the parser, but IIRC that shouldn't
> require serious modifications.  Most of the work would be in
> evaluate_call(), just before calling evaluate_symbol_call()...
> I'll look into that; not right now, though.


Thanks, that does sound useful!

> BTW, __vhost_get_user() might be better off expanded in both callers -
> that would get their structure similar to vhost_copy_{to,from}_user(),
> especially if you expand __vhost_get_user_slow() as well.

I agree, that does sound like a good cleanup.

> Not sure I understand what's going with ->meta_iotlb[] - what are the
> lifetime rules for struct vhost_iotlb_map and what prevents the pointers
> from going stale?

It can be zeroed at any point.
We just try to call __vhost_vq_meta_reset whenever anything can go
stale.


  parent reply	other threads:[~2020-06-04 10:10 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-02  8:45 [PATCH RFC] uaccess: user_access_begin_after_access_ok() Michael S. Tsirkin
2020-06-02 10:15 ` Jason Wang
2020-06-02 16:33   ` Al Viro
2020-06-02 17:18     ` Linus Torvalds
2020-06-02 17:44       ` Al Viro
2020-06-02 17:46         ` Al Viro
2020-06-02 20:32       ` Michael S. Tsirkin
2020-06-02 20:41         ` David Laight
2020-06-02 21:58           ` Al Viro
2020-06-03  8:08             ` David Laight
2020-06-02 20:43         ` Linus Torvalds
2020-06-03  6:01           ` Michael S. Tsirkin
     [not found]             ` <CAHk-=wi3=QuD30fRq8fYYTj9WmkgeZ0VR_Sh3DQHU+nmwj-jMg@mail.gmail.com>
2020-06-03 16:59               ` Linus Torvalds
2020-06-02 16:30 ` Al Viro
2020-06-02 20:42   ` Michael S. Tsirkin
2020-06-02 22:10     ` Al Viro
2020-06-03  5:17       ` Michael S. Tsirkin
2020-06-03  1:48 ` Al Viro
2020-06-03  3:57   ` Jason Wang
2020-06-03  4:18     ` Al Viro
2020-06-03  5:18       ` Jason Wang
2020-06-03  5:46         ` Michael S. Tsirkin
2020-06-03  6:23           ` Jason Wang
2020-06-03  6:30             ` Michael S. Tsirkin
2020-06-03  6:36               ` Jason Wang
2020-06-04 16:49                 ` Michael S. Tsirkin
2020-06-05 10:03                   ` Jason Wang
2020-06-06 20:08                     ` Michael S. Tsirkin
2020-06-03  6:25       ` Michael S. Tsirkin
2020-06-03  5:29   ` Michael S. Tsirkin
2020-06-03 16:52     ` Al Viro
2020-06-04  6:10       ` Jason Wang
2020-06-04 14:59         ` Al Viro
2020-06-04 16:46           ` Michael S. Tsirkin
2020-06-04 10:10       ` Michael S. Tsirkin [this message]
2020-06-04 15:03         ` Al Viro
2020-06-04 16:47           ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200604054516-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).