virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Linus Walleij" <linus.walleij@linaro.org>,
	LKML <linux-kernel@vger.kernel.org>,
	virtualization@lists.linux-foundation.org,
	"Sjur Brændeland" <sjur.brandeland@stericsson.com>
Subject: Re: [RFCv2 00/12] Introduce host-side virtio queue and CAIF Virtio.
Date: Wed, 16 Jan 2013 10:16:06 +0200	[thread overview]
Message-ID: <20130116081606.GB11465@redhat.com> (raw)
In-Reply-To: <874nihzuoz.fsf@rustcorp.com.au>

On Wed, Jan 16, 2013 at 01:43:32PM +1030, Rusty Russell wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> >> +static int resize_iovec(struct vringh_iov *iov, gfp_t gfp)
> >> +{
> >> +	struct iovec *new;
> >> +	unsigned int new_num = iov->max * 2;
> >
> > We must limit this I think, this is coming
> > from userspace. How about UIO_MAXIOV?
> 
> We limit it to the ring size already;

1. do we limit it in case there's a loop in the descriptor ring?
2. do we limit it in case there are indirect descriptors?
I guess I missed where we do this could you point this out to me?

> UIO_MAXIOV is a weird choice here.

It's kind of forced by the need to pass the iov on to the linux kernel,
so we know that any guest using more is broken on existing hypervisors.

Ring size is somewhat arbitrary too, isn't it?  A huge ring where we
post lots of short descriptors (e.g. RX buffers) seems like a valid thing to do.

> >> +static u16 __cold return_from_indirect(const struct vringh *vrh, int *up_next,
> >> +				       struct vring_desc **descs, int *desc_max)
> >
> > Not sure it should be cold like that - virtio net uses indirect on data
> > path.
> 
> This is only when we have a chained, indirect descriptor (ie. we have to
> go back up to the next entry in the main descriptor table).  That's
> allowed in the spec, but noone does it.
> >> +		/* Make sure it's OK, and get offset. */
> >> +		if (!check_range(desc.addr, desc.len, &range, getrange)) {
> >> +			err = -EINVAL;
> >> +			goto fail;
> >> +		}
> >
> > Hmm this looks like it will translate and
> > validate immediate descriptors same way as indirect ones.
> > vhost-net has different translation for regular descriptors
> > and indirect ones, both for speed and to allow ring aliasing,
> > so it has to know which is which.
> 
> I see translate_desc() in both cases, what's different?
> >> +		addr = (void *)(long)desc.addr + range.offset;
> >
> > I really dislike raw pointers that we must never dereference.
> > Since we are forcing everything to __user anyway, why don't we
> > tag all addresses as __user? The kernel users of this API
> > can cast that away, this will keep the casts to minimum.
> >
> > Failing that, we can add our own class
> > # define __virtio         __attribute__((noderef, address_space(2)))
> 
> In this case, perhaps we should leave addr as a u64?

Point being? All users will cast to a pointer.
It seems at first passing in raw pointers is cleaner,
but it turns out in the API we are passing iovs around,
and they are __user anyway.
So using raw pointers here does not buy us anything,
so let's use __user and gain extra static checks at no cost.


> >> +		iov->iov[iov->i].iov_base = (__force __user void *)addr;
> >> +		iov->iov[iov->i].iov_len = desc.len;
> >> +		iov->i++;
> >
> >
> > This looks like it won't do the right thing if desc.len spans multiple
> > ranges. I don't know if this happens in practice but this is something
> > vhost supports ATM.
> 
> Well, kind of.  I assumed that the bool (*getrange)(u64, struct
> vringh_range *)) callback would meld any adjacent ranges if it needs to.

Confused. If addresses 0 to 0x1000 map to virtual addresses 0 to 0x1000
and 0x1000 to 0x2000 map to virtual addresses 0x2000 to 0x3000, then
a single descriptor covering 0 to 0x2000 in guest needs two
iov entries. What can getrange do about it?


> >> +/* All the information about an iovec. */
> >> +struct vringh_iov {
> >> +	struct iovec *iov;
> >> +	unsigned i, max;
> >> +	bool allocated;
> >
> > MAybe set iov = NULL when not allocated?
> 
> The idea was that iov points to the initial (on-stack?) iov, for the
> fast path.
> 
> I'm writing a more complete test at the moment, then I will look at how
> this fits with vhost.c as it stands...
> 
> Cheers,
> Rusty.

  reply	other threads:[~2013-01-16  8:16 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-31 22:46 [RFC virtio-next 0/4] Introduce CAIF Virtio and reversed Vrings Sjur Brændeland
2012-10-31 22:46 ` [RFC virtio-next 1/4] virtio: Move definitions to header file vring.h Sjur Brændeland
2012-10-31 22:46 ` [RFC virtio-next 2/4] include/vring.h: Add support for reversed vritio rings Sjur Brændeland
2012-10-31 22:46 ` [RFC virtio-next 3/4] virtio_ring: Call callback function even when used ring is empty Sjur Brændeland
2012-10-31 22:46 ` [RFC virtio-next 4/4] caif_virtio: Add CAIF over virtio Sjur Brændeland
2012-11-01  7:41 ` [RFC virtio-next 0/4] Introduce CAIF Virtio and reversed Vrings Rusty Russell
2012-11-05 12:12   ` Sjur Brændeland
     [not found]   ` <CANHm3PgrsTD4uYuXN0AMuZFX794CJmmus4AST=G0+nP1ha3VyQ@mail.gmail.com>
2012-11-06  2:09     ` Rusty Russell
2012-12-05 14:36       ` [RFCv2 00/12] Introduce host-side virtio queue and CAIF Virtio Sjur Brændeland
2012-12-05 14:36         ` [RFCv2 01/12] vhost: Use struct vring in vhost_virtqueue Sjur Brændeland
2012-12-05 14:37         ` [RFCv2 02/12] vhost: Isolate reusable vring related functions Sjur Brændeland
2012-12-05 14:37         ` [RFCv2 03/12] virtio-ring: Introduce file virtio_ring_host Sjur Brændeland
2012-12-05 14:37         ` [RFCv2 04/12] virtio-ring: Refactor out the functions accessing user memory Sjur Brændeland
2012-12-06  9:52           ` Michael S. Tsirkin
2012-12-06 11:03             ` Sjur BRENDELAND
2012-12-06 11:15               ` Michael S. Tsirkin
2012-12-07 11:05                 ` Sjur BRENDELAND
2012-12-07 12:40                   ` Michael S. Tsirkin
2012-12-07 13:02                     ` Sjur BRENDELAND
2012-12-07 14:05                       ` Michael S. Tsirkin
2012-12-05 14:37         ` [RFCv2 05/12] virtio-ring: Refactor move attributes to struct virtqueue Sjur Brændeland
2012-12-05 14:37         ` [RFCv2 06/12] virtio_ring: Move SMP macros to virtio_ring.h Sjur Brændeland
2012-12-05 14:37         ` [RFCv2 07/12] virtio-ring: Add Host side virtio-ring implementation Sjur Brændeland
2012-12-05 14:37         ` [RFCv2 08/12] virtio: Update vring_interrupt for host-side virtio queues Sjur Brændeland
2012-12-05 14:37         ` [RFCv2 09/12] virtio-ring: Add BUG_ON checking on host/guest ring type Sjur Brændeland
2012-12-05 14:37         ` [RFCv2 10/12] virtio: Add argument reversed to function find_vqs() Sjur Brændeland
2012-12-05 14:37         ` [RFCv2 11/12] remoteproc: Add support for host-virtqueues Sjur Brændeland
2012-12-05 14:37         ` [RFCv2 12/12] caif_virtio: Introduce caif over virtio Sjur Brændeland
2012-12-06 10:27         ` [RFCv2 00/12] Introduce host-side virtio queue and CAIF Virtio Michael S. Tsirkin
2012-12-21  6:11           ` Rusty Russell
2013-01-08  8:04             ` Sjur Brændeland
2013-01-08 23:17               ` Rusty Russell
2013-01-10 10:30                 ` Rusty Russell
2013-01-10 11:11                   ` Michael S. Tsirkin
2013-01-10 22:48                     ` Rusty Russell
2013-01-11  7:31                       ` Michael S. Tsirkin
     [not found]                       ` <20130111073155.GA13315@redhat.com>
2013-01-12  0:20                         ` Rusty Russell
2013-01-14 16:54                           ` Michael S. Tsirkin
2013-01-10 18:39                   ` Sjur Brændeland
2013-01-10 23:35                     ` Rusty Russell
2013-01-11  6:37                       ` Rusty Russell
2013-01-11 15:02                         ` Sjur Brændeland
2013-01-12  0:26                           ` Rusty Russell
2013-01-14 17:39                         ` Michael S. Tsirkin
2013-01-16  3:13                           ` Rusty Russell
2013-01-16  8:16                             ` Michael S. Tsirkin [this message]
2013-01-17  2:10                               ` Rusty Russell
     [not found]                               ` <87k3rcy2y2.fsf@rustcorp.com.au>
2013-01-17  9:58                                 ` Michael S. Tsirkin
2013-01-21 11:55                                   ` Rusty Russell
2013-01-17 10:35                                 ` Rusty Russell
2013-01-11 14:52                       ` Sjur Brændeland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130116081606.GB11465@redhat.com \
    --to=mst@redhat.com \
    --cc=linus.walleij@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rusty@rustcorp.com.au \
    --cc=sjur.brandeland@stericsson.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).