All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Yongji Xie <xieyongji@bytedance.com>
Cc: ashish.kalra@amd.com, file@sect.tu-berlin.de,
	kvm <kvm@vger.kernel.org>,
	konrad.wilk@oracle.com,
	linux-kernel <linux-kernel@vger.kernel.org>,
	virtualization <virtualization@lists.linux-foundation.org>,
	Christoph Hellwig <hch@infradead.org>,
	Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: Re: [RFC PATCH V2 0/7] Do not read from descripto ring
Date: Fri, 14 May 2021 07:36:12 -0400	[thread overview]
Message-ID: <20210514073452-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CACycT3u+hQbDJtf5gxS1NVVpiTffMz1skuhTExy5d_oRjYKoxg@mail.gmail.com>

On Fri, May 14, 2021 at 07:27:22PM +0800, Yongji Xie wrote:
> On Fri, May 14, 2021 at 7:17 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >
> > On Fri, May 14, 2021 at 03:29:20PM +0800, Jason Wang wrote:
> > > On Fri, May 14, 2021 at 12:27 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > >
> > > > On Fri, Apr 23, 2021 at 04:09:35PM +0800, Jason Wang wrote:
> > > > > Sometimes, the driver doesn't trust the device. This is usually
> > > > > happens for the encrtpyed VM or VDUSE[1].
> > > >
> > > > Thanks for doing this.
> > > >
> > > > Can you describe the overall memory safety model that virtio drivers
> > > > must follow?
> > >
> > > My understanding is that, basically the driver should not trust the
> > > device (since the driver doesn't know what kind of device that it
> > > tries to drive)
> > >
> > > 1) For any read only metadata (required at the spec level) which is
> > > mapped as coherent, driver should not depend on the metadata that is
> > > stored in a place that could be wrote by the device. This is what this
> > > series tries to achieve.
> > > 2) For other metadata that is produced by the device, need to make
> > > sure there's no malicious device triggered behavior, this is somehow
> > > similar to what vhost did. No DOS, loop, kernel bug and other stuffs.
> > > 3) swiotb is a must to enforce memory access isolation. (VDUSE or encrypted VM)
> > >
> > > > For example:
> > > >
> > > > - Driver-to-device buffers must be on dedicated pages to avoid
> > > >   information leaks.
> > >
> > > It looks to me if swiotlb is used, we don't need this since the
> > > bouncing is not done at byte not page.
> > >
> > > But if swiotlb is not used, we need to enforce this.
> > >
> > > >
> > > > - Driver-to-device buffers must be on dedicated pages to avoid memory
> > > >   corruption.
> > >
> > > Similar to the above.
> > >
> > > >
> > > > When I say "pages" I guess it's the IOMMU page size that matters?
> > > >
> > >
> > > And the IOTLB page size.
> > >
> > > > What is the memory access granularity of VDUSE?
> > >
> > > It has an swiotlb, but the access and bouncing is done per byte.
> > >
> > > >
> > > > I'm asking these questions because there is driver code that exposes
> > > > kernel memory to the device and I'm not sure it's safe. For example:
> > > >
> > > >   static int virtblk_add_req(struct virtqueue *vq, struct virtblk_req *vbr,
> > > >                   struct scatterlist *data_sg, bool have_data)
> > > >   {
> > > >           struct scatterlist hdr, status, *sgs[3];
> > > >           unsigned int num_out = 0, num_in = 0;
> > > >
> > > >           sg_init_one(&hdr, &vbr->out_hdr, sizeof(vbr->out_hdr));
> > > >                             ^^^^^^^^^^^^^
> > > >           sgs[num_out++] = &hdr;
> > > >
> > > >           if (have_data) {
> > > >                   if (vbr->out_hdr.type & cpu_to_virtio32(vq->vdev, VIRTIO_BLK_T_OUT))
> > > >                           sgs[num_out++] = data_sg;
> > > >                   else
> > > >                           sgs[num_out + num_in++] = data_sg;
> > > >           }
> > > >
> > > >           sg_init_one(&status, &vbr->status, sizeof(vbr->status));
> > > >                                ^^^^^^^^^^^^
> > > >           sgs[num_out + num_in++] = &status;
> > > >
> > > >           return virtqueue_add_sgs(vq, sgs, num_out, num_in, vbr, GFP_ATOMIC);
> > > >   }
> > > >
> > > > I guess the drivers don't need to be modified as long as swiotlb is used
> > > > to bounce the buffers through "insecure" memory so that the memory
> > > > surrounding the buffers is not exposed?
> > >
> > > Yes, swiotlb won't bounce the whole page. So I think it's safe.
> >
> > Thanks Jason and Yongji Xie for clarifying. Seems like swiotlb or a
> > similar mechanism can handle byte-granularity isolation so the drivers
> > not need to worry about information leaks or memory corruption outside
> > the mapped byte range.
> >
> > We still need to audit virtio guest drivers to ensure they don't trust
> > data that can be modified by the device. I will look at virtio-blk and
> > virtio-fs next week.
> >
> 
> Oh, that's great. Thank you!
> 
> I also did some audit work these days and will send a new version for
> reviewing next Monday.
> 
> Thanks,
> Yongji

Doing it in a way that won't hurt performance for simple
configs that trust the device is a challenge though.
Pls take a look at the discussion with Christoph for some ideas
on how to do this.


-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Yongji Xie <xieyongji@bytedance.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	virtualization <virtualization@lists.linux-foundation.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	file@sect.tu-berlin.de, ashish.kalra@amd.com,
	konrad.wilk@oracle.com, kvm <kvm@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>
Subject: Re: Re: [RFC PATCH V2 0/7] Do not read from descripto ring
Date: Fri, 14 May 2021 07:36:12 -0400	[thread overview]
Message-ID: <20210514073452-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CACycT3u+hQbDJtf5gxS1NVVpiTffMz1skuhTExy5d_oRjYKoxg@mail.gmail.com>

On Fri, May 14, 2021 at 07:27:22PM +0800, Yongji Xie wrote:
> On Fri, May 14, 2021 at 7:17 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >
> > On Fri, May 14, 2021 at 03:29:20PM +0800, Jason Wang wrote:
> > > On Fri, May 14, 2021 at 12:27 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > >
> > > > On Fri, Apr 23, 2021 at 04:09:35PM +0800, Jason Wang wrote:
> > > > > Sometimes, the driver doesn't trust the device. This is usually
> > > > > happens for the encrtpyed VM or VDUSE[1].
> > > >
> > > > Thanks for doing this.
> > > >
> > > > Can you describe the overall memory safety model that virtio drivers
> > > > must follow?
> > >
> > > My understanding is that, basically the driver should not trust the
> > > device (since the driver doesn't know what kind of device that it
> > > tries to drive)
> > >
> > > 1) For any read only metadata (required at the spec level) which is
> > > mapped as coherent, driver should not depend on the metadata that is
> > > stored in a place that could be wrote by the device. This is what this
> > > series tries to achieve.
> > > 2) For other metadata that is produced by the device, need to make
> > > sure there's no malicious device triggered behavior, this is somehow
> > > similar to what vhost did. No DOS, loop, kernel bug and other stuffs.
> > > 3) swiotb is a must to enforce memory access isolation. (VDUSE or encrypted VM)
> > >
> > > > For example:
> > > >
> > > > - Driver-to-device buffers must be on dedicated pages to avoid
> > > >   information leaks.
> > >
> > > It looks to me if swiotlb is used, we don't need this since the
> > > bouncing is not done at byte not page.
> > >
> > > But if swiotlb is not used, we need to enforce this.
> > >
> > > >
> > > > - Driver-to-device buffers must be on dedicated pages to avoid memory
> > > >   corruption.
> > >
> > > Similar to the above.
> > >
> > > >
> > > > When I say "pages" I guess it's the IOMMU page size that matters?
> > > >
> > >
> > > And the IOTLB page size.
> > >
> > > > What is the memory access granularity of VDUSE?
> > >
> > > It has an swiotlb, but the access and bouncing is done per byte.
> > >
> > > >
> > > > I'm asking these questions because there is driver code that exposes
> > > > kernel memory to the device and I'm not sure it's safe. For example:
> > > >
> > > >   static int virtblk_add_req(struct virtqueue *vq, struct virtblk_req *vbr,
> > > >                   struct scatterlist *data_sg, bool have_data)
> > > >   {
> > > >           struct scatterlist hdr, status, *sgs[3];
> > > >           unsigned int num_out = 0, num_in = 0;
> > > >
> > > >           sg_init_one(&hdr, &vbr->out_hdr, sizeof(vbr->out_hdr));
> > > >                             ^^^^^^^^^^^^^
> > > >           sgs[num_out++] = &hdr;
> > > >
> > > >           if (have_data) {
> > > >                   if (vbr->out_hdr.type & cpu_to_virtio32(vq->vdev, VIRTIO_BLK_T_OUT))
> > > >                           sgs[num_out++] = data_sg;
> > > >                   else
> > > >                           sgs[num_out + num_in++] = data_sg;
> > > >           }
> > > >
> > > >           sg_init_one(&status, &vbr->status, sizeof(vbr->status));
> > > >                                ^^^^^^^^^^^^
> > > >           sgs[num_out + num_in++] = &status;
> > > >
> > > >           return virtqueue_add_sgs(vq, sgs, num_out, num_in, vbr, GFP_ATOMIC);
> > > >   }
> > > >
> > > > I guess the drivers don't need to be modified as long as swiotlb is used
> > > > to bounce the buffers through "insecure" memory so that the memory
> > > > surrounding the buffers is not exposed?
> > >
> > > Yes, swiotlb won't bounce the whole page. So I think it's safe.
> >
> > Thanks Jason and Yongji Xie for clarifying. Seems like swiotlb or a
> > similar mechanism can handle byte-granularity isolation so the drivers
> > not need to worry about information leaks or memory corruption outside
> > the mapped byte range.
> >
> > We still need to audit virtio guest drivers to ensure they don't trust
> > data that can be modified by the device. I will look at virtio-blk and
> > virtio-fs next week.
> >
> 
> Oh, that's great. Thank you!
> 
> I also did some audit work these days and will send a new version for
> reviewing next Monday.
> 
> Thanks,
> Yongji

Doing it in a way that won't hurt performance for simple
configs that trust the device is a challenge though.
Pls take a look at the discussion with Christoph for some ideas
on how to do this.


-- 
MST


  reply	other threads:[~2021-05-14 11:36 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-23  8:09 [RFC PATCH V2 0/7] Do not read from descripto ring Jason Wang
2021-04-23  8:09 ` Jason Wang
2021-04-23  8:09 ` [RFC PATCH V2 1/7] virtio-ring: maintain next in extra state for packed virtqueue Jason Wang
2021-04-23  8:09   ` Jason Wang
2021-04-23  8:09 ` [RFC PATCH V2 2/7] virtio_ring: rename vring_desc_extra_packed Jason Wang
2021-04-23  8:09   ` Jason Wang
2021-04-23  8:09 ` [RFC PATCH V2 3/7] virtio-ring: factor out desc_extra allocation Jason Wang
2021-04-23  8:09   ` Jason Wang
2021-04-23  8:09 ` [RFC PATCH V2 4/7] virtio_ring: secure handling of mapping errors Jason Wang
2021-04-23  8:09   ` Jason Wang
2021-04-23  8:09 ` [RFC PATCH V2 5/7] virtio_ring: introduce virtqueue_desc_add_split() Jason Wang
2021-04-23  8:09   ` Jason Wang
2021-04-23  8:09 ` [RFC PATCH V2 6/7] virtio: use err label in __vring_new_virtqueue() Jason Wang
2021-04-23  8:09   ` Jason Wang
2021-04-23  8:09 ` [RFC PATCH V2 7/7] virtio-ring: store DMA metadata in desc_extra for split virtqueue Jason Wang
2021-04-23  8:09   ` Jason Wang
2021-05-06  3:20 ` [RFC PATCH V2 0/7] Do not read from descripto ring Jason Wang
2021-05-06  3:20   ` Jason Wang
2021-05-06  8:12   ` Michael S. Tsirkin
2021-05-06  8:12     ` Michael S. Tsirkin
2021-05-06 12:38     ` Christoph Hellwig
2021-05-06 12:38       ` Christoph Hellwig
2021-05-14 11:13       ` Michael S. Tsirkin
2021-05-14 11:13         ` Michael S. Tsirkin
2021-06-04  5:38         ` Jason Wang
2021-06-04  5:38           ` Jason Wang
2021-07-11 16:08           ` Michael S. Tsirkin
2021-07-11 16:08             ` Michael S. Tsirkin
2021-07-12  3:07             ` Jason Wang
2021-07-12  3:07               ` Jason Wang
2021-07-12 12:58               ` Michael S. Tsirkin
2021-07-12 12:58                 ` Michael S. Tsirkin
2021-05-13 16:27 ` Stefan Hajnoczi
2021-05-13 16:27   ` Stefan Hajnoczi
2021-05-14  6:06   ` Yongji Xie
2021-05-14  7:30     ` Jason Wang
2021-05-14  7:30       ` Jason Wang
2021-05-14  8:40       ` Yongji Xie
2021-05-14  7:29   ` Jason Wang
2021-05-14  7:29     ` Jason Wang
2021-05-14 11:16     ` Stefan Hajnoczi
2021-05-14 11:16       ` Stefan Hajnoczi
2021-05-14 11:27       ` Yongji Xie
2021-05-14 11:36         ` Michael S. Tsirkin [this message]
2021-05-14 11:36           ` Michael S. Tsirkin
2021-05-14 13:58           ` Yongji Xie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210514073452-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=ashish.kalra@amd.com \
    --cc=file@sect.tu-berlin.de \
    --cc=hch@infradead.org \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stefanha@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xieyongji@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.