From: "Michael S. Tsirkin" <mst@redhat.com>
To: Yongji Xie <xieyongji@bytedance.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
Jason Wang <jasowang@redhat.com>,
virtualization <virtualization@lists.linux-foundation.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
file@sect.tu-berlin.de, ashish.kalra@amd.com,
konrad.wilk@oracle.com, kvm <kvm@vger.kernel.org>,
Christoph Hellwig <hch@infradead.org>
Subject: Re: Re: [RFC PATCH V2 0/7] Do not read from descripto ring
Date: Fri, 14 May 2021 07:36:12 -0400 [thread overview]
Message-ID: <20210514073452-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CACycT3u+hQbDJtf5gxS1NVVpiTffMz1skuhTExy5d_oRjYKoxg@mail.gmail.com>
On Fri, May 14, 2021 at 07:27:22PM +0800, Yongji Xie wrote:
> On Fri, May 14, 2021 at 7:17 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >
> > On Fri, May 14, 2021 at 03:29:20PM +0800, Jason Wang wrote:
> > > On Fri, May 14, 2021 at 12:27 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > >
> > > > On Fri, Apr 23, 2021 at 04:09:35PM +0800, Jason Wang wrote:
> > > > > Sometimes, the driver doesn't trust the device. This is usually
> > > > > happens for the encrtpyed VM or VDUSE[1].
> > > >
> > > > Thanks for doing this.
> > > >
> > > > Can you describe the overall memory safety model that virtio drivers
> > > > must follow?
> > >
> > > My understanding is that, basically the driver should not trust the
> > > device (since the driver doesn't know what kind of device that it
> > > tries to drive)
> > >
> > > 1) For any read only metadata (required at the spec level) which is
> > > mapped as coherent, driver should not depend on the metadata that is
> > > stored in a place that could be wrote by the device. This is what this
> > > series tries to achieve.
> > > 2) For other metadata that is produced by the device, need to make
> > > sure there's no malicious device triggered behavior, this is somehow
> > > similar to what vhost did. No DOS, loop, kernel bug and other stuffs.
> > > 3) swiotb is a must to enforce memory access isolation. (VDUSE or encrypted VM)
> > >
> > > > For example:
> > > >
> > > > - Driver-to-device buffers must be on dedicated pages to avoid
> > > > information leaks.
> > >
> > > It looks to me if swiotlb is used, we don't need this since the
> > > bouncing is not done at byte not page.
> > >
> > > But if swiotlb is not used, we need to enforce this.
> > >
> > > >
> > > > - Driver-to-device buffers must be on dedicated pages to avoid memory
> > > > corruption.
> > >
> > > Similar to the above.
> > >
> > > >
> > > > When I say "pages" I guess it's the IOMMU page size that matters?
> > > >
> > >
> > > And the IOTLB page size.
> > >
> > > > What is the memory access granularity of VDUSE?
> > >
> > > It has an swiotlb, but the access and bouncing is done per byte.
> > >
> > > >
> > > > I'm asking these questions because there is driver code that exposes
> > > > kernel memory to the device and I'm not sure it's safe. For example:
> > > >
> > > > static int virtblk_add_req(struct virtqueue *vq, struct virtblk_req *vbr,
> > > > struct scatterlist *data_sg, bool have_data)
> > > > {
> > > > struct scatterlist hdr, status, *sgs[3];
> > > > unsigned int num_out = 0, num_in = 0;
> > > >
> > > > sg_init_one(&hdr, &vbr->out_hdr, sizeof(vbr->out_hdr));
> > > > ^^^^^^^^^^^^^
> > > > sgs[num_out++] = &hdr;
> > > >
> > > > if (have_data) {
> > > > if (vbr->out_hdr.type & cpu_to_virtio32(vq->vdev, VIRTIO_BLK_T_OUT))
> > > > sgs[num_out++] = data_sg;
> > > > else
> > > > sgs[num_out + num_in++] = data_sg;
> > > > }
> > > >
> > > > sg_init_one(&status, &vbr->status, sizeof(vbr->status));
> > > > ^^^^^^^^^^^^
> > > > sgs[num_out + num_in++] = &status;
> > > >
> > > > return virtqueue_add_sgs(vq, sgs, num_out, num_in, vbr, GFP_ATOMIC);
> > > > }
> > > >
> > > > I guess the drivers don't need to be modified as long as swiotlb is used
> > > > to bounce the buffers through "insecure" memory so that the memory
> > > > surrounding the buffers is not exposed?
> > >
> > > Yes, swiotlb won't bounce the whole page. So I think it's safe.
> >
> > Thanks Jason and Yongji Xie for clarifying. Seems like swiotlb or a
> > similar mechanism can handle byte-granularity isolation so the drivers
> > not need to worry about information leaks or memory corruption outside
> > the mapped byte range.
> >
> > We still need to audit virtio guest drivers to ensure they don't trust
> > data that can be modified by the device. I will look at virtio-blk and
> > virtio-fs next week.
> >
>
> Oh, that's great. Thank you!
>
> I also did some audit work these days and will send a new version for
> reviewing next Monday.
>
> Thanks,
> Yongji
Doing it in a way that won't hurt performance for simple
configs that trust the device is a challenge though.
Pls take a look at the discussion with Christoph for some ideas
on how to do this.
--
MST
next prev parent reply other threads:[~2021-05-14 11:36 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-23 8:09 [RFC PATCH V2 0/7] Do not read from descripto ring Jason Wang
2021-04-23 8:09 ` [RFC PATCH V2 1/7] virtio-ring: maintain next in extra state for packed virtqueue Jason Wang
2021-04-23 8:09 ` [RFC PATCH V2 2/7] virtio_ring: rename vring_desc_extra_packed Jason Wang
2021-04-23 8:09 ` [RFC PATCH V2 3/7] virtio-ring: factor out desc_extra allocation Jason Wang
2021-04-23 8:09 ` [RFC PATCH V2 4/7] virtio_ring: secure handling of mapping errors Jason Wang
2021-04-23 8:09 ` [RFC PATCH V2 5/7] virtio_ring: introduce virtqueue_desc_add_split() Jason Wang
2021-04-23 8:09 ` [RFC PATCH V2 6/7] virtio: use err label in __vring_new_virtqueue() Jason Wang
2021-04-23 8:09 ` [RFC PATCH V2 7/7] virtio-ring: store DMA metadata in desc_extra for split virtqueue Jason Wang
2021-05-06 3:20 ` [RFC PATCH V2 0/7] Do not read from descripto ring Jason Wang
2021-05-06 8:12 ` Michael S. Tsirkin
2021-05-06 12:38 ` Christoph Hellwig
2021-05-14 11:13 ` Michael S. Tsirkin
2021-06-04 5:38 ` Jason Wang
2021-07-11 16:08 ` Michael S. Tsirkin
2021-07-12 3:07 ` Jason Wang
2021-07-12 12:58 ` Michael S. Tsirkin
2021-05-13 16:27 ` Stefan Hajnoczi
2021-05-14 6:06 ` Yongji Xie
2021-05-14 7:30 ` Jason Wang
2021-05-14 8:40 ` Yongji Xie
2021-05-14 7:29 ` Jason Wang
2021-05-14 11:16 ` Stefan Hajnoczi
2021-05-14 11:27 ` Yongji Xie
2021-05-14 11:36 ` Michael S. Tsirkin [this message]
2021-05-14 13:58 ` Yongji Xie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210514073452-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=ashish.kalra@amd.com \
--cc=file@sect.tu-berlin.de \
--cc=hch@infradead.org \
--cc=jasowang@redhat.com \
--cc=konrad.wilk@oracle.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stefanha@redhat.com \
--cc=virtualization@lists.linux-foundation.org \
--cc=xieyongji@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox