From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: ashish.kalra@amd.com, file@sect.tu-berlin.de,
kvm@vger.kernel.org, konrad.wilk@oracle.com,
linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org,
Christoph Hellwig <hch@infradead.org>,
xieyongji@bytedance.com, stefanha@redhat.com
Subject: Re: [RFC PATCH V2 0/7] Do not read from descripto ring
Date: Mon, 12 Jul 2021 08:58:18 -0400 [thread overview]
Message-ID: <20210712085734-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <e2b4c614-746f-e81b-bb0b-d84f0efd381f@redhat.com>
On Mon, Jul 12, 2021 at 11:07:44AM +0800, Jason Wang wrote:
>
> 在 2021/7/12 上午12:08, Michael S. Tsirkin 写道:
> > On Fri, Jun 04, 2021 at 01:38:01PM +0800, Jason Wang wrote:
> > > 在 2021/5/14 下午7:13, Michael S. Tsirkin 写道:
> > > > On Thu, May 06, 2021 at 01:38:29PM +0100, Christoph Hellwig wrote:
> > > > > On Thu, May 06, 2021 at 04:12:17AM -0400, Michael S. Tsirkin wrote:
> > > > > > Let's try for just a bit, won't make this window anyway:
> > > > > >
> > > > > > I have an old idea. Add a way to find out that unmap is a nop
> > > > > > (or more exactly does not use the address/length).
> > > > > > Then in that case even with DMA API we do not need
> > > > > > the extra data. Hmm?
> > > > > So we actually do have a check for that from the early days of the DMA
> > > > > API, but it only works at compile time: CONFIG_NEED_DMA_MAP_STATE.
> > > > >
> > > > > But given how rare configs without an iommu or swiotlb are these days
> > > > > it has stopped to be very useful. Unfortunately a runtime-version is
> > > > > not entirely trivial, but maybe if we allow for false positives we
> > > > > could do something like this
> > > > >
> > > > > bool dma_direct_need_state(struct device *dev)
> > > > > {
> > > > > /* some areas could not be covered by any map at all */
> > > > > if (dev->dma_range_map)
> > > > > return false;
> > > > > if (force_dma_unencrypted(dev))
> > > > > return false;
> > > > > if (dma_direct_need_sync(dev))
> > > > > return false;
> > > > > return *dev->dma_mask == DMA_BIT_MASK(64);
> > > > > }
> > > > >
> > > > > bool dma_need_state(struct device *dev)
> > > > > {
> > > > > const struct dma_map_ops *ops = get_dma_ops(dev);
> > > > >
> > > > > if (dma_map_direct(dev, ops))
> > > > > return dma_direct_need_state(dev);
> > > > > return ops->unmap_page ||
> > > > > ops->sync_single_for_cpu || ops->sync_single_for_device;
> > > > > }
> > > > Yea that sounds like a good idea. We will need to document that.
> > > >
> > > >
> > > > Something like:
> > > >
> > > > /*
> > > > * dma_need_state - report whether unmap calls use the address and length
> > > > * @dev: device to guery
> > > > *
> > > > * This is a runtime version of CONFIG_NEED_DMA_MAP_STATE.
> > > > *
> > > > * Return the value indicating whether dma_unmap_* and dma_sync_* calls for the device
> > > > * use the DMA state parameters passed to them.
> > > > * The DMA state parameters are: scatter/gather list/table, address and
> > > > * length.
> > > > *
> > > > * If dma_need_state returns false then DMA state parameters are
> > > > * ignored by all dma_unmap_* and dma_sync_* calls, so it is safe to pass 0 for
> > > > * address and length, and DMA_UNMAP_SG_TABLE_INVALID and
> > > > * DMA_UNMAP_SG_LIST_INVALID for s/g table and length respectively.
> > > > * If dma_need_state returns true then DMA state might
> > > > * be used and so the actual values are required.
> > > > */
> > > >
> > > > And we will need DMA_UNMAP_SG_TABLE_INVALID and
> > > > DMA_UNMAP_SG_LIST_INVALID as pointers to an empty global table and list
> > > > for calls such as dma_unmap_sgtable that dereference pointers before checking
> > > > they are used.
> > > >
> > > >
> > > > Does this look good?
> > > >
> > > > The table/length variants are for consistency, virtio specifically does
> > > > not use s/g at the moment, but it seems nicer than leaving
> > > > users wonder what to do about these.
> > > >
> > > > Thoughts? Jason want to try implementing?
> > >
> > > I can add it in my todo list other if other people are interested in this,
> > > please let us know.
> > >
> > > But this is just about saving the efforts of unmap and it doesn't eliminate
> > > the necessary of using private memory (addr, length) for the metadata for
> > > validating the device inputs.
> >
> > Besides unmap, why do we need to validate address?
>
>
> Sorry, it's not validating actually, the driver doesn't do any validation.
> As the subject, the driver will just use the metadata stored in the
> desc_state instead of the one stored in the descriptor ring.
>
>
> > length can be
> > typically validated by specific drivers - not all of them even use it ..
> >
> > > And just to clarify, the slight regression we see is testing without
> > > VIRTIO_F_ACCESS_PLATFORM which means DMA API is not used.
> > I guess this is due to extra cache pressure?
>
>
> Yes.
>
>
> > Maybe create yet another
> > array just for DMA state ...
>
>
> I'm not sure I get this, we use this basically:
>
> struct vring_desc_extra {
> dma_addr_t addr; /* Buffer DMA addr. */
> u32 len; /* Buffer length. */
> u16 flags; /* Descriptor flags. */
> u16 next; /* The next desc state in a list. */
> };
>
> Except for the "next" the rest are all DMA state.
>
> Thanks
I am talking about the dma need state idea where we interrogate the DMA
API to figure out whether unmap is actually a nop.
>
> >
> > > So I will go to post a formal version of this series and we can start from
> > > there.
> > >
> > > Thanks
> > >
> > >
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
next prev parent reply other threads:[~2021-07-12 12:58 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-23 8:09 [RFC PATCH V2 0/7] Do not read from descripto ring Jason Wang
2021-04-23 8:09 ` [RFC PATCH V2 1/7] virtio-ring: maintain next in extra state for packed virtqueue Jason Wang
2021-04-23 8:09 ` [RFC PATCH V2 2/7] virtio_ring: rename vring_desc_extra_packed Jason Wang
2021-04-23 8:09 ` [RFC PATCH V2 3/7] virtio-ring: factor out desc_extra allocation Jason Wang
2021-04-23 8:09 ` [RFC PATCH V2 4/7] virtio_ring: secure handling of mapping errors Jason Wang
2021-04-23 8:09 ` [RFC PATCH V2 5/7] virtio_ring: introduce virtqueue_desc_add_split() Jason Wang
2021-04-23 8:09 ` [RFC PATCH V2 6/7] virtio: use err label in __vring_new_virtqueue() Jason Wang
2021-04-23 8:09 ` [RFC PATCH V2 7/7] virtio-ring: store DMA metadata in desc_extra for split virtqueue Jason Wang
2021-05-06 3:20 ` [RFC PATCH V2 0/7] Do not read from descripto ring Jason Wang
2021-05-06 8:12 ` Michael S. Tsirkin
2021-05-06 12:38 ` Christoph Hellwig
2021-05-14 11:13 ` Michael S. Tsirkin
2021-06-04 5:38 ` Jason Wang
2021-07-11 16:08 ` Michael S. Tsirkin
2021-07-12 3:07 ` Jason Wang
2021-07-12 12:58 ` Michael S. Tsirkin [this message]
2021-05-13 16:27 ` Stefan Hajnoczi
2021-05-14 7:29 ` Jason Wang
2021-05-14 11:16 ` Stefan Hajnoczi
[not found] ` <CACycT3u+hQbDJtf5gxS1NVVpiTffMz1skuhTExy5d_oRjYKoxg@mail.gmail.com>
2021-05-14 11:36 ` Michael S. Tsirkin
[not found] ` <CACycT3v-2naEaXEtPqaKcGz8qpfnmp4VzrHefqLNhO=9=57jdQ@mail.gmail.com>
2021-05-14 7:30 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210712085734-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=ashish.kalra@amd.com \
--cc=file@sect.tu-berlin.de \
--cc=hch@infradead.org \
--cc=jasowang@redhat.com \
--cc=konrad.wilk@oracle.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stefanha@redhat.com \
--cc=virtualization@lists.linux-foundation.org \
--cc=xieyongji@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).