From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: Srujana Challa <schalla@marvell.com>,
"virtualization@lists.linux.dev" <virtualization@lists.linux.dev>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
Vamsi Krishna Attunuru <vattunuru@marvell.com>,
Shijith Thotton <sthotton@marvell.com>,
Nithin Kumar Dabilpuram <ndabilpuram@marvell.com>,
Jerin Jacob <jerinj@marvell.com>,
"joro@8bytes.org" <joro@8bytes.org>,
"will@kernel.org" <will@kernel.org>
Subject: Re: [EXTERNAL] Re: [PATCH] vdpa: Add support for no-IOMMU mode
Date: Mon, 22 Jul 2024 03:50:56 -0400 [thread overview]
Message-ID: <20240722034957-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CACGkMEtQ3SWBpS-00BBCJxoUK5AQRB=FhKGEqigh81GTbRf61A@mail.gmail.com>
On Mon, Jul 22, 2024 at 03:22:22PM +0800, Jason Wang wrote:
> On Fri, Jul 19, 2024 at 11:40 PM Srujana Challa <schalla@marvell.com> wrote:
> >
> > > On Thu, May 30, 2024 at 03:48:23PM +0530, Srujana Challa wrote:
> > > > This commit introduces support for an UNSAFE, no-IOMMU mode in the
> > > > vhost-vdpa driver. When enabled, this mode provides no device
> > > > isolation, no DMA translation, no host kernel protection, and cannot
> > > > be used for device assignment to virtual machines. It requires RAWIO
> > > > permissions and will taint the kernel.
> > > > This mode requires enabling the
> > > "enable_vhost_vdpa_unsafe_noiommu_mode"
> > > > option on the vhost-vdpa driver. This mode would be useful to get
> > > > better performance on specifice low end machines and can be leveraged
> > > > by embedded platforms where applications run in controlled environment.
> > > >
> > > > Signed-off-by: Srujana Challa <schalla@marvell.com>
> > >
> > > Thought hard about that.
> > > I think given vfio supports this, we can do that too, and the extension is small.
> > >
> > > However, it looks like setting this parameter will automatically change the
> > > behaviour for existing userspace when IOMMU_DOMAIN_IDENTITY is set.
> > >
> > > I suggest a new domain type for use just for this purpose.
>
> I'm not sure I get this, we want to bypass IOMMU, so it doesn't even
> have a doman.
yes, a fake one. or come up with some other flag that userspace
will set.
> > This way if host has
> > > an iommu, then the same kernel can run both VMs with isolation and unsafe
> > > embedded apps without.
> > Could you provide further details on this concept? What criteria would determine
> > the configuration of the new domain type? Would this require a boot parameter
> > similar to IOMMU_DOMAIN_IDENTITY, such as iommu.passthrough=1 or iommu.pt?
>
> Thanks
>
> > >
> > > > ---
> > > > drivers/vhost/vdpa.c | 23 +++++++++++++++++++++++
> > > > 1 file changed, 23 insertions(+)
> > > >
> > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index
> > > > bc4a51e4638b..d071c30125aa 100644
> > > > --- a/drivers/vhost/vdpa.c
> > > > +++ b/drivers/vhost/vdpa.c
> > > > @@ -36,6 +36,11 @@ enum {
> > > >
> > > > #define VHOST_VDPA_IOTLB_BUCKETS 16
> > > >
> > > > +bool vhost_vdpa_noiommu;
> > > > +module_param_named(enable_vhost_vdpa_unsafe_noiommu_mode,
> > > > + vhost_vdpa_noiommu, bool, 0644);
> > > > +MODULE_PARM_DESC(enable_vhost_vdpa_unsafe_noiommu_mode,
> > > "Enable
> > > > +UNSAFE, no-IOMMU mode. This mode provides no device isolation, no
> > > > +DMA translation, no host kernel protection, cannot be used for device
> > > > +assignment to virtual machines, requires RAWIO permissions, and will
> > > > +taint the kernel. If you do not know what this is for, step away.
> > > > +(default: false)");
> > > > +
> > > > struct vhost_vdpa_as {
> > > > struct hlist_node hash_link;
> > > > struct vhost_iotlb iotlb;
> > > > @@ -60,6 +65,7 @@ struct vhost_vdpa {
> > > > struct vdpa_iova_range range;
> > > > u32 batch_asid;
> > > > bool suspended;
> > > > + bool noiommu_en;
> > > > };
> > > >
> > > > static DEFINE_IDA(vhost_vdpa_ida);
> > > > @@ -887,6 +893,10 @@ static void vhost_vdpa_general_unmap(struct
> > > > vhost_vdpa *v, {
> > > > struct vdpa_device *vdpa = v->vdpa;
> > > > const struct vdpa_config_ops *ops = vdpa->config;
> > > > +
> > > > + if (v->noiommu_en)
> > > > + return;
> > > > +
> > > > if (ops->dma_map) {
> > > > ops->dma_unmap(vdpa, asid, map->start, map->size);
> > > > } else if (ops->set_map == NULL) {
> > > > @@ -980,6 +990,9 @@ static int vhost_vdpa_map(struct vhost_vdpa *v,
> > > struct vhost_iotlb *iotlb,
> > > > if (r)
> > > > return r;
> > > >
> > > > + if (v->noiommu_en)
> > > > + goto skip_map;
> > > > +
> > > > if (ops->dma_map) {
> > > > r = ops->dma_map(vdpa, asid, iova, size, pa, perm, opaque);
> > > > } else if (ops->set_map) {
> > > > @@ -995,6 +1008,7 @@ static int vhost_vdpa_map(struct vhost_vdpa *v,
> > > struct vhost_iotlb *iotlb,
> > > > return r;
> > > > }
> > > >
> > > > +skip_map:
> > > > if (!vdpa->use_va)
> > > > atomic64_add(PFN_DOWN(size), &dev->mm->pinned_vm);
> > > >
> > > > @@ -1298,6 +1312,7 @@ static int vhost_vdpa_alloc_domain(struct
> > > vhost_vdpa *v)
> > > > struct vdpa_device *vdpa = v->vdpa;
> > > > const struct vdpa_config_ops *ops = vdpa->config;
> > > > struct device *dma_dev = vdpa_get_dma_dev(vdpa);
> > > > + struct iommu_domain *domain;
> > > > const struct bus_type *bus;
> > > > int ret;
> > > >
> > > > @@ -1305,6 +1320,14 @@ static int vhost_vdpa_alloc_domain(struct
> > > vhost_vdpa *v)
> > > > if (ops->set_map || ops->dma_map)
> > > > return 0;
> > > >
> > > > + domain = iommu_get_domain_for_dev(dma_dev);
> > > > + if ((!domain || domain->type == IOMMU_DOMAIN_IDENTITY) &&
> > > > + vhost_vdpa_noiommu && capable(CAP_SYS_RAWIO)) {
> > >
> > > So if userspace does not have CAP_SYS_RAWIO instead of failing with a
> > > permission error the functionality changes silently?
> > > That's confusing, I think.
> > Yes, you are correct. I will modify the code to return error when vhost_vdpa_noiommu
> > is set and CAP_SYS_RAWIO is not set.
> >
> > Thanks.
> > >
> > >
> > > > + add_taint(TAINT_USER, LOCKDEP_STILL_OK);
> > > > + dev_warn(&v->dev, "Adding kernel taint for noiommu on
> > > device\n");
> > > > + v->noiommu_en = true;
> > > > + return 0;
> > > > + }
> > > > bus = dma_dev->bus;
> > > > if (!bus)
> > > > return -EFAULT;
> > > > --
> > > > 2.25.1
> >
next prev parent reply other threads:[~2024-07-22 7:51 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-30 10:18 [PATCH] vdpa: Add support for no-IOMMU mode Srujana Challa
2024-05-31 2:26 ` Jason Wang
2024-06-04 9:29 ` [EXTERNAL] " Srujana Challa
2024-06-06 0:15 ` Jason Wang
2024-06-12 9:22 ` Srujana Challa
2024-06-12 12:32 ` Michael S. Tsirkin
2024-06-17 1:38 ` Jason Wang
2024-06-04 11:55 ` Stefano Garzarella
2024-06-06 0:14 ` Jason Wang
2024-06-01 19:13 ` kernel test robot
2024-07-17 9:50 ` Michael S. Tsirkin
2024-07-19 15:39 ` [EXTERNAL] " Srujana Challa
2024-07-22 7:22 ` Jason Wang
2024-07-22 7:50 ` Michael S. Tsirkin [this message]
2024-07-23 7:10 ` Srujana Challa
2024-07-23 11:04 ` Michael S. Tsirkin
2024-08-06 11:47 ` Srujana Challa
2024-08-28 9:08 ` Srujana Challa
2024-09-10 5:56 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240722034957-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=jasowang@redhat.com \
--cc=jerinj@marvell.com \
--cc=joro@8bytes.org \
--cc=kvm@vger.kernel.org \
--cc=ndabilpuram@marvell.com \
--cc=schalla@marvell.com \
--cc=sthotton@marvell.com \
--cc=vattunuru@marvell.com \
--cc=virtualization@lists.linux.dev \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).