virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Srujana Challa <schalla@marvell.com>
Cc: Jason Wang <jasowang@redhat.com>,
	"virtualization@lists.linux.dev" <virtualization@lists.linux.dev>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Vamsi Krishna Attunuru <vattunuru@marvell.com>,
	Shijith Thotton <sthotton@marvell.com>,
	Nithin Kumar Dabilpuram <ndabilpuram@marvell.com>,
	Jerin Jacob <jerinj@marvell.com>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"will@kernel.org" <will@kernel.org>
Subject: Re: [EXTERNAL] Re: [PATCH] vdpa: Add support for no-IOMMU mode
Date: Tue, 10 Sep 2024 01:56:45 -0400	[thread overview]
Message-ID: <20240910015607-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <DS0PR18MB53684A307C08D17276A465EEA0952@DS0PR18MB5368.namprd18.prod.outlook.com>

On Wed, Aug 28, 2024 at 09:08:13AM +0000, Srujana Challa wrote:
> > Subject: RE: [EXTERNAL] Re: [PATCH] vdpa: Add support for no-IOMMU mode
> > 
> > > On Tue, Jul 23, 2024 at 07:10:52AM +0000, Srujana Challa wrote:
> > > > > On Mon, Jul 22, 2024 at 03:22:22PM +0800, Jason Wang wrote:
> > > > > > On Fri, Jul 19, 2024 at 11:40 PM Srujana Challa
> > > > > > <schalla@marvell.com>
> > > > > wrote:
> > > > > > >
> > > > > > > > On Thu, May 30, 2024 at 03:48:23PM +0530, Srujana Challa wrote:
> > > > > > > > > This commit introduces support for an UNSAFE, no-IOMMU
> > > > > > > > > mode in the vhost-vdpa driver. When enabled, this mode
> > > > > > > > > provides no device isolation, no DMA translation, no host
> > > > > > > > > kernel protection, and cannot be used for device
> > > > > > > > > assignment to virtual machines. It requires RAWIO
> > > > > > > > > permissions and will taint the
> > > kernel.
> > > > > > > > > This mode requires enabling the
> > > > > > > > "enable_vhost_vdpa_unsafe_noiommu_mode"
> > > > > > > > > option on the vhost-vdpa driver. This mode would be useful
> > > > > > > > > to get better performance on specifice low end machines
> > > > > > > > > and can be leveraged by embedded platforms where
> > > > > > > > > applications run in controlled
> > > > > environment.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Srujana Challa <schalla@marvell.com>
> > > > > > > >
> > > > > > > > Thought hard about that.
> > > > > > > > I think given vfio supports this, we can do that too, and
> > > > > > > > the extension is
> > > > > small.
> > > > > > > >
> > > > > > > > However, it looks like setting this parameter will
> > > > > > > > automatically change the behaviour for existing userspace
> > > > > > > > when
> > > > > IOMMU_DOMAIN_IDENTITY is set.
> > > > Our initial thought was to support only for no-iommu case, in which
> > > > domain
> > > itself
> > > > won't be exist.   So, we can modify the code as below to check for only
> > > presence of domain.
> > > > I think,  only handling of no-iommu case wouldn't effect the
> > > > existing
> > > userspace.
> > > > +   if ((!domain) && vhost_vdpa_noiommu && capable(CAP_SYS_RAWIO))
> > {
> > >
> > > I would prefer some explicit action.
> > > Just not specifying a domain is something I'd like to keep reserved
> > > for something of more wide usefulness.
> > Can we introduce a new feature like VHOST_BACKEND_F_NOIOMMU in
> > VHOST_VDPA_BACKEND_FEATURES?  We can have below logic based on this
> > feature bit negotiation.
> > Thanks.
> Michael, could you please confirm if adding a new feature to VHOST_VDPA_BACKEND_FEATURES
> is an appropriate solution to support no-IOMMU for the vhost-vdpa backend?


Yes. So the idea is to require both a module parameter, and a
flag set by userspace, to make sure users do not mistakenly
try to assign such devices to VMs.

Thanks.

> > >
> > >
> > > > > > > >
> > > > > > > > I suggest a new domain type for use just for this purpose.
> > > > > >
> > > > > > I'm not sure I get this, we want to bypass IOMMU, so it doesn't
> > > > > > even have a doman.
> > > > >
> > > > > yes, a fake one. or come up with some other flag that userspace will set.
> > > > >
> > > > > > > This way if host has
> > > > > > > > an iommu, then the same kernel can run both VMs with
> > > > > > > > isolation and unsafe embedded apps without.
> > > > > > > Could you provide further details on this concept? What
> > > > > > > criteria would determine the configuration of the new domain
> > > > > > > type? Would this require a boot parameter similar to
> > > > > > > IOMMU_DOMAIN_IDENTITY, such as
> > > > > iommu.passthrough=1 or iommu.pt?
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > > >
> > > > > > > > > ---
> > > > > > > > >  drivers/vhost/vdpa.c | 23 +++++++++++++++++++++++
> > > > > > > > >  1 file changed, 23 insertions(+)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > > > index bc4a51e4638b..d071c30125aa 100644
> > > > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > > > @@ -36,6 +36,11 @@ enum {
> > > > > > > > >
> > > > > > > > >  #define VHOST_VDPA_IOTLB_BUCKETS 16
> > > > > > > > >
> > > > > > > > > +bool vhost_vdpa_noiommu;
> > > > > > > > >
> > > +module_param_named(enable_vhost_vdpa_unsafe_noiommu_mode,
> > > > > > > > > +              vhost_vdpa_noiommu, bool, 0644);
> > > > > > > > >
> > > +MODULE_PARM_DESC(enable_vhost_vdpa_unsafe_noiommu_mode,
> > > > > > > > "Enable
> > > > > > > > > +UNSAFE, no-IOMMU mode.  This mode provides no device
> > > > > > > > > +isolation, no DMA translation, no host kernel protection,
> > > > > > > > > +cannot be used for device assignment to virtual machines,
> > > > > > > > > +requires RAWIO permissions, and will taint the kernel.
> > > > > > > > > +If you do not know what this is
> > > > > for, step away.
> > > > > > > > > +(default: false)");
> > > > > > > > > +
> > > > > > > > >  struct vhost_vdpa_as {
> > > > > > > > >     struct hlist_node hash_link;
> > > > > > > > >     struct vhost_iotlb iotlb; @@ -60,6 +65,7 @@ struct
> > > > > > > > > vhost_vdpa {
> > > > > > > > >     struct vdpa_iova_range range;
> > > > > > > > >     u32 batch_asid;
> > > > > > > > >     bool suspended;
> > > > > > > > > +   bool noiommu_en;
> > > > > > > > >  };
> > > > > > > > >
> > > > > > > > >  static DEFINE_IDA(vhost_vdpa_ida); @@ -887,6 +893,10 @@
> > > > > > > > > static void vhost_vdpa_general_unmap(struct vhost_vdpa *v,  {
> > > > > > > > >     struct vdpa_device *vdpa = v->vdpa;
> > > > > > > > >     const struct vdpa_config_ops *ops = vdpa->config;
> > > > > > > > > +
> > > > > > > > > +   if (v->noiommu_en)
> > > > > > > > > +           return;
> > > > > > > > > +
> > > > > > > > >     if (ops->dma_map) {
> > > > > > > > >             ops->dma_unmap(vdpa, asid, map->start, map->size);
> > > > > > > > >     } else if (ops->set_map == NULL) { @@ -980,6 +990,9 @@
> > > > > > > > > static int vhost_vdpa_map(struct vhost_vdpa *v,
> > > > > > > > struct vhost_iotlb *iotlb,
> > > > > > > > >     if (r)
> > > > > > > > >             return r;
> > > > > > > > >
> > > > > > > > > +   if (v->noiommu_en)
> > > > > > > > > +           goto skip_map;
> > > > > > > > > +
> > > > > > > > >     if (ops->dma_map) {
> > > > > > > > >             r = ops->dma_map(vdpa, asid, iova, size, pa, perm, opaque);
> > > > > > > > >     } else if (ops->set_map) { @@ -995,6 +1008,7 @@ static
> > > > > > > > > int vhost_vdpa_map(struct vhost_vdpa *v,
> > > > > > > > struct vhost_iotlb *iotlb,
> > > > > > > > >             return r;
> > > > > > > > >     }
> > > > > > > > >
> > > > > > > > > +skip_map:
> > > > > > > > >     if (!vdpa->use_va)
> > > > > > > > >             atomic64_add(PFN_DOWN(size),
> > > > > > > > > &dev->mm->pinned_vm);
> > > > > > > > >
> > > > > > > > > @@ -1298,6 +1312,7 @@ static int
> > > > > > > > > vhost_vdpa_alloc_domain(struct
> > > > > > > > vhost_vdpa *v)
> > > > > > > > >     struct vdpa_device *vdpa = v->vdpa;
> > > > > > > > >     const struct vdpa_config_ops *ops = vdpa->config;
> > > > > > > > >     struct device *dma_dev = vdpa_get_dma_dev(vdpa);
> > > > > > > > > +   struct iommu_domain *domain;
> > > > > > > > >     const struct bus_type *bus;
> > > > > > > > >     int ret;
> > > > > > > > >
> > > > > > > > > @@ -1305,6 +1320,14 @@ static int
> > > > > > > > > vhost_vdpa_alloc_domain(struct
> > > > > > > > vhost_vdpa *v)
> > > > > > > > >     if (ops->set_map || ops->dma_map)
> > > > > > > > >             return 0;
> > > > > > > > >
> > > > > > > > > +   domain = iommu_get_domain_for_dev(dma_dev);
> > > > > > > > > +   if ((!domain || domain->type == IOMMU_DOMAIN_IDENTITY)
> > > &&
> > > > > > > > > +       vhost_vdpa_noiommu && capable(CAP_SYS_RAWIO)) {
> > > > > > > >
> > > > > > > > So if userspace does not have CAP_SYS_RAWIO instead of
> > > > > > > > failing with a permission error the functionality changes silently?
> > > > > > > > That's confusing, I think.
> > > > > > > Yes, you are correct. I will modify the code to return error
> > > > > > > when vhost_vdpa_noiommu is set and CAP_SYS_RAWIO is not set.
> > > > > > >
> > > > > > > Thanks.
> > > > > > > >
> > > > > > > >
> > > > > > > > > +           add_taint(TAINT_USER, LOCKDEP_STILL_OK);
> > > > > > > > > +           dev_warn(&v->dev, "Adding kernel taint for
> > > > > > > > > + noiommu on
> > > > > > > > device\n");
> > > > > > > > > +           v->noiommu_en = true;
> > > > > > > > > +           return 0;
> > > > > > > > > +   }
> > > > > > > > >     bus = dma_dev->bus;
> > > > > > > > >     if (!bus)
> > > > > > > > >             return -EFAULT;
> > > > > > > > > --
> > > > > > > > > 2.25.1
> > > > > > >
> > > >
> 


      reply	other threads:[~2024-09-10  5:56 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-30 10:18 [PATCH] vdpa: Add support for no-IOMMU mode Srujana Challa
2024-05-31  2:26 ` Jason Wang
2024-06-04  9:29   ` [EXTERNAL] " Srujana Challa
2024-06-06  0:15     ` Jason Wang
2024-06-12  9:22       ` Srujana Challa
2024-06-12 12:32         ` Michael S. Tsirkin
2024-06-17  1:38           ` Jason Wang
2024-06-04 11:55   ` Stefano Garzarella
2024-06-06  0:14     ` Jason Wang
2024-06-01 19:13 ` kernel test robot
2024-07-17  9:50 ` Michael S. Tsirkin
2024-07-19 15:39   ` [EXTERNAL] " Srujana Challa
2024-07-22  7:22     ` Jason Wang
2024-07-22  7:50       ` Michael S. Tsirkin
2024-07-23  7:10         ` Srujana Challa
2024-07-23 11:04           ` Michael S. Tsirkin
2024-08-06 11:47             ` Srujana Challa
2024-08-28  9:08               ` Srujana Challa
2024-09-10  5:56                 ` Michael S. Tsirkin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240910015607-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=jerinj@marvell.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=ndabilpuram@marvell.com \
    --cc=schalla@marvell.com \
    --cc=sthotton@marvell.com \
    --cc=vattunuru@marvell.com \
    --cc=virtualization@lists.linux.dev \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).