Re: Multiple vIOMMU instance support in QEMU?

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Peter Xu <peterx@redhat.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Eric Auger <eric.auger@redhat.com>,
	Nicolin Chen <nicolinc@nvidia.com>,
	peter.maydell@linaro.org, qemu-devel@nongnu.org,
	qemu-arm@nongnu.org, yi.l.liu@intel.com, kevin.tian@intel.com,
	Jason Wang <jasowang@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: Multiple vIOMMU instance support in QEMU?
Date: Thu, 18 May 2023 15:45:24 -0400	[thread overview]
Message-ID: <ZGaAVAI9u4K4vy1/@x1n> (raw)
In-Reply-To: <ZGY8rj9hRxGLpFdH@nvidia.com>

On Thu, May 18, 2023 at 11:56:46AM -0300, Jason Gunthorpe wrote:
> On Thu, May 18, 2023 at 10:16:24AM -0400, Peter Xu wrote:
> 
> > What you mentioned above makes sense to me from the POV that 1 vIOMMU may
> > not suffice, but that's at least totally new area to me because I never
> > used >1 IOMMUs even bare metal (excluding the case where I'm aware that
> > e.g. a GPU could have its own IOMMU-like dma translator).
> 
> Even x86 systems are multi-iommu, one iommu per physical CPU socket.

I tried to look at a 2-node system on hand and I indeed got two dmars:

[    4.444788] DMAR: dmar0: reg_base_addr fbffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[    4.459673] DMAR: dmar1: reg_base_addr c7ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df

Though they do not seem to be all parallel on attaching devices.  E.g.,
most of the devices on this host are attached to dmar1, while there're only
two devices attached to dmar0:

80:05.2 System peripheral: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D IIO RAS/Control Status/Global Errors (rev 01)
80:05.0 System peripheral: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D Map/VTd_Misc/System Management (rev 01)

> 
> I'm not sure how they model this though - Kevin do you know? Do we get
> multiple iommu instances in Linux or is all the broadcasting of
> invalidates and sharing of tables hidden?
> 
> > What's the system layout of your multi-vIOMMU world?  Is there still a
> > centric vIOMMU, or multi-vIOMMUs can run fully in parallel, so that e.g. we
> > can have DEV1,DEV2 under vIOMMU1 and DEV3,DEV4 under vIOMMU2?
> 
> Just like physical, each viommu is parallel and independent. Each has
> its own caches, ASIDs, DIDs/etc and thus invalidation domains.
> 
> The seperated caches is the motivating reason to do this as something
> like vCMDQ is a direct command channel for invalidations to only the
> caches of a single IOMMU block.

From cache invalidation pov, shouldn't the best be per-device granule (like
dev-iotlb in VT-d? No idea for ARM)?

But that's two angles I assume - currently dev-iotlb is still emulated at
least in QEMU.  Having a hardware accelerated queue is definitely another
thing.

> 
> > Is it a common hardware layout or nVidia specific?
> 
> I think it is pretty normal, you have multiple copies of the IOMMU and
> its caches for physical reasons.
> 
> The only choice is if the platform HW somehow routes invalidations to
> all IOMMUs or requires SW to route/replicate invalidates.
> 
> ARM's IP seems to be designed toward the latter so I expect it is
> going to be common on ARM.

Thanks for the information, Jason.

I see that Intel is already copied here (at least Yi and Kevin) so I assume
there're already some kind of synchronizations on multi-vIOMMU vs recent
works on Intel side, which is definitely nice and can avoid work conflicts.

We should probably also copy Jason Wang and mst when there's any formal
proposal.  I've got them all copied here too.

-- 
Peter Xu

next prev parent reply	other threads:[~2023-05-18 19:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-24 23:42 Multiple vIOMMU instance support in QEMU? Nicolin Chen
2023-05-18  3:22 ` Nicolin Chen
2023-05-18  9:06   ` Eric Auger
2023-05-18 14:16     ` Peter Xu
2023-05-18 14:56       ` Jason Gunthorpe
2023-05-18 19:45         ` Peter Xu [this message]
2023-05-18 20:19           ` Jason Gunthorpe
2023-05-19  0:38             ` Tian, Kevin
2023-05-18 22:56         ` Tian, Kevin
2023-05-18 17:39     ` Nicolin Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZGaAVAI9u4K4vy1/@x1n \
    --to=peterx@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=mst@redhat.com \
    --cc=nicolinc@nvidia.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).