All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wright <chrisw@sous-sol.org>
To: Avi Kivity <avi@redhat.com>
Cc: Tom Lyon <pugs@lyon-about.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	chrisw@sous-sol.org, joro@8bytes.org, hjk@linutronix.de,
	gregkh@suse.de, aafabbri@cisco.com, scofeldm@cisco.com,
	alex.williamson@redhat.com
Subject: Re: [PATCH] VFIO driver: Non-privileged user level PCI drivers
Date: Tue, 1 Jun 2010 22:29:07 -0700	[thread overview]
Message-ID: <20100602052907.GT8301@sequoia.sous-sol.org> (raw)
In-Reply-To: <4C05C925.6080006@redhat.com>

* Avi Kivity (avi@redhat.com) wrote:
> On 06/02/2010 12:26 AM, Tom Lyon wrote:
> >
> >I'm not really opposed to multiple devices per domain, but let me point out how I
> >ended up here.  First, the driver has two ways of mapping pages, one based on the
> >iommu api and one based on the dma_map_sg api.  With the latter, the system
> >already allocates a domain per device and there's no way to control it. This was
> >presumably done to help isolation between drivers.  If there are multiple drivers
> >in the user level, do we not want the same isoation to apply to them?
> 
> In the case of kvm, we don't want isolation between devices, because
> that doesn't happen on real hardware.

Sure it does.  That's exactly what happens when there's an iommu
involved with bare metal.

> So if the guest programs
> devices to dma to each other, we want that to succeed.

And it will as long as ATS is enabled (this is a basic requirement
for PCIe peer-to-peer traffic to succeed with an iommu involved on
bare metal).

That's how things currently are, i.e. we put all devices belonging to a
single guest in the same domain.  However, it can be useful to put each
device belonging to a guest in a unique domain.  Especially as qemu
grows support for iommu emulation, and guest OSes begin to understand
how to use a hw iommu.

> >Also, domains are not a very scarce resource - my little core i5 has 256,
> >and the intel architecture goes to 64K.
> 
> But there is a 0.2% of mapped memory per domain cost for the page
> tables.  For the kvm use case, that could be significant since a
> guest may have large amounts of memory and large numbers of assigned
> devices.
> 
> >And then there's the fact that it is possible to have multiple disjoint iommus on a system,
> >so it may not even be possible to bring 2 devices under one domain.
> 
> That's indeed a deficiency.

Not sure it's a deficiency.  Typically to share page table mappings
across multiple iommu's you just have to do update/invalidate to each
hw iommu that is sharing the mapping.  Alternatively, you can use more
memory and build/maintain identical mappings (as Tom alludes to below).

> >Given all that, I am inclined to leave it alone until someone has a real problem.
> >Note that not sharing iommu domains doesn't mean you can't share device memory,
> >just that you have to do multiple mappings
> 
> I think we do have a real problem (though a mild one).
> 
> The only issue I see with deferring the solution is that the API
> becomes gnarly; both the kernel and userspace will have to support
> both APIs forever.  Perhaps we can implement the new API but defer
> the actual sharing until later, don't know how much work this saves.
> Or Alex/Chris can pitch in and help.

It really shouldn't be that complicated to create the API to allow for
flexible device <-> domain mappings, so I agree, makes sense to do it
right up front.

thanks,
-chris

  reply	other threads:[~2010-06-02  5:30 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-28 23:07 [PATCH] VFIO driver: Non-privileged user level PCI drivers Tom Lyon
2010-05-28 23:36 ` Randy Dunlap
2010-05-28 23:56 ` Randy Dunlap
2010-05-29 11:55 ` Arnd Bergmann
2010-05-29 12:16   ` Avi Kivity
2010-05-30 12:19 ` Michael S. Tsirkin
2010-05-30 12:27   ` Avi Kivity
2010-05-30 12:49     ` Michael S. Tsirkin
2010-05-30 13:01       ` Avi Kivity
2010-05-30 13:03         ` Michael S. Tsirkin
2010-05-30 13:13           ` Avi Kivity
2010-05-30 14:53             ` Michael S. Tsirkin
2010-05-31 11:50               ` Avi Kivity
2010-05-31 17:10                 ` Michael S. Tsirkin
2010-06-01  8:10                   ` Avi Kivity
2010-06-01  9:55                     ` Michael S. Tsirkin
2010-06-01 10:28                       ` Avi Kivity
2010-06-01 10:46                         ` Michael S. Tsirkin
2010-06-01 12:41                           ` Avi Kivity
2010-06-02  9:45                             ` Joerg Roedel
2010-06-02  9:49                               ` Avi Kivity
2010-06-02 10:04                                 ` Joerg Roedel
2010-06-02 10:09                                   ` Michael S. Tsirkin
2010-06-02 11:21                                   ` Avi Kivity
2010-06-02 16:53                                     ` Chris Wright
2010-06-06 13:44                                       ` Avi Kivity
2010-06-02 10:15                               ` Michael S. Tsirkin
2010-06-02 10:26                                 ` Joerg Roedel
2010-06-01 21:26                           ` Tom Lyon
2010-06-02  2:59                             ` Avi Kivity
2010-06-02  5:29                               ` Chris Wright [this message]
2010-06-02  5:40                                 ` Avi Kivity
2010-06-02  4:29                         ` Alex Williamson
2010-06-02  4:59                           ` Tom Lyon
2010-06-02  5:08                             ` Avi Kivity
2010-06-02  9:53                             ` Joerg Roedel
2010-06-02  9:42                       ` Joerg Roedel
2010-06-02  9:50                         ` Avi Kivity
2010-06-02  9:53                         ` Michael S. Tsirkin
2010-06-02 10:19                           ` Joerg Roedel
2010-06-02 10:21                             ` Michael S. Tsirkin
2010-06-02 10:35                               ` Joerg Roedel
2010-06-02 10:38                                 ` Michael S. Tsirkin
2010-06-02 11:12                                   ` Joerg Roedel
2010-06-02 11:21                                     ` Michael S. Tsirkin
2010-06-02 12:19                                       ` Joerg Roedel
2010-06-02 12:25                                         ` Avi Kivity
2010-06-02 12:50                                           ` Joerg Roedel
2010-06-02 13:06                                             ` Avi Kivity
2010-06-02 13:53                                               ` Joerg Roedel
2010-06-02 13:17                                             ` Michael S. Tsirkin
2010-06-02 14:01                                               ` Joerg Roedel
2010-06-02 12:34                                         ` Michael S. Tsirkin
2010-06-02 13:02                                           ` Joerg Roedel
2010-06-02 17:46                                         ` Chris Wright
2010-06-02 18:09                                           ` Tom Lyon
2010-06-02 19:46                                             ` Joerg Roedel
2010-06-03  6:23                                           ` Avi Kivity
2010-06-03 21:41                                             ` Tom Lyon
2010-06-06  9:54                                               ` Michael S. Tsirkin
2010-06-07 19:01                                                 ` Tom Lyon
2010-06-08 21:22                                                   ` Michael S. Tsirkin
2010-06-02 10:44                             ` Michael S. Tsirkin
2010-05-30 12:59 ` Avi Kivity
2010-05-31 17:17 ` Alan Cox
2010-06-01 21:29   ` Tom Lyon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100602052907.GT8301@sequoia.sous-sol.org \
    --to=chrisw@sous-sol.org \
    --cc=aafabbri@cisco.com \
    --cc=alex.williamson@redhat.com \
    --cc=avi@redhat.com \
    --cc=gregkh@suse.de \
    --cc=hjk@linutronix.de \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=pugs@lyon-about.com \
    --cc=scofeldm@cisco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.