kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joerg Roedel <joerg.roedel@amd.com>
To: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: joro@8bytes.org, muli@il.ibm.com, amit.shah@redhat.com,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	iommu@lists.linux-foundation.org, dwmw2@infradead.org,
	mingo@redhat.com
Subject: Re: [PATCH 9/9] x86/iommu: use dma_ops_list in get_dma_ops
Date: Mon, 29 Sep 2008 15:33:11 +0200	[thread overview]
Message-ID: <20080929133311.GK27928@amd.com> (raw)
In-Reply-To: <20080929221640X.fujita.tomonori@lab.ntt.co.jp>

On Mon, Sep 29, 2008 at 10:16:44PM +0900, FUJITA Tomonori wrote:
> On Mon, 29 Sep 2008 11:36:52 +0200
> Joerg Roedel <joro@8bytes.org> wrote:
> 
> > On Mon, Sep 29, 2008 at 12:30:44PM +0300, Muli Ben-Yehuda wrote:
> > > On Sun, Sep 28, 2008 at 09:13:33PM +0200, Joerg Roedel wrote:
> > > 
> > > > I think we should try to build a paravirtualized IOMMU for KVM
> > > > guests.  It should work this way: We reserve a configurable amount
> > > > of contiguous guest physical memory and map it dma contiguous using
> > > > some kind of hardware IOMMU. This is possible with all hardare
> > > > IOMMUs we have in the field by now, also Calgary and GART. The guest
> > > > does dma_coherent allocations from this memory directly and is done.
> > > > For map_single and map_sg 
> > > > the guest can do bounce buffering. We avoid nearly all pvdma hypercalls
> > > > with this approach, keep guest swapping working and solve also the
> > > > problems with device dma_masks and guest memory that is not contigous on
> > > > the host side.
> > > 
> > > I'm not sure I follow, but if I understand correctly with this
> > > approach the guest could only DMA into buffers that fall within the
> > > range you allocated for DMA and mapped. Isn't that a pretty nasty
> > > limitation?  The guest would need to bounce-bufer every frame that
> > > happened to not fall inside that range, with the resulting loss of
> > > performance.
> > 
> > The bounce buffering is needed for map_single/map_sg allocations. For
> > dma_alloc_coherent we can directly allocate from that range. The
> > performance loss of the bounce buffering may be lower than the
> > hypercalls we need as the alternative (we need hypercalls for map, unmap
> > and sync).
> 
> Nobody cares about the performance of dma_alloc_coherent. Only the
> performance of map_single/map_sg matters.
>
> I'm not sure how expensive the hypercalls are, but they are more
> expensive than bounce buffering coping lots of data for every I/Os?

I don't think that we can avoid bounce buffering into the guests at all
(with and without my idea of a paravirtualized IOMMU) when we want to
handle dma_masks and requests that cross guest physical pages properly.

With mapping/unmapping through hypercalls we add the world-switch
overhead to the copy-overhead. We can't avoid this when we have no
hardware support at all. But already with older IOMMUs like Calgary and
GART we can at least avoid the world-switch. And since, for example,
every 64 bit capable AMD processor has a GART we can make use of it.

Joerg

-- 
           |           AMD Saxony Limited Liability Company & Co. KG
 Operating |         Wilschdorfer Landstr. 101, 01109 Dresden, Germany
 System    |                  Register Court Dresden: HRA 4896
 Research  |              General Partner authorized to represent:
 Center    |             AMD Saxony LLC (Wilmington, Delaware, US)
           | General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy


  reply	other threads:[~2008-09-29 13:33 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-22 18:21 [PATCH 0/9][RFC] stackable dma_ops for x86 Joerg Roedel
2008-09-22 18:21 ` [PATCH 1/9] x86/iommu: add necessary types for stackable dma_ops Joerg Roedel
2008-09-22 18:21 ` [PATCH 2/9] x86/iommu: add stackable dma_ops registration interface Joerg Roedel
2008-09-22 18:21 ` [PATCH 3/9] x86/iommu: change PCI-NOMMU to use dma_ops register interface Joerg Roedel
2008-09-22 18:21 ` [PATCH 4/9] x86/iommu: change SWIOTLB " Joerg Roedel
2008-09-22 18:21 ` [PATCH 5/9] x86/iommu: change GART " Joerg Roedel
2008-09-22 18:21 ` [PATCH 6/9] x86/iommu: change Calgary " Joerg Roedel
2008-09-23 14:37   ` Muli Ben-Yehuda
2008-09-30 11:10   ` Ingo Molnar
2008-09-30 11:58     ` Joerg Roedel
2008-09-30 13:30     ` Muli Ben-Yehuda
2008-09-30 15:38       ` Ingo Molnar
2008-09-30 18:33         ` Muli Ben-Yehuda
2008-09-22 18:21 ` [PATCH 7/9] x86/iommu: change AMD IOMMU " Joerg Roedel
2008-09-22 18:21 ` [PATCH 8/9] x86/iommu: change Intel " Joerg Roedel
2008-09-22 18:21 ` [PATCH 9/9] x86/iommu: use dma_ops_list in get_dma_ops Joerg Roedel
2008-09-26  7:56   ` Amit Shah
2008-09-26  8:59     ` Joerg Roedel
2008-09-26 10:49       ` Amit Shah
2008-09-26 12:32         ` Joerg Roedel
2008-09-27  0:13           ` Muli Ben-Yehuda
2008-09-28 19:13             ` Joerg Roedel
2008-09-29  9:30               ` Muli Ben-Yehuda
2008-09-29  9:36                 ` Joerg Roedel
2008-09-29 13:16                   ` FUJITA Tomonori
2008-09-29 13:33                     ` Joerg Roedel [this message]
2008-09-30 19:44                       ` Muli Ben-Yehuda
2008-10-01  7:19                         ` Joerg Roedel
2008-10-03  8:38                           ` Muli Ben-Yehuda
2008-09-26 11:00     ` Joerg Roedel
2008-09-28 14:21   ` FUJITA Tomonori
2008-09-28 18:44     ` Joerg Roedel
2008-09-29  9:25       ` Muli Ben-Yehuda
2008-09-29  9:29         ` Joerg Roedel
2008-09-22 18:36 ` [PATCH 0/9][RFC] stackable dma_ops for x86 Arjan van de Ven
2008-09-22 18:39   ` Joerg Roedel
2008-09-23  2:41     ` Jeremy Fitzhardinge
2008-09-23  2:50       ` Arjan van de Ven
2008-09-28 14:21 ` FUJITA Tomonori
2008-09-28 18:49   ` Joerg Roedel
2008-09-29 13:16     ` FUJITA Tomonori
2008-09-29 13:26       ` Joerg Roedel
2008-09-29 13:42         ` FUJITA Tomonori
2008-09-29 13:51           ` Joerg Roedel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080929133311.GK27928@amd.com \
    --to=joerg.roedel@amd.com \
    --cc=amit.shah@redhat.com \
    --cc=dwmw2@infradead.org \
    --cc=fujita.tomonori@lab.ntt.co.jp \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=muli@il.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).