From: Andrea Arcangeli <andrea@qumranet.com>
To: Anthony Liguori <aliguori@us.ibm.com>
Cc: Carsten Otte <cotte@de.ibm.com>,
Hollis Blanchard <hollisb@us.ibm.com>,
kvm-devel@lists.sourceforge.net, Avi Kivity <avi@qumranet.com>,
Ben-Ami Yassour <benami@il.ibm.com>,
"Zhang, Xiantao" <xiantao.zhang@intel.com>
Subject: Re: [PATCH] Handle vma regions with no backing page (v2)
Date: Wed, 30 Apr 2008 09:00:39 +0200 [thread overview]
Message-ID: <20080430070039.GA12501@duo.random> (raw)
In-Reply-To: <4817AB73.20703@us.ibm.com>
On Tue, Apr 29, 2008 at 06:12:51PM -0500, Anthony Liguori wrote:
> IIUC PPC correctly, all IO pages have corresponding struct pages. This
> means that get_user_pages() would succeed and you can reference count them?
> In this case, we would never take the VM_PFNMAP path.
get_user_pages only works on vmas where only pfn with struct page can
be mapped, but if a struct page exists it doesn't mean get_user_pages
will succeed. All mmio regions should be marked VM_IO as reading on
them affects hardware somehow and that prevents get_user_pages to work
on them regardless if a struct page exists.
> That's independent of this patchset. For non-aware guests, we'll have to
> pin all of physical memory up front and then create an IOMMU table from the
> pinned physical memory. For aware guests with a PV DMA window API, we'll
> be able to build that mapping on the fly (enforcing mlock allocation
> limits).
BTW, as far as linux guest is concerned, if the PV DMA API mlock
ulimit triggers the guest will crash. Nothing checks when
pci_map_single returns null (the fix would be to throttle the I/O
until some other dma is completed and to split the dma in multiple
operations if it's a SG entry and if it repeteadly fails to fallback
to PIO or return an IO error if PIO isn't available). It can fail if
there's lots of weird pci hardware doing rdma at the same time (for
example see iommu_arena_alloc retval in
arch/alpha/kernel/pci_iommu.c). In short we'll either need ulimit -l
unlimited or we'll have to define practical limits so depending on the
guest driver code and number of devices using passthrough.
I'll make the reserved-ram patch incremental with those patches, then
it should pick the right pfn coming from /dev/mem without my
page_count == 0 check, and then I've only to fixup the page pinning
(so likely it'll also be incremental with the kvm mmu notifier patch
so I can hope to get something final and remove page pinning for good
not only on mmio regions that don't have a struct page). I've
currently troubles with the blk-settings.c change done in 2.6.25 to
boot in the host, I thought I fixed that already...(I did when loading
the host kernel in kvm, but on real hardware it fails still for
another reason). And Andrew sent me a large email about mmu notifiers,
so before I return on the reserved-ram I've to answer him and upload
an updated mmu-notifier patch with certain cleanups he requested, so
go ahead ignoring the reserved-ram and mmu notifiers, I'll pick
whatever is available in or outside kvm.git when I'm ready. Thanks!
-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
next prev parent reply other threads:[~2008-04-30 7:00 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-29 19:09 [PATCH] Handle vma regions with no backing page (v2) Anthony Liguori
2008-04-29 22:17 ` Avi Kivity
2008-04-29 22:25 ` Anthony Liguori
2008-04-29 22:42 ` Avi Kivity
2008-04-29 22:51 ` Anthony Liguori
2008-04-29 22:52 ` Avi Kivity
2008-04-29 22:57 ` Hollis Blanchard
2008-04-29 23:12 ` Anthony Liguori
2008-04-30 7:00 ` Andrea Arcangeli [this message]
2008-04-30 15:37 ` Anthony Liguori
2008-04-30 15:11 ` Hollis Blanchard
2008-04-30 7:59 ` Carsten Otte
2008-04-30 6:11 ` Muli Ben-Yehuda
2008-04-30 8:59 ` Avi Kivity
2008-04-30 9:13 ` Andrea Arcangeli
2008-04-30 9:15 ` Avi Kivity
2008-04-30 12:24 ` Anthony Liguori
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080430070039.GA12501@duo.random \
--to=andrea@qumranet.com \
--cc=aliguori@us.ibm.com \
--cc=avi@qumranet.com \
--cc=benami@il.ibm.com \
--cc=cotte@de.ibm.com \
--cc=hollisb@us.ibm.com \
--cc=kvm-devel@lists.sourceforge.net \
--cc=xiantao.zhang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox