From: Gordan Bobic <gordan@bobich.net>
To: Ian Campbell <ian.campbell@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: Bug: Limitation of <=2GB RAM in domU persists with 4.3.0
Date: Fri, 26 Jul 2013 10:23:46 +0100 [thread overview]
Message-ID: <f6491fbe3cad4b0cbf82edfa39d48ae9@mail.shatteredsilicon.net> (raw)
In-Reply-To: <1374798084.10269.2.camel@hastur.hellion.org.uk>
On Fri, 26 Jul 2013 01:21:24 +0100, Ian Campbell
<ian.campbell@citrix.com> wrote:
> On Thu, 2013-07-25 at 23:23 +0100, Gordan Bobic wrote:
>> Now, if I am understanding the basic nature of the problem
>> correctly,
>> this _could_ be worked around by ensuring that vBAR = pBAR since in
>> that
>> case there is no room for the mis-mapped memory overwrites to occur.
>> Is
>> that correct?
>
> AIUI (which is not very well...) it's not so much vBAR=pBAR but
> making
> the guest e820 (memory map) have the same MMIO holes as the host so
> that
> there can't be any clash between v- or p-BAR and RAM in the guest.
Sure, I understand that - but unless I am overlooking something,
vBAR=pBAR implicitly ensures that.
The question, then, is what happens in the null translation instance.
Specifically, if the PCIe bridge/router is broken (and NF200 is, it
seems), it would imply that when the driver talks to the device, the
operation will get sent to the vBAR (=pBAR, i.e. straight to the
hardware). This then gets translated to the pBAR. But - with a
broken bridge, and vBAR=pBAR, the MMIO request hits the pBAR
directly from the guest. Does it then still get intercepted by
the hypervisor, translated (null operation), and re-transmitted?
If so, this would lead to the card receiving everything twice,
resulting either in things outright breaking or going half as
fast at best.
Now, all this could be a good thing or a bad thing, depending on
how exactly you spin it. If the bridge is broken and doesn't
route all the way back to the root bridge, this could actually be
a performance optimizing feature. If we set vBAR=pBAR and disable
any translation thereafter, this avoids the overhead of passing
everything to/from the root PCIe bridge, and we can just directly
DMA everything.
I'm sure there are security implications here, but since NF200
doesn't do PCIe ACS either, any concept of security goes out
the window pre-emptively.
So, my question is:
1) If vBAR = pBAR, does the hypervisor still do any translation?
I presume it does because it expects the traffic to pass up
from the root bridge, to the hypervisor and then back, to
ensure security. If indeed it does do this, where could I
optionally disable it, and is there an easy to follow bit of
example code for how to plumb in a boot parameter option for
this?
2) Further, I'm finding myself motivated to write that
auto-set (as opposed to hard coded) vBAR=pBAR patch discussed
briefly a week or so ago (have an init script read the BAR
info from dom0 and put it in xenstore, plus a patch to
make pBAR=vBAR reservations built dynamically rather than
statically, based on this data. Now, I'm quite fluent in C,
but my familiarity with Xen soruce code is nearly non-existant
(limited to studying an old unsupported patch every now and then
in order to make it apply to a more recent code release).
Can anyone help me out with a high level view WRT where
this would be best plumbed in (which files and the flow of
control between the affected files)?
The added bonus of this (if it can be made to work) is that
it might just make unmodified GeForce cards work, too,
which probably makes it worthwhile on it's own.
>> I guess I could test this easily enough by applying the vBAR = pBAR
>> hack.
>
> Does the e820_host=1 option help? That might be PV only though, I
> can't
> remember...
Thanks for pointing this one out, I just found this post in the
archives:
http://lists.xen.org/archives/html/xen-users/2012-08/msg00150.html
With a broken PCIe router, would I also need iommu=soft?
Gordan
next prev parent reply other threads:[~2013-07-26 9:23 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-23 22:34 Bug: Limitation of <=2GB RAM in domU persists with 4.3.0 Gordan Bobic
2013-07-24 14:08 ` Konrad Rzeszutek Wilk
2013-07-24 14:17 ` Gordan Bobic
2013-07-24 16:06 ` Konrad Rzeszutek Wilk
2013-07-24 16:14 ` Gordan Bobic
2013-07-24 16:31 ` Konrad Rzeszutek Wilk
2013-07-24 17:26 ` Gordan Bobic
2013-07-24 22:15 ` Gordan Bobic
2013-07-25 19:18 ` George Dunlap
2013-07-25 21:48 ` Gordan Bobic
2013-07-25 22:23 ` Gordan Bobic
2013-07-26 0:21 ` Ian Campbell
2013-07-26 1:15 ` Andrew Bobulsky
2013-07-26 9:28 ` Gordan Bobic
2013-07-26 13:11 ` Gordan Bobic
2013-07-31 17:53 ` George Dunlap
2013-07-31 17:56 ` Andrew Cooper
2013-07-31 19:36 ` Gordan Bobic
2013-07-31 19:35 ` Gordan Bobic
2013-08-01 9:15 ` George Dunlap
2013-08-01 13:10 ` Fabio Fantoni
2013-08-02 14:43 ` George Dunlap
2013-07-28 10:26 ` Konrad Rzeszutek Wilk
2013-07-28 21:24 ` Gordan Bobic
2013-07-28 23:17 ` Konrad Rzeszutek Wilk
2013-07-28 23:30 ` Gordan Bobic
2013-07-29 9:53 ` Ian Campbell
2013-07-26 9:23 ` Gordan Bobic [this message]
2013-07-29 11:14 ` Ian Campbell
2013-07-29 18:04 ` Konrad Rzeszutek Wilk
2013-09-03 13:53 ` Gordan Bobic
2013-09-03 14:59 ` Konrad Rzeszutek Wilk
2013-09-03 19:47 ` HVM support for e820_host (Was: Bug: Limitation of <=2GB RAM in domU persists with 4.3.0) Gordan Bobic
2013-09-03 20:35 ` Gordan Bobic
2013-09-03 20:49 ` Gordan Bobic
2013-09-03 21:10 ` Konrad Rzeszutek Wilk
2013-09-03 21:24 ` Gordan Bobic
2013-09-03 21:30 ` Konrad Rzeszutek Wilk
2013-09-04 0:18 ` Gordan Bobic
2013-09-04 14:08 ` Konrad Rzeszutek Wilk
2013-09-04 14:23 ` Gordan Bobic
2013-09-04 18:00 ` Konrad Rzeszutek Wilk
2013-09-03 21:08 ` Konrad Rzeszutek Wilk
2013-09-04 9:21 ` Gordan Bobic
2013-09-04 11:01 ` Gordan Bobic
2013-09-04 13:11 ` Gordan Bobic
2013-09-04 20:18 ` Gordan Bobic
2013-09-05 2:04 ` Konrad Rzeszutek Wilk
2013-09-05 9:41 ` Gordan Bobic
2013-09-05 10:00 ` Gordan Bobic
2013-09-05 12:36 ` Konrad Rzeszutek Wilk
2013-09-05 10:26 ` Gordan Bobic
2013-09-05 12:38 ` Konrad Rzeszutek Wilk
2013-09-05 21:13 ` Gordan Bobic
2013-09-05 21:29 ` Gordan Bobic
2013-09-05 21:46 ` Gordan Bobic
2013-09-05 22:23 ` Konrad Rzeszutek Wilk
2013-09-05 22:42 ` Gordan Bobic
2013-09-06 13:09 ` Konrad Rzeszutek Wilk
2013-09-06 14:09 ` Gordan Bobic
2013-09-05 22:45 ` Gordan Bobic
2013-09-05 23:01 ` Konrad Rzeszutek Wilk
2013-09-06 12:23 ` Gordan Bobic
2013-09-06 13:20 ` Konrad Rzeszutek Wilk
2013-09-06 14:45 ` Gordan Bobic
2013-09-05 22:33 ` Gordan Bobic
2013-09-06 13:04 ` Konrad Rzeszutek Wilk
2013-09-06 13:34 ` Gordan Bobic
2013-09-06 14:32 ` Konrad Rzeszutek Wilk
2013-09-06 16:30 ` Gordan Bobic
2013-09-06 19:54 ` Gordan Bobic
2013-09-10 13:35 ` Konrad Rzeszutek Wilk
2013-09-10 15:04 ` Gordan Bobic
2013-07-25 21:26 ` Bug: Limitation of <=2GB RAM in domU persists with 4.3.0 Gordan Bobic
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f6491fbe3cad4b0cbf82edfa39d48ae9@mail.shatteredsilicon.net \
--to=gordan@bobich.net \
--cc=George.Dunlap@eu.citrix.com \
--cc=andrew.cooper3@citrix.com \
--cc=ian.campbell@citrix.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).