From: George Dunlap <george.dunlap@eu.citrix.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
Mukesh Rathor <mukesh.rathor@oracle.com>
Cc: Keir Fraser <keir@xen.org>,
Ian Campbell <ian.campbell@citrix.com>,
Jan Beulich <beulich@suse.com>, Tim Deegan <tim@xen.org>,
xen-devel@lists.xen.org, Ian Jackson <ian.jackson@citrix.com>
Subject: Re: [PATCH v14 00/20] Introduce PVH domU support
Date: Mon, 4 Nov 2013 17:23:07 +0000 [thread overview]
Message-ID: <5277D7FB.30208@eu.citrix.com> (raw)
In-Reply-To: <20131104165905.GA6979@phenom.dumpdata.com>
On 04/11/13 16:59, Konrad Rzeszutek Wilk wrote:
> On Mon, Nov 04, 2013 at 12:14:49PM +0000, George Dunlap wrote:
>> Updates:
>> - Fixed bugs in v14:
>> Zombie domains, FreeBSD crash, Crash at 4GiB, HVM crash
>> (Thank you to Roger Pau Mone for fixes to the last 3)
>> - Completely eliminated PV emulation codepath
>
> Odd, you dropped Mukesh email from the patch series - so he can't
> jump on answering questions right away.
The mail I received has Mukesh cc'd in all the patches...
>
>> == RFC ==
>>
>> We had talked about accepting the patch series as-is once I had the
>> known bugs fixed; but I couldn't help making an attempt at using the
>> HVM IO emulation codepaths so that we could completely eliminate
>> having to use the PV emulation code, in turn eliminating some of the
>> uglier "support" patches required to make the PV emulation code
>> capable of running on a PVH guest. The idea for "admin" pio ranges
>> would be that we would use the vmx hardware to allow the guest direct
>> access, rather than the "re-execute with guest GPRs" trick that PV
>> uses. (This functionality is not implememted by this patch series, so
>> we would need to make sure it was sorted for the dom0 series.)
>>
>> The result looks somewhat cleaner to me. On the other hand, because
>> string in & out instructions use the full emulation code, it means
>> opening up an extra 6k lines of code to PVH guests, including all the
>> complexity of the ioreq path. (It doesn't actually send ioreqs, but
>> since it shares much of the path, it shares much of the complexity.)
>> Additionally, I'm not sure I've done it entirely correctly: the guest
>> boots and the io instructions it executes seem to be handled
>> correctly, but it may not be using the corner cases.
> The case I think Mukesh was hitting was the 'speaker_io' path. But
> perhaps I am misremembering it?
Well looking at the trace, it looks like the PVH kernel he gave me is
actually attempting to enumerate the PCI space (writing a large range of
values to cf8 then reading cfc). A full set of accesses is below:
vcpu 0
IO address summary:
21:[w] 1 0.00s 0.00% 5387 cyc { 5387| 5387| 5387}
70:[w] 8 0.00s 0.00% 1434 cyc { 916| 1005| 3651}
71:[r] 8 0.00s 0.00% 1803 cyc { 1017| 1496| 5100}
a1:[w] 1 0.00s 0.00% 1357 cyc { 1357| 1357| 1357}
cf8:[r] 3 0.00s 0.00% 1202 cyc { 1088| 1150| 1369}
cf8:[w] 16850 0.01s 0.00% 966 cyc { 896| 937| 1073}
cfa:[w] 1 0.00s 0.00% 932 cyc { 932| 932| 932}
cfb:[w] 2 0.00s 0.00% 2517 cyc { 2001| 3033| 3033}
cfc:[r] 16560 0.01s 0.00% 1174 cyc { 1118| 1150| 1227}
cfe:[r] 288 0.00s 0.00% 1380 cyc { 1032| 1431| 1499}
vcpu 1
IO address summary:
60:[r] 16 0.00s 0.00% 1141 cyc { 1011| 1014| 2093}
64:[r] 18276 0.01s 0.01% 1579 cyc { 1408| 1443| 2629}
vcpu 2
IO address summary:
70:[w] 33 0.00s 0.00% 1192 cyc { 855| 920| 2306}
71:[r] 31 0.00s 0.00% 1177 cyc { 988| 1032| 1567}
71:[w] 2 0.00s 0.00% 1079 cyc { 1014| 1144| 1144}
2e9:[r] 3 0.00s 0.00% 1697 cyc { 1002| 1011| 3080}
2e9:[w] 3 0.00s 0.00% 998 cyc { 902| 952| 1141}
2f9:[r] 3 0.00s 0.00% 1725 cyc { 996| 1020| 3160}
2f9:[w] 3 0.00s 0.00% 990 cyc { 905| 935| 1130}
3e9:[r] 3 0.00s 0.00% 1595 cyc { 1011| 1026| 2749}
3e9:[w] 3 0.00s 0.00% 1012 cyc { 920| 976| 1142}
3f9:[r] 3 0.00s 0.00% 2480 cyc { 988| 1079| 5375}
3f9:[w] 3 0.00s 0.00% 1064 cyc { 913| 1035| 1245}
(No i/o from vcpu 3.)
Presumably some of these are just "the BIOS may be lying, check anyway"
probes, which should be harmless for domUs.
>
>> This also means no support for "legacy" forced invalid ops -- only native
>> cpuid is supported in this series.
> OK.
(FWIW, support for legacy forced invalid ops was requested by Tim.)
>> I have the fixes in another series, if people think it would be better
>> to check in exactly what we had with bug fixes ASAP.
>>
>> Other "open issues" on the design (which need not stop the series
>> going in) include:
>>
>> - Whether a completely separate mode is necessary, or whether having
>> just having HVM mode with some flags to disable / change certain
>> functionality would be better
>>
>> - Interface-wise: Right now PVH is special-cased for bringing up
>> CPUs. Is this what we want to do going forward, or would it be better
>> to try to make it more like PV (which was tried before and is hard), or more
>> like HVM (which would involve having emulated APICs, &c &c).
> How is it hard? From the Linux standpoint it is just an hypercall?
This is my understanding of a discussion that happened between Tim and
Mukesh just as I was joining the conversation. My understanding was
that the issue had to do with pre-loading segments and DTs, which for PV
guests is easy because Xen controls the tables themselves, but is harder
to do in a reasonable way for HVM guests because the guest controls the
tables. Mukesh had initially implemented it the full PV way (or mostly
PV), but Tim was concerned about some kind of potential consistency
issue. But I didn't read the discussion very carefully, as I was just
trying to get my head around the series as a whole at that time.
The suggestion to just use an HVM-style method was made at the XenSummit
by Glauber Costa. Glauber is a bit more of a KVM guy, so tends to lean
towards "just behave like the real hardware". Nonetheless, I think his
concern about adding an extra interface is a valid one, and worth
keeping in mind.
-George
next prev parent reply other threads:[~2013-11-04 17:23 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-04 12:14 [PATCH v14 00/20] Introduce PVH domU support George Dunlap
2013-11-04 12:14 ` [PATCH v14 01/17] Allow vmx_update_debug_state to be called when v!=current George Dunlap
2013-11-04 16:01 ` Jan Beulich
2013-11-04 16:18 ` George Dunlap
2013-11-04 12:14 ` [PATCH v14 02/17] libxc: Move temporary grant table mapping to end of memory George Dunlap
2013-11-05 10:57 ` Roger Pau Monné
2013-11-05 11:01 ` Ian Campbell
2013-11-04 12:14 ` [PATCH v14 03/17] pvh prep: code motion George Dunlap
2013-11-04 16:14 ` Jan Beulich
2013-11-07 10:48 ` George Dunlap
2013-11-04 12:14 ` [PATCH v14 04/17] Introduce pv guest type and has_hvm_container macros George Dunlap
2013-11-04 16:20 ` Jan Beulich
2013-11-04 16:26 ` George Dunlap
2013-11-04 16:39 ` George Dunlap
2013-11-07 10:55 ` George Dunlap
2013-11-07 11:04 ` Jan Beulich
2013-11-07 11:11 ` George Dunlap
2013-11-04 12:14 ` [PATCH v14 05/17] pvh: Introduce PVH guest type George Dunlap
2013-11-06 23:28 ` Tim Deegan
2013-11-07 11:21 ` George Dunlap
2013-11-07 16:59 ` Tim Deegan
2013-11-04 12:14 ` [PATCH v14 06/17] pvh: Disable unneeded features of HVM containers George Dunlap
2013-11-04 16:21 ` George Dunlap
2013-11-04 16:37 ` Jan Beulich
2013-11-06 23:54 ` Tim Deegan
2013-11-07 9:00 ` Jan Beulich
2013-11-07 17:02 ` Tim Deegan
2013-11-04 12:14 ` [PATCH v14 07/17] pvh: vmx-specific changes George Dunlap
2013-11-04 16:19 ` George Dunlap
2013-11-04 16:42 ` Jan Beulich
2013-11-07 0:28 ` Tim Deegan
2013-11-07 0:27 ` Tim Deegan
2013-11-07 14:50 ` George Dunlap
2013-11-07 15:40 ` Andrew Cooper
2013-11-07 15:43 ` George Dunlap
2013-11-07 17:00 ` Tim Deegan
2013-11-04 12:14 ` [PATCH v14 08/17] pvh: Do not allow PVH guests to change paging modes George Dunlap
2013-11-04 12:14 ` [PATCH v14 09/17] pvh: PVH access to hypercalls George Dunlap
2013-11-04 12:14 ` [PATCH v14 10/17] pvh: Use PV e820 George Dunlap
2013-11-04 12:15 ` [PATCH v14 11/17] pvh: Set up more PV stuff in set_info_guest George Dunlap
2013-11-04 16:20 ` George Dunlap
2013-11-04 16:53 ` Jan Beulich
2013-11-07 15:51 ` George Dunlap
2013-11-07 16:10 ` Jan Beulich
2013-11-07 16:33 ` George Dunlap
2013-11-04 12:15 ` [PATCH v14 12/17] pvh: Use PV handlers for cpuid, and IO George Dunlap
2013-11-04 16:20 ` George Dunlap
2013-11-05 8:42 ` Jan Beulich
2013-11-07 16:50 ` George Dunlap
2013-11-04 12:15 ` [PATCH v14 13/17] pvh: Disable 32-bit guest support for now George Dunlap
2013-11-04 12:15 ` [PATCH v14 14/17] pvh: Restrict tsc_mode to NEVER_EMULATE " George Dunlap
2013-11-04 12:15 ` [PATCH v14 15/17] pvh: Documentation George Dunlap
2013-11-04 12:15 ` [PATCH v14 16/17] PVH xen tools: libxc changes to build a PVH guest George Dunlap
2013-11-04 12:15 ` [PATCH v14 17/17] PVH xen tools: libxl changes to create " George Dunlap
2013-11-04 16:59 ` [PATCH v14 00/20] Introduce PVH domU support Konrad Rzeszutek Wilk
2013-11-04 17:23 ` George Dunlap [this message]
2013-11-04 17:34 ` Tim Deegan
2013-11-08 15:41 ` George Dunlap
2013-11-08 15:53 ` George Dunlap
2013-11-08 17:01 ` Tim Deegan
2013-11-08 17:06 ` George Dunlap
2013-11-08 15:58 ` Konrad Rzeszutek Wilk
2013-11-07 1:11 ` Tim Deegan
2013-11-11 12:37 ` Roger Pau Monné
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5277D7FB.30208@eu.citrix.com \
--to=george.dunlap@eu.citrix.com \
--cc=beulich@suse.com \
--cc=ian.campbell@citrix.com \
--cc=ian.jackson@citrix.com \
--cc=keir@xen.org \
--cc=konrad.wilk@oracle.com \
--cc=mukesh.rathor@oracle.com \
--cc=tim@xen.org \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).