From: Juergen Gross <jgross@suse.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: andrew.cooper3@citrix.com, xen-devel@lists.xenproject.org,
Dario Faggioli <dfaggioli@suse.com>
Subject: Re: [PATCH v3 00/17] Alternative Meltdown mitigation
Date: Tue, 13 Feb 2018 15:29:39 +0100 [thread overview]
Message-ID: <14b3325f-b13f-cfc7-444d-8368f56d5884@suse.com> (raw)
In-Reply-To: <5A83014E02000078001A7619@suse.com>
On 13/02/18 15:16, Jan Beulich wrote:
>>>> On 13.02.18 at 12:36, <jgross@suse.com> wrote:
>> On 12/02/18 18:54, Dario Faggioli wrote:
>>> On Fri, 2018-02-09 at 15:01 +0100, Juergen Gross wrote:
>>>> This series is available via github:
>>>>
>>>> https://github.com/jgross1/xen.git xpti
>>>>
>>>> Dario wants to do some performance tests for this series to compare
>>>> performance with Jan's series with all optimizations posted.
>>>>
>>> And some of this is indeed ready.
>>>
>>> So, this is again on my testbox, with 16 pCPUs and 12GB of RAM, and I
>>> used a guest with 16 vCPUs and 10GB of RAM.
>>>
>>> I benchmarked Jan's patch *plus* all the optimizations and overhead
>>> mitigation patches he posted on xen-devel (the ones that are already in
>>> staging, and also the ones that are not yet there). That's "XPTI-Light"
>>> in the table and in the graphs. Booting this with 'xpti=false' is
>>> considered the baseline, while booting with 'xpti=true' is the actual
>>> thing we want to measure. :-)
>>>
>>> Then I ran the same benchmarks on Juergen's branch above, enabled at
>>> boot. That's "XPYI" in the table and graphs (yes, I know, sorry for the
>>> typo!).
>>>
>>> http://openbenchmarking.org/result/1802125-DARI-180211144
>>>
>> http://openbenchmarking.org/result/1802125-DARI-180211144&obr_hgv=XPTI-Light+x
>> pti%3Dfalse&obr_nor=y&obr_hgv=XPTI-Light+xpti%3Dfalse
>>
>> ...
>>
>>> Or, actually, that's not it! :-O In fact, right while I was writing
>>> this report, it came out on IRC that something can be done, on
>>> Juergen's XPTI series, to mitigate the performance impact a bit.
>>>
>>> Juergen sent me a patch already, and I'm re-running the benchmarks with
>>> that applied. I'll let know how the results ends up looking like.
>>
>> It turned out the results are not basically different. So the general
>> problem with context switches is still there (which I expected, BTW).
>>
>> So I guess the really bad results with benchmarks triggering a lot of
>> vcpu scheduling show that my approach isn't going to fly, as the most
>> probable cause for the slow context switches are the introduced
>> serializing instructions (LTR, WRMSRs) which can't be avoided when we
>> want to use per-vcpu stacks.
>>
>> OTOH the results of the other benchmarks showing some advantage over
>> Jan's solution indicate there is indeed an aspect which can be improved.
>>
>> Instead of preferring one approach over the other I have thought about
>> a way to use the best parts of each solution in a combined variant. In
>> case nobody is feeling strong to pursue my current approach further I'd
>> like to suggest the following scheme:
>>
>> - Whenever a L4 page table of the guest is in use on one physical cpu
>> only use the L4 shadow cache of my series in order to avoid having to
>> copy the L4 contents each time the hypervisor is left.
>>
>> - As soon as a L4 page table is being activated on a second cpu fall
>> back to use the per-cpu page table on that cpu (the cpu already using
>> the L4 page table can continue doing so).
>
> Would the first of these CPUs continue to run on the shadow L4 in
> that case? If so, would there be no synchronization issues? If not,
> how do you envision "telling" it to move to the per-CPU L4 (which,
> afaict, includes knowing which vCPU / pCPU that is)?
I thought to let the CPU running on the shadow L4. This L4 already is
configured for the CPU it is being used on, so we just have to avoid
to activate it on a second CPU.
I don't see synchronization issues as all guest L4 modifications would
be mirrored in the shadow, as done in my series already.
>> - Before activation of a L4 shadow page table it is modified to map the
>> per-cpu data needed in guest mode for the local cpu only.
>
> I had been considering to do this in XPTI light for other purposes
> too (for example it might be possible to short circuit the guest
> system call path to get away without multiple page table switches).
> We really first need to settle on how much we feel is safe to expose
> while the guest is running. So far I've been under the impression
> that people actually think we should further reduce exposed pieces
> of code/data, rather than widen the "window".
I would like to have some prepared L3 page tables for each cpu meant to
be hooked into the correct shadow L4 slots. The shadow L4 should map as
few hypervisor parts as possible (again like in my current series).
>> - Use INVPCID instead of %cr4 PGE toggling to speed up purging global
>> TLB entries (depending on the availability of the feature, of course).
>
> That's something we should do independent of what XPTI model
> we'd like to retain long term.
Right. That was just for completeness.
>> - Use the PCID feature for being able to avoid purging TLB entries which
>> might be needed later (depending on hardware again).
>
> Which first of all raises the question: Does PCID (other than the U
> bit) prevent use of TLB entries in the wrong context? IOW is the
> PCID check done early (during TLB lookup) rather than late (during
> insn retirement)?
We can test this easily. As Linux kernel is already using this mechanism
for Meltdown mitigation I assume it is save, or the Linux kernel way to
avoid Meltdown attacks wouldn't work.
Juergen
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
prev parent reply other threads:[~2018-02-13 14:29 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-09 14:01 [PATCH v3 00/17] Alternative Meltdown mitigation Juergen Gross
2018-02-09 14:01 ` [PATCH v3 01/17] x86: don't use hypervisor stack size for dumping guest stacks Juergen Gross
2018-02-09 14:01 ` [PATCH v3 02/17] x86: do a revert of e871e80c38547d9faefc6604532ba3e985e65873 Juergen Gross
2018-02-13 10:14 ` Jan Beulich
2018-02-09 14:01 ` [PATCH v3 03/17] x86: revert 5784de3e2067ed73efc2fe42e62831e8ae7f46c4 Juergen Gross
2018-02-09 14:01 ` [PATCH v3 04/17] x86: don't access saved user regs via rsp in trap handlers Juergen Gross
2018-02-09 14:01 ` [PATCH v3 05/17] x86: add a xpti command line parameter Juergen Gross
2018-02-09 14:01 ` [PATCH v3 06/17] x86: allow per-domain mappings without NX bit or with specific mfn Juergen Gross
2018-02-09 14:01 ` [PATCH v3 07/17] xen/x86: split _set_tssldt_desc() into ldt and tss specific functions Juergen Gross
2018-02-09 14:01 ` [PATCH v3 08/17] x86: add support for spectre mitigation with local thunk Juergen Gross
2018-02-09 14:01 ` [PATCH v3 09/17] x86: create syscall stub for per-domain mapping Juergen Gross
2018-02-09 14:01 ` [PATCH v3 10/17] x86: allocate per-vcpu stacks for interrupt entries Juergen Gross
2018-02-09 14:01 ` [PATCH v3 11/17] x86: modify interrupt handlers to support stack switching Juergen Gross
2018-02-09 14:01 ` [PATCH v3 12/17] x86: activate per-vcpu stacks in case of xpti Juergen Gross
2018-02-09 14:01 ` [PATCH v3 13/17] x86: allocate hypervisor L4 page table for XPTI Juergen Gross
2018-02-09 14:01 ` [PATCH v3 14/17] xen: add domain pointer to fill_ro_mpt() and zap_ro_mpt() functions Juergen Gross
2018-02-09 14:01 ` [PATCH v3 15/17] x86: fill XPTI shadow pages and keep them in sync with guest L4 Juergen Gross
2018-02-09 14:01 ` [PATCH v3 16/17] x86: do page table switching when entering/leaving hypervisor Juergen Gross
2018-02-09 14:01 ` [PATCH v3 17/17] x86: hide most hypervisor mappings in XPTI shadow page tables Juergen Gross
2018-02-12 17:54 ` [PATCH v3 00/17] Alternative Meltdown mitigation Dario Faggioli
2018-02-13 11:36 ` Juergen Gross
2018-02-13 14:16 ` Jan Beulich
[not found] ` <5A83014E02000078001A7619@suse.com>
2018-02-13 14:29 ` Juergen Gross [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=14b3325f-b13f-cfc7-444d-8368f56d5884@suse.com \
--to=jgross@suse.com \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=dfaggioli@suse.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).