From: "Egger, Christoph" <chegger@amazon.de>
To: "Zhang, Yang Z" <yang.z.zhang@intel.com>,
Jan Beulich <JBeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
Keir Fraser <keir@xen.org>, "Dong, Eddie" <eddie.dong@intel.com>,
"Nakajima, Jun" <jun.nakajima@intel.com>,
xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH 5/5] x86/HVM: cache emulated instruction for retry processing
Date: Tue, 7 Jan 2014 10:43:43 +0100 [thread overview]
Message-ID: <52CBCC4F.8080500@amazon.de> (raw)
In-Reply-To: <A9667DDFB95DB7438FA9D7D576C3D87E0A9A34C2@SHSMSX104.ccr.corp.intel.com>
On 07.01.14 09:54, Zhang, Yang Z wrote:
> Jan Beulich wrote on 2014-01-07:
>>>>> On 24.12.13 at 12:29, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote:
>>> Jan Beulich wrote on 2013-12-18:
>>>>>>> On 18.12.13 at 10:40, "Zhang, Yang Z" <yang.z.zhang@intel.com>
>> wrote:
>>>>> Jan Beulich wrote on 2013-12-18:
>>>>>>>>> On 18.12.13 at 09:36, "Zhang, Yang Z" <yang.z.zhang@intel.com>
>>>> wrote:
>>>>>>> Jan Beulich wrote on 2013-09-30:
>>>>>>>> Rather than re-reading the instruction bytes upon retry
>>>>>>>> processing, stash away and re-use tha what we already read.
>>>>>>>> That way we can be certain that the retry won't do something
>>>>>>>> different from what requested the retry, getting once again
>>>>>>>> closer to real hardware behavior (where what we use retries for
>>>>>>>> is simply a bus operation, not involving redundant decoding of instructions).
>>>>>>>>
>>>>>>>
>>>>>>> This patch doesn't consider the nested case.
>>>>>>> For example, if the buffer saved the L2's instruction, then
>>>>>>> vmexit to
>>>>>>> L1 and
>>>>>>> L1 may use the wrong instruction.
>>>>>>
>>>>>> I'm having difficulty seeing how the two could get intermixed:
>>>>>> There should be, at any given point in time, at most one
>>>>>> instruction being emulated. Can you please give a more elaborate
>>>>>> explanation of the situation where you see a (theoretical?
>>>>>> practical?)
>>> problem?
>>>>>
>>>>> I saw this issue when booting L1 hyper-v. I added some debug info
>>>>> and saw the strange phenomenon:
>>>>>
>>>>> (XEN) write to buffer: eip 0xfffff8800430bc80, size 16,
>>>>> content:f7420c1f608488b 44000000011442c7
>>>>> (XEN) read from buffer: eip 0xfffff800002f6138, size 16,
>>>>> content:f7420c1f608488b 44000000011442c7
>>>>>
>>>>> From the log, we can see different eip but using the same buffer.
>>>>> Since I don't know how hyper-v working, so I cannot give more
>>>>> information why this happens. And I only saw it with L1 hyper-v
>>>>> (Xen on Xen and KVM on Xen don't have this issue) .
>>>>
>>>> But in order to validate the fix is (a) needed and (b) correct, a
>>>> proper understanding of the issue is necessary as the very first step.
>>>> Which doesn't require knowing internals of Hyper-V, all you need is
>>>> tracking of emulator state changes in Xen (which I realize is said
>>>> easier than done, since you want to make sure you don't generate
>>>> huge amounts of output before actually hitting the issue, making it
>>>> close to
>>> impossible to analyze).
>>>
>>> Ok. It should be an old issue and just exposed by your patch only:
>>> Sometimes, L0 need to decode the L2's instruction to handle IO
>>> access directly. For example, if L1 pass through (not VT-d) a device to L2 directly.
>>> And L0 may get X86EMUL_RETRY when handling this IO request. At same
>>> time, if there is a virtual vmexit pending (for example, an
>>> interrupt pending to inject to L1) and hypervisor will switch the
>>> VCPU context from L2 to L1. Now we already in L1's context, but
>>> since we got a X86EMUL_RETRY just now and this means hyprevisor will
>>> retry to handle the IO request later and unfortunately, the retry will happen in L1's context.
>>> Without your patch, L0 just emulate an L1's instruction and
>>> everything goes on. With your patch, L0 will get the instruction
>>> from the buffer which is belonging to L2, and problem arises.
>>>
>>> So the fixing is that if there is a pending IO request, no virtual
>>> vmexit/vmentry is allowed which following hardware's behavior.
>>>
>>> diff --git a/xen/arch/x86/hvm/vmx/vvmx.c
>>> b/xen/arch/x86/hvm/vmx/vvmx.c index 41db52b..c5446a9 100644
>>> --- a/xen/arch/x86/hvm/vmx/vvmx.c
>>> +++ b/xen/arch/x86/hvm/vmx/vvmx.c
>>> @@ -1394,7 +1394,10 @@ void nvmx_switch_guest(void)
>>> struct vcpu *v = current;
>>> struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
>>> struct cpu_user_regs *regs = guest_cpu_user_regs();
>>> + ioreq_t *p = get_ioreq(v);
>>>
>>> + if ( p->state != STATE_IOREQ_NONE )
>>> + return;
>>> /*
>>> * a softirq may interrupt us between a virtual vmentry is
>>> * just handled and the true vmentry. If during this window,
>>
>> This change looks much more sensible; question is whether simply
>> ignoring the switch request is acceptable (and I don't know the nested
>> HVM code well enough to tell). Furthermore I wonder whether that's really a VMX-only issue.
>
> From hardware's point, an IO operation is handled atomically. So the problem
> will never happen. But in Xen, an IO operation is divided into several
steps.
> Without nested virtualization, the VCPU is paused or looped until the
IO emulation
> is finished. But in nested environment, the VCPU will continue running
even the
> IO emulation is not finished. So my patch will check this and allow
the VCPU to
> continue running only there is no pending IO request. This is matching
hardware's behavior.
>
> I guess SVM also has this problem. But since I don't know nested SVM well,
> so perhaps Christoph can help to have a double check.
For SVM this issue was fixed with commit
d740d811925385c09553cbe6dee8e77c1d43b198
And there is a followup cleanup commit
ac97fa6a21ccd395cca43890bbd0bf32e3255ebb
The change in nestedsvm.c in commit d740d811925385c09553cbe6dee8e77c1d43b198
is actually not SVM specific.
Move that into nhvm_interrupt_blocked()
in hvm.c right before
return hvm_funcs.nhvm_intr_blocked(v);
and the fix applies for both SVM and VMX.
Christoph
next prev parent reply other threads:[~2014-01-07 9:44 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-30 12:51 [PATCH 0/5] x86/HVM: XSA-63 follow-ups Jan Beulich
2013-09-30 12:57 ` [PATCH 1/5] x86/HVM: properly handle backward string instruction emulation Jan Beulich
2013-10-08 16:36 ` Andrew Cooper
2013-09-30 12:57 ` [PATCH 2/5] x86/HVM: fix direct PCI port I/O emulation retry and error handling Jan Beulich
2013-10-08 18:13 ` Andrew Cooper
2013-09-30 12:58 ` [PATCH 3/5] x86/HVM: don't ignore hvm_copy_to_guest_phys() errors during I/O intercept Jan Beulich
2013-10-08 18:20 ` Andrew Cooper
2013-10-09 7:48 ` Jan Beulich
2013-09-30 12:58 ` [PATCH 4/5] x86/HVM: properly deal with hvm_copy_*_guest_phys() errors Jan Beulich
2013-09-30 13:07 ` Andrew Cooper
2013-09-30 12:59 ` [PATCH 5/5] x86/HVM: cache emulated instruction for retry processing Jan Beulich
2013-10-10 11:35 ` Andrew Cooper
2013-12-18 8:36 ` Zhang, Yang Z
2013-12-18 8:48 ` Jan Beulich
2013-12-18 9:40 ` Zhang, Yang Z
2013-12-18 10:53 ` Jan Beulich
2013-12-24 11:29 ` Zhang, Yang Z
2014-01-07 8:28 ` Jan Beulich
2014-01-07 8:54 ` Zhang, Yang Z
2014-01-07 9:43 ` Egger, Christoph [this message]
2014-01-08 5:50 ` Zhang, Yang Z
2014-01-09 12:19 ` Egger, Christoph
2014-01-16 4:42 ` Zhang, Yang Z
2014-01-16 8:23 ` Jan Beulich
2014-01-17 2:53 ` Zhang, Yang Z
2013-10-08 15:10 ` Ping: [PATCH 0/5] x86/HVM: XSA-63 follow-ups Jan Beulich
2013-10-14 7:29 ` Keir Fraser
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52CBCC4F.8080500@amazon.de \
--to=chegger@amazon.de \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=eddie.dong@intel.com \
--cc=jun.nakajima@intel.com \
--cc=keir@xen.org \
--cc=xen-devel@lists.xenproject.org \
--cc=yang.z.zhang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.