From: Paolo Bonzini <pbonzini@redhat.com>
To: Gleb Natapov <gleb@redhat.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
mtosatti@redhat.com, jan.kiszka@siemens.com
Subject: Re: [PATCH] x86: kvm: reset the bootstrap processor when it gets an INIT
Date: Mon, 11 Mar 2013 18:39:44 +0100 [thread overview]
Message-ID: <513E16E0.2050703@redhat.com> (raw)
In-Reply-To: <20130311172034.GR31619@redhat.com>
Il 11/03/2013 18:20, Gleb Natapov ha scritto:
> On Mon, Mar 11, 2013 at 03:28:03PM +0100, Paolo Bonzini wrote:
>> Il 11/03/2013 14:54, Gleb Natapov ha scritto:
>>>> Setting the mp_state to INIT_RECEIVED is that interface, and it already
>>>> works, for APs at least. This patch extends it to work for the BSP as well.
>>>
>>> It does not for AP either. If AP has vmx on mp_sate should not be set to
>>> INIT_RECEIVED. mp_sate is a state as you can see from its name and we
>>> already had a discussion on the generic device API about importance of
>>> separating sending commands from setting state. There is a difference
>>> between setting mp_state during migration and signaling INIT#.
>>
>> What does migration have to do with this?
>
> get|set_mpstate is used by migration. Actually this is primary reason
> for this interface existence.
Does it have to be the only one?
>>>> In the corresponding userspace patch, I don't need to touch the CPU
>>>> state at all. I can just signal the kernel. If I touch the CPU, I'll
>>>> break the nested case, no matter how it is implemented. So far, the
>>>> userspace did not have to worry about nested, and that's something that
>>>> should be kept that way.
>>> We are discussing two different things here. I'll try to separate them.
>>> 1. BSP is broken WRT #INIT
>>> 2. nested is broken WRT #INIT
>>>
>>> You are fixing 1 with your patches, for that I proposed much easier
>>> solution (at last from kernel point of view): if BSP reset it in
>>> userspace and make it runnable. Nested virt is still broken, but this is
>>> not what you are fixing.
>>
>> It's not what I'm fixing, but I don't want to make the fix for nested
>
> What are you fixing then?
Nested virt is not what I am fixing, but I'm trying to keep an eye on
that (and the other INIT race) while doing these patches.
>> virt unnecessarily more complicated. Nested virt needs to know about
>> INIT and SIPI; redefining the meaning of INIT_RECEIVED and SIPI_RECEIVED
>> makes it more complicated to reflect these events to L1.
>>
>>> For 2 much more involved fix is needed. Jan fixes it and it will require
>>> signaling INIT# from userspace by other means than mp_sate because
>>> signaling INIT# does not automatically means that mp_sate changes to
>>> INIT_RECEIVED.
>>
>> In your interpretation of INIT_RECEIVED, no. In mine, yes...
>
> Your code shows different. With your patch setting mp_state to
> INIT_RECEIVED makes vcpu non tunable. This is incorrect if INIT_RECEIVED
> is "INIT# is triggered" interface.
What do you mean by "non tunable"? In non-nested mode, the VCPU will
reset immediately, as soon as it is re-entered. In nested mode, the
VCPU will eat the INIT_RECEIVED and turn it into a vmexit.
At least according to AMD's docs, the VMM has to reassert INIT if it
wants the processor to actually process it [15.20.8 INIT support].
Intel's does not say it explicitly, but it doesn't say the opposite
either. It seems to be the only that makes sense.
>>>> If we move away from the INIT_RECEIVED and SIPI_RECEIVED states for
>>>> in-kernel APIC -> VCPU communication, then the KVM_SET_MP_STATE ioctl
>>>> will have to convert them to the right bits in the requests field or in
>>>> the APIC state. But I'm starting to see less benefit from moving away
>>>> from mp_state.
>>>>
>>> We are not moving away from mp_state, we are moving away from using
>>> mp_state for signaling
>>
>> That's what I meant; sorry for the unclear abbreviation.
>
> Then we disagree.
We do. Let's see _where_ exactly we disagree.
>>> because with nested virt INIT does not always
>>> change mp_state
>>
>> Why not?
>
> Because mp_state is the current state the vcpu is in. It can be
> uninitialized, runnable, halted or wait for sipi. SDM says that
> if nested virt is enabled vcpu does not enter wait for sipi state
> on INIT#.
Yes, but it still has to do something (a vmexit) and go back to RUNNING.
So it needs signaling from userspace to the kernel.
>> Which is why it's good to have the reset done in kernel space,
>> not in user space.
>
> Without nested virt it does not really matter and if it is does not
> really matter you do not add code to the kernel just because it is good.
> With nested virt INIT# processing needs to go to the kernel. In some
> cases INIT will cause reset, but you do not "do reset in kernel space",
> you do "INIT# handling in kernel space".
We agree on this. What I add is: let's define the API so that it is
nested-friendly. This means having a signaling mechanism for userspace.
I think you do not want mp_state to be this signaling mechanism. Why
not? Can an existing ioctl be the alternative or do we need to invent a
new one?
Paolo
next prev parent reply other threads:[~2013-03-11 17:39 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-09 6:48 [PATCH] x86: kvm: reset the bootstrap processor when it gets an INIT Paolo Bonzini
2013-03-10 11:46 ` Gleb Natapov
2013-03-10 14:53 ` Paolo Bonzini
2013-03-10 15:35 ` Gleb Natapov
2013-03-10 17:19 ` Paolo Bonzini
2013-03-10 18:10 ` Gleb Natapov
2013-03-11 10:14 ` Paolo Bonzini
2013-03-11 10:28 ` Gleb Natapov
2013-03-11 11:25 ` Paolo Bonzini
2013-03-11 11:51 ` Gleb Natapov
2013-03-11 13:31 ` Paolo Bonzini
2013-03-11 13:54 ` Gleb Natapov
2013-03-11 14:01 ` Jan Kiszka
2013-03-11 14:05 ` Gleb Natapov
2013-03-11 14:06 ` Jan Kiszka
2013-03-11 14:09 ` Gleb Natapov
2013-03-11 14:10 ` Jan Kiszka
2013-03-11 14:12 ` Gleb Natapov
2013-03-11 14:19 ` Jan Kiszka
2013-03-11 14:23 ` Paolo Bonzini
2013-03-11 15:36 ` Jan Kiszka
2013-03-11 17:23 ` Gleb Natapov
2013-03-11 17:34 ` Jan Kiszka
2013-03-11 17:38 ` Jan Kiszka
2013-03-11 17:41 ` Gleb Natapov
2013-03-11 18:05 ` Jan Kiszka
2013-03-11 18:13 ` Gleb Natapov
2013-03-11 18:27 ` Jan Kiszka
2013-03-11 18:39 ` Gleb Natapov
2013-03-11 18:47 ` Jan Kiszka
2013-03-11 18:51 ` Gleb Natapov
2013-03-11 19:01 ` Jan Kiszka
2013-03-11 19:30 ` Gleb Natapov
2013-03-12 9:25 ` Jan Kiszka
2013-03-12 11:28 ` Gleb Natapov
2013-03-11 14:28 ` Paolo Bonzini
2013-03-11 17:20 ` Gleb Natapov
2013-03-11 17:39 ` Paolo Bonzini [this message]
2013-03-11 18:04 ` Gleb Natapov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=513E16E0.2050703@redhat.com \
--to=pbonzini@redhat.com \
--cc=gleb@redhat.com \
--cc=jan.kiszka@siemens.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mtosatti@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).