linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Gleb Natapov <gleb@redhat.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	mtosatti@redhat.com, jan.kiszka@siemens.com
Subject: Re: [PATCH] x86: kvm: reset the bootstrap processor when it gets an INIT
Date: Mon, 11 Mar 2013 18:39:44 +0100	[thread overview]
Message-ID: <513E16E0.2050703@redhat.com> (raw)
In-Reply-To: <20130311172034.GR31619@redhat.com>

Il 11/03/2013 18:20, Gleb Natapov ha scritto:
> On Mon, Mar 11, 2013 at 03:28:03PM +0100, Paolo Bonzini wrote:
>> Il 11/03/2013 14:54, Gleb Natapov ha scritto:
>>>> Setting the mp_state to INIT_RECEIVED is that interface, and it already
>>>> works, for APs at least.  This patch extends it to work for the BSP as well.
>>>
>>> It does not for AP either. If AP has vmx on mp_sate should not be set to
>>> INIT_RECEIVED. mp_sate is a state as you can see from its name and we
>>> already had a discussion on the generic device API about importance of
>>> separating sending commands from setting state. There is a difference
>>> between setting mp_state during migration and signaling INIT#.
>>
>> What does migration have to do with this?
>
> get|set_mpstate is used by migration. Actually this is primary reason
> for this interface existence.

Does it have to be the only one?

>>>> In the corresponding userspace patch, I don't need to touch the CPU
>>>> state at all.  I can just signal the kernel.  If I touch the CPU, I'll
>>>> break the nested case, no matter how it is implemented.  So far, the
>>>> userspace did not have to worry about nested, and that's something that
>>>> should be kept that way.
>>> We are discussing two different things here. I'll try to separate them.
>>> 1. BSP is broken WRT #INIT
>>> 2. nested is broken WRT #INIT
>>>
>>> You are fixing 1 with your patches, for that I proposed much easier
>>> solution (at last from kernel point of view): if BSP reset it in
>>> userspace and make it runnable. Nested virt is still broken, but this is
>>> not what you are fixing.
>>
>> It's not what I'm fixing, but I don't want to make the fix for nested
> 
> What are you fixing then?

Nested virt is not what I am fixing, but I'm trying to keep an eye on
that (and the other INIT race) while doing these patches.

>> virt unnecessarily more complicated.  Nested virt needs to know about
>> INIT and SIPI; redefining the meaning of INIT_RECEIVED and SIPI_RECEIVED
>> makes it more complicated to reflect these events to L1.
>>
>>> For 2 much more involved fix is needed. Jan fixes it and it will require
>>> signaling INIT# from userspace by other means than mp_sate because
>>> signaling INIT# does not automatically means that mp_sate changes to
>>> INIT_RECEIVED.
>>
>> In your interpretation of INIT_RECEIVED, no.  In mine, yes...
>
> Your code shows different. With your patch setting mp_state to
> INIT_RECEIVED makes vcpu non tunable. This is incorrect if INIT_RECEIVED
> is "INIT# is triggered" interface.

What do you mean by "non tunable"?  In non-nested mode, the VCPU will
reset immediately, as soon as it is re-entered.  In nested mode, the
VCPU will eat the INIT_RECEIVED and turn it into a vmexit.

At least according to AMD's docs, the VMM has to reassert INIT if it
wants the processor to actually process it [15.20.8 INIT support].
Intel's does not say it explicitly, but it doesn't say the opposite
either.  It seems to be the only that makes sense.

>>>> If we move away from the INIT_RECEIVED and SIPI_RECEIVED states for
>>>> in-kernel APIC -> VCPU communication, then the KVM_SET_MP_STATE ioctl
>>>> will have to convert them to the right bits in the requests field or in
>>>> the APIC state.  But I'm starting to see less benefit from moving away
>>>> from mp_state.
>>>>
>>> We are not moving away from mp_state, we are moving away from using
>>> mp_state for signaling
>>
>> That's what I meant; sorry for the unclear abbreviation.
> 
> Then we disagree.

We do.  Let's see _where_ exactly we disagree.

>>> because with nested virt INIT does not always
>>> change mp_state
>>
>> Why not?
> 
> Because mp_state is the current state the vcpu is in. It can be
> uninitialized, runnable, halted or wait for sipi. SDM says that
> if nested virt is enabled vcpu does not enter wait for sipi state
> on INIT#.

Yes, but it still has to do something (a vmexit) and go back to RUNNING.
 So it needs signaling from userspace to the kernel.

>>          Which is why it's good to have the reset done in kernel space,
>> not in user space.
>
> Without nested virt it does not really matter and if it is does not
> really matter you do not add code to the kernel just because it is good.
> With nested virt INIT# processing needs to go to the kernel. In some
> cases INIT will cause reset, but you do not "do reset in kernel space",
> you do "INIT# handling in kernel space".

We agree on this.  What I add is: let's define the API so that it is
nested-friendly.  This means having a signaling mechanism for userspace.
 I think you do not want mp_state to be this signaling mechanism.  Why
not?  Can an existing ioctl be the alternative or do we need to invent a
new one?

Paolo


  reply	other threads:[~2013-03-11 17:39 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-09  6:48 [PATCH] x86: kvm: reset the bootstrap processor when it gets an INIT Paolo Bonzini
2013-03-10 11:46 ` Gleb Natapov
2013-03-10 14:53   ` Paolo Bonzini
2013-03-10 15:35     ` Gleb Natapov
2013-03-10 17:19       ` Paolo Bonzini
2013-03-10 18:10         ` Gleb Natapov
2013-03-11 10:14           ` Paolo Bonzini
2013-03-11 10:28             ` Gleb Natapov
2013-03-11 11:25               ` Paolo Bonzini
2013-03-11 11:51                 ` Gleb Natapov
2013-03-11 13:31                   ` Paolo Bonzini
2013-03-11 13:54                     ` Gleb Natapov
2013-03-11 14:01                       ` Jan Kiszka
2013-03-11 14:05                         ` Gleb Natapov
2013-03-11 14:06                           ` Jan Kiszka
2013-03-11 14:09                             ` Gleb Natapov
2013-03-11 14:10                               ` Jan Kiszka
2013-03-11 14:12                                 ` Gleb Natapov
2013-03-11 14:19                                   ` Jan Kiszka
2013-03-11 14:23                           ` Paolo Bonzini
2013-03-11 15:36                             ` Jan Kiszka
2013-03-11 17:23                               ` Gleb Natapov
2013-03-11 17:34                                 ` Jan Kiszka
2013-03-11 17:38                                   ` Jan Kiszka
2013-03-11 17:41                                   ` Gleb Natapov
2013-03-11 18:05                                     ` Jan Kiszka
2013-03-11 18:13                                       ` Gleb Natapov
2013-03-11 18:27                                         ` Jan Kiszka
2013-03-11 18:39                                           ` Gleb Natapov
2013-03-11 18:47                                             ` Jan Kiszka
2013-03-11 18:51                                               ` Gleb Natapov
2013-03-11 19:01                                                 ` Jan Kiszka
2013-03-11 19:30                                                   ` Gleb Natapov
2013-03-12  9:25                                                     ` Jan Kiszka
2013-03-12 11:28                                                       ` Gleb Natapov
2013-03-11 14:28                       ` Paolo Bonzini
2013-03-11 17:20                         ` Gleb Natapov
2013-03-11 17:39                           ` Paolo Bonzini [this message]
2013-03-11 18:04                             ` Gleb Natapov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=513E16E0.2050703@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=gleb@redhat.com \
    --cc=jan.kiszka@siemens.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).