[Qemu-devel] Re: [PATCH 08/13] kvm: x86: Inject pending MCE events on state writeback

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Jan Kiszka <jan.kiszka@siemens.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Avi Kivity <avi@redhat.com>, Huang Ying <ying.huang@intel.com>,
	Jin Dongming <jin.dongming@np.css.fujitsu.com>
Subject: [Qemu-devel] Re: [PATCH 08/13] kvm: x86: Inject pending MCE events on state writeback
Date: Thu, 17 Feb 2011 18:06:19 +0100	[thread overview]
Message-ID: <4D5D558B.10406@siemens.com> (raw)
In-Reply-To: <20110217163534.GB10918@amt.cnet>

On 2011-02-17 17:35, Marcelo Tosatti wrote:
> On Tue, Feb 15, 2011 at 09:23:32AM +0100, Jan Kiszka wrote:
>> The current way of injecting MCE events without updating of and
>> synchronizing with the CPUState is broken and causes spurious
>> corruptions of the MCE-related parts of the CPUState.
> 
> Can you explain how? The current pronlem with MCE is that it bypasses 
> writeback code, but corruption has nothing to do with that.

It's precisely the same scenario as with the old debug exception
re-injection: If we update the pending exception state via
KVM_SET_VCPU_EVENTS, we must not inject it via any other path. Otherwise
we end up with overwritten/lost events - which is extremely critical for
this rarely taken code paths.

Jut like parts of KVM_SET_GUEST_DEBUG, KVM_X86_SET_MCE pre-dates
KVM_SET_VCPU_EVENTS which obsoleted all other exception injection
mechanisms.

> 
>> As a first step towards a fix, enhance the state writeback code with
>> support for injecting events that are pending in the CPUState. A pending
>> exception will then be signaled via cpu_interrupt(CPU_INTERRUPT_MCE).
>> And, just like for TCG, we need to leave the halt state when
>> CPU_INTERRUPT_MCE is pending (left broken for the to-be-removed old KVM
>> code).
>>
>> This will also allow to unify TCG and KVM injection code.
>>
>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>> CC: Huang Ying <ying.huang@intel.com>
>> CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
>> CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
>> ---
>>  target-i386/kvm.c |   75 +++++++++++++++++++++++++++++++++++++++++++++++++---
>>  1 files changed, 70 insertions(+), 5 deletions(-)
>>
>> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
>> index f909661..46f45db 100644
>> --- a/target-i386/kvm.c
>> +++ b/target-i386/kvm.c
>> @@ -467,6 +467,44 @@ void kvm_inject_x86_mce(CPUState *cenv, int bank, uint64_t status,
>>  #endif /* !KVM_CAP_MCE*/
>>  }
>>  
>> +static int kvm_inject_mce_oldstyle(CPUState *env)
>> +{
>> +#ifdef KVM_CAP_MCE
>> +    if (kvm_has_vcpu_events()) {
>> +        return 0;
>> +    }
>> +    if (env->interrupt_request & CPU_INTERRUPT_MCE) {
>> +        unsigned int bank, bank_num = env->mcg_cap & 0xff;
>> +        struct kvm_x86_mce mce;
>> +
>> +        /* We must not raise CPU_INTERRUPT_MCE if it's not supported. */
>> +        assert(env->mcg_cap);
>> +
>> +        env->interrupt_request &= ~CPU_INTERRUPT_MCE;
>> +
>> +        /*
>> +         * There must be at least one bank in use if CPU_INTERRUPT_MCE was set.
>> +         * Find it and use its values for the event injection.
>> +         */
>> +        for (bank = 0; bank < bank_num; bank++) {
>> +            if (env->mce_banks[bank * 4 + 1] & MCI_STATUS_VAL) {
>> +                break;
>> +            }
>> +        }
>> +        assert(bank < bank_num);
>> +
>> +        mce.bank = bank;
>> +        mce.status = env->mce_banks[bank * 4 + 1];
>> +        mce.mcg_status = env->mcg_status;
>> +        mce.addr = env->mce_banks[bank * 4 + 2];
>> +        mce.misc = env->mce_banks[bank * 4 + 3];
>> +
>> +        return kvm_vcpu_ioctl(env, KVM_X86_SET_MCE, &mce);
>> +    }
>> +#endif /* KVM_CAP_MCE */
>> +    return 0;
>> +}
>> +
>>  static void cpu_update_state(void *opaque, int running, int reason)
>>  {
>>      CPUState *env = opaque;
>> @@ -1375,10 +1413,25 @@ static int kvm_put_vcpu_events(CPUState *env, int level)
>>          return 0;
>>      }
>>  
>> -    events.exception.injected = (env->exception_injected >= 0);
>> -    events.exception.nr = env->exception_injected;
>> -    events.exception.has_error_code = env->has_error_code;
>> -    events.exception.error_code = env->error_code;
>> +    if (env->interrupt_request & CPU_INTERRUPT_MCE) {
>> +        /* We must not raise CPU_INTERRUPT_MCE if it's not supported. */
>> +        assert(env->mcg_cap);
>> +
>> +        env->interrupt_request &= ~CPU_INTERRUPT_MCE;
>> +        if (env->exception_injected == EXCP08_DBLE) {
>> +            /* this means triple fault */
>> +            qemu_system_reset_request();
>> +            env->exit_request = 1;
>> +        }
>> +        events.exception.injected = 1;
>> +        events.exception.nr = EXCP12_MCHK;
>> +        events.exception.has_error_code = 0;
>> +    } else {
>> +        events.exception.injected = (env->exception_injected >= 0);
>> +        events.exception.nr = env->exception_injected;
>> +        events.exception.has_error_code = env->has_error_code;
>> +        events.exception.error_code = env->error_code;
>> +    }
> 
> IMO it is important to maintain a scope for kvm_put_vcpu_events /
> kvm_get_vcpu_events: they synchronize state to/from the kernel. Not more
> than that. Whatever you're trying to do here should be higher in the
> vcpu loop code.

We pick up CPU_INTERRUPT_MCE and translate it into the right exception
that put_vcpu_events is about to sync to the kernel. What should be done
earlier of those steps? Calculating env->exception_injected?

> 
>>      events.interrupt.injected = (env->interrupt_injected >= 0);
>>      events.interrupt.nr = env->interrupt_injected;
>> @@ -1539,6 +1592,11 @@ int kvm_arch_put_registers(CPUState *env, int level)
>>      if (ret < 0) {
>>          return ret;
>>      }
>> +    /* must be before kvm_put_msrs */
>> +    ret = kvm_inject_mce_oldstyle(env);
>> +    if (ret < 0) {
>> +        return ret;
>> +    }
>>      ret = kvm_put_msrs(env, level);
>>      if (ret < 0) {
>>          return ret;
>> @@ -1678,10 +1736,17 @@ void kvm_arch_post_run(CPUState *env, struct kvm_run *run)
>>  int kvm_arch_process_irqchip_events(CPUState *env)
>>  {
>>      if (kvm_irqchip_in_kernel()) {
>> +        if (env->interrupt_request & CPU_INTERRUPT_MCE) {
>> +            kvm_cpu_synchronize_state(env);
>> +            if (env->mp_state == KVM_MP_STATE_HALTED) {
>> +                env->mp_state = KVM_MP_STATE_RUNNABLE;
>> +            }
>> +        }
> 
> Should not manipulate mp_state of a running vcpu (should only do that
> for migration when vcpu is stopped), since its managed by the kernel,
> for irqchip case.

Not for asynchronously injected MCEs. The target CPU would simply
oversleep them. MCEs are not in the scope of the in-kernel irqchip.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

next prev parent reply	other threads:[~2011-02-17 17:11 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-15  8:23 [Qemu-devel] [PATCH 00/13] [uq/master] Patch queue, part IV (MCE edition) Jan Kiszka
2011-02-15  8:23 ` [Qemu-devel] [PATCH 01/13] x86: Account for MCE in cpu_has_work Jan Kiszka
2011-02-15  8:23 ` [Qemu-devel] [PATCH 02/13] x86: Perform implicit mcg_status reset Jan Kiszka
2011-02-15  8:23 ` [Qemu-devel] [PATCH 03/13] x86: Small cleanups of MCE helpers Jan Kiszka
2011-02-15  8:23 ` [Qemu-devel] [PATCH 04/13] x86: Refine error reporting of MCE injection services Jan Kiszka
2011-02-15  8:23 ` [Qemu-devel] [PATCH 05/13] x86: Optionally avoid injecting AO MCEs while others are pending Jan Kiszka
2011-02-15  8:23 ` [Qemu-devel] [PATCH 06/13] Synchronize VCPU states before reset Jan Kiszka
2011-02-15  8:23 ` [Qemu-devel] [PATCH 07/13] kvm: x86: Move MCE functions together Jan Kiszka
2011-02-15  8:23 ` [Qemu-devel] [PATCH 08/13] kvm: x86: Inject pending MCE events on state writeback Jan Kiszka
2011-02-17 16:35   ` [Qemu-devel] " Marcelo Tosatti
2011-02-17 17:06     ` Jan Kiszka [this message]
2011-02-17 17:55       ` Marcelo Tosatti
2011-02-17 18:04         ` Jan Kiszka
2011-02-17 18:17           ` Marcelo Tosatti
2011-02-15  8:23 ` [Qemu-devel] [PATCH 09/13] kvm: x86: Consolidate TCG and KVM MCE injection code Jan Kiszka
2011-02-17 18:08   ` [Qemu-devel] " Marcelo Tosatti
2011-02-17 18:17     ` Jan Kiszka
2011-02-15  8:23 ` [Qemu-devel] [PATCH 10/13] kvm: x86: Clean up kvm_setup_mce Jan Kiszka
2011-02-15  8:23 ` [Qemu-devel] [PATCH 11/13] kvm: x86: Fail kvm_arch_init_vcpu if MCE initialization fails Jan Kiszka
2011-02-15  8:23 ` [Qemu-devel] [PATCH 12/13] Add qemu_ram_remap Jan Kiszka
2011-02-15  8:23 ` [Qemu-devel] [PATCH 13/13] KVM, MCE, unpoison memory address across reboot Jan Kiszka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D5D558B.10406@siemens.com \
    --to=jan.kiszka@siemens.com \
    --cc=avi@redhat.com \
    --cc=jin.dongming@np.css.fujitsu.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=seto.hidetoshi@jp.fujitsu.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).