qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
To: Thomas Huth <thuth@redhat.com>
Cc: benh@au1.ibm.com, aik@ozlabs.ru, agraf@suse.de,
	qemu-devel@nongnu.org, paulus@samba.org, qemu-ppc@nongnu.org,
	sam.bobroff@au1.ibm.com, david@gibson.dropbear.id.au
Subject: Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
Date: Fri, 13 Nov 2015 00:19:37 +0530	[thread overview]
Message-ID: <5644DF41.7060605@linux.vnet.ibm.com> (raw)
In-Reply-To: <56445E7B.5010904@redhat.com>



On Thursday 12 November 2015 03:10 PM, Thomas Huth wrote:
> On 12/11/15 09:09, Thomas Huth wrote:
>> On 11/11/15 18:16, Aravinda Prasad wrote:
>>> Memory error such as bit flips that cannot be corrected
>>> by hardware are passed on to the kernel for handling.
>>> If the memory address in error belongs to guest then
>>> guest kernel is responsible for taking suitable action.
>>> Patch [1] enhances KVM to exit guest with exit reason
>>> set to KVM_EXIT_NMI in such cases.
>>>
>>> This patch handles KVM_EXIT_NMI exit. If the guest OS
>>> has registered the machine check handling routine by
>>> calling "ibm,nmi-register", then the handler builds
>>> the error log and invokes the registered handler else
>>> invokes the handler at 0x200.
>>>
>>> [1] http://marc.info/?l=kvm-ppc&m=144726114408289
>>>
>>> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
>>> ---
>>>  target-ppc/kvm.c     |   69 +++++++++++++++++++++++++++++++++++++++++++
>>>  target-ppc/kvm_ppc.h |   81 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>>  2 files changed, 150 insertions(+)
>>>
>>> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
>>> index 110436d..e2e5170 100644
>>> --- a/target-ppc/kvm.c
>>> +++ b/target-ppc/kvm.c
>>> @@ -1665,6 +1665,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>>>          ret = 0;
>>>          break;
>>>  
>>> +    case KVM_EXIT_NMI:
>>> +        DPRINTF("handle NMI exception\n");
>>> +        ret = kvm_handle_nmi(cpu);
>>> +        break;
>>> +
>>>      default:
>>>          fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
>>>          ret = -1;
>>> @@ -2484,3 +2489,67 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>>>  {
>>>      return data & 0xffff;
>>>  }
>>> +
>>> +int kvm_handle_nmi(PowerPCCPU *cpu)
>>> +{
>>> +    struct rtas_mc_log mc_log;
>>> +    CPUPPCState *env = &cpu->env;
>>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>>> +    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
>>> +
>>> +    cpu_synchronize_state(CPU(ppc_env_get_cpu(env)));
>>> +
>>> +    /* Properly set bits in MSR before we invoke the handler */
>>> +    env->msr = 0;
>>> +
>>> +    if (!(*pcc->interrupts_big_endian)(cpu)) {
>>> +        env->msr |= (1ULL << MSR_LE);
>>> +    }
>>> +
>>> +#ifdef TARGET_PPC64
>>> +    env->msr |= (1ULL << MSR_SF);
>>> +#endif
>>> +
>>> +    if (!spapr->guest_machine_check_addr) {
>>> +        /*
>>> +         * If OS has not registered with "ibm,nmi-register"
>>> +         * jump to 0x200
>>> +         */
>>
>> Shouldn't you also check MSR_ME here first and enter checkstop when
>> machine checks are disabled?
>> Also I think you have to set up some more registers for machine check
>> interrupts, like SRR0 and SRR1?
>>
>>> +        env->nip = 0x200;
>>> +        return 0;
>>> +    }
>>> +
>>> +    qemu_mutex_lock(&spapr->mc_in_progress);
>>
>> Using a mutex here is definitely wrong. The kvm_arch_handle_exit() code
>> is run under the Big QEMU Lock™ (see qemu_mutex_lock_iothread() in
>> kvm_cpu_exec()),
> 
> In case you're looking for the calls, I just noticed that the
> qemu_mutex_lock_iothread() have recently been pushed into
> kvm_arch_handle_exit() itself.

ok thanks for pointing out.

> 
>> so if you would ever get one thread waiting for this
>> mutex here, it could never be unlocked again in rtas_ibm_nmi_interlock()
>> because the other code would wait forever to get the BQL ==> Deadlock.
>>
>> I think if you want to be able to handle multiple NMIs at once, you
>> likely need something like an error log per CPU instead. And if an NMI
>> happens one CPU while there is already a NMI handler running on the very
>> same CPU, you could likely simply track this with an boolean variable
>> and put the CPU into checkstop if this happens?
> 
> Ok, I now had a look into the LoPAPR spec, and if I've got that right,
> you really have to serialize the NMIs in case they happen at multiple
> CPUs at the same time. So I guess the best thing you can do here is
> something like:
> 
>    while (spapr->mc_in_progress) {
>        /*
>         * There is already another NMI in progress, thus we need
>         * to yield here to wait until it has been finsihed
>         */
>        qemu_mutex_unlock_iothread();
>        usleep(10);
>        qemu_mutex_lock_iothread();
>    }
>    spapr->mc_in_progress = true;

Above piece of code should help. I will modify accordingly in the next
revision.

> 
> Also LoPAPR talks about 'subsequent processors report "fatal error
> previously reported"', so maybe the other processors should report that
> condition in this case?

I feel guest kernel is responsible for that or does that mean that qemu
should report the same error, which first processor encountered, for
subsequent processors? In that case what if the error encountered by
first processor was recovered.

> And of course you've also got to check that the same CPU is not getting
> multiple NMIs before the interlock function has been called again.

I think it is good to check that. However, shouldn't the guest enable ME
until it calls interlock function?

Regards,
Aravinda

> 
>  Thomas
> 

-- 
Regards,
Aravinda

  reply	other threads:[~2015-11-12 18:49 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-11 17:15 [Qemu-devel] [PATCH 0/4] target-ppc/spapr: Add FWNMI support in QEMU for PowerKVM guests Aravinda Prasad
2015-11-11 17:15 ` [Qemu-devel] [PATCH 1/4] spapr: Extend rtas-blob Aravinda Prasad
2015-11-12  3:40   ` David Gibson
2015-11-12  8:26   ` Thomas Huth
2015-11-12 11:53     ` David Gibson
2015-11-12 18:59     ` Aravinda Prasad
2015-11-11 17:15 ` [Qemu-devel] [PATCH 2/4] spapr: Register and handle HCALL to receive updated RTAS region Aravinda Prasad
2015-11-12  3:42   ` David Gibson
2015-11-12  5:28     ` [Qemu-devel] [Qemu-ppc] " Nikunj A Dadhania
2015-11-12  7:23       ` David Gibson
2015-11-11 17:15 ` [Qemu-devel] [PATCH 3/4] spapr: Handle "ibm, nmi-register" and "ibm, nmi-interlock" RTAS calls Aravinda Prasad
2015-11-12  4:02   ` David Gibson
2015-11-12 18:04     ` Aravinda Prasad
2015-11-12  9:23   ` Thomas Huth
2015-11-12 18:52     ` Aravinda Prasad
2015-11-11 17:16 ` [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit Aravinda Prasad
2015-11-12  4:29   ` David Gibson
2015-11-12  5:20     ` Aravinda Prasad
2015-11-12  8:09   ` Thomas Huth
2015-11-12  9:40     ` Thomas Huth
2015-11-12 18:49       ` Aravinda Prasad [this message]
2015-11-16  7:52         ` Thomas Huth
2015-11-16 10:07           ` Aravinda Prasad
2015-11-16 10:41             ` Thomas Huth
2015-11-16 11:57               ` Aravinda Prasad
2015-11-13  1:57       ` David Gibson
2015-11-13  7:03         ` Thomas Huth
2015-11-16  5:45           ` David Gibson
2015-11-12 18:23     ` Aravinda Prasad
2015-11-13  1:58       ` David Gibson
2015-11-13  4:53         ` Aravinda Prasad
2015-11-13  5:57           ` David Gibson
2015-11-13  6:27             ` Aravinda Prasad
2015-11-19  1:56       ` Alexey Kardashevskiy
2015-11-19 16:02         ` Aravinda Prasad
2015-11-16  3:50     ` Paul Mackerras
2015-11-16  9:01       ` Thomas Huth
2015-11-16 11:29         ` Aravinda Prasad
2015-11-16 21:46         ` Paul Mackerras
2015-11-12  4:30 ` [Qemu-devel] [PATCH 0/4] target-ppc/spapr: Add FWNMI support in QEMU for PowerKVM guests David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5644DF41.7060605@linux.vnet.ibm.com \
    --to=aravinda@linux.vnet.ibm.com \
    --cc=agraf@suse.de \
    --cc=aik@ozlabs.ru \
    --cc=benh@au1.ibm.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=paulus@samba.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=sam.bobroff@au1.ibm.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).