qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Thomas Huth <thuth@redhat.com>
Cc: benh@au1.ibm.com, aik@ozlabs.ru, agraf@suse.de,
	qemu-devel@nongnu.org, qemu-ppc@nongnu.org,
	sam.bobroff@au1.ibm.com,
	Aravinda Prasad <aravinda@linux.vnet.ibm.com>,
	paulus@samba.org
Subject: Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit
Date: Fri, 13 Nov 2015 12:57:15 +1100	[thread overview]
Message-ID: <20151113015715.GJ4886@voom.redhat.com> (raw)
In-Reply-To: <56445E7B.5010904@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3671 bytes --]

On Thu, Nov 12, 2015 at 10:40:11AM +0100, Thomas Huth wrote:
> On 12/11/15 09:09, Thomas Huth wrote:
> > On 11/11/15 18:16, Aravinda Prasad wrote:
> >> Memory error such as bit flips that cannot be corrected
> >> by hardware are passed on to the kernel for handling.
> >> If the memory address in error belongs to guest then
> >> guest kernel is responsible for taking suitable action.
> >> Patch [1] enhances KVM to exit guest with exit reason
> >> set to KVM_EXIT_NMI in such cases.
> >>
> >> This patch handles KVM_EXIT_NMI exit. If the guest OS
> >> has registered the machine check handling routine by
> >> calling "ibm,nmi-register", then the handler builds
> >> the error log and invokes the registered handler else
> >> invokes the handler at 0x200.
> >>
> >> [1] http://marc.info/?l=kvm-ppc&m=144726114408289
> >>
> >> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
[snip]
> >> +        env->nip = 0x200;
> >> +        return 0;
> >> +    }
> >> +
> >> +    qemu_mutex_lock(&spapr->mc_in_progress);
> > 
> > Using a mutex here is definitely wrong. The kvm_arch_handle_exit() code
> > is run under the Big QEMU Lock™ (see qemu_mutex_lock_iothread() in
> > kvm_cpu_exec()),
> 
> In case you're looking for the calls, I just noticed that the
> qemu_mutex_lock_iothread() have recently been pushed into
> kvm_arch_handle_exit() itself.
> 
> > so if you would ever get one thread waiting for this
> > mutex here, it could never be unlocked again in rtas_ibm_nmi_interlock()
> > because the other code would wait forever to get the BQL ==> Deadlock.
> > 
> > I think if you want to be able to handle multiple NMIs at once, you
> > likely need something like an error log per CPU instead. And if an NMI
> > happens one CPU while there is already a NMI handler running on the very
> > same CPU, you could likely simply track this with an boolean variable
> > and put the CPU into checkstop if this happens?
> 
> Ok, I now had a look into the LoPAPR spec, and if I've got that right,
> you really have to serialize the NMIs in case they happen at multiple
> CPUs at the same time. So I guess the best thing you can do here is
> something like:
> 
>    while (spapr->mc_in_progress) {
>        /*
>         * There is already another NMI in progress, thus we need
>         * to yield here to wait until it has been finsihed
>         */
>        qemu_mutex_unlock_iothread();
>        usleep(10);
>        qemu_mutex_lock_iothread();
>    }
>    spapr->mc_in_progress = true;
> 
> Also LoPAPR talks about 'subsequent processors report "fatal error
> previously reported"', so maybe the other processors should report that
> condition in this case?
> And of course you've also got to check that the same CPU is not getting
> multiple NMIs before the interlock function has been called again.

You should be able to avoid the nasty usleep by using a pthread
condition variable.  So here you'd have

    while (spapr->mc_in_progress) {
        pthread_cond_wait(&mc_delivery_cond, &qemu_iothread_lock);
    }
    spapr->mc_in_progress = true;

Or.. there may be qemu wrappers around the pthread functions you
should be using.  Once delivery of a single MC is complete, you'd use
pthread_cond_signal() to wake up any additional ones.

pthread_cond_wait automatically drops the specified mutex internally,
so access to mc_in_progress itself is still protected by the iothread
mutex.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  parent reply	other threads:[~2015-11-13  4:04 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-11 17:15 [Qemu-devel] [PATCH 0/4] target-ppc/spapr: Add FWNMI support in QEMU for PowerKVM guests Aravinda Prasad
2015-11-11 17:15 ` [Qemu-devel] [PATCH 1/4] spapr: Extend rtas-blob Aravinda Prasad
2015-11-12  3:40   ` David Gibson
2015-11-12  8:26   ` Thomas Huth
2015-11-12 11:53     ` David Gibson
2015-11-12 18:59     ` Aravinda Prasad
2015-11-11 17:15 ` [Qemu-devel] [PATCH 2/4] spapr: Register and handle HCALL to receive updated RTAS region Aravinda Prasad
2015-11-12  3:42   ` David Gibson
2015-11-12  5:28     ` [Qemu-devel] [Qemu-ppc] " Nikunj A Dadhania
2015-11-12  7:23       ` David Gibson
2015-11-11 17:15 ` [Qemu-devel] [PATCH 3/4] spapr: Handle "ibm, nmi-register" and "ibm, nmi-interlock" RTAS calls Aravinda Prasad
2015-11-12  4:02   ` David Gibson
2015-11-12 18:04     ` Aravinda Prasad
2015-11-12  9:23   ` Thomas Huth
2015-11-12 18:52     ` Aravinda Prasad
2015-11-11 17:16 ` [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit Aravinda Prasad
2015-11-12  4:29   ` David Gibson
2015-11-12  5:20     ` Aravinda Prasad
2015-11-12  8:09   ` Thomas Huth
2015-11-12  9:40     ` Thomas Huth
2015-11-12 18:49       ` Aravinda Prasad
2015-11-16  7:52         ` Thomas Huth
2015-11-16 10:07           ` Aravinda Prasad
2015-11-16 10:41             ` Thomas Huth
2015-11-16 11:57               ` Aravinda Prasad
2015-11-13  1:57       ` David Gibson [this message]
2015-11-13  7:03         ` Thomas Huth
2015-11-16  5:45           ` David Gibson
2015-11-12 18:23     ` Aravinda Prasad
2015-11-13  1:58       ` David Gibson
2015-11-13  4:53         ` Aravinda Prasad
2015-11-13  5:57           ` David Gibson
2015-11-13  6:27             ` Aravinda Prasad
2015-11-19  1:56       ` Alexey Kardashevskiy
2015-11-19 16:02         ` Aravinda Prasad
2015-11-16  3:50     ` Paul Mackerras
2015-11-16  9:01       ` Thomas Huth
2015-11-16 11:29         ` Aravinda Prasad
2015-11-16 21:46         ` Paul Mackerras
2015-11-12  4:30 ` [Qemu-devel] [PATCH 0/4] target-ppc/spapr: Add FWNMI support in QEMU for PowerKVM guests David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151113015715.GJ4886@voom.redhat.com \
    --to=david@gibson.dropbear.id.au \
    --cc=agraf@suse.de \
    --cc=aik@ozlabs.ru \
    --cc=aravinda@linux.vnet.ibm.com \
    --cc=benh@au1.ibm.com \
    --cc=paulus@samba.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=sam.bobroff@au1.ibm.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).