All of lore.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: kvm@vger.kernel.org, Alexander Graf <agraf@suse.de>,
	Paul Mackerras <paulus@au1.ibm.com>,
	kvm-ppc@vger.kernel.org
Subject: Re: Reset problem vs. MMIO emulation, hypercalls, etc...
Date: Thu, 02 Aug 2012 12:35:27 +0000	[thread overview]
Message-ID: <501A740F.2000000@redhat.com> (raw)
In-Reply-To: <1343791031.16975.41.camel@pasglop>

On 08/01/2012 06:17 AM, Benjamin Herrenschmidt wrote:
> Hi Avi !
> 
> We identified a problem on powerpc which seems to actually be a generic
> issue, and Alex suggested we propose a generic fix. I want to make sure
> we are on the right track first before proposing an actual patch as we
> would like the patch to go in ASAP (ie not waiting the next merge
> window) as it will fix an actual nasty bug with reset in KVM.
> 
> So the basic issue has to do with doing a machine reset as a result of a
> hypervisor call, but the same problem should happen with MMIO/PIO
> emulation.
> 
> After we do an exit as a result of such an operation, at the next
> KVM_RUN, KVM will fetch the "results" of the operation (in the hypercall
> case that's a bunch of register values, in the MMIO read emulation case
> it's a single register value usually, x86 might have more subtle cases)
> and we update the VCPU state (ie. registers) with that data.
> 
> However, what happens is that if a reset happens in between, we end up
> clobbering the reset state.
> 
> IE. What happens in qemu is roughtly:
> 
>  - The hcall or MMIO that triggers the reset happens, goes to qemu,
> which eventually calls qemu_system_reset_request()
> 
>  - This sets the global reset pending flag and wakes up the main loop.
> It also does a stop of the current vcpu, so we do not return to the
> kernel at this stage.
> 
>  - The main loop gets the flag, starts the reset process, which begins
> with stopping all the VCPUs.
> 
>  - The reset handlers are called, which includes resetting the CPU
> state, which in our case (powerpc) results in a SET_REGS ioctl to
> establish a new fresh state for booting.
> 
>  - The generic code then restarts all VCPUs, which then return into
> VCPU_RUN.
> 
>  - The VCPU(s) that did an exit as a result of MMIO emulation,
> hypercall, or similiar (typically the one that triggered the reset but
> possibly others) then gets some of their register state "updated" by the
> result of the operation (in the hcall case, it's a field in the mmap'ed
> run structure that clobbers GPR3 among others).
> 
> Now this is generally not a big issue as -usually- machines don't care
> much about the state of registers on reset.
> 

This is actually documented in api.txt, though not in relation to reset:

  NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO and KVM_EXIT_OSI, the
  corresponding operations are complete (and guest state is consistent)
  only after userspace has re-entered the kernel with KVM_RUN.  The
  kernel side will first finish incomplete operations and then check
  for pending signals.  Userspace can re-enter the guest with an
  unmasked signal pending to complete pending operations.

For x86 the issue was with live migration - you can't copy guest
register state in the middle of an I/O operation.  Reset is actually
similar, but it involves writing state (which can then be overwritten)
instead of reading it.

-- 
error compiling committee.c: too many arguments to function

WARNING: multiple messages have this Message-ID (diff)
From: Avi Kivity <avi@redhat.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: kvm@vger.kernel.org, Alexander Graf <agraf@suse.de>,
	Paul Mackerras <paulus@au1.ibm.com>,
	kvm-ppc@vger.kernel.org
Subject: Re: Reset problem vs. MMIO emulation, hypercalls, etc...
Date: Thu, 02 Aug 2012 15:35:27 +0300	[thread overview]
Message-ID: <501A740F.2000000@redhat.com> (raw)
In-Reply-To: <1343791031.16975.41.camel@pasglop>

On 08/01/2012 06:17 AM, Benjamin Herrenschmidt wrote:
> Hi Avi !
> 
> We identified a problem on powerpc which seems to actually be a generic
> issue, and Alex suggested we propose a generic fix. I want to make sure
> we are on the right track first before proposing an actual patch as we
> would like the patch to go in ASAP (ie not waiting the next merge
> window) as it will fix an actual nasty bug with reset in KVM.
> 
> So the basic issue has to do with doing a machine reset as a result of a
> hypervisor call, but the same problem should happen with MMIO/PIO
> emulation.
> 
> After we do an exit as a result of such an operation, at the next
> KVM_RUN, KVM will fetch the "results" of the operation (in the hypercall
> case that's a bunch of register values, in the MMIO read emulation case
> it's a single register value usually, x86 might have more subtle cases)
> and we update the VCPU state (ie. registers) with that data.
> 
> However, what happens is that if a reset happens in between, we end up
> clobbering the reset state.
> 
> IE. What happens in qemu is roughtly:
> 
>  - The hcall or MMIO that triggers the reset happens, goes to qemu,
> which eventually calls qemu_system_reset_request()
> 
>  - This sets the global reset pending flag and wakes up the main loop.
> It also does a stop of the current vcpu, so we do not return to the
> kernel at this stage.
> 
>  - The main loop gets the flag, starts the reset process, which begins
> with stopping all the VCPUs.
> 
>  - The reset handlers are called, which includes resetting the CPU
> state, which in our case (powerpc) results in a SET_REGS ioctl to
> establish a new fresh state for booting.
> 
>  - The generic code then restarts all VCPUs, which then return into
> VCPU_RUN.
> 
>  - The VCPU(s) that did an exit as a result of MMIO emulation,
> hypercall, or similiar (typically the one that triggered the reset but
> possibly others) then gets some of their register state "updated" by the
> result of the operation (in the hcall case, it's a field in the mmap'ed
> run structure that clobbers GPR3 among others).
> 
> Now this is generally not a big issue as -usually- machines don't care
> much about the state of registers on reset.
> 

This is actually documented in api.txt, though not in relation to reset:

  NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO and KVM_EXIT_OSI, the
  corresponding operations are complete (and guest state is consistent)
  only after userspace has re-entered the kernel with KVM_RUN.  The
  kernel side will first finish incomplete operations and then check
  for pending signals.  Userspace can re-enter the guest with an
  unmasked signal pending to complete pending operations.

For x86 the issue was with live migration - you can't copy guest
register state in the middle of an I/O operation.  Reset is actually
similar, but it involves writing state (which can then be overwritten)
instead of reading it.

-- 
error compiling committee.c: too many arguments to function

  parent reply	other threads:[~2012-08-02 12:35 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-01  3:17 Reset problem vs. MMIO emulation, hypercalls, etc Benjamin Herrenschmidt
2012-08-01  3:17 ` Benjamin Herrenschmidt
2012-08-02 10:46 ` Alexander Graf
2012-08-02 10:46   ` Alexander Graf
2012-08-02 12:22   ` Benjamin Herrenschmidt
2012-08-02 12:22     ` Benjamin Herrenschmidt
2012-08-02 12:35 ` Avi Kivity [this message]
2012-08-02 12:35   ` Avi Kivity
2012-08-02 12:59   ` Alexander Graf
2012-08-02 12:59     ` Alexander Graf
2012-08-02 13:05     ` Avi Kivity
2012-08-02 13:05       ` Avi Kivity
2012-08-02 20:29       ` Benjamin Herrenschmidt
2012-08-02 20:29         ` Benjamin Herrenschmidt
2012-08-05  8:55         ` Avi Kivity
2012-08-05  8:55           ` Avi Kivity
2012-08-05 20:45           ` Benjamin Herrenschmidt
2012-08-05 20:45             ` Benjamin Herrenschmidt
2012-08-02 20:20   ` Benjamin Herrenschmidt
2012-08-02 20:20     ` Benjamin Herrenschmidt
2012-08-03 17:41     ` Marcelo Tosatti
2012-08-03 17:41       ` Marcelo Tosatti
2012-08-03 18:05       ` Marcelo Tosatti
2012-08-03 18:05         ` Marcelo Tosatti
2012-08-03 22:32         ` Benjamin Herrenschmidt
2012-08-03 22:32           ` Benjamin Herrenschmidt
2012-08-05  9:00           ` Avi Kivity
2012-08-05  9:00             ` Avi Kivity
2012-08-06 20:25             ` Scott Wood
2012-08-06 20:25               ` Scott Wood
2012-08-07  8:44               ` Avi Kivity
2012-08-07  8:44                 ` Avi Kivity
2012-08-06 20:54             ` Benjamin Herrenschmidt
2012-08-06 20:54               ` Benjamin Herrenschmidt
2012-08-03 22:30       ` Benjamin Herrenschmidt
2012-08-03 22:30         ` Benjamin Herrenschmidt
2012-08-06  3:13         ` David Gibson
2012-08-06  3:13           ` David Gibson
2012-08-06 20:57           ` Benjamin Herrenschmidt
2012-08-06 20:57             ` Benjamin Herrenschmidt
2012-08-07  1:32             ` David Gibson
2012-08-07  1:32               ` David Gibson
2012-08-07  8:46               ` Avi Kivity
2012-08-07  8:46                 ` Avi Kivity
2012-08-07 12:14                 ` David Gibson
2012-08-07 12:14                   ` David Gibson
2012-08-07 13:13                   ` Avi Kivity
2012-08-07 13:13                     ` Avi Kivity
2012-08-07 21:09                     ` Benjamin Herrenschmidt
2012-08-07 21:09                       ` Benjamin Herrenschmidt
2012-08-08  8:52                       ` Avi Kivity
2012-08-08  8:52                         ` Avi Kivity
2012-08-08  9:27                         ` Benjamin Herrenschmidt
2012-08-08  9:27                           ` Benjamin Herrenschmidt
2012-08-08  0:49                     ` David Gibson
2012-08-08  0:49                       ` David Gibson
2012-08-08  8:58                       ` Avi Kivity
2012-08-08  8:58                         ` Avi Kivity
2012-08-08 11:59                         ` David Gibson
2012-08-08 11:59                           ` David Gibson
2012-08-08 12:42                           ` Avi Kivity
2012-08-08 12:42                             ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=501A740F.2000000@redhat.com \
    --to=avi@redhat.com \
    --cc=agraf@suse.de \
    --cc=benh@kernel.crashing.org \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=paulus@au1.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.