public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Roman Kagan <rkagan@virtuozzo.com>
To: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>, <kvm@vger.kernel.org>,
	Denis Lunev <den@virtuozzo.com>
Subject: Re: [PATCH] kvm/x86: skip async_pf when in guest mode
Date: Fri, 25 Nov 2016 11:42:51 +0300	[thread overview]
Message-ID: <20161125084251.GC28457@rkaganb.sw.ru> (raw)
In-Reply-To: <20161125071521.GB28457@rkaganb.sw.ru>

On Fri, Nov 25, 2016 at 10:15:21AM +0300, Roman Kagan wrote:
> On Thu, Nov 24, 2016 at 09:49:59PM +0100, Radim Krčmář wrote:
> > 2016-11-24 19:30+0300, Roman Kagan:
> > > Async pagefault machinery assumes communication with L1 guests only: all
> > > the state -- MSRs, apf area addresses, etc, -- are for L1.  However, it
> > > currently doesn't check if the vCPU is running L1 or L2, and may inject
> > > 
> > > To reproduce the problem, use a host with swap enabled, run a VM on it,
> > > run a nested VM on top, and set RSS limit for L1 on the host via
> > > /sys/fs/cgroup/memory/machine.slice/machine-*.scope/memory.limit_in_bytes
> > > to swap it out (you may need to tighten and release it once or twice, or
> > > create some memory load inside L1).  Very quickly L2 guest starts
> > > receiving pagefaults with bogus %cr2 (apf tokens from the host
> > > actually), and L1 guest starts accumulating tasks stuck in D state in
> > > kvm_async_pf_task_wait.
> > > 
> > > To avoid that, only do async_pf stuff when executing L1 guest.
> > > 
> > > Note: this patch only fixes x86; other async_pf-capable arches may also
> > > need something similar.
> > > 
> > > Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> > > ---
> > 
> > Applied to kvm/queue, thanks.
> > 
> > The VM task in L1 could be scheduled out instead of hogging the VCPU for
> > a long time, so L1 might want to handle async_pf, especially if L1 set
> > KVM_ASYNC_PF_SEND_ALWAYS.  Another case happens if L1 scheduled out a
> > high-priority task on async_pf and executed the low-priority VM task in
> > spare time, expecting another #PF when the page is ready, which might be
> > long before the next nested VM exit.
> > 
> > Have you considered doing a nested VM exit and delivering the async_pf
> > to L1 immediately?
> 
> I haven't, but it seems to make sense indeed for "page ready" async_pfs.  
> 
> I'll have a look into it.

What's the correct way to kick L2 to L1 from the host?  I failed to find
one from a brief skimming through the code.  We need a sensible exit
reason delivered to L1 (probably "external interrupt" will do) but I
don't see a method to do so without actually injecting an interrupt into
L1 which is not unlikely to confuse it.  Any suggestion?

Thanks,
Roman.

  reply	other threads:[~2016-11-25 10:48 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-24 16:30 [PATCH] kvm/x86: skip async_pf when in guest mode Roman Kagan
2016-11-24 20:49 ` Radim Krčmář
2016-11-25  7:15   ` Roman Kagan
2016-11-25  8:42     ` Roman Kagan [this message]
2016-11-25  8:51       ` Paolo Bonzini
2016-11-25 11:17         ` Roman Kagan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161125084251.GC28457@rkaganb.sw.ru \
    --to=rkagan@virtuozzo.com \
    --cc=den@virtuozzo.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox