From: Marcelo Tosatti <mtosatti@redhat.com>
To: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Cc: peterz@infradead.org, mingo@elte.hu, avi@redhat.com,
raghukt@linux.vnet.ibm.com, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, x86@kernel.org, jeremy@goop.org,
vatsa@linux.vnet.ibm.com, hpa@zytor.com
Subject: Re: [PATCH v2 3/7] KVM: Add paravirt kvm_flush_tlb_others
Date: Thu, 21 Jun 2012 09:26:33 -0300 [thread overview]
Message-ID: <20120621122633.GA3842@amt.cnet> (raw)
In-Reply-To: <87a9zzsudx.fsf@abhimanyu.in.ibm.com>
On Tue, Jun 19, 2012 at 11:41:54AM +0530, Nikunj A Dadhania wrote:
> Hi Marcelo,
>
> Thanks for the review.
>
> On Tue, 12 Jun 2012 20:02:18 -0300, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > On Mon, Jun 04, 2012 at 10:37:24AM +0530, Nikunj A. Dadhania wrote:
> > > flush_tlb_others_ipi depends on lot of statics in tlb.c. Replicated
> > > the flush_tlb_others_ipi as kvm_flush_tlb_others to further adapt to
> > > paravirtualization.
> > >
> > > Use the vcpu state information inside the kvm_flush_tlb_others to
> > > avoid sending ipi to pre-empted vcpus.
> > >
> > > * Do not send ipi's to offline vcpus and set flush_on_enter flag
> > > * For online vcpus: Wait for them to clear the flag
> > >
> > > The approach was discussed here: https://lkml.org/lkml/2012/2/20/157
> > >
> > > Suggested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > > Signed-off-by: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>
> >
> > Why not reintroduce the hypercall to flush TLBs?
> Sure, I will get a version with this.
>
> > No waiting, no entry/exit trickery.
> >
> Are you also suggesting to get rid of vcpu-running information?
Yes.
> We would atleast need this to raise a flushTLB hypercall on the sleeping
> vcpu.
No, you always raise a flush TLB hypercall on flush_tlb_others, like Xen does.
Instead of an MMIO exit to APIC to IPI, its a hypercall exit.
> > This is half-way-there paravirt with all the downsides.
> Some more details on what are the downsides would help us reach to a
> better solution.
The downside i refer to is overhead to write, test and maintain separate code
for running as a guest. If you are adding awareness about hypervisor
anyway, a hypercall is the simplest way:
- It is simple to request a remote TLB flush via vcpu->requests in the
hypervisor. Information about which vcpus are in/out guest mode
already available.
- Maintenance of in/out guest mode information in guest memory adds more
overhead to entry/exit paths.
- No need to handle the interprocessor synchronization in the pseudo
algo below (already handled by vcpu->requests mechanism).
- The guest TLB IPI flow is (target vcpu):
- exit-due-to-external-interrupt.
- inject-ipi.
- enter guest mode.
- execute IPI handler, flush TLB.
- ACK IPI.
- source vcpu continues.
- The hypervisor TLB flush flow is:
- exit-due-to-external-interrupt.
- execute IPI handler (in host, from make_all_cpus_request)
- ACK IPI.
- source vcpu continues.
Unless there is an advantage in using APIC IPI exit vs hypercall
exit?
> > Even though the guest running information might be useful in other
> > cases.
> >
> Yes, that was one of the things on the mind.
>
> > > Pseudo Algo:
> > >
> > > Write()
> > > ======
> > >
> > > guest_exit()
> > > flush_on_enter[i]=0;
> > > running[i] = 0;
> > >
> > > guest_enter()
> > > running[i] = 1;
> > > smp_mb();
> > > if(flush_on_enter[i]) {
> > > tlb_flush()
> > > flush_on_enter[i]=0;
> > > }
> > >
> > >
> > > Read()
> > > ======
> > >
> > > GUEST KVM-HV
> > >
> > > f->flushcpumask = cpumask - me;
> > >
> > > again:
> > > for_each_cpu(i, f->flushmask) {
> > >
> > > if (!running[i]) {
> > > case 1:
> > >
> > > running[n]=1
> > >
> > > (cpuN does not see
> > > flush_on_enter set,
> > > guest later finds it
> > > running and sends ipi,
> > > we are fine here, need
> > > to clear the flag on
> > > guest_exit)
> > >
> > > flush_on_enter[i] = 1;
> > > case2:
> > >
> > > running[n]=1
> > > (cpuN - will see flush
> > > on enter and an IPI as
> > > well - addressed in patch-4)
> > >
> > > if (!running[i])
> > > cpu_clear(f->flushmask); All is well, vm_enter
> > > will do the fixup
> > > }
> > > case 3:
> > > running[n] = 0;
> > >
> > > (cpuN went to sleep,
> > > we saw it as awake,
> > > ipi sent, but wait
> > > will break without
> > > zero_mask and goto
> > > again will take care)
> > >
> > > }
> > > send_ipi(f->flushmask)
> > >
> > > wait_a_while_for_zero_mask();
> > >
> > > if (!zero_mask)
> > > goto again;
> > > ---
> > > arch/x86/include/asm/kvm_para.h | 3 +-
> > > arch/x86/include/asm/tlbflush.h | 9 ++++++
> > > arch/x86/kernel/kvm.c | 1 +
> > > arch/x86/kvm/x86.c | 14 ++++++++-
> > > arch/x86/mm/tlb.c | 61 +++++++++++++++++++++++++++++++++++++++
> > > 5 files changed, 86 insertions(+), 2 deletions(-)
> > >
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2012-06-21 12:26 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-04 5:05 [PATCH v2 0/7] KVM paravirt remote flush tlb Nikunj A. Dadhania
2012-06-04 5:06 ` [PATCH v2 1/7] KVM Guest: Add VCPU running/pre-empted state for guest Nikunj A. Dadhania
2012-06-12 22:43 ` Marcelo Tosatti
2012-06-19 6:03 ` Nikunj A Dadhania
2012-06-04 5:06 ` [PATCH v2 2/7] KVM-HV: " Nikunj A. Dadhania
2012-06-04 5:07 ` [PATCH v2 3/7] KVM: Add paravirt kvm_flush_tlb_others Nikunj A. Dadhania
2012-06-12 23:02 ` Marcelo Tosatti
2012-06-19 6:11 ` Nikunj A Dadhania
2012-06-21 12:26 ` Marcelo Tosatti [this message]
2012-07-03 7:55 ` Marcelo Tosatti
2012-07-03 8:19 ` Nikunj A Dadhania
2012-07-05 2:09 ` Marcelo Tosatti
2012-07-05 5:55 ` Nikunj A Dadhania
2012-07-06 9:47 ` Nikunj A Dadhania
2012-07-03 8:11 ` Marcelo Tosatti
2012-07-03 8:27 ` Nikunj A Dadhania
2012-06-04 5:07 ` [PATCH v2 4/7] KVM: export kvm_kick_vcpu for pv_flush Nikunj A. Dadhania
2012-06-04 5:08 ` [PATCH v2 5/7] KVM: Introduce PV kick in flush tlb Nikunj A. Dadhania
2012-07-03 8:07 ` Marcelo Tosatti
2012-07-03 8:25 ` Nikunj A Dadhania
2012-07-05 2:37 ` Marcelo Tosatti
2012-07-05 5:53 ` Nikunj A Dadhania
2012-06-04 5:08 ` [PATCH v2 6/7] kvm,x86: RCU based table free Nikunj A. Dadhania
2012-06-05 10:48 ` Stefano Stabellini
2012-06-05 11:08 ` Nikunj A Dadhania
2012-06-05 11:58 ` Stefano Stabellini
2012-06-05 13:04 ` Nikunj A Dadhania
2012-06-05 13:08 ` Peter Zijlstra
2012-06-05 13:26 ` Stefano Stabellini
2012-06-05 13:31 ` Peter Zijlstra
2012-06-05 13:41 ` Stefano Stabellini
2012-08-01 11:23 ` Stefano Stabellini
2012-08-01 12:12 ` Nikunj A Dadhania
2012-08-01 12:59 ` Stefano Stabellini
2012-06-05 15:29 ` Nikunj A Dadhania
2012-06-05 13:21 ` Stefano Stabellini
2012-06-04 5:08 ` [PATCH v2 7/7] Flush page-table pages before freeing them Nikunj A. Dadhania
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120621122633.GA3842@amt.cnet \
--to=mtosatti@redhat.com \
--cc=avi@redhat.com \
--cc=hpa@zytor.com \
--cc=jeremy@goop.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=nikunj@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=raghukt@linux.vnet.ibm.com \
--cc=vatsa@linux.vnet.ibm.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.