Re: [PATCH RFC] kvm: add PV MMIO EVENTFD

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Michael S. Tsirkin" <mst@redhat.com>
To: Alexander Graf <agraf@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	x86@kernel.org, Will Deacon <will.deacon@arm.com>,
	linux-kernel@vger.kernel.org,
	Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@redhat.com>,
	kvm@vger.kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	Sasha Levin <sasha.levin@oracle.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Christoffer Dall <c.dall@virtualopensystems.com>,
	virtualization@lists.linux-foundation.org,
	Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Subject: Re: [PATCH RFC] kvm: add PV MMIO EVENTFD
Date: Thu, 4 Apr 2013 14:21:57 +0300	[thread overview]
Message-ID: <20130404112157.GB6467@redhat.com> (raw)
In-Reply-To: <C5D3F576-7567-4FCD-8819-9DC235E3513B@suse.de>

On Thu, Apr 04, 2013 at 02:09:53PM +0200, Alexander Graf wrote:
> 
> On 04.04.2013, at 13:04, Michael S. Tsirkin wrote:
> 
> > On Thu, Apr 04, 2013 at 01:57:34PM +0200, Alexander Graf wrote:
> >> 
> >> On 04.04.2013, at 12:50, Michael S. Tsirkin wrote:
> >> 
> >>> With KVM, MMIO is much slower than PIO, due to the need to
> >>> do page walk and emulation. But with EPT, it does not have to be: we
> >>> know the address from the VMCS so if the address is unique, we can look
> >>> up the eventfd directly, bypassing emulation.
> >>> 
> >>> Add an interface for userspace to specify this per-address, we can
> >>> use this e.g. for virtio.
> >>> 
> >>> The implementation adds a separate bus internally. This serves two
> >>> purposes:
> >>> - minimize overhead for old userspace that does not use PV MMIO
> >>> - minimize disruption in other code (since we don't know the length,
> >>> devices on the MMIO bus only get a valid address in write, this
> >>> way we don't need to touch all devices to teach them handle
> >>> an dinvalid length)
> >>> 
> >>> At the moment, this optimization is only supported for EPT on x86 and
> >>> silently ignored for NPT and MMU, so everything works correctly but
> >>> slowly.
> >>> 
> >>> TODO: NPT, MMU and non x86 architectures.
> >>> 
> >>> The idea was suggested by Peter Anvin.  Lots of thanks to Gleb for
> >>> pre-review and suggestions.
> >>> 
> >>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >> 
> >> This still uses page fault intercepts which are orders of magnitudes
> >> slower than hypercalls.
> > 
> > Not really. Here's a test:
> > compare vmcall to portio:
> > 
> > vmcall 1519
> > ...
> > outl_to_kernel 1745
> > 
> > compare portio to mmio:
> > 
> > mmio-wildcard-eventfd:pci-mem 3529
> > mmio-pv-eventfd:pci-mem 1878
> > portio-wildcard-eventfd:pci-io 1846
> > 
> > So not orders of magnitude.
> 
> https://dl.dropbox.com/u/8976842/KVM%20Forum%202012/MMIO%20Tuning.pdf
> 
> Check out page 41. Higher is better (number is number of loop cycles in a second). My test system was an AMD Istanbul based box.

Wow 2x difference.  The difference seems to be much smaller now. Newer
hardware? Newer software?

> > 
> >> Why don't you just create a PV MMIO hypercall
> >> that the guest can use to invoke MMIO accesses towards the host based
> >> on physical addresses with explicit length encodings?
> >> That way you simplify and speed up all code paths, exceeding the speed
> >> of PIO exits even. It should also be quite easily portable, as all
> >> other platforms have hypercalls available as well.
> >> 
> >> 
> >> Alex
> > 
> > I sent such a patch, but maintainers seem reluctant to add hypercalls.
> > Gleb, could you comment please?
> > 
> > A fast way to do MMIO is probably useful in any case ...
> 
> Yes, but at least according to my numbers optimizing anything that is not hcalls is a waste of time.
> 
> 
> Alex

This was the implementation: 'kvm_para: add mmio word store hypercall'.
Again, I don't mind.

-- 
MST

WARNING: multiple messages have this Message-ID (diff)

From: "Michael S. Tsirkin" <mst@redhat.com>
To: Alexander Graf <agraf@suse.de>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
	Gleb Natapov <gleb@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>,
	Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>,
	Alex Williamson <alex.williamson@redhat.com>,
	Will Deacon <will.deacon@arm.com>,
	Christoffer Dall <c.dall@virtualopensystems.com>,
	Sasha Levin <sasha.levin@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org
Subject: Re: [PATCH RFC] kvm: add PV MMIO EVENTFD
Date: Thu, 4 Apr 2013 14:21:57 +0300	[thread overview]
Message-ID: <20130404112157.GB6467@redhat.com> (raw)
In-Reply-To: <C5D3F576-7567-4FCD-8819-9DC235E3513B@suse.de>

On Thu, Apr 04, 2013 at 02:09:53PM +0200, Alexander Graf wrote:
> 
> On 04.04.2013, at 13:04, Michael S. Tsirkin wrote:
> 
> > On Thu, Apr 04, 2013 at 01:57:34PM +0200, Alexander Graf wrote:
> >> 
> >> On 04.04.2013, at 12:50, Michael S. Tsirkin wrote:
> >> 
> >>> With KVM, MMIO is much slower than PIO, due to the need to
> >>> do page walk and emulation. But with EPT, it does not have to be: we
> >>> know the address from the VMCS so if the address is unique, we can look
> >>> up the eventfd directly, bypassing emulation.
> >>> 
> >>> Add an interface for userspace to specify this per-address, we can
> >>> use this e.g. for virtio.
> >>> 
> >>> The implementation adds a separate bus internally. This serves two
> >>> purposes:
> >>> - minimize overhead for old userspace that does not use PV MMIO
> >>> - minimize disruption in other code (since we don't know the length,
> >>> devices on the MMIO bus only get a valid address in write, this
> >>> way we don't need to touch all devices to teach them handle
> >>> an dinvalid length)
> >>> 
> >>> At the moment, this optimization is only supported for EPT on x86 and
> >>> silently ignored for NPT and MMU, so everything works correctly but
> >>> slowly.
> >>> 
> >>> TODO: NPT, MMU and non x86 architectures.
> >>> 
> >>> The idea was suggested by Peter Anvin.  Lots of thanks to Gleb for
> >>> pre-review and suggestions.
> >>> 
> >>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >> 
> >> This still uses page fault intercepts which are orders of magnitudes
> >> slower than hypercalls.
> > 
> > Not really. Here's a test:
> > compare vmcall to portio:
> > 
> > vmcall 1519
> > ...
> > outl_to_kernel 1745
> > 
> > compare portio to mmio:
> > 
> > mmio-wildcard-eventfd:pci-mem 3529
> > mmio-pv-eventfd:pci-mem 1878
> > portio-wildcard-eventfd:pci-io 1846
> > 
> > So not orders of magnitude.
> 
> https://dl.dropbox.com/u/8976842/KVM%20Forum%202012/MMIO%20Tuning.pdf
> 
> Check out page 41. Higher is better (number is number of loop cycles in a second). My test system was an AMD Istanbul based box.

Wow 2x difference.  The difference seems to be much smaller now. Newer
hardware? Newer software?

> > 
> >> Why don't you just create a PV MMIO hypercall
> >> that the guest can use to invoke MMIO accesses towards the host based
> >> on physical addresses with explicit length encodings?
> >> That way you simplify and speed up all code paths, exceeding the speed
> >> of PIO exits even. It should also be quite easily portable, as all
> >> other platforms have hypercalls available as well.
> >> 
> >> 
> >> Alex
> > 
> > I sent such a patch, but maintainers seem reluctant to add hypercalls.
> > Gleb, could you comment please?
> > 
> > A fast way to do MMIO is probably useful in any case ...
> 
> Yes, but at least according to my numbers optimizing anything that is not hcalls is a waste of time.
> 
> 
> Alex

This was the implementation: 'kvm_para: add mmio word store hypercall'.
Again, I don't mind.

-- 
MST

next prev parent reply	other threads:[~2013-04-04 11:21 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-04 10:50 [PATCH RFC] kvm: add PV MMIO EVENTFD Michael S. Tsirkin
2013-04-04 10:50 ` Michael S. Tsirkin
2013-04-04 11:57 ` Alexander Graf
2013-04-04 11:57   ` Alexander Graf
2013-04-04 11:04   ` Michael S. Tsirkin
2013-04-04 11:04     ` Michael S. Tsirkin
2013-04-04 12:09     ` Alexander Graf
2013-04-04 12:09       ` Alexander Graf
2013-04-04 11:21       ` Michael S. Tsirkin [this message]
2013-04-04 11:21         ` Michael S. Tsirkin
2013-04-04 12:19       ` Gleb Natapov
2013-04-04 12:19         ` Gleb Natapov
2013-04-04 12:22         ` Alexander Graf
2013-04-04 12:22           ` Alexander Graf
2013-04-04 12:08   ` Gleb Natapov
2013-04-04 12:08     ` Gleb Natapov
2013-04-04 12:22     ` Alexander Graf
2013-04-04 12:22       ` Alexander Graf
2013-04-04 12:34       ` Gleb Natapov
2013-04-04 12:34         ` Gleb Natapov
2013-04-04 12:39         ` Alexander Graf
2013-04-04 12:39           ` Alexander Graf
2013-04-04 12:58       ` Michael S. Tsirkin
2013-04-04 12:58         ` Michael S. Tsirkin
2013-04-04 14:02         ` Alexander Graf
2013-04-04 14:02           ` Alexander Graf
2013-04-04 13:40           ` Michael S. Tsirkin
2013-04-04 13:40             ` Michael S. Tsirkin
2013-04-04 12:32     ` Alexander Graf
2013-04-04 12:32       ` Alexander Graf
2013-04-04 12:38       ` Gleb Natapov
2013-04-04 12:38         ` Gleb Natapov
2013-04-04 12:39         ` Alexander Graf
2013-04-04 12:39           ` Alexander Graf
2013-04-04 12:45           ` Gleb Natapov
2013-04-04 12:45             ` Gleb Natapov
2013-04-04 12:49             ` Alexander Graf
2013-04-04 12:49               ` Alexander Graf
2013-04-04 12:56               ` Gleb Natapov
2013-04-04 12:56                 ` Gleb Natapov
2013-04-04 13:06                 ` Alexander Graf
2013-04-04 13:06                   ` Alexander Graf
2013-04-04 13:14                   ` Gleb Natapov
2013-04-04 13:14                     ` Gleb Natapov
2013-04-04 14:26                     ` Michael S. Tsirkin
2013-04-04 14:26                       ` Michael S. Tsirkin
2013-04-07  9:30                     ` Gleb Natapov
2013-04-07  9:30                       ` Gleb Natapov
2013-04-07  8:43                       ` Michael S. Tsirkin
2013-04-07  8:43                         ` Michael S. Tsirkin
2013-04-04 13:33                   ` Michael S. Tsirkin
2013-04-04 13:33                     ` Michael S. Tsirkin
2013-04-04 15:36                     ` Alexander Graf
2013-04-04 15:36                       ` Alexander Graf
2013-04-04 15:36                       ` Michael S. Tsirkin
2013-04-04 15:36                         ` Michael S. Tsirkin
2013-04-04 16:39                         ` Gleb Natapov
2013-04-04 16:39                           ` Gleb Natapov
2013-04-04 23:32                         ` Christoffer Dall
2013-04-07  7:41                           ` Michael S. Tsirkin
2013-04-07  7:41                             ` Michael S. Tsirkin
2013-04-07 21:25                             ` Christoffer Dall
2013-04-04 23:32                         ` Christoffer Dall
2013-04-04 16:33                       ` Gleb Natapov
2013-04-04 16:33                         ` Gleb Natapov
2013-04-04 13:12                 ` Michael S. Tsirkin
2013-04-04 13:12                   ` Michael S. Tsirkin
2013-04-04 13:06           ` Michael S. Tsirkin
2013-04-04 13:06             ` Michael S. Tsirkin
2013-04-04 13:03       ` Michael S. Tsirkin
2013-04-04 13:03         ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130404112157.GB6467@redhat.com \
    --to=mst@redhat.com \
    --cc=agraf@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=c.dall@virtualopensystems.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=sasha.levin@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    --cc=yoshikawa_takuya_b1@lab.ntt.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.