public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Alexander Graf <agraf@suse.de>
Cc: Sasha Levin <levinsasha928@gmail.com>,
	penberg@kernel.org, kvm@vger.kernel.org, asias.hejun@gmail.com,
	gorcunov@gmail.com, prasadjoshi124@gmail.com
Subject: Re: [PATCH] kvm tools: Add MMIO coalescing support
Date: Sat, 4 Jun 2011 12:47:12 +0200	[thread overview]
Message-ID: <20110604104712.GG16292@elte.hu> (raw)
In-Reply-To: <FCAE4C25-E073-4193-9C52-E202E0DE76CC@suse.de>


* Alexander Graf <agraf@suse.de> wrote:

> 
> On 04.06.2011, at 12:35, Ingo Molnar wrote:
> 
> > 
> > * Sasha Levin <levinsasha928@gmail.com> wrote:
> > 
> >> On Sat, 2011-06-04 at 12:17 +0200, Ingo Molnar wrote:
> >>> * Sasha Levin <levinsasha928@gmail.com> wrote:
> >>> 
> >>>> On Sat, 2011-06-04 at 11:38 +0200, Ingo Molnar wrote:
> >>>>> * Sasha Levin <levinsasha928@gmail.com> wrote:
> >>>>> 
> >>>>>> Coalescing MMIO allows us to avoid an exit every time we have a
> >>>>>> MMIO write, instead - MMIO writes are coalesced in a ring which
> >>>>>> can be flushed once an exit for a different reason is needed.
> >>>>>> A MMIO exit is also trigged once the ring is full.
> >>>>>> 
> >>>>>> Coalesce all MMIO regions registered in the MMIO mapper.
> >>>>>> Add a coalescing handler under kvm_cpu.
> >>>>> 
> >>>>> Does this have any effect on latency? I.e. does the guest side 
> >>>>> guarantee that the pending queue will be flushed after a group of 
> >>>>> updates have been done?
> >>>> 
> >>>> Theres nothing that detects groups of MMIO writes, but the ring size is
> >>>> a bit less than PAGE_SIZE (half of it is overhead - rest is data) and
> >>>> we'll exit once the ring is full.
> >>> 
> >>> But if the page is only filled partially and if mmio is not submitted 
> >>> by the guest indefinitely (say it runs a lot of user-space code) then 
> >>> the mmio remains pending in the partial-page buffer?
> >> 
> >> We flush the ring on any exit from the guest, not just MMIO exit.
> >> But yes, from what I understand from the code - if the buffer is only
> >> partially full and we don't take an exit, the buffer doesn't get back to
> >> the host.
> >> 
> >> ioeventfds and such are making exits less common, so yes - it's possible
> >> we won't have an exit in a while.
> >> 
> >>> If that's how it works then i *really* don't like this, this looks 
> >>> like a seriously mis-designed batching feature which might have 
> >>> improved a few server benchmarks but which will introduce random, 
> >>> hard to debug delays all around the place!
> > 
> > The proper way to implement batching is not to do it blindly like 
> > here, but to do what we do in the TLB coalescing/gather code in the 
> > kernel:
> > 
> > 	gather();
> > 
> > 	... submit individual TLB flushes ...
> > 
> > 	flush();
> > 
> > That's how it should be done here too: each virtio driver that issues 
> 
> The world doesn't consist of virtio drivers. It also doesn't 
> consist of only OSs and drivers that we control 100%.

So? I only inquired about latencies, asking what impact on latencies 
is. Regardless of the circumstances we do not want to introduce 
unbound latencies.

If there are no unbound latencies then i'm happy.

> > a group of MMIOs should first start batching, then issue the 
> > individual MMIOs and then flush them.
> > 
> > That can be simplified to leave out the gather() phase, i.e. just 
> > issue batched MMIOs and flush them before exiting the virtio 
> > (guest side) driver routines.
> 
> This acceleration is done to speed up the host kernel<->userspace 
> side.

Yes.

> [...] It's completely independent from the guest. [...]

Well, since user-space gets the MMIOs only once the guest exits it's 
not independent, is it?

> [...] If you want to have the guest communicate fast, create an 
> asynchronous ring and process that. And that's what virtio already 
> does today.
>
> > KVM_CAP_COALESCED_MMIO is an unsafe shortcut hack in its current 
> > form and it looks completely unsafe.
> 
> I haven't tracked the history of it, but I always assumed it was 
> used for repz mov instructions where we already know the size of 
> mmio transactions.

That's why i asked what the effect on latencies is. If there's no 
negative effect then i'm a happy camper.

Thanks,

	Ingo

  reply	other threads:[~2011-06-04 10:47 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-03 19:51 [PATCH] kvm tools: Add MMIO coalescing support Sasha Levin
2011-06-04  9:38 ` Ingo Molnar
2011-06-04 10:14   ` Sasha Levin
2011-06-04 10:17     ` Ingo Molnar
2011-06-04 10:28       ` Sasha Levin
2011-06-04 10:35         ` Ingo Molnar
2011-06-04 10:39           ` Alexander Graf
2011-06-04 10:47             ` Ingo Molnar [this message]
2011-06-04 10:54               ` Alexander Graf
2011-06-04 11:27                 ` Ingo Molnar
2011-06-04 11:53                   ` Alexander Graf
2011-06-04 14:46                     ` Ingo Molnar
2011-06-04 15:22                       ` Alexander Graf
2011-06-04 16:34                         ` Ingo Molnar
2011-06-04 16:50                           ` Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110604104712.GG16292@elte.hu \
    --to=mingo@elte.hu \
    --cc=agraf@suse.de \
    --cc=asias.hejun@gmail.com \
    --cc=gorcunov@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=levinsasha928@gmail.com \
    --cc=penberg@kernel.org \
    --cc=prasadjoshi124@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox