public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Razya Ladelsky <RAZYA@il.ibm.com>
Cc: abel.gordon@gmail.com, Anthony Liguori <anthony@codemonkey.ws>,
	asias@redhat.com, digitaleric@google.com,
	Eran Raichstein <ERANRA@il.ibm.com>,
	gleb@redhat.com, jasowang@redhat.com,
	Joel Nider <JOELN@il.ibm.com>,
	kvm@vger.kernel.org, kvm-owner@vger.kernel.org,
	pbonzini@redhat.com, Stefan Hajnoczi <stefanha@gmail.com>,
	Yossi Kuperman1 <YOSSIKU@il.ibm.com>,
	Eyal Moscovici <EYALMO@il.ibm.com>,
	bsd@redhat.com
Subject: Re: Updated Elvis Upstreaming Roadmap
Date: Wed, 18 Dec 2013 12:43:50 +0200	[thread overview]
Message-ID: <20131218104350.GB6741@redhat.com> (raw)
In-Reply-To: <OFAAB02888.70858D88-ONC2257C43.00545F93-C2257C44.00376177@il.ibm.com>

On Tue, Dec 17, 2013 at 12:04:42PM +0200, Razya Ladelsky wrote:
> Hi,
> 
> Thank you all for your comments.
> I'm sorry for taking this long to reply, I was away on vacation..
> 
> It was a good, long discussion, many issues were raised, which we'd like 
> to address with the following proposed roadmap for Elvis patches.
> In general, we believe it would be best to start with patches that are 
> as simple as possible, providing the basic Elvis functionality, 
> and attend to the more complicated issues in subsequent patches.
> 
> Here's the road map for Elvis patches: 

Thanks for the follow up. Some suggestions below.
Please note they suggestions below merely represent
thoughts on merging upstream.
If as the first step you are content with keeping this
work as out of tree patches, in order to have
the freedom to experiment with interfaces and
performance, please feel free to ignore them.

> 1. Shared vhost thread for multiple devices.
> 
> The way to go here, we believe, is to start with a patch having a shared 
> vhost thread for multiple devices of the SAME vm.
> The next step/patch may be handling vms belonging to the same cgroup.
> 
> Finally, we need to extend the functionality so that the shared vhost 
> thread 
> serves multiple vms (not necessarily belonging to the same cgroup).
> 
> There was a lot of discussion about the way to address the enforcement 
> of cgroup policies, and we will consider the various solutions with a 
> future
> patch.

With respect to the upstream kernel,
I'm not sure a bunch of changes just for the sake of guests with
multiple virtual NIC cards makes sense.
And I wonder how this step, in isolation, will affect e.g.
multiqueue workloads.
But I guess if the numbers are convincing, this can be mergeable.

> 
> 2. Creation of vhost threads
> 
> We suggested two ways of controlling the creation and removal of vhost
> threads: 
> - statically determining the maximum number of virtio devices per worker 
> via a kernel module parameter 
> - dynamically: Sysfs mechanism to add and remove vhost threads 
> 
> It seems that it would be simplest to take the static approach as
> a first stage. At a second stage (next patch), we'll advance to 
> dynamically 
> changing the number of vhost threads, using the static module parameter 
> only as a default value. 

I'm not sure how independent this is from 1.
With respect to the upstream kernel,
Introducing interfaces (which we'll have to maintain
forever) just for the sake of guests with
multiple virtual NIC cards does not look like a good tradeoff.

So I'm unlikely to merge this upstream without making it useful cross-VM,
and yes this means isolation and accounting with cgroups need to
work properly.

> Regarding cwmq, it is an interesting mechanism, which we need to explore 
> further.
> At the moment we prefer not to change the vhost model to use cwmq, as some 
> of the issues that were discussed, such as cgroups, are not supported by 
> cwmq, and this is adding more complexity.
> However, we'll look further into it, and consider it at a later stage.

Hmm that's still assuming some smart management tool configuring
this correctly.  Can't this be determined automatically depending
on the workload?
This is what the cwmq suggestion was really about: detect
that we need more threads and spawn them.
It's less about sharing the implementation with workqueues -
would be very nice but not a must.



> 3. Adding polling mode to vhost 
> 
> It is a good idea making polling adaptive based on various factors such as 
> the I/O rate, the guest kick overhead(which is the tradeoff of polling), 
> or the amount of wasted cycles (cycles we kept polling but no new work was 
> added).
> However, as a beginning polling patch, we would prefer having a naive 
> polling approach, which could be tuned with later patches.
> 

While any polling approach would still need a lot of testing to prove we
don't for example steal CPU from guest which could be doing other useful
work, given that an exit is at least 1.5K cycles at least in theory it
seems like something that can improve performance.  I'm not sure how
naive we can be without introducing regressions  for some workloads.
For example, if we are on the same host CPU, there's no
chance busy waiting will help us make progress.
How about detecting that the VCPU thread that kicked us
is currently running on another CPU, and only polling in
this case?

> 4. vhost statistics 
> 
> The issue that was raised for the vhost statistics was using ftrace 
> instead of the debugfs mechanism.
> However, looking further into the kvm stat mechanism, we learned that 
> ftrace didn't replace the plain debugfs mechanism, but was used in 
> addition to it.
>  
> We propose to continue using debugfs for statistics, in a manner similar 
> to kvm,
> and at some point in the future ftrace can be added to vhost as well.

IMHO which kvm stat is a useful script, the best tool
for perf stats is still perf. So I would try to integrate with that.
How it works internally is IMHO less important.

> Does this plan look o.k.?
> If there are no further comments, I'll start preparing the patches 
> according to what we've agreed on thus far.
> Thank you,
> Razya

I think a good place to try to start merging upstream would be 3 and 4.
So if you want to make it easier to merge things upstream, try to keep 3
and 4 independent from 1 and 2.

Thanks again,

-- 
MST

  reply	other threads:[~2013-12-18 10:40 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-17 10:04 Updated Elvis Upstreaming Roadmap Razya Ladelsky
2013-12-18 10:43 ` Michael S. Tsirkin [this message]
2013-12-19  6:40   ` Abel Gordon
2013-12-19 10:13     ` Michael S. Tsirkin
2013-12-19 10:36       ` Abel Gordon
2013-12-19 11:37         ` Michael S. Tsirkin
2013-12-19 12:56           ` Abel Gordon
2013-12-19 13:48             ` Michael S. Tsirkin
2013-12-19 14:19               ` Abel Gordon
2013-12-19 14:48                 ` Michael S. Tsirkin
2013-12-24 12:50                   ` Razya Ladelsky
2013-12-24 16:21 ` Gleb Natapov
2013-12-25  7:38   ` Razya Ladelsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131218104350.GB6741@redhat.com \
    --to=mst@redhat.com \
    --cc=ERANRA@il.ibm.com \
    --cc=EYALMO@il.ibm.com \
    --cc=JOELN@il.ibm.com \
    --cc=RAZYA@il.ibm.com \
    --cc=YOSSIKU@il.ibm.com \
    --cc=abel.gordon@gmail.com \
    --cc=anthony@codemonkey.ws \
    --cc=asias@redhat.com \
    --cc=bsd@redhat.com \
    --cc=digitaleric@google.com \
    --cc=gleb@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=kvm-owner@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox