From: Sean Christopherson <seanjc@google.com>
To: Vineeth Remanan Pillai <vineeth@bitbyteword.org>
Cc: Ben Segall <bsegall@google.com>, Borislav Petkov <bp@alien8.de>,
Daniel Bristot de Oliveira <bristot@redhat.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
"H . Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
Juri Lelli <juri.lelli@redhat.com>, Mel Gorman <mgorman@suse.de>,
Paolo Bonzini <pbonzini@redhat.com>,
Andy Lutomirski <luto@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Steven Rostedt <rostedt@goodmis.org>,
Thomas Gleixner <tglx@linutronix.de>,
Valentin Schneider <vschneid@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Suleiman Souhlal <suleiman@google.com>,
Masami Hiramatsu <mhiramat@google.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
x86@kernel.org, Tejun Heo <tj@kernel.org>,
Josh Don <joshdon@google.com>, Barret Rhoden <brho@google.com>,
David Vernet <dvernet@meta.com>,
Joel Fernandes <joel@joelfernandes.org>
Subject: Re: [RFC PATCH 0/8] Dynamic vcpu priority management in kvm
Date: Thu, 14 Dec 2023 12:13:34 -0800 [thread overview]
Message-ID: <ZXth7hu7jaHbJZnj@google.com> (raw)
In-Reply-To: <CAO7JXPihjjko6qe8tr6e6UE=L7uSR6AACq1Zwg+7n95s5A-yoQ@mail.gmail.com>
On Thu, Dec 14, 2023, Vineeth Remanan Pillai wrote:
> On Thu, Dec 14, 2023 at 11:38 AM Sean Christopherson <seanjc@google.com> wrote:
> Now when I think about it, the implementation seems to
> suggest that we are putting policies in kvm. Ideally, the goal is:
> - guest scheduler communicates the priority requirements of the workload
> - kvm applies the priority to the vcpu task.
Why? Tasks are tasks, why does KVM need to get involved? E.g. if the problem
is that userspace doesn't have the right knobs to adjust the priority of a task
quickly and efficiently, then wouldn't it be better to solve that problem in a
generic way?
> - Now that vcpu is appropriately prioritized, host scheduler can make
> the right choice of picking the next best task.
>
> We have an exception of proactive boosting for interrupts/nmis. I
> don't expect these proactive boosting cases to grow. And I think this
> also to be controlled by the guest where the guest can say what
> scenarios would it like to be proactive boosted.
>
> That would make kvm just a medium to communicate the scheduler
> requirements from guest to host and not house any policies. What do
> you think?
...
> > Pushing the scheduling policies to host userspace would allow for far more control
> > and flexibility. E.g. a heavily paravirtualized environment where host userspace
> > knows *exactly* what workloads are being run could have wildly different policies
> > than an environment where the guest is a fairly vanilla Linux VM that has received
> > a small amount of enlightment.
> >
> > Lastly, if the concern/argument is that userspace doesn't have the right knobs
> > to (quickly) boost vCPU tasks, then the proposed sched_ext functionality seems
> > tailor made for the problems you are trying to solve.
> >
> > https://lkml.kernel.org/r/20231111024835.2164816-1-tj%40kernel.org
> >
> You are right, sched_ext is a good choice to have policies
> implemented. In our case, we would need a communication mechanism as
> well and hence we thought kvm would work best to be a medium between
> the guest and the host.
Making KVM be the medium may be convenient and the quickest way to get a PoC
out the door, but effectively making KVM a middle-man is going to be a huge net
negative in the long term. Userspace can communicate with the guest just as
easily as KVM, and if you make KVM the middle-man, then you effectively *must*
define a relatively rigid guest/host ABI.
If instead the contract is between host userspace and the guest, the ABI can be
much more fluid, e.g. if you (or any setup) can control at least some amount of
code that runs in the guest, then the contract between the guest and host doesn't
even need to be formally defined, it could simply be a matter of bundling host
and guest code appropriately.
If you want to land support for a given contract in upstream repositories, e.g.
to broadly enable paravirt scheduling support across a variety of usersepace VMMs
and/or guests, then yeah, you'll need a formal ABI. But that's still not a good
reason to have KVM define the ABI. Doing it in KVM might be a wee bit easier because
it's largely just a matter of writing code, and LKML provides a centralized channel
for getting buyin from all parties. But defining an ABI that's independent of the
kernel is absolutely doable, e.g. see the many virtio specs.
I'm not saying KVM can't help, e.g. if there is information that is known only
to KVM, but the vast majority of the contract doesn't need to be defined by KVM.
next prev parent reply other threads:[~2023-12-14 20:13 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-14 2:47 [RFC PATCH 0/8] Dynamic vcpu priority management in kvm Vineeth Pillai (Google)
2023-12-14 2:47 ` [RFC PATCH 1/8] kvm: x86: MSR for setting up scheduler info shared memory Vineeth Pillai (Google)
2023-12-14 10:53 ` Vitaly Kuznetsov
2023-12-14 19:53 ` Vineeth Remanan Pillai
2023-12-14 2:47 ` [RFC PATCH 2/8] sched/core: sched_setscheduler_pi_nocheck for interrupt context usage Vineeth Pillai (Google)
2023-12-14 2:47 ` [RFC PATCH 3/8] kvm: x86: vcpu boosting/unboosting framework Vineeth Pillai (Google)
2023-12-14 2:47 ` [RFC PATCH 4/8] kvm: x86: boost vcpu threads on latency sensitive paths Vineeth Pillai (Google)
2023-12-14 2:47 ` [RFC PATCH 5/8] kvm: x86: upper bound for preemption based boost duration Vineeth Pillai (Google)
2023-12-14 2:47 ` [RFC PATCH 6/8] kvm: x86: enable/disable global/per-guest vcpu boost feature Vineeth Pillai (Google)
2023-12-14 2:47 ` [RFC PATCH 7/8] sched/core: boost/unboost in guest scheduler Vineeth Pillai (Google)
2024-01-09 17:26 ` Shrikanth Hegde
2023-12-14 2:47 ` [RFC PATCH 8/8] irq: boost/unboost in irq/nmi entry/exit and softirq Vineeth Pillai (Google)
2023-12-15 17:26 ` Thomas Gleixner
2023-12-15 18:52 ` Vineeth Remanan Pillai
2023-12-14 16:38 ` [RFC PATCH 0/8] Dynamic vcpu priority management in kvm Sean Christopherson
2023-12-14 19:25 ` Vineeth Remanan Pillai
2023-12-14 20:13 ` Sean Christopherson [this message]
2023-12-14 21:36 ` Vineeth Remanan Pillai
2023-12-15 0:47 ` Sean Christopherson
2023-12-15 14:34 ` Vineeth Remanan Pillai
2023-12-15 16:56 ` Sean Christopherson
2023-12-15 17:40 ` Vineeth Remanan Pillai
2023-12-15 17:54 ` Sean Christopherson
2023-12-15 19:10 ` Vineeth Remanan Pillai
2023-12-15 15:20 ` Joel Fernandes
2023-12-15 16:38 ` Sean Christopherson
2023-12-15 20:18 ` Joel Fernandes
2023-12-15 22:01 ` Sean Christopherson
2024-01-12 18:37 ` Joel Fernandes
2023-12-15 18:10 ` David Vernet
2024-01-03 20:09 ` Joel Fernandes
2024-01-04 22:34 ` David Vernet
2024-01-24 2:15 ` Joel Fernandes
2024-01-24 17:06 ` David Vernet
2024-01-25 1:08 ` Joel Fernandes
2024-01-26 21:19 ` David Vernet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZXth7hu7jaHbJZnj@google.com \
--to=seanjc@google.com \
--cc=bp@alien8.de \
--cc=brho@google.com \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=dave.hansen@linux.intel.com \
--cc=dietmar.eggemann@arm.com \
--cc=dvernet@meta.com \
--cc=hpa@zytor.com \
--cc=joel@joelfernandes.org \
--cc=joshdon@google.com \
--cc=juri.lelli@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mgorman@suse.de \
--cc=mhiramat@google.com \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=suleiman@google.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=vineeth@bitbyteword.org \
--cc=vkuznets@redhat.com \
--cc=vschneid@redhat.com \
--cc=wanpengli@tencent.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox