From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Hildenbrand <david@redhat.com>,
Michal Privoznik <mprivozn@redhat.com>,
qemu-devel@nongnu.org
Subject: Re: [PATCH] util: NUMA aware memory preallocation
Date: Thu, 12 May 2022 09:15:22 +0100 [thread overview]
Message-ID: <YnzCGh3psZgK8tUw@redhat.com> (raw)
In-Reply-To: <04938ba0-7ff4-df3c-348d-b679eac4fbac@redhat.com>
On Thu, May 12, 2022 at 09:41:29AM +0200, Paolo Bonzini wrote:
> On 5/11/22 18:54, Daniel P. Berrangé wrote:
> > On Wed, May 11, 2022 at 01:07:47PM +0200, Paolo Bonzini wrote:
> > > On 5/11/22 12:10, Daniel P. Berrangé wrote:
> > > > I expect creating/deleting I/O threads is cheap in comparison to
> > > > the work done for preallocation. If libvirt is using -preconfig
> > > > and object-add to create the memory backend, then we could have
> > > > option of creating the I/O threads dynamically in -preconfig mode,
> > > > create the memory backend, and then delete the I/O threads again.
> > >
> > > I think this is very overengineered. Michal's patch is doing the obvious
> > > thing and if it doesn't work that's because Libvirt is trying to micromanage
> > > QEMU.
> >
> > Calling it micromanaging is putting a very negative connotation on
> > this. What we're trying todo is enforce a host resource policy for
> > QEMU, in a way that a compromised QEMU can't escape, which is a
> > valuable protection.
>
> I'm sorry if that was a bit exaggerated, but the negative connotation was
> intentional.
>
> > > As mentioned on IRC, if the reason is to prevent moving around threads in
> > > realtime (SCHED_FIFO, SCHED_RR) classes, that should be fixed at the kernel
> > > level.
> >
> > We use cgroups where it is available to us, but we don't always have
> > the freedom that we'd like.
>
> I understand. I'm thinking of a new flag to sched_setscheduler that fixes
> the CPU affinity and policy of the thread and prevents changing it in case
> QEMU is compromised later. The seccomp/SELinux sandboxes can prevent
> setting the SCHED_FIFO class without this flag.
>
> In addition, my hunch is that this works only because the RT setup of QEMU
> is not safe against priority inversion. IIRC the iothreads are set with a
> non-realtime priority, but actually they should have a _higher_ priority
> than the CPU threads, and the thread pool I/O bound workers should have an
> even higher priority; otherwise you have a priority inversion situation
> where an interrupt is pending that would wake up the CPU, but the iothreads
> cannot process it because they have a lower priority than the CPU.
At least for RHEL deployments of KVM-RT, IIC the expectation is that
the VCPUs with RT priority never do I/O, and that there is at least 1
additional non-RT vCPU from which the OS performs I/O. IOW, the RT
VCPU works in a completely self contained manner with no interaction
to any other QEMU threads. If that's not the case, then you would
have to make sure those other threads have priority / schedular
adjustments to avoid priority inversion
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
next prev parent reply other threads:[~2022-05-12 8:26 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-10 6:55 [PATCH] util: NUMA aware memory preallocation Michal Privoznik
2022-05-10 9:12 ` Daniel P. Berrangé
2022-05-10 10:27 ` Dr. David Alan Gilbert
2022-05-11 13:16 ` Michal Prívozník
2022-05-11 14:50 ` David Hildenbrand
2022-05-11 15:08 ` Daniel P. Berrangé
2022-05-11 16:41 ` David Hildenbrand
2022-05-11 8:34 ` Dr. David Alan Gilbert
2022-05-11 9:20 ` Daniel P. Berrangé
2022-05-11 9:19 ` Daniel P. Berrangé
2022-05-11 9:31 ` David Hildenbrand
2022-05-11 9:34 ` Daniel P. Berrangé
2022-05-11 10:03 ` David Hildenbrand
2022-05-11 10:10 ` Daniel P. Berrangé
2022-05-11 11:07 ` Paolo Bonzini
2022-05-11 16:54 ` Daniel P. Berrangé
2022-05-12 7:41 ` Paolo Bonzini
2022-05-12 8:15 ` Daniel P. Berrangé [this message]
2022-06-08 10:34 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YnzCGh3psZgK8tUw@redhat.com \
--to=berrange@redhat.com \
--cc=david@redhat.com \
--cc=mprivozn@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).