qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Hildenbrand <david@redhat.com>,
	Michal Privoznik <mprivozn@redhat.com>,
	qemu-devel@nongnu.org
Subject: Re: [PATCH] util: NUMA aware memory preallocation
Date: Thu, 12 May 2022 09:15:22 +0100	[thread overview]
Message-ID: <YnzCGh3psZgK8tUw@redhat.com> (raw)
In-Reply-To: <04938ba0-7ff4-df3c-348d-b679eac4fbac@redhat.com>

On Thu, May 12, 2022 at 09:41:29AM +0200, Paolo Bonzini wrote:
> On 5/11/22 18:54, Daniel P. Berrangé wrote:
> > On Wed, May 11, 2022 at 01:07:47PM +0200, Paolo Bonzini wrote:
> > > On 5/11/22 12:10, Daniel P. Berrangé wrote:
> > > > I expect creating/deleting I/O threads is cheap in comparison to
> > > > the work done for preallocation. If libvirt is using -preconfig
> > > > and object-add to create the memory backend, then we could have
> > > > option of creating the I/O threads dynamically in -preconfig mode,
> > > > create the memory backend, and then delete the I/O threads again.
> > > 
> > > I think this is very overengineered.  Michal's patch is doing the obvious
> > > thing and if it doesn't work that's because Libvirt is trying to micromanage
> > > QEMU.
> > 
> > Calling it micromanaging is putting a very negative connotation on
> > this. What we're trying todo is enforce a host resource policy for
> > QEMU, in a way that a compromised QEMU can't escape, which is a
> > valuable protection.
> 
> I'm sorry if that was a bit exaggerated, but the negative connotation was
> intentional.
> 
> > > As mentioned on IRC, if the reason is to prevent moving around threads in
> > > realtime (SCHED_FIFO, SCHED_RR) classes, that should be fixed at the kernel
> > > level.
> > 
> > We use cgroups where it is available to us, but we don't always have
> > the freedom that we'd like.
> 
> I understand.  I'm thinking of a new flag to sched_setscheduler that fixes
> the CPU affinity and policy of the thread and prevents changing it in case
> QEMU is compromised later.  The seccomp/SELinux sandboxes can prevent
> setting the SCHED_FIFO class without this flag.
> 
> In addition, my hunch is that this works only because the RT setup of QEMU
> is not safe against priority inversion.  IIRC the iothreads are set with a
> non-realtime priority, but actually they should have a _higher_ priority
> than the CPU threads, and the thread pool I/O bound workers should have an
> even higher priority; otherwise you have a priority inversion situation
> where an interrupt is pending that would wake up the CPU, but the iothreads
> cannot process it because they have a lower priority than the CPU.

At least for RHEL deployments of KVM-RT, IIC the expectation is that
the VCPUs with RT priority never do I/O, and that there is at least 1
additional non-RT vCPU from which the OS performs I/O. IOW, the RT
VCPU works in a completely self contained manner with no interaction
to any other QEMU threads. If that's not the case, then you would
have to make sure those other threads have priority / schedular
adjustments to avoid priority inversion

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



  reply	other threads:[~2022-05-12  8:26 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-10  6:55 [PATCH] util: NUMA aware memory preallocation Michal Privoznik
2022-05-10  9:12 ` Daniel P. Berrangé
2022-05-10 10:27   ` Dr. David Alan Gilbert
2022-05-11 13:16   ` Michal Prívozník
2022-05-11 14:50     ` David Hildenbrand
2022-05-11 15:08     ` Daniel P. Berrangé
2022-05-11 16:41       ` David Hildenbrand
2022-05-11  8:34 ` Dr. David Alan Gilbert
2022-05-11  9:20   ` Daniel P. Berrangé
2022-05-11  9:19 ` Daniel P. Berrangé
2022-05-11  9:31   ` David Hildenbrand
2022-05-11  9:34     ` Daniel P. Berrangé
2022-05-11 10:03       ` David Hildenbrand
2022-05-11 10:10         ` Daniel P. Berrangé
2022-05-11 11:07           ` Paolo Bonzini
2022-05-11 16:54             ` Daniel P. Berrangé
2022-05-12  7:41               ` Paolo Bonzini
2022-05-12  8:15                 ` Daniel P. Berrangé [this message]
2022-06-08 10:34       ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YnzCGh3psZgK8tUw@redhat.com \
    --to=berrange@redhat.com \
    --cc=david@redhat.com \
    --cc=mprivozn@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).