* [Qemu-devel] Using Linux's CPUSET for KVM VCPUs
@ 2010-07-22 14:03 Andre Przywara
2010-07-22 14:43 ` Daniel P. Berrange
0 siblings, 1 reply; 2+ messages in thread
From: Andre Przywara @ 2010-07-22 14:03 UTC (permalink / raw)
To: KVM list, QEMU devel
Hi all,
while working on NUMA host pinning, I experimented with vCPU affinity
within QEMU, but left it alone as it would complicate the code and would
not achieve better experience than using taskset with the monitor
provided thread ids like it is done currently. During that I looked at
Linux' CPUSET implementation
(/src/linux-2.6/Documentation/cgroups/cpusets.txt).
In brief, this is a pseudo file system based, truly hierarchical
implementation of restricting a set of processes (or threads, it uses
PIDs) to a certain subset of the machine.
Sadly we cannot leverage this for true guest NUMA memory assignment, but
it would work nice for pinning (or not) guest vCPUs. I had the following
structure in mind:
For each guest there is a new CPUSET (mkdir $CPUSET_MNT/`cat
/proc/$$/cpuset`/kvm_$guestname). One could then assign the guest global
resources to this CPUSET.
For each vCPU there is a separate CPUSET located under this guest global
one. This would allow for easy manipulation of the pinning of vCPUs,
even from the console without any mgt app (although this could be easily
implemented in libvirt).
/
|
+--/ kvm_guest_01
| |
| +-- VCPU0
| |
| +-- VCPU1
|
+--/ kvm_guest_02
...
What do you think about it? It is worth implementing this?
Regards,
Andre.
--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 448-3567-12
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [Qemu-devel] Using Linux's CPUSET for KVM VCPUs
2010-07-22 14:03 [Qemu-devel] Using Linux's CPUSET for KVM VCPUs Andre Przywara
@ 2010-07-22 14:43 ` Daniel P. Berrange
0 siblings, 0 replies; 2+ messages in thread
From: Daniel P. Berrange @ 2010-07-22 14:43 UTC (permalink / raw)
To: Andre Przywara; +Cc: QEMU devel, KVM list
On Thu, Jul 22, 2010 at 04:03:13PM +0200, Andre Przywara wrote:
> Hi all,
>
> while working on NUMA host pinning, I experimented with vCPU affinity
> within QEMU, but left it alone as it would complicate the code and would
> not achieve better experience than using taskset with the monitor
> provided thread ids like it is done currently. During that I looked at
> Linux' CPUSET implementation
> (/src/linux-2.6/Documentation/cgroups/cpusets.txt).
> In brief, this is a pseudo file system based, truly hierarchical
> implementation of restricting a set of processes (or threads, it uses
> PIDs) to a certain subset of the machine.
> Sadly we cannot leverage this for true guest NUMA memory assignment, but
> it would work nice for pinning (or not) guest vCPUs.
IIUC the 'cpuset.mems' tunable let you control the NUMA node that
memory allocation will come out of. It isn't as flexible as numactl
policies, since you can't request interleaving, but if you're just
look to control node locality I think it would do.
> I had the following
> structure in mind:
> For each guest there is a new CPUSET (mkdir $CPUSET_MNT/`cat
> /proc/$$/cpuset`/kvm_$guestname). One could then assign the guest global
> resources to this CPUSET.
> For each vCPU there is a separate CPUSET located under this guest global
> one. This would allow for easy manipulation of the pinning of vCPUs,
> even from the console without any mgt app (although this could be easily
> implemented in libvirt).
FYI, if you have any cgroup controllers mounted, libvirt will already
automatically create a dedicated sub-group for every guest you run.
The main reason we use cgroups is that it lets us apply controls to a
group of PIDs at once (eg cpu.cpu_shares to all threads within QEMU,
instead of nice(2) on each individual threads). When dealing at the
individual vCPU level there are single PIDs again, libvirt hasn't
needed further cgroup subdivision, just using traditional Linux APIs
instead.
> /
> |
> +--/ kvm_guest_01
> | |
> | +-- VCPU0
> | |
> | +-- VCPU1
> |
> +--/ kvm_guest_02
> ...
>
> What do you think about it? It is worth implementing this?
Having at least one cgroup per guest has certainly proved valuable for
libvirt's needs. If not using a mgmt API exposing vcpus (and other
internal QEMU threads) via named sub-cgroups could be quite convenient
Regards,
Daniel
--
|: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2010-07-22 14:44 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-22 14:03 [Qemu-devel] Using Linux's CPUSET for KVM VCPUs Andre Przywara
2010-07-22 14:43 ` Daniel P. Berrange
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).