From mboxrd@z Thu Jan  1 00:00:00 1970
From: Thomas Monjalon <thomas.monjalon-pdR9zngts4EAvxtiuMwx3w@public.gmane.org>
Subject: Re: [PATCH v4 00/10] VM Power Management
Date: Tue, 14 Oct 2014 17:03:41 +0200
Message-ID: <3349663.LNtcecTXb3@xps13>
References: <1412003903-9061-1-git-send-email-alan.carew@intel.com>
 <3264386.kAdiTFhMft@xps13>
 <0E29434AEE0C3A4180987AB476A6F6306D28093B@IRSMSX109.ger.corp.intel.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7Bit
Cc: dev-VfR2kkLFssw@public.gmane.org
To: "Carew, Alan" <alan.carew-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Return-path: <dev-bounces-VfR2kkLFssw@public.gmane.org>
In-Reply-To: <0E29434AEE0C3A4180987AB476A6F6306D28093B-kPTMFJFq+rHjxeytcECX8bfspsVTdybXVpNB7YpNyf8@public.gmane.org>
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request-VfR2kkLFssw@public.gmane.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev-VfR2kkLFssw@public.gmane.org>
List-Help: <mailto:dev-request-VfR2kkLFssw@public.gmane.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request-VfR2kkLFssw@public.gmane.org?subject=subscribe>
Errors-To: dev-bounces-VfR2kkLFssw@public.gmane.org
Sender: "dev" <dev-bounces-VfR2kkLFssw@public.gmane.org>

2014-10-14 12:37, Carew, Alan:
> > > The following patches add two DPDK sample applications and an alternate
> > > implementation of librte_power for use in virtualized environments.
> > > The idea is to provide librte_power functionality from within a VM to address
> > > the lack of MSRs to facilitate frequency changes from within a VM.
> > > It is ideally suited for Haswell which provides per core frequency scaling.
> > >
> > > The current librte_power affects frequency changes via the acpi-cpufreq
> > > 'userspace' power governor, accessed via sysfs.
> > 
> > Something was preventing me from looking deeper in this big codebase,
> > but I didn't know what sounds weird.
> > Now I realize: the real problem is that virtualization transparency is
> > broken for power management. So the right thing to do is to fix it in
> > KVM. I think all this patchset is a huge workaround.
> > 
> > Did you try to fix it with Qemu/KVM?
> 
> When looking at the libvirt API it would seem to be a natural fit to have
> power management sitting there, so in essence I would agree.
> 
> However with a DPDK solution it would be possible to re-use the message bus
> to pass information like device stats, application state, D-state requests
> etc. to the host and allow for management layer(e.g. OpenStack) to make
> informed decisions.

I think that management informations should be transmitted in a management
channel. Such solution should exist in OpenStack.

> Also, the scope of adding power management to qemu/KVM would be huge;
> while the easier path is not always the best and the problem of power
> management in VMs is both a DPDK problem (given that librte_power only
> worked on the host) and a general virtualization problem that would be
> better solved by those with direct knowledge of Qemu/KVM architecture
> and influence on the direction of the Qemu project.

Being a huge effort is not an argument.
Please check with Qemu community, they'll welcome it.
 
> As it stands, the host backend is simply an example application that can
> be replaced by a VMM or Orchestration layer, by using Virtio-Serial it has
> obvious leanings to Qemu, but even this could be easily swapped out for
> XenBus, IVSHMEM, IP etc.
> 
> If power management is to be eventually supported by Hypervisors directly
> then we could also enable to option to switch to that environment, currently
> the librte_power implementations (VM or Host) can be selected dynamically
> (environment auto-detection) or explicitly via rte_power_set_env(), adding
> an arbitrary number of environments is relatively easy.

Yes, you are adding a new layer to workaround hypervisor lacks. And this layer
will handle native support when it will exist. But if you implement native
support now, we don't need this extra layer.

> I hope this helps to clarify the approach.

Thanks for your explanation.

-- 
Thomas