From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pablo de Lara Subject: =?utf-8?q?=5BPATCH_v2=5D_doc=3A_new_sample_app_UG_for_?= =?utf-8?q?VM_power_management?= Date: Fri, 28 Nov 2014 16:46:42 +0000 Message-ID: <1417193202-23972-1-git-send-email-pablo.de.lara.guarch@intel.com> References: <1417088640-7641-1-git-send-email-pablo.de.lara.guarch@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable To: dev-VfR2kkLFssw@public.gmane.org Return-path: In-Reply-To: <1417088640-7641-1-git-send-email-pablo.de.lara.guarch-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces-VfR2kkLFssw@public.gmane.org Sender: "dev" This patch adds a new sample app UG, contaning explanation of the new two sample apps added in the VM power management patchset Changes in v2: Corrected svg files Signed-off-by: Pablo de Lara --- .../sample_app_ug/img/vm_power_mgr_highlevel.svg | 1173 ++++++++++++++= ++++++ .../img/vm_power_mgr_vm_request_seq.svg | 548 +++++++++ doc/guides/sample_app_ug/index.rst | 5 + doc/guides/sample_app_ug/vm_power_management.rst | 274 +++++ 4 files changed, 2000 insertions(+), 0 deletions(-) create mode 100644 doc/guides/sample_app_ug/img/vm_power_mgr_highlevel.s= vg create mode 100644 doc/guides/sample_app_ug/img/vm_power_mgr_vm_request_= seq.svg create mode 100644 doc/guides/sample_app_ug/vm_power_management.rst diff --git a/doc/guides/sample_app_ug/img/vm_power_mgr_highlevel.svg b/do= c/guides/sample_app_ug/img/vm_power_mgr_highlevel.svg new file mode 100644 index 0000000..4b0b3b8 --- /dev/null +++ b/doc/guides/sample_app_ug/img/vm_power_mgr_highlevel.svg @@ -0,0 +1,1173 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Page-1 + + + + Box + Host + + + + + + + + + + + + Host + + + + + + + 1-D single.59 + + Sheet.63 + + + + + + + + + + + + + Sheet.64 + + + + + + + + + + + Sheet.65 + + + + + + + + + + + + + + + + + 1-D single.54 + + Sheet.56 + + + + + + + + + + + + + Sheet.57 + + + + + + + + + + + Sheet.58 + + + + + + + + + + + + Box.10 + VM 0 + + + + + + + + + + + + <= v:newlineChar/>VM 0 + + Box.2 + Core 0 + + + + + + + + + + + + Core 0 + + Box.3 + Core 1 + + + + + + + + + + + + Core 1 + + Box.4 + Core 2 + + + + + + + + + + + + Core 2 + + Box.5 + Core 3 + + + + + + + + + + + + Core 3 + + Box.6 + Core 4 + + + + + + + + + + + + Core 4 + + Box.7 + Core 5 + + + + + + + + + + + + Core 5 + + Box.8 + Core 6 + + + + + + + + + + + + Core 6 + + Box.9 + Core 7 + + + + + + + + + + + + Core 7 + + Box.11 + Virtual Core 0 + + + + + + + + + + + + Virtual Core 0 + + Box.12 + Virtual Core 1 + + + + + + + + + + + + Virtual Core 1 + + Box.13 + Virtual Core 2 + + + + + + + + + + + + Virtual Core 2 + + Box.14 + Virtual Core 3 + + + + + + + + + + + + Virtual Core 3 + + + + + + + 1-D single + + Sheet.17 + + + + + + + + + + + + + Sheet.18 + + + + + + + + + + + Sheet.19 + + + + + + + + + + + + + + + + + 1-D single.20 + + Sheet.21 + + + + + + + + + + + + + Sheet.22 + + + + + + + + + + + Sheet.23 + + + + + + + + + + + + + + + + + 1-D single.28 + + Sheet.29 + + + + + + + + + + + + + Sheet.30 + + + + + + + + + + + Sheet.31 + + + + + + + + + + + + Box.32 + DPDK Application + + + + + + + + + + + + DPDK Application + + Box.33 + VM 1 + + + + + + + + + + + + <= v:newlineChar/>VM 1 + + Box.34 + Virtual Core 0 + + + + + + + + + + + + Virtual Core 0 + + Box.35 + Virtual Core 1 + + + + + + + + + + + + Virtual Core 1 + + Box.36 + DPDK Application + + + + + + + + + + + + DPDK Application + + Box.49 + DPDK VM Application Reuse librte_power interface, but provide..= . + + + + + + + + + + + + DPDK VM Application=C2=B7 <= tspan class=3D"st18">Reuse librte_power interface, but provides a new implementation that forwards frequency set reques= ts to host via Virtio-Serial channel=C2=B7 Each lcore has= exclusive access to a single channel=C2=B7 Sample application re-uses l3fwd_p= ower=C2=B7 A CLI for changing frequency from wit= hin a VM is also included. + + + + + + + 1-D single.37 + + Sheet.38 + + + + + + + + + + + + + Sheet.39 + + + + + + + + + + + Sheet.40 + + + + + + + + + + + + Box.15 + OS/Hypervisor + + + + + + + + + + + + OS/Hypervisor + + Box.55 + Linux =E2=80=9Cuserspace=E2=80=9D power governor /sys/devices/s= ystem/cpu/cpuN... + + + + + + + + + + + + Linux =E2=80=9Cuserspace=E2=80=9D= power governor/sys/devices/system= /cpu/cpuN/cpufreq/ + + Box.45 + VM Power Monitor Accepts VM Commands over Virtio Serial endpo..= . + + + + + + + + + + + + VM Power Monitor=C2=B7 <= tspan class=3D"st18">Accepts VM Commands over Virtio Serial endpoints, monitored via epoll=C2=B7 <= tspan class=3D"st18">Commands include the virtual core to be modified= , using libvirt to get physical core mapping<= /tspan>=C2=B7 <= tspan class=3D"st18">Uses librte_power to affect frequency changes via L= inux userspace power governor(APCI cpufreq= )=C2=B7 CLI: For adding VM chann= els to monitor, <= tspan class=3D"st18">inspecting and changing channel state, manually altering CPU frequency. Also allows for the= changing of vCPU to pCPU pinning. + + Box.53 + VM Power Monitor Application + + + + + + + + + + + + VM Power Monitor Application + + Box.61 + librte_power(vm) + + + + + + + + + + + + librte_power(vm) + + Box.48 + lcore channel 0 + + + + + + + + + + + + lcore channel 0 + + Box.47 + librte_power(vm) + + + + + + + + + + + + librte_power(vm) + + Box.46 + lcore channel 1 + + + + + + + + + + + + lcore channel 1 + + Box.60 + lcore channel 2 + + + + + + + + + + + + lcore channel 2 + + Box.62 + lcore channel 3 + + + + + + + + + + + + lcore channel 3 + + Box.50 + lcore channel 0 + + + + + + + + + + + + lcore channel 0 + + Box.52 + lcore channel 1 + + + + + + + + + + + + lcore channel 1 + + Box.51 + Endpoint Monitor(lcore channels) + + + + + + + + + + + + Endpoint Monitor(lcore channels) + + Box.25 + Channel Manager + + + + + + + + + + + + Channel Manager + + Box.41 + QEMU + + + + + + + + + + + + QEMU + + Box.42 + libvirt + + + + + + + + + + + + libvirt + + Dynamic connector.43 + + + + + + + + Dynamic connector + + + + + + + + Box.26 + librte_power(Host) + + + + + + + + + + + + librte_power(Host) + + Dynamic connector.68 + Map vCPU to pCPU + + + + + + + + + Map vCPU to pCPU + + Box.27 + VM Power CLI + + + + + + + + + + + + VM Power CLI + + diff --git a/doc/guides/sample_app_ug/img/vm_power_mgr_vm_request_seq.svg= b/doc/guides/sample_app_ug/img/vm_power_mgr_vm_request_seq.svg new file mode 100644 index 0000000..587d35d --- /dev/null +++ b/doc/guides/sample_app_ug/img/vm_power_mgr_vm_request_seq.svg @@ -0,0 +1,548 @@ + + + + + + + + + + + + + + + + + + + + +Loop: for each epoll +event + + + + + + + + + + + + + + + + + + +librte_power(VM) + + + + + +Sequence + + + + + + + + + + + + + + + + + +guest_channel(VM) + + + + + + + + + + + + + + + + + + +channel_monitor(Host) + + + + + + + + + + + + + + + + + + + + + + +channel_manager(Host) + + + + + + + + + + + + + + + + + + + + + + +power_manager(Host) + + + + + + + + + + + + + + + + +process_request + + + + + + + + + + + + +get_pcpu_mask() + + + + + + + + + + + + + + + + +pcpu_mask + + + + + + + + + + + + + + + + + + +librte_power(Host) + + + + + + + + + + + + +scale_freq_up(pcpu_mask) + + + + + + + + + + + + + + + + +rte_power_freq_up() + + + + + + + + + + + + + + + + + + + + +guest_channel_send_msg() + + + + + + + + + + + + +status + + + + + + + + + + + + +status + + + + + + + + + + + + + + + + +rte_power_freq_up() + + + + + + + + + + + + +status + + + + + diff --git a/doc/guides/sample_app_ug/index.rst b/doc/guides/sample_app_u= g/index.rst index ad2ca28..d2ccbc9 100644 --- a/doc/guides/sample_app_ug/index.rst +++ b/doc/guides/sample_app_ug/index.rst @@ -100,6 +100,7 @@ Copyright =C2=A9 2012 - 2014, Intel Corporation. All = rights reserved. netmap_compatibility internet_proto_ip_pipeline test_pipeline + vm_power_management =20 **Figures** =20 @@ -147,6 +148,10 @@ Copyright =C2=A9 2012 - 2014, Intel Corporation. All= rights reserved. =20 :ref:`Figure 21.Test Pipeline Application ` =20 +:ref:`Figure 22.High level Solution ` + +:ref:`Figure 23.VM request to scale frequency ` + **Tables** =20 :ref:`Table 1.Output Traffic Marking ` diff --git a/doc/guides/sample_app_ug/vm_power_management.rst b/doc/guide= s/sample_app_ug/vm_power_management.rst new file mode 100644 index 0000000..7eaafd5 --- /dev/null +++ b/doc/guides/sample_app_ug/vm_power_management.rst @@ -0,0 +1,274 @@ +.. BSD LICENSE + Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of Intel Corporation nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FO= R + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL= , + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE= , + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON AN= Y + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE US= E + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +VM Power Management Application +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D + +Introduction +------------ + +Applications running in Virtual Environments have an abstract view of th= e underlying hardware on the Host, in particular applications cannot see = the binding of virtual to physical hardware. When looking at CPU resourci= ng, the pinning of Virtual CPUs(vCPUs) to Host Physical CPUs(pCPUS) is no= t apparent to an application and this pinning may change over time. +Furthermore, Operating Systems on virtual machines do not have the abili= ty to govern their own power policy; the Machine Specific Registers (MSRs= ) for enabling P-State transitions are not exposed to Operating Systems r= unning on Virtual Machines(VMs). + +The Virtual Machine Power Management solution shows an example of how a = DPDK application can indicate its processing requirements using VM local = only information(vCPU/lcore) to a Host based Monitor which is responsible= for accepting requests for frequency changes for a vCPU, translating the= vCPU to a pCPU via libvirt and affecting the change in frequency. + +The solution is comprised of two high-level components. + +1. Example Host Application + + Using a Command Line Interface(CLI) for VM->Host communication channel= management it allows for adding channels to the Monitor, setting and que= rying the vCPU to pCPU pinning, inspecting and manually changing the freq= uency for each CPU. The CLI runs on a single lcore while the thread respo= nsible for managing VM requests runs on a second lcore. + VM requests arriving on a channel for frequency changes are passed to = the librte_power ACPI cpufreq sysfs based library. The Host Application r= elies on both qemu-kvm and libvirt to function. + +2. librte_power for Virtual Machines + + Using an alternate implementation for the librte_power API, requests f= or frequency changes are forwarded to the host monitor rather than the AP= CI cpufreq sysfs interface used on the host. + The l3fwd-power application will use this implementation when deployed= on a VM(see Chapter 11 "L3 Forwarding with Power Management Application"= ). + +.. _figure_22: + +**Figure 22. Highlevel Solution** + +|vm_power_mgr_highlevel| + +Overview +-------- + +VM Power Management employs qemu-kvm to provide communications channels = between the host and VMs in the form of Virtio-Serial which appears as a = paravirtualized serial device on a VM and can be configured to use variou= s backends on the host. For this example each Virtio-Serial endpoint on t= he host is configured as AF_UNIX file socket, supporting poll/select and = epoll for event notification. In this example each channel endpoint on th= e host is monitored via epoll for EPOLLIN events. +Each channel is specified as qemu-kvm arguments or as libvirt XML for ea= ch VM, where each VM can have a number of channels up to a maximum of 64 = per VM, in this example each DPDK lcore on a VM has exclusive access to a= channel. + +To enable frequency changes from within a VM, a request via the librte_p= ower interface is forwarded via Virtio-Serial to the host, each request c= ontains the vCPU and power command(scale up/down/min/max). +The API for host and guest librte_power is consistent across environment= s, with the selection of VM or Host Implementation determined at automati= cally at runtime based on the environment. + +Upon receiving a request, the host translates the vCPU to a pCPU via the= libvirt API before forwarding to the host librte_power. + + +.. _figure_23: + +**Figure 23. VM request to scale frequency** + +|vm_power_mgr_vm_request_seq| + +Performance Considerations +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +While Haswell Microarchitecture allows for independent power control for= each core, earlier Microarchtectures do not offer such fine grained cont= rol. When deployed on pre-Haswell platforms greater care must be taken in= selecting which cores are assigned to a VM, for instance a core will not= scale down until its sibling is similarly scaled. + + +Configuration +------------- + +BIOS +~~~~ + +Enhanced Intel SpeedStep=C2=AE Technology must be enabled in the platfor= m BIOS if the power management feature of Intel=C2=AE DPDK is to be used.= Otherwise, the sys file folder /sys/devices/system/cpu/cpu0/cpufreq will= not exist, and the CPU frequency- based power management cannot be used.= Consult the relevant BIOS documentation to determine how these settings = can be accessed. + +Host Operating System +~~~~~~~~~~~~~~~~~~~~~ + +The Host OS must also have the *apci_cpufreq* module installed, in some = cases the *intel_pstate* driver may be the default Power Management envir= onment. To enable *acpi_cpufreq* and disable *intel_pstate*, add the foll= owing to the grub linux command line: + +:: + + intel_pstate=3Ddisable + +Upon rebooting, load the *acpi_cpufreq* module: + +:: + + modprobe acpi_cpufreq + + + +Hypervisor Channel Configuration +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Virtio-Serial channels are configured via libvirt XML: + + +:: + + {vm_name} + +
+ + + + +
+ + + +Where a single controller of type *virtio-serial* is created and up to 3= 2 channels can be associated with a single controller and multiple contro= llers can be specified. +The convention is to use the name of the VM in the host path *{vm_name}*= and to increment *{channel_num}* for each channel, likewise the port val= ue *{N}* must be incremented for each channel. + +Each channel on the host will appear in *path*, the directory */tmp/powe= rmonitor/* must first be created and given qemu permissions + +:: + + mkdir /tmp/powermonitor/ + chown qemu:qemu /tmp/powermonitor + + Note that files and directories within /tmp are generally removed upon + rebooting the host and the above steps may need to be carried out afte= r each reboot. + +The serial device as it appears on a VM is configured with the *target* = element attribute *name* and must be in the form of *virtio.serial.port.p= oweragent.{vm_channel_num}*, where *vm_channel_num* is typically the lcor= e channel to be used in DPDK VM applications. + +Each channel on a VM will be present at */dev/virtio-ports/virtio.seria= l.port.poweragent.{vm_channel_num}* + +Compiling and Running the Host Application +------------------------------------------ + +Compiling +~~~~~~~~~ + +1. export RTE_SDK=3D/path/to/rte_sdk +2. cd ${RTE_SDK}/examples/vm_power_manager +3. make + +Running +~~~~~~~ + +The application does not have any specific command line options other th= an *EAL*: + +:: + + ./build/vm_power_mgr [EAL options] + +The application requires exactly two cores to run, one core is dedicated= to the CLI, while the other is dedicated to the channel endpoint monitor= , for example to run on cores 0 & 1 on a system with 4 memory channels: + +:: + + ./build/vm_power_mgr -c 0x3 -n 4 + +After successful initialisation the user is presented with VM Power Mana= ger CLI: + +:: + + vm_power> + +Virtual Machines can now be added to the VM Power Manager: + +:: + + vm_power> add_vm {vm_name} + +When a {vm_name} is specified with the *add_vm* command a lookup is perf= ormed with libvirt to ensure that the VM exists, {vm_name} is used as an = unique identifier to associate channels with a particular VM and for exec= uting operations on a VM within the CLI. VMs do not have to be running to= in order to add them. + +A number of commands can be issued via the CLI in relation to VMs: + + Remove a Virtual Machine identified by {vm_name} from the VM Power Man= ager.:: + + rm_vm {vm_name} + + Add communication channels for the specified VM, the virtio channels m= ust be enabled in the VM configuration(qemu/libvirt) and the associated V= M must be active. {list} is a comma-separated list of channel numbers to = add, using the keyword 'all' will attempt to add all channels for the VM:= : + + add_channels {vm_name} {list}|all + + Enable or disable the communication channels in {list}(comma-separated= ) for the specified VM, alternatively list can be replaced with keyword '= all'. Disabled channels will still receive packets on the host, however t= he commands they specify will be ignored. Set status to 'enabled' to begi= n processing requests again:: + + set_channel_status {vm_name} {list}|all enabled|disabled + + Print to the CLI the information on the specified VM, the information = lists the number of vCPUS, the pinning to pCPU(s) as a bit mask, along wi= th any communication channels associated with each VM, along with the sta= tus of each channel.:: + + show_vm {vm_name} + + Set the binding of Virtual CPU on VM with name {vm_name} to the Physi= cal CPU mask.:: + + set_pcpu_mask {vm_name} {vcpu} {pcpu} + + Set the binding of Virtual CPU on VM to the Physical CPU.:: + + set_pcpu {vm_name} {vcpu} {pcpu} + +Manual control and inspection can also be carried in relation CPU freque= ncy scaling: + + Get the current frequency for each core specified in the mask.:: + + show_cpu_freq_mask {mask} + + Set the current frequency for the cores specified in {core_mask} by sc= aling each up/down/min/max.:: + + set_cpu_freq {core_mask} up|down|min|max + + Get the current frequency for the specified core.:: + + show_cpu_freq {core_num} + + Set the current frequency for the specified core by scaling up/down/mi= n/max.:: + + set_cpu_freq {core_num} up|down|min|max + +Compiling and Running the Guest Applications +-------------------------------------------- + +For compiling and running l3fwd-power, see Chapter 11 "L3 Forwarding wit= h Power Management Application". + +A guest CLI is also provided for validating the setup. + +For both l3fwd-power and guest CLI, the channels for the VM must be moni= tored by the host application using the *add_channels* command on the hos= t. + + +Compiling +~~~~~~~~~ + +1. export RTE_SDK=3D/path/to/rte_sdk +2. cd ${RTE_SDK}/examples/vm_power_manager/guest_cli +3. make + +Running +~~~~~~~ + +The application does not have any specific command line options other th= an *EAL*: + +:: + + ./build/vm_power_mgr [EAL options] + +The application for example purposes uses a channel for each lcore enabl= ed, for example to run on cores 0,1,2,3 on a system with 4 memory channel= s: + +:: + + ./build/guest_vm_power_mgr -c 0xf -n 4 + + +After successful initialisation the user is presented with VM Power Mana= ger Guest CLI: + +:: + + vm_power(guest)> + +To change the frequency of a lcore, use the set_cpu_freq command. Where = {core_num} is the lcore and channel to change frequency by scaling up/dow= n/min/max. + +:: + + set_cpu_freq {core_num} up|down|min|max + +.. |vm_power_mgr_highlevel| image:: img/vm_power_mgr_highlevel.svg +.. |vm_power_mgr_vm_request_seq| image:: img/vm_power_mgr_vm_request_seq= .svg --=20 1.7.4.1