From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from casper.infradead.org (casper.infradead.org [85.118.1.10]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by bilbo.ozlabs.org (Postfix) with ESMTPS id 30198B6F1E for ; Mon, 17 Aug 2009 07:56:52 +1000 (EST) Subject: Re: [PATCH 0/3] cpu: idle state framework for offline CPUs. From: Peter Zijlstra To: balbir@linux.vnet.ibm.com In-Reply-To: <20090816194441.GA22626@balbir.in.ibm.com> References: <20090809120818.GA1338@ucw.cz> <200908091522.02898.rjw@sisk.pl> <20090810081941.GA18649@elf.ucw.cz> <1249950137.11545.38184.camel@localhost.localdomain> <20090812115806.GK24339@elf.ucw.cz> <20090812195753.GA14649@in.ibm.com> <20090813045931.GB14649@in.ibm.com> <20090814113021.GL32418@elf.ucw.cz> <20090816182629.GA31027@in.ibm.com> <20090816194441.GA22626@balbir.in.ibm.com> Content-Type: text/plain Date: Sun, 16 Aug 2009 23:53:22 +0200 Message-Id: <1250459602.8648.35.camel@laptop> Mime-Version: 1.0 Cc: "Brown, Len" , "Darrick J. Wong" , Gautham R Shenoy , "linux-kernel@vger.kernel.org" , "Rafael J. Wysocki" , Pavel Machek , "Pallipadi, Venkatesh" , "Li, Shaohua" , Ingo Molnar , "linuxppc-dev@lists.ozlabs.org" , Len Brown List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Mon, 2009-08-17 at 01:14 +0530, Balbir Singh wrote: > * Dipankar Sarma [2009-08-16 23:56:29]: > > > On Fri, Aug 14, 2009 at 01:30:21PM +0200, Pavel Machek wrote: > > > > > > > > It depends on the hypervisor implementation. On pseries (powerpc) > > > > hypervisor, for example, they are different. By offlining a vcpu > > > > (and in turn shutting a cpu), you will actually create a configuration > > > > change in the VM that is visible to other systems management tools > > > > which may not be what the system administrator wanted. Ideally, > > > > we would like to distinguish between these two states. > > > > > > > > Hope that suffices as an example. > > > > > > So... you have something like "physically pulling out hotplug cpu" on > > > powerpc. > > > > If any system can do physical unplug, then it should do "offline" > > with configuration changes reflected in the hypervisor and > > other system configuration software. > > > > > But maybe it is useful to take already offline cpus (from linux side), > > > and make that visible to hypervisor, too. > > > > > > So maybe something like "echo 1 > /sys/devices/system/cpu/cpu1/unplug" > > > would be more useful for hypervisor case? > > > > On pseries, we do an RTAS call ("stop-cpu") which effectively permantently > > de-allocates it from the VM hands over the control to hypervisor. The > > hypervisors may do whatever it wants including allocating it to > > another VM. Once gone, the original VM may not get it back depending > > on the situation. > > > > The point I am making is that we may not always want to *release* > > the CPU to hypervisor and induce a configuration change. That needs > > to be reflected by extending the existing user interface - hence > > the proposal for - /sys/devices/system/cpu/cpu<#>/state and > > /sys/devices/system/cpu/cpu<#>/available_states. It allows > > ceding to hypervisor without de-allocating. It is a minor > > extension of the existing interface keeping backwards compatibility > > and platforms can allow what make sense. > > > > > Agreed, I've tried to come with a little ASCII art to depict your > scenairos graphically > > > +--------+ don't need (offline) > | OS +----------->+------------+ > +--+-----+ | hypervisor +-----> Reuse CPU > | | | for something > | | | else > | | | (visible to users) > | | | as resource changed > | +----------- + > V (needed, but can cede) > +------------+ > | hypervisor | Don't reuse CPU > | | (CPU ceded) > | | give back to OS > +------------+ when needed. > (Not visible to > users as so resource > binding changed) I still don't get it... _why_ should this be exposed in the guest kernel? Why not let the hypervisor manage a guest's offline cpus in a way it sees fit?