From mboxrd@z Thu Jan 1 00:00:00 1970 From: Juergen Gross Subject: Re: Hypervisor crash(!) on xl cpupool-numa-split Date: Thu, 03 Feb 2011 06:57:11 +0100 Message-ID: <4D4A43B7.5040707@ts.fujitsu.com> References: <4D41FD3A.5090506@amd.com> <201102021539.06664.stephan.diestelhorst@amd.com> <4D4974D1.1080503@ts.fujitsu.com> <201102021701.05665.stephan.diestelhorst@amd.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <201102021701.05665.stephan.diestelhorst@amd.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Stephan Diestelhorst Cc: George Dunlap , "Przywara, Andre" , "xen-devel@lists.xensource.com" , Keir Fraser , Ian Jackson List-Id: xen-devel@lists.xenproject.org On 02/02/11 17:01, Stephan Diestelhorst wrote: > On Wednesday 02 February 2011 16:14:25 Juergen Gross wrote: >> On 02/02/11 15:39, Stephan Diestelhorst wrote: >>> We have the following theory of what happens: >>> * some vcpus of a particular domain are currently in the process of >>> being moved to the new pool >> >> The only _vcpus_ to be moved between pools are the idle vcpus. And those >> never contribute to accounting in credit scheduler. >> >> We are moving _pcpus_ only (well, moving a domain between pools actually >> moves vcpus as well, but then the domain is paused). > > How do you ensure that the domain is paused and stays that way? Pausing > the domain was what I had in mind, too... Look at sched_move_domain() in schedule.c: I'm calling domain_pause() before moving the vcpus and domain_unpause() after that. > >>> Despite the rant, it is amazing to see the ability to move running >>> things around through this remote continuation trick! In my (ancient) >>> balancer experiments I added hypervisor-threads just for side- >>> stepping this issue.. >> >> I think the easiest way to solve the problem would be to move the cpu to the >> new pool in a tasklet. This is possible now, because tasklets are always >> executed in the idle vcpus. > > Yep. That was exactly what I build. At the time stuff like that did > not exist (2005). > >> OTOH I'd like to understand what is wrong with my current approach... > > Nothing, in fact I like it. In my rant I complained about the fact > that splitting the critical section accross this continuation looks > scary, basically causing some generic red lights to turn on :-) And > making reasoning about the correctness a little complicated, but that > may well be a local issue ;-) Perhaps you can help solving the miracle: Could you replace the BUG_ON in sched_credit.c:389 with something like this: if (!is_idle_vcpu(per_cpu(schedule_data, cpu).curr)) { extern void dump_runq(unsigned char key); struct vcpu *vc = per_cpu(schedule_data, cpu).curr; printk("+++ (%d.%d) instead idle vcpu on cpu %d\n", vc->domain->domain_id, vc->vcpu_id, cpu); dump_runq('q'); BUG(); } Juergen -- Juergen Gross Principal Developer Operating Systems TSP ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html