From mboxrd@z Thu Jan  1 00:00:00 1970
From: Mukesh Rathor <mukesh.rathor@oracle.com>
Subject: Re: dom0 hang
Date: Wed, 01 Jul 2009 20:19:31 -0700
Message-ID: <4A4C2743.5030703@oracle.com>
References: <4A426D50.80401@oracle.com>
Reply-To: mukesh.rathor@oracle.com
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xensource.com>
In-Reply-To: <4A426D50.80401@oracle.com>
List-Unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xensource.com>
List-Help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-Subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
Sender: xen-devel-bounces@lists.xensource.com
Errors-To: xen-devel-bounces@lists.xensource.com
To: mukesh.rathor@oracle.com
Cc: ackaouy@gmail.com, Dan Magenheimer <dan.magenheimer@oracle.com>, xen-devel <xen-devel@lists.xensource.com>, andrew thomas <andrew.thomas@oracle.com>, "Kurt C. Hackel" <kurt.hackel@oracle.com>
List-Id: xen-devel@lists.xenproject.org


Mukesh Rathor wrote:
> Here are the details on the dom0 hang:
> 
> xen: 3.4.0
> dom0: 2.6.18-128
> 
> dom0.vcpu0: spinning in schedule() on spinlock: spin_lock_irq(&rq->lock);
> dom0.vcpu1: eip == ret after __HYPERVISOR_event_channel_op hypercall
> 
> Just of of curiosity, I set breakpoint at the above ret in kdb, and it 
> never got hit. So I wondered why vcpu1 is not getting scheculed, and 
> noticed that xen.schedule always schedules vcpu0. Two cpus on the box, 
> other one is mostly in idle.
> 
> anyways, I've turned lock debugging on in dom0 and reproducing it right 
> now.
> 
> thanks,
> Mukesh
> 


Ok, here's what I have found on this:

dom0 hang:
     vcpu0 is trying to wakeup a task and in try_to_wake_up() calls
     task_rq_lock(). since the task has cpu set to 1, it gets runq lock
     for vcpu1. next it calls resched_task() which results in sending IPI
     to vcpu1. for that, vcpu0 gets into the HYPERVISOR_event_channel_op
     HCALL and is waiting to return. Meanwhile, vcpu1 got running, and is
     spinning on it's runq lock in "schedule():spin_lock_irq(&rq->lock);",
     that vcpu0 is holding (and is waiting to return from the HCALL).

     As I had noticed before, vcpu0 never gets scheduled in xen. So
     looking further into xen:

xen:
     Both vcpu's are on the same runq, in this case cpu1. But the
     priority of vcpu1 has been set to CSCHED_PRI_TS_BOOST. As a result,
     the scheduler always picks vcpu1, and vcpu0 is starved. Also, I see in
     kdb that the scheduler timer is not set on cpu 0. That would've
     allowed csched_load_balance() to kick in on cpu0. [Also, on
     cpu1, the accounting timer, csched_tick, is not set.  Altho,
     csched_tick() is running on cpu0, it only checks runq for cpu0.]

     Looks like c/s 19500 changed csched_schedule():

-    ret.time = MILLISECS(CSCHED_MSECS_PER_TSLICE);
+    ret.time = (is_idle_vcpu(snext->vcpu) ?
+                -1 : MILLISECS(CSCHED_MSECS_PER_TSLICE));

   The quickest fix for us would be to just back that out.


   BTW, just a comment on following (all in sched_credit.c):

       if ( svc->pri == CSCHED_PRI_TS_UNDER &&
          !(svc->flags & CSCHED_FLAG_VCPU_PARKED) )
       {
          svc->pri = CSCHED_PRI_TS_BOOST;
       }
   comibined with
     if ( snext->pri > CSCHED_PRI_TS_OVER )
             __runq_remove(snext);

       Setting CSCHED_PRI_TS_BOOST as pri of vcpu seems dangerous. To me,
       since csched_schedule() never checks for time accumulated by a
       vcpu at pri CSCHED_PRI_TS_BOOST, that is same as pinning a vcpu to a
       pcpu. if that vcpu never makes progress, essentially, the system
       has lost a physical cpu.  Optionally, csched_schedule() should always
       check for cpu time accumulated and reduce the priority over time.
       I can't tell right off if it already does that. or something like
       that :)...  my 2 cents.

thanks,
Mukesh
  *** : starting 3 star campaign against overuse of macros!