From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: xen-4.1: PV domain hanging at startup, jiffies stopped Date: Mon, 29 Aug 2011 16:07:49 -0400 Message-ID: <20110829200749.GA17265@dumpdata.com> References: <4E5A3F0A.8060700@mimuw.edu.pl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <4E5A3F0A.8060700@mimuw.edu.pl> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Marek Marczykowski Cc: "xen-devel@lists.xensource.com" , Joanna Rutkowska List-Id: xen-devel@lists.xenproject.org On Sun, Aug 28, 2011 at 03:13:46PM +0200, Marek Marczykowski wrote: > Hey, > > I'm experiencing strange problem: non-deterministic PV domain hang, only > on some machines (with fast SSD drive). I've tried xen-4.1.0 and > xen-4.1.1 with many kernels different kernels: > VM: > - 2.6.38.3 xenlinux based on SUSE package > - vanilla 3.0.3 > - vanilla 3.1 rc2 > dom0: > - 2.6.38.3 xenlinux based on SUSE package > - vanilla 3.1 rc2 > > Result always the same: sometimes VM hang at startup, SysRq-T shows > modprobe waiting in "wait_for_devices" (concretely schedule_timeout) and > jiffies counter not increasing between task-states dumps. > > The only found thing (probably) connected with this problem are domU > kernel messages: > CE: xen increased min_delta_ns to 150000 nsec > (...) > CE: xen increased min_delta_ns to 4000000 nsec > CE: Reprogramming failure. Giving up > > This messages doesn't exists in successful boot. > > I've also tried some options to xen and domU kernel, but without success > (all combinations): BTW, your 'xencons=..' and 'swiotlb=force' are obsolete. Use 'console=hvc0' and 'iommu=soft'. The 'swiotlb=force' kills performance. > xen: tsc=unstable, cpufreq=none > domU: nohz=off, clocksource=tsc > > Some combination of above options lowered frequency of problem (ex > tsc=unstable + nohz=off), but it happens quite often - like 1 of 15 > boots fails. > > Have you idea what is the cause and what can help? The problem looks to be xenwatch stuck. So the problem is in Dom0 right?