From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dulloor Subject: Re: Remus : VM on backup not in pause state Date: Thu, 22 Jul 2010 16:40:58 -0700 Message-ID: References: <20100722214913.GE3994@kremvax.cs.ubc.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <20100722214913.GE3994@kremvax.cs.ubc.ca> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Dulloor , xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org On Thu, Jul 22, 2010 at 2:49 PM, Brendan Cully wrote: > On Thursday, 22 July 2010 at 13:45, Dulloor wrote: >> My setup is as follows : >> - xen : unstable (rev:21743) >> - Dom0 : pvops (branch : stable-2.6.32.x, >> rev:01d9fbca207ec232c758d991d66466fc6e38349e) >> - Guest Configuration : >> ------------------------------------------------------------------------= ------------------ >> kernel =3D "/usr/lib/xen/boot/hvmloader" >> builder=3D'hvm' >> name =3D "linux-hvm" >> vcpus =3D 4 >> memory =3D 2048 >> vif =3D [ 'type=3Dioemu, bridge=3Deth0, mac=3D00:1c:3e:17:22:13' ] >> disk =3D [ 'phy:/dev/XenVolG/hvm-linux-snap-1.img,hda,w' ] >> device_model =3D '/usr/lib/xen/bin/qemu-dm' >> boot=3D"cd" >> sdl=3D0 >> vnc=3D1 >> vnclisten=3D"0.0.0.0" >> vncconsole=3D0 >> vncpasswd=3D'' >> stdvga=3D0 >> superpages=3D1 >> serial=3D'pty' >> ------------------------------------------------------------------------= ------------------ >> >> - Remus command : >> # remus --no-net linux-hvm >> >> - On primary : >> # xm list >> Name =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0ID =A0 Mem VCPUs =A0 =A0 =A0State =A0 Time(s) >> linux-hvm =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A09 =A02048 =A0 =A0 4 =A0 =A0 -b-s-- =A0 =A0 10.8 >> >> - On secondary : >> # xm list >> Name =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0ID =A0 Mem VCPUs =A0 =A0 =A0State =A0 Time(s) >> linux-hvm =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 11 =A02048 =A0 =A0 4 =A0 =A0 -b---- =A0 =A0 =A01.9 >> >> >> I have to issue "xm pause/unpause" explicitly for the backup VM. >> Any recent changes ? > > This probably means there was a timeout on the replication channel, > interpreted by the backup as a failure of the primary, which caused it > to activate itself. You should see evidence of that in the remus > console logs and xend.log and daemon.log (for the disk side). > > Once you've figured out where the timeout happened it'll be easier to > figure out why. > Please find the logs attached. I didn't find anything interesting in daemon.log. What does remus log there ? I am not using disk replication, since I have issues with that .. but that's for another email :) The only visible error is in xend-secondary.log around xc_restore : [2010-07-22 16:15:37 2056] DEBUG (balloon:207) Balloon: setting dom0 target= to 5 765 MiB. [2010-07-22 16:15:37 2056] DEBUG (XendDomainInfo:1467) Setting memory targe= t of domain Domain-0 (0) to 5765 MiB. [2010-07-22 16:15:37 2056] DEBUG (XendCheckpoint:290) [xc_restore]: /usr/li= b/xen /bin/xc_restore 5 1 5 6 1 1 1 0 [2010-07-22 16:18:42 2056] INFO (XendCheckpoint:408) xc: error: Error when reading pages (11 =3D Resource temporarily unavailabl): Internal error [2010-07-22 16:18:42 2056] INFO (XendCheckpoint:408) xc: error: error when buffering batch, finishing (11 =3D Resource temporarily unavailabl): Internal error If you haven't seen this before, please let me know and I will try debugging more. -dulloor