From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: Re: Bug#637234: linux-image-3.0.0-1-686-pae: I/O errors using ext4 under xen Date: Mon, 29 Aug 2011 10:08:50 -0400 Message-ID: <20110829140849.GA3897@dumpdata.com> References: <20110809180728.2279.11548.reportbug@mail1.gedalya.net> <1314254828.17978.720.camel@dagon.hellion.org.uk> <20110826175317.GA5043@dumpdata.com> <4E58251A.8090108@gedalya.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Content-Disposition: inline In-Reply-To: <4E58251A.8090108@gedalya.net> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Gedalya Cc: Ian Campbell , xen-devel , 637234@bugs.debian.org List-Id: xen-devel@lists.xenproject.org On Fri, Aug 26, 2011 at 06:58:34PM -0400, Gedalya wrote: >=20 > >One way to make sure that is not the case is to disable barriers in th= e > >guest. Meaning in /etc/fstab have something like this: > > > >/dev/xvdc /blah ext4 errors=3Dremount-ro,barrier=3D0 0 1 >=20 > That seems to fix it. It was remounting as read only either during > the boot process or immediately after, and now it boots up and seems > to stay up. I'll test laster with a DomU that actually has things > running. Yeeey! >=20 > This also fixes the reboot problem I noted earlier, init 6 now > reboots the DomU rather than destory it. >=20 > > > >The other question is what version of Dom0 are you running? Is it 2.6.= 32? > >2.6.39? > squeeze, running linux-image-2.6.32-5-xen-amd64 2.6.32-35 Oh, I think I know _exactly_ what bug that is: This git commit: 280802657fb95c52bb5a35d43fea60351883b2af "xen/blkback: When writting barr= iers set the sector number to zero" has to be reverted. Specifically: commit 3f963cae3ef35d26fdd899c08797a598c5ca3e9b Author: Jeremy Fitzhardinge Date: Tue Jul 19 16:44:42 2011 -0700 Revert "xen/blkback: When writting barriers set the sector number to = zero..." =20 This reverts commit 280802657fb95c52bb5a35d43fea60351883b2af. This p= atch is reported to cause disk corruption: =20 From: "Huang2, Wei" =20 We recently found a disk corruption issue with SLES11 SP1 guest. Basi= cally the guest disk becomes non-bootable after guest shutdown. This is a S= LES specific issue as we didn=E2=80=99t see on other Linux and Windows VM= s. Here is the configuration: =20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 1. Xen: xen-4.1-testing, changeset 23096 =20 2. Dom0: Jeremy=E2=80=99s latest pvops 6d94b75 (June 1) =20 3. VM: SLES 11 SP1, installed as physical machine with raw disk = format =20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 Regarding the disk before corruption, =E2=80=9Cfile sles11sp1.img=E2=80= =9D command read: =E2=80=9C/root/guests/sles11-sp1/sles11sp1.img: x86 boot sector= ; partition 1: ID=3D0x82, starthead 1, startsector 63, 4208967 sectors; partition 2: ID=3D0x83, active, starthead 0, startsector 4209030, 16755795 sectors=E2=80=9D. After corruption, it became a data file: =E2=80=9C=E2=80=9C/root/guests/sles11-sp1/sles11sp1.img: data=E2=80=9D= . and this one added: 25266338a41470a21e9b3974445be09e0640dda7 xen/blkback: don't fail empty barrier requests =20 The sector number on empty barrier requests may (will?) be -1, which, given that it's being treated as unsigned 64-bit quantity, will almos= t always exceed the actual (virtual) disk's size. =20 Inspired by Konrad's "When writting barriers set the sector number to zero...".