From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: BUG: ext3 corruption in domU Date: Fri, 24 May 2013 10:20:44 -0400 Message-ID: <20130524142044.GF3900@phenom.dumpdata.com> References: <1366203601.25579.24.camel@zakaz.uk.xensource.com> <1366633594.22143.60.camel@zakaz.uk.xensource.com> <20130522201044.GA12372@phenom.dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Anthony Sheetz Cc: "xen-devel@lists.xen.org" , Ian Campbell , Roger Pau Monne List-Id: xen-devel@lists.xenproject.org On Thu, May 23, 2013 at 02:19:50PM -0400, Anthony Sheetz wrote: > On Wed, May 22, 2013 at 4:10 PM, Konrad Rzeszutek Wilk > wrote: > > On Mon, Apr 22, 2013 at 01:26:34PM +0100, Ian Campbell wrote: > >> Konrad is on vacation this week, so it'll probably be next week before > >> this gets looked at by him. > > > > And I finally got to this email in my 'vacation-mbox' > >> > >> Ian. > >> > >> On Mon, 2013-04-22 at 13:22 +0100, Anthony Sheetz wrote: > >> > I realize folks are pretty busy, but we're still interested in getting > >> > this problem solved, and I want to be sure it's not lost in the > >> > shuffle. > >> > Any chance of getting some attention for it? > >> > > >> > On Wed, Apr 17, 2013 at 9:00 AM, Ian Campbell wrote: > >> > > On Tue, 2013-04-16 at 18:39 +0100, Anthony Sheetz wrote: > >> > >> (re-sending, first message seems to have gotten lost) > >> > >> > >> > >> I was referred here by Ian Campbell ijc@hellion.org.uk from bugs.debian.org. > >> > > > >> > > I'm here too (different hat ;-)), thanks for posting it here. I've added > >> > > some people who know about the block stuff to the CC. > >> > > > >> > > Guys, my suspicion is that the issue is that barriers issued by ext3 > >> > > inside the guest aren't making it all the way down the > >> > > ext3->blkfront->blkback->lvm->dm-crypt->disk chain leading the > >> > > filesystem to eventually corrupt itself. > >> > > > >> > > The issue seems to relate to the use of dm-crypt since > >> > > ext3->blkfront->blkback->lvm->disk is reported work fine. > >> > > > >> > > However there is no problem with the local dom0 ext3 root filesystem > >> > > which is also in the same lvm VG on the crypt device (i.e. > >> > > ext3->lvm->dm-crypt->disk), so its not purely a dm-crypt issue. I figure > >> > > something is up at the blkfront->back link which causes the barriers > >> > > which blkback is injecting into the block subsystem either don't make it > >> > > to the dm-crypt layer or do not DTRT once they arrive. > >> > > > >> > > I'm not really sure with how to proceed (or how to ask Anthony to > >> > > proceed) with verifying any part of that hypothesis though. > >> > > > >> > > ISTR issues with old vs new style barriers or barriers with no data in > >> > > them or something, could this be related to that? (or am I thinking of > >> > > DISCARD?) > > > > You are using two different kernel versions. The 2.6.32 domU is only using > > WRITE_BARRIERs, while in the 3.2 kernels that have been completly eliminated. > > The mechanism they use is called 'WRITE_FLUSH'. The 3.2 kernel has a patch: > > ommit 29bde093787f3bdf7b9b4270ada6be7c8076e36b > > Author: Konrad Rzeszutek Wilk > > Date: Mon Oct 10 00:42:22 2011 -0400 > > > > xen/blkback: Support 'feature-barrier' aka old-style BARRIER requests. > > > > > > which emulates the barrier request by draining all of the oustanding I/Os and then > > sending the WRITE_FLUSH. > > > > But it looks like you are hitting an issue here. Just to make sure > > that is the case, what happens if you use the _same_ kernel in both dom0 and > > domU? Does it work then? > > > > First, thank you so much for getting back to me, it's really appreciated. > At this point I've forgotten if I did this with Wheezy on Wheezy, and > what the result was. > I'll have to test using the 3.2 kernel on the domU Debian Squeeze and > get back to you. I should be able to do that early next week. Thank you. Also when you do this test, could you also provide the 'xenstore-ls' output from dom0? And the 'dmesg' output from the guest (or at least the 'xl console | tee /tmp/log' ? That would give me and idea if the frontend/backend have the right negotiation parameters. Have a good weekend!