public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* ext3 filesystem corruption in user mode linux
@ 2010-09-24  4:14 Chris Frey
  2010-09-24  6:05 ` Chris Frey
  2010-09-27 20:54 ` Chris Frey
  0 siblings, 2 replies; 11+ messages in thread
From: Chris Frey @ 2010-09-24  4:14 UTC (permalink / raw)
  To: linux-kernel

Hi,

I'm running a stock 2.6.35.4 kernel, on both the host and the guest.

For one test, I've loaded a stage3 gentoo system including portage,
created completely on the host, using dd, a 10gig sparse virtual disk,
mkfs.ext3, and the gentoo tarballs.  (I get fs corruption with Ubuntu
guests as well).

If I do some heavy CPU and disk work in the guest, something like this:

	# (cd /usr && tar cjf - portage) | tar xjf -
	# rm -rf portage

Then the filesystem corrupts itself, giving errors like the following
in dmesg, during the rm:

    EXT3-fs error (device ubda): ext3_lookup: deleted inode referenced: 51715

I've posted my kernel configs online here:

	Host: http://foursquare.net/kernel/host-config.txt
	UML:  http://foursquare.net/kernel/uml-config.txt

I'm not sure what the next debugging step should be.  I can test different
versions of UML kernels, or test patches, if more testing is needed.

Thanks,
- Chris


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ext3 filesystem corruption in user mode linux
  2010-09-24  4:14 ext3 filesystem corruption in user mode linux Chris Frey
@ 2010-09-24  6:05 ` Chris Frey
  2010-09-24  6:44   ` Chris Frey
  2010-09-27 20:54 ` Chris Frey
  1 sibling, 1 reply; 11+ messages in thread
From: Chris Frey @ 2010-09-24  6:05 UTC (permalink / raw)
  To: linux-kernel

On Fri, Sep 24, 2010 at 12:14:10AM -0400, Chris Frey wrote:
> If I do some heavy CPU and disk work in the guest, something like this:
> 
> 	# (cd /usr && tar cjf - portage) | tar xjf -
> 	# rm -rf portage
> 
> Then the filesystem corrupts itself, giving errors like the following
> in dmesg, during the rm:
> 
>     EXT3-fs error (device ubda): ext3_lookup: deleted inode referenced: 51715


Update:

So far I have been unable to make 2.6.32.21 corrupt its filesystem
when running as the UML kernel (Host 2.6.35.4, UML 2.6.32.21)

When running UML 2.6.33.7, I get similar errors:

EXT3-fs error (device ubda): ext3_lookup: deleted inode referenced: 21397
EXT3-fs error (device ubda): ext3_lookup: deleted inode referenced: 13257
EXT3-fs error (device ubda): ext3_lookup: deleted inode referenced: 13257
EXT3-fs error (device ubda): ext3_lookup: deleted inode referenced: 21403
EXT3-fs error (device ubda): ext3_lookup: deleted inode referenced: 21403
EXT3-fs error (device ubda): ext3_lookup: deleted inode referenced: 13262
EXT3-fs error (device ubda): ext3_lookup: deleted inode referenced: 13262
EXT3-fs error (device ubda): htree_dirblock_to_tree: bad entry in directory #13269: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
EXT3-fs (ubda): warning: empty_dir: bad directory (dir #13269) - no `.' or `..'
EXT3-fs (ubda): warning: ext3_rmdir: empty directory has nlink!=2 (3)


If anyone wants me to test a specific UML kernel, please let me know.

- Chris


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ext3 filesystem corruption in user mode linux
  2010-09-24  6:05 ` Chris Frey
@ 2010-09-24  6:44   ` Chris Frey
  2010-09-27 21:04     ` richard -rw- weinberger
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Frey @ 2010-09-24  6:44 UTC (permalink / raw)
  To: linux-kernel

On Fri, Sep 24, 2010 at 02:05:57AM -0400, Chris Frey wrote:
> So far I have been unable to make 2.6.32.21 corrupt its filesystem
> when running as the UML kernel (Host 2.6.35.4, UML 2.6.32.21)

Spoke too soon... 2.6.32.21 has errors too, if pushed hard enough.

- Chris


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ext3 filesystem corruption in user mode linux
  2010-09-24  4:14 ext3 filesystem corruption in user mode linux Chris Frey
  2010-09-24  6:05 ` Chris Frey
@ 2010-09-27 20:54 ` Chris Frey
  2010-09-27 21:00   ` Randy Dunlap
  1 sibling, 1 reply; 11+ messages in thread
From: Chris Frey @ 2010-09-27 20:54 UTC (permalink / raw)
  To: linux-kernel

On Fri, Sep 24, 2010 at 12:14:10AM -0400, Chris Frey wrote:
> I'm not sure what the next debugging step should be.  I can test different
> versions of UML kernels, or test patches, if more testing is needed.

Is this the wrong place to post about user mode linux?  I searched
but I don't see a mailing list specific to ARCH=um.

Thanks,
- Chris


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ext3 filesystem corruption in user mode linux
  2010-09-27 20:54 ` Chris Frey
@ 2010-09-27 21:00   ` Randy Dunlap
  0 siblings, 0 replies; 11+ messages in thread
From: Randy Dunlap @ 2010-09-27 21:00 UTC (permalink / raw)
  To: Chris Frey; +Cc: linux-kernel


On Mon, September 27, 2010 1:54 pm, Chris Frey wrote:
> On Fri, Sep 24, 2010 at 12:14:10AM -0400, Chris Frey wrote:
>
>> I'm not sure what the next debugging step should be.  I can test
>> different versions of UML kernels, or test patches, if more testing is
>> needed.
>
> Is this the wrong place to post about user mode linux?  I searched
> but I don't see a mailing list specific to ARCH=um.

I would expect here to be OK, but the MAINTAINERS file also lists
2 mailing lists and one web site:

L:	user-mode-linux-devel@lists.sourceforge.net
L:	user-mode-linux-user@lists.sourceforge.net
W:	http://user-mode-linux.sourceforge.net

-- 
~Randy


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ext3 filesystem corruption in user mode linux
  2010-09-24  6:44   ` Chris Frey
@ 2010-09-27 21:04     ` richard -rw- weinberger
  2010-09-27 21:18       ` Chris Frey
  0 siblings, 1 reply; 11+ messages in thread
From: richard -rw- weinberger @ 2010-09-27 21:04 UTC (permalink / raw)
  To: Chris Frey; +Cc: linux-kernel

On Fri, Sep 24, 2010 at 8:44 AM, Chris Frey <cdfrey@foursquare.net> wrote:
> On Fri, Sep 24, 2010 at 02:05:57AM -0400, Chris Frey wrote:
>> So far I have been unable to make 2.6.32.21 corrupt its filesystem
>> when running as the UML kernel (Host 2.6.35.4, UML 2.6.32.21)
>
> Spoke too soon... 2.6.32.21 has errors too, if pushed hard enough.
>

can you do a git bisect?

-- 
Cheers,
//richard

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ext3 filesystem corruption in user mode linux
  2010-09-27 21:04     ` richard -rw- weinberger
@ 2010-09-27 21:18       ` Chris Frey
  2010-09-27 21:25         ` richard -rw- weinberger
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Frey @ 2010-09-27 21:18 UTC (permalink / raw)
  To: richard -rw- weinberger; +Cc: linux-kernel

On Mon, Sep 27, 2010 at 11:04:07PM +0200, richard -rw- weinberger wrote:
> On Fri, Sep 24, 2010 at 8:44 AM, Chris Frey <cdfrey@foursquare.net> wrote:
> > On Fri, Sep 24, 2010 at 02:05:57AM -0400, Chris Frey wrote:
> >> So far I have been unable to make 2.6.32.21 corrupt its filesystem
> >> when running as the UML kernel (Host 2.6.35.4, UML 2.6.32.21)
> >
> > Spoke too soon... 2.6.32.21 has errors too, if pushed hard enough.
> >
> 
> can you do a git bisect?

I've been trying, but so far the range of buggy versions stretches back
to 2.6.32.x at least.  I'm trying to find a version that doesn't have
this issue.

I'm kinda surprised that nobody else has run into this.  Or maybe user
mode linux isn't used as much as I thought?   Or I'm just doing something
completely bone headed. :-)

- Chris


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ext3 filesystem corruption in user mode linux
  2010-09-27 21:18       ` Chris Frey
@ 2010-09-27 21:25         ` richard -rw- weinberger
  2010-09-27 21:26           ` Chris Frey
  0 siblings, 1 reply; 11+ messages in thread
From: richard -rw- weinberger @ 2010-09-27 21:25 UTC (permalink / raw)
  To: Chris Frey; +Cc: linux-kernel

On Mon, Sep 27, 2010 at 11:18 PM, Chris Frey <cdfrey@foursquare.net> wrote:
> On Mon, Sep 27, 2010 at 11:04:07PM +0200, richard -rw- weinberger wrote:
>> On Fri, Sep 24, 2010 at 8:44 AM, Chris Frey <cdfrey@foursquare.net> wrote:
>> > On Fri, Sep 24, 2010 at 02:05:57AM -0400, Chris Frey wrote:
>> >> So far I have been unable to make 2.6.32.21 corrupt its filesystem
>> >> when running as the UML kernel (Host 2.6.35.4, UML 2.6.32.21)
>> >
>> > Spoke too soon... 2.6.32.21 has errors too, if pushed hard enough.
>> >
>>
>> can you do a git bisect?
>
> I've been trying, but so far the range of buggy versions stretches back
> to 2.6.32.x at least.  I'm trying to find a version that doesn't have
> this issue.
>
> I'm kinda surprised that nobody else has run into this.  Or maybe user
> mode linux isn't used as much as I thought?   Or I'm just doing something
> completely bone headed. :-)

Hmm, maybe this issue affects only your configuration.
When I have some spare time I'll try to trigger this on my setup.

-- 
Cheers,
//richard

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ext3 filesystem corruption in user mode linux
  2010-09-27 21:25         ` richard -rw- weinberger
@ 2010-09-27 21:26           ` Chris Frey
  2010-09-27 21:59             ` Chris Frey
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Frey @ 2010-09-27 21:26 UTC (permalink / raw)
  To: richard -rw- weinberger; +Cc: linux-kernel

On Mon, Sep 27, 2010 at 11:25:11PM +0200, richard -rw- weinberger wrote:
> On Mon, Sep 27, 2010 at 11:18 PM, Chris Frey <cdfrey@foursquare.net> wrote:
> > On Mon, Sep 27, 2010 at 11:04:07PM +0200, richard -rw- weinberger wrote:
> >> On Fri, Sep 24, 2010 at 8:44 AM, Chris Frey <cdfrey@foursquare.net> wrote:
> >> > On Fri, Sep 24, 2010 at 02:05:57AM -0400, Chris Frey wrote:
> >> >> So far I have been unable to make 2.6.32.21 corrupt its filesystem
> >> >> when running as the UML kernel (Host 2.6.35.4, UML 2.6.32.21)
> >> >
> >> > Spoke too soon... 2.6.32.21 has errors too, if pushed hard enough.
> >> >
> >>
> >> can you do a git bisect?
> >
> > I've been trying, but so far the range of buggy versions stretches back
> > to 2.6.32.x at least. ?I'm trying to find a version that doesn't have
> > this issue.
> >
> > I'm kinda surprised that nobody else has run into this. ?Or maybe user
> > mode linux isn't used as much as I thought? ? Or I'm just doing something
> > completely bone headed. :-)
> 
> Hmm, maybe this issue affects only your configuration.
> When I have some spare time I'll try to trigger this on my setup.

That would be great.  Let me know if you need any specific data from my end.

- Chris


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ext3 filesystem corruption in user mode linux
  2010-09-27 21:26           ` Chris Frey
@ 2010-09-27 21:59             ` Chris Frey
       [not found]               ` <AANLkTinmCxYrjdYoDUE6+zg1i1QSJxt5ngN84FrjWJaj@mail.gmail.com>
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Frey @ 2010-09-27 21:59 UTC (permalink / raw)
  To: richard -rw- weinberger; +Cc: linux-kernel

On Mon, Sep 27, 2010 at 05:26:50PM -0400, Chris Frey wrote:
> > Hmm, maybe this issue affects only your configuration.
> > When I have some spare time I'll try to trigger this on my setup.
> 
> That would be great.  Let me know if you need any specific data from my end.

One data point that I'm finding so far, is that it is easier to trigger
in 2.6.35.x than 2.6.32.x.  It also doesn't seem to be dependent on the
version of the host kernel.  I can reproduce it running 2.6.35.4 or
2.6.32.22.

If you want a small gentoo starting point, I have a 56M compressed
ext3 gentoo guest filesystem I can share with those who wish to test.
I don't want to make it available to everyone, due to size.

- Chris


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ext3 filesystem corruption in user mode linux
       [not found]                   ` <AANLkTi=iSSffp-48g0XAOYBSnodAsEh20j2EtHW=4Egt@mail.gmail.com>
@ 2010-09-28  0:48                     ` Chris Frey
  0 siblings, 0 replies; 11+ messages in thread
From: Chris Frey @ 2010-09-28  0:48 UTC (permalink / raw)
  To: richard -rw- weinberger; +Cc: linux-kernel

On Tue, Sep 28, 2010 at 01:35:09AM +0200, richard -rw- weinberger wrote:
> On Tue, Sep 28, 2010 at 12:12 AM, Chris Frey <cdfrey@foursquare.net> wrote:
> > I'm using a mixture of the following to test.  The errors happen
> > during the 'rm'.
> >
> >        Direct copy:
> >                (cd dir && tar cjf - portage) | tar xjf - ; rm -rf portage
> >
> >        Hostfs copy:
> >                tar xjf /mnt/hostfs/portage-latest.tar.bz2 ; rm -rf portage
> >
> >        Network copy:
> >                ssh remote "cat portage-latest.tar.bz2" | tar xjf - ; rm -rf portage
> >
> > With these tests, I'm almost guessing that it might be some missed IRQs
> > or something in the guest, since files that are corrupt often contain
> > all zeros, which would match the sparse filesystem images I'm using.
> 
> Hmm, something really nasty is going one here.
> I can reproduce this issue using ext2, ext3 and reiserfs as UML root filesystem.
> It seems to be a block layer issue.
> Tomorrow I'll have a close look at the issue using my openSUSE setup.
> So far I've used your Gentoo image.

I've also seen the issue with a Ubuntu guest as well.  So far, I don't think
it matters what OS is in the guest.

Thanks for reproducing the error!
- Chris


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2010-09-28  0:49 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-24  4:14 ext3 filesystem corruption in user mode linux Chris Frey
2010-09-24  6:05 ` Chris Frey
2010-09-24  6:44   ` Chris Frey
2010-09-27 21:04     ` richard -rw- weinberger
2010-09-27 21:18       ` Chris Frey
2010-09-27 21:25         ` richard -rw- weinberger
2010-09-27 21:26           ` Chris Frey
2010-09-27 21:59             ` Chris Frey
     [not found]               ` <AANLkTinmCxYrjdYoDUE6+zg1i1QSJxt5ngN84FrjWJaj@mail.gmail.com>
     [not found]                 ` <20100927221225.GA22343@foursquare.net>
     [not found]                   ` <AANLkTi=iSSffp-48g0XAOYBSnodAsEh20j2EtHW=4Egt@mail.gmail.com>
2010-09-28  0:48                     ` Chris Frey
2010-09-27 20:54 ` Chris Frey
2010-09-27 21:00   ` Randy Dunlap

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox