Re: xfs|loop|raid: attempt to access beyond end of device

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: "Janos Haar" <djani22@netcenter.hu>
To: "David Chinner" <dgc@sgi.com>
Cc: <linux-kernel@vger.kernel.org>
Subject: Re: xfs|loop|raid: attempt to access beyond end of device
Date: Wed, 26 Dec 2007 03:19:19 +0100	[thread overview]
Message-ID: <076901c84765$bbc3cee0$9900a8c0@dcccs> (raw)
In-Reply-To: 20071225091347.GA155407@sgi.com

Hello list,


----- Original Message ----- 
From: "David Chinner" <dgc@sgi.com>
To: "Janos Haar" <djani22@netcenter.hu>
Cc: <linux-kernel@vger.kernel.org>
Sent: Tuesday, December 25, 2007 10:13 AM
Subject: Re: xfs|loop|raid: attempt to access beyond end of device


> On Sun, Dec 23, 2007 at 08:21:08PM +0100, Janos Haar wrote:
> > Hello, list,
> >
> > I have a little problem on one of my productive system.
> >
> > The system sometimes crashed, like this:
> >
> > Dec 23 08:53:05 Albohacen-global kernel: attempt to access beyond end of
> > device
> > Dec 23 08:53:05 Albohacen-global kernel: loop0: rw=1,
want=50552830649176,
> > limit=3085523200
> > Dec 23 08:53:05 Albohacen-global kernel: Buffer I/O error on device
loop0,
> > logical block 6319103831146
> > Dec 23 08:53:05 Albohacen-global kernel: lost page write due to I/O
error on
> > loop0
>
> So a long way beyond the end of the device.
>
> [snip soft lockup warnings]
>
> > Dec 23 09:08:19 Albohacen-global kernel: Filesystem "loop0": Access to
block
> > zero in inode 397821447 start_block: 0 start_off: 0 blkcnt: 0
extent-state:
> > 0 lastx: e4
>
> And that's to block zero of the filesystem. Sure signs of a corupted inode
> extent btree. We've seen a few of these corruptions on loopback device
> reported recently.
>
> You'll need to unmount and repair the filesystem to make this go away,
> but it's hard to know what is causing the btree corruption.

Yes, if i do that, the fs clean again, and works well until we moves big
amount of data on the storage.
But the problem comes out again, and again.

If i can, i will try the latest stable kernel, with some loop fixes, but it
is not too easy.
This is a productive system.


>
> > Dec 23 09:08:22 Albohacen-global last message repeated 19 times
> >
> > some more info:
> >
> > [root@Albohacen-global ~]# uname -a
> > Linux Albohacen-global 2.6.21.1 #3 SMP Thu May 3 04:33:36 CEST 2007
x86_64
> > x86_64 x86_64 GNU/Linux
> > [root@Albohacen-global ~]# cat /proc/mdstat
> > Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
[raid4]
> > [multipath] [faulty]
> > md1 : active raid4 sdf2[1] sde2[0] sdd2[5] sdc2[4] sdb2[3] sda2[2]
> >       19558720 blocks level 4, 64k chunk, algorithm 0 [6/6] [UUUUUU]
> >       bitmap: 8/239 pages [32KB], 8KB chunk
> >
> > md2 : active raid4 sdf3[1] sde3[0] sdd3[5] sdc3[4] sdb3[3] sda3[2]
> >       1542761600 blocks level 4, 64k chunk, algorithm 0 [6/6] [UUUUUU]
> >       bitmap: 0/148 pages [0KB], 1024KB chunk
> >
> > md0 : active raid1 sdb1[1] sda1[0]
> >       104320 blocks [2/2] [UU]
> >
> > unused devices: <none>
> > [root@Albohacen-global ~]# losetup /dev/loop0
> > /dev/loop0: [0010]:6598 (/dev/md2), encryption blowfish (type 18)
>
> You're using an encrypted block device?

Yes.

> What mechanism are you using for
> encryption (doesn't appear to be dmcrypt)?

No, this is the "old way", not the dmcrypt.
This is cryptoloop.

> Does it handle readahead bio
> cancellation correctly?

I dont know exactly, but in my log it is always a write failures! ( i think
from rw=1 and lost page write messages)
It is the readahead important in this case?

> We had similar XFS corruption problems on dmcrypt
> between 2.6.14 and ~2.6.20 due to a bug in dmcrypt's failure to handle
> aborted readahead I/O correctly....

Ok, what is the next step? :-)
(i will try the latest kernel, if i can....)
Can i switch to dmcrypt from cryptoloop without rebuild the fs?

Thanks a lot,
Janos Haar

>
> Cheers,
>
> Dave.
> -- 
> Dave Chinner
> Principal Engineer
> SGI Australian Software Group

     prev parent reply	other threads:[~2007-12-26  2:21 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-23 19:21 xfs|loop|raid: attempt to access beyond end of device Janos Haar
2007-12-25  9:13 ` David Chinner
2007-12-26  2:19   ` Janos Haar [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='076901c84765$bbc3cee0$9900a8c0@dcccs' \
    --to=djani22@netcenter.hu \
    --cc=dgc@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox