All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Janos Haar" <djani22@netcenter.hu>
To: "David Chinner" <dgc@sgi.com>
Cc: <linux-kernel@vger.kernel.org>
Subject: Re: xfs|loop|raid: attempt to access beyond end of device
Date: Wed, 26 Dec 2007 03:19:19 +0100	[thread overview]
Message-ID: <076901c84765$bbc3cee0$9900a8c0@dcccs> (raw)
In-Reply-To: 20071225091347.GA155407@sgi.com

Hello list,


----- Original Message ----- 
From: "David Chinner" <dgc@sgi.com>
To: "Janos Haar" <djani22@netcenter.hu>
Cc: <linux-kernel@vger.kernel.org>
Sent: Tuesday, December 25, 2007 10:13 AM
Subject: Re: xfs|loop|raid: attempt to access beyond end of device


> On Sun, Dec 23, 2007 at 08:21:08PM +0100, Janos Haar wrote:
> > Hello, list,
> >
> > I have a little problem on one of my productive system.
> >
> > The system sometimes crashed, like this:
> >
> > Dec 23 08:53:05 Albohacen-global kernel: attempt to access beyond end of
> > device
> > Dec 23 08:53:05 Albohacen-global kernel: loop0: rw=1,
want=50552830649176,
> > limit=3085523200
> > Dec 23 08:53:05 Albohacen-global kernel: Buffer I/O error on device
loop0,
> > logical block 6319103831146
> > Dec 23 08:53:05 Albohacen-global kernel: lost page write due to I/O
error on
> > loop0
>
> So a long way beyond the end of the device.
>
> [snip soft lockup warnings]
>
> > Dec 23 09:08:19 Albohacen-global kernel: Filesystem "loop0": Access to
block
> > zero in inode 397821447 start_block: 0 start_off: 0 blkcnt: 0
extent-state:
> > 0 lastx: e4
>
> And that's to block zero of the filesystem. Sure signs of a corupted inode
> extent btree. We've seen a few of these corruptions on loopback device
> reported recently.
>
> You'll need to unmount and repair the filesystem to make this go away,
> but it's hard to know what is causing the btree corruption.

Yes, if i do that, the fs clean again, and works well until we moves big
amount of data on the storage.
But the problem comes out again, and again.

If i can, i will try the latest stable kernel, with some loop fixes, but it
is not too easy.
This is a productive system.


>
> > Dec 23 09:08:22 Albohacen-global last message repeated 19 times
> >
> > some more info:
> >
> > [root@Albohacen-global ~]# uname -a
> > Linux Albohacen-global 2.6.21.1 #3 SMP Thu May 3 04:33:36 CEST 2007
x86_64
> > x86_64 x86_64 GNU/Linux
> > [root@Albohacen-global ~]# cat /proc/mdstat
> > Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
[raid4]
> > [multipath] [faulty]
> > md1 : active raid4 sdf2[1] sde2[0] sdd2[5] sdc2[4] sdb2[3] sda2[2]
> >       19558720 blocks level 4, 64k chunk, algorithm 0 [6/6] [UUUUUU]
> >       bitmap: 8/239 pages [32KB], 8KB chunk
> >
> > md2 : active raid4 sdf3[1] sde3[0] sdd3[5] sdc3[4] sdb3[3] sda3[2]
> >       1542761600 blocks level 4, 64k chunk, algorithm 0 [6/6] [UUUUUU]
> >       bitmap: 0/148 pages [0KB], 1024KB chunk
> >
> > md0 : active raid1 sdb1[1] sda1[0]
> >       104320 blocks [2/2] [UU]
> >
> > unused devices: <none>
> > [root@Albohacen-global ~]# losetup /dev/loop0
> > /dev/loop0: [0010]:6598 (/dev/md2), encryption blowfish (type 18)
>
> You're using an encrypted block device?

Yes.

> What mechanism are you using for
> encryption (doesn't appear to be dmcrypt)?

No, this is the "old way", not the dmcrypt.
This is cryptoloop.

> Does it handle readahead bio
> cancellation correctly?

I dont know exactly, but in my log it is always a write failures! ( i think
from rw=1 and lost page write messages)
It is the readahead important in this case?

> We had similar XFS corruption problems on dmcrypt
> between 2.6.14 and ~2.6.20 due to a bug in dmcrypt's failure to handle
> aborted readahead I/O correctly....

Ok, what is the next step? :-)
(i will try the latest kernel, if i can....)
Can i switch to dmcrypt from cryptoloop without rebuild the fs?

Thanks a lot,
Janos Haar

>
> Cheers,
>
> Dave.
> -- 
> Dave Chinner
> Principal Engineer
> SGI Australian Software Group


      reply	other threads:[~2007-12-26  2:21 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-23 19:21 xfs|loop|raid: attempt to access beyond end of device Janos Haar
2007-12-25  9:13 ` David Chinner
2007-12-26  2:19   ` Janos Haar [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='076901c84765$bbc3cee0$9900a8c0@dcccs' \
    --to=djani22@netcenter.hu \
    --cc=dgc@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.