linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Lukáš Czerner" <lczerner@redhat.com>
To: Terry <td3201@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>, linux-ext4@vger.kernel.org
Subject: Re: ext4 won't mount - fsck required - 2nd fsck in less than a week
Date: Tue, 11 Sep 2012 13:59:47 -0400 (EDT)	[thread overview]
Message-ID: <alpine.LFD.2.00.1209111357050.2222@new-host-2> (raw)
In-Reply-To: <CAHSRzpC8OaMqhYtVuQdet8niRfnvCVwMHi9a6u8fmHdBsy1y2w@mail.gmail.com>

On Tue, 11 Sep 2012, Terry wrote:

> Date: Tue, 11 Sep 2012 11:22:27 -0500
> From: Terry <td3201@gmail.com>
> To: Theodore Ts'o <tytso@mit.edu>
> Cc: linux-ext4@vger.kernel.org
> Subject: Re: ext4 won't mount - fsck required - 2nd fsck in less than a week
> 
> On Mon, Sep 10, 2012 at 8:56 AM, Terry <td3201@gmail.com> wrote:
> > On Mon, Sep 10, 2012 at 8:48 AM, Terry <td3201@gmail.com> wrote:
> >> On Sun, Sep 9, 2012 at 10:18 PM, Terry <td3201@gmail.com> wrote:
> >>> On Sun, Sep 9, 2012 at 9:53 PM, Terry <td3201@gmail.com> wrote:
> >>>> On Sun, Sep 9, 2012 at 9:47 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> >>>>> On Sun, Sep 09, 2012 at 09:34:10PM -0500, Terry wrote:
> >>>>>>
> >>>>>> As the subject says, we have a 15 TB fsck drive that won't mount with
> >>>>>> these errors:
> >>>>>>
> >>>>>> Sep 9 20:02:20 narf kernel: EXT4-fs (dm-9): ext4_check_descriptors:
> >>>>>> Inode bitmap for group 3200 not in group (block 4161027887)!
> >>>>>> Sep 9 20:02:20 narf kernel: EXT4-fs (dm-9): group descriptors corrupted!
> >>>>>
> >>>>> These indicate a very basic file system corruption where the block
> >>>>> group descriptors are corrupted.  E2fsck will complain immediately
> >>>>> upon seeing this sort of fs inconsistency, and the first thing it will
> >>>>> try to do is fix it.
> >>>>>
> >>>>>> We did a proactive fsck on Tuesday of last week because it was
> >>>>>> starting to give filesystem errors. It ran through and mounted fine.
> >>>>>>
> >>>>>> The filesystem lives on an equallogic SAN spread across 36 drives.
> >>>>>> Could this be something with the physical layer or is it not abnormal
> >>>>>> to have to run multiple rounds of fsck to fully fix an issue?
> >>>>>
> >>>>> This is most probably a hardware problem; normally e2fsck will fix
> >>>>> file system corruptions (and certainly problems such as corrupt block
> >>>>> group scriptors) in a single pass.  If e2fsck finished and the file
> >>>>> system mounted fine last week, and now you're getting this kind of
> >>>>> error, it basically screams some kind of physical layer problem, or
> >>>>> perhaps a bad hard drive, or perhaps the SAN disk is getting
> >>>>> incorrectly written to by some other system, etc.
> >>>>>
> >>>>>                                      - Ted
> >>>>
> >>>> Thanks for the reply.  It is part of a RHEL cluster but we did not
> >>>> have any situations where multiple systems mounted the filesystem.  It
> >>>> is a an old SAN so perhaps we have a physical issue. We'll see what it
> >>>> happens with this pass.
> >>>
> >>> While I am waiting for fsck to finish, another thought. This
> >>> filesystem contains a lot of small files. 35,867,642 files to be
> >>> exact.  Anything else I should check or know to ensure a smooth
> >>> operation for these types of filesystems?  I formatted them with
> >>> standard RHEL 6 options.
> >>
> >> FSCK completed fixing a lot of things.  The file system then mounted
> >> without any errors.  We are still getting these types of errors in
> >> /var/log/messages:
> >>
> >> Sep 10 08:40:49 narf kernel: EXT4-fs error (device dm-6):
> >> ext4_dx_find_entry: bad entry in directory #743966900: directory entry
> >> across blocks - block=2975876794offset=0(946176), inode=1414751737,
> >> rec_len=45724, name_len=206
> >>
> >> Thoughts?
> >
> > Hold that thought.  This is another filesystem.  Let me fix that one
> > then come back to this problem if it still exists.
> 
> Ok, fixed the other filesystem (dm-6) yesterday.  Today, getting these
> errors still on it:
> Sep 11 11:17:47 omadvnfs01a kernel: EXT4-fs error (device dm-6):
> ext4_mb_generate_buddy: EXT4-fs: group 90851: 0 blocks in bitmap, 5048
> in gd
> Sep 11 11:18:17 omadvnfs01a kernel: EXT4-fs error (device dm-6):
> ext4_mb_generate_buddy: EXT4-fs: group 90670: 0 blocks in bitmap, 6665
> in gd
> Sep 11 11:19:31 omadvnfs01a kernel: EXT4-fs error (device dm-6):
> ext4_mb_generate_buddy: EXT4-fs: group 37589: 420 blocks in bitmap,
> 8302 in gd
> Sep 11 11:19:31 omadvnfs01a kernel: EXT4-fs error (device dm-6):
> ext4_mb_generate_buddy: EXT4-fs: group 71777: 7071 blocks in bitmap,
> 23711 in gd
> Sep 11 11:19:31 omadvnfs01a kernel: EXT4-fs error (device dm-6):
> ext4_mb_generate_buddy: EXT4-fs: group 71778: 10664 blocks in bitmap,
> 26624 in gd
> Sep 11 11:19:39 omadvnfs01a kernel: EXT4-fs error (device dm-6):
> ext4_mb_generate_buddy: EXT4-fs: group 13499: 9884 blocks in bitmap,
> 1256 in gd
> Sep 11 11:19:39 omadvnfs01a kernel: EXT4-fs error (device dm-6):
> ext4_mb_generate_buddy: EXT4-fs: group 13498: 383 blocks in bitmap,
> 384 in gd
> Sep 11 11:19:39 omadvnfs01a kernel: EXT4-fs error (device dm-6):
> ext4_mb_generate_buddy: EXT4-fs: group 13496: 2356 blocks in bitmap,
> 10453 in gd
> Sep 11 11:19:39 omadvnfs01a kernel: EXT4-fs error (device dm-6):
> ext4_mb_generate_buddy: EXT4-fs: group 13497: 3593 blocks in bitmap,
> 5641 in gd
> Sep 11 11:19:50 omadvnfs01a kernel: EXT4-fs error (device dm-6):
> ext4_mb_generate_buddy: EXT4-fs: group 49528: 25850 blocks in bitmap,
> 29946 in gd

Hi, what RHEL version are you using, or even better what kernel
version are you using ? If you have RHEL subscription, you should
definitely Red Hat about the issue.

Thanks!
-Lukas

> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

      parent reply	other threads:[~2012-09-11 17:59 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-10  2:34 ext4 won't mount - fsck required - 2nd fsck in less than a week Terry
2012-09-10  2:47 ` Theodore Ts'o
2012-09-10  2:53   ` Terry
2012-09-10  3:18     ` Terry
2012-09-10 13:48       ` Terry
2012-09-10 13:56         ` Terry
2012-09-11 16:22           ` Terry
2012-09-11 17:00             ` Theodore Ts'o
2012-09-11 17:07               ` Terry
2012-09-11 18:06                 ` Theodore Ts'o
2012-09-11 18:16                   ` Eric Sandeen
2012-09-11 17:59             ` Lukáš Czerner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.00.1209111357050.2222@new-host-2 \
    --to=lczerner@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=td3201@gmail.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).