fsck.ext4 taking a very long time because of "should not have EOFBLOCKS

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* fsck.ext4 taking a very long time because of "should not have EOFBLOCKS_FL set"
@ 2011-10-19 16:02 Johannes Segitz
  2011-10-19 16:22 ` Andreas Dilger
  2011-10-19 18:53 ` Ted Ts'o
  0 siblings, 2 replies; 6+ messages in thread
From: Johannes Segitz @ 2011-10-19 16:02 UTC (permalink / raw)
  To: linux-ext4

Hello,

yesterday i was forced to start a fsck of an ext4 filesystem (4 TB on
a encrypted raid5 array). After a while a got a lot
of those messages:
Inode 23565579 should not have EOFBLOCKS_FL set (size 0, lblk -1)

After some googling i found this thread
http://kerneltrap.org/mailarchive/linux-ext4/2010/8/19/6885408/thread#mid-6885408

Since it's something that can be taken care of by using "-p" i started
it yesterday and was kind of surprised
to discover it running happily today with no sign of stopping. I piped
the output to /dev/null since the printing
of the messages alone caused quite a bit of load so i don't know at
which inode fsck currently is.

Is there a way to speed things up? If i understand the thread
correctly those errors should self correct over time
and i don't want to wait anymore. Can i do any harm by killing fsck
and start it again without the pipe to see
at which inode it currently is?

Bye,
Johannes

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: fsck.ext4 taking a very long time because of "should not have EOFBLOCKS_FL set"
  2011-10-19 16:02 fsck.ext4 taking a very long time because of "should not have EOFBLOCKS_FL set" Johannes Segitz
@ 2011-10-19 16:22 ` Andreas Dilger
  2011-10-20  7:48   ` Johannes Segitz
  2011-10-19 18:53 ` Ted Ts'o
  1 sibling, 1 reply; 6+ messages in thread
From: Andreas Dilger @ 2011-10-19 16:22 UTC (permalink / raw)
  To: Johannes Segitz; +Cc: linux-ext4@vger.kernel.org

On 2011-10-19, at 10:02 AM, Johannes Segitz <johannes.segitz@gmail.com> wrote:

> yesterday i was forced to start a fsck of an ext4 filesystem (4 TB on
> a encrypted raid5 array). After a while a got a lot
> of those messages:
> Inode 23565579 should not have EOFBLOCKS_FL set (size 0, lblk -1)
> 
> After some googling i found this thread
> http://kerneltrap.org/mailarchive/linux-ext4/2010/8/19/6885408/thread#mid-6885408
> 
> Since it's something that can be taken care of by using "-p" i started
> it yesterday and was kind of surprised
> to discover it running happily today with no sign of stopping. I piped
> the output to /dev/null since the printing
> of the messages alone caused quite a bit of load so i don't know at
> which inode fsck currently is.
> 
> Is there a way to speed things up? If i understand the thread
> correctly those errors should self correct over time
> and i don't want to wait anymore. Can i do any harm by killing fsck
> and start it again without the pipe to see
> at which inode it currently is?

You could always strace e2fsck to see what it is printing. 

Cheers, Andreas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: fsck.ext4 taking a very long time because of "should not have EOFBLOCKS_FL set"
  2011-10-19 16:02 fsck.ext4 taking a very long time because of "should not have EOFBLOCKS_FL set" Johannes Segitz
  2011-10-19 16:22 ` Andreas Dilger
@ 2011-10-19 18:53 ` Ted Ts'o
       [not found]   ` <CAFj9jjwOuukgzsgA8i3qzvEi3N7E19ugZfh3d+KGgGrrAms2OA@mail.gmail.com>
  1 sibling, 1 reply; 6+ messages in thread
From: Ted Ts'o @ 2011-10-19 18:53 UTC (permalink / raw)
  To: Johannes Segitz; +Cc: linux-ext4

On Wed, Oct 19, 2011 at 06:02:12PM +0200, Johannes Segitz wrote:
> Hello,
> 
> yesterday i was forced to start a fsck of an ext4 filesystem (4 TB on
> a encrypted raid5 array). After a while a got a lot
> of those messages:
> Inode 23565579 should not have EOFBLOCKS_FL set (size 0, lblk -1)
> 
> After some googling i found this thread
> http://kerneltrap.org/mailarchive/linux-ext4/2010/8/19/6885408/thread#mid-6885408

What kernel version are you using, and can you upgrade to one that has
this bug fixed?  This is a problem which was fixed over a year ago...

> Since it's something that can be taken care of by using "-p" i started
> it yesterday and was kind of surprised
> to discover it running happily today with no sign of stopping. I piped
> the output to /dev/null since the printing
> of the messages alone caused quite a bit of load so i don't know at
> which inode fsck currently is.
> 
> Is there a way to speed things up? If i understand the thread
> correctly those errors should self correct over time
> and i don't want to wait anymore. Can i do any harm by killing fsck
> and start it again without the pipe to see
> at which inode it currently is?

What version of e2fsprogs are you using?  Given that you're using an
old version of the kernel there's a good chance you're using a old
version of e2fsprogs.  Are you willing to upgrade to a newer kernel
and e2fsprogs?  If so, the following procedure documented in the
following commit, which is included in e2fsprogs 1.41.13 or newer,
should help you out (see below).

						- Ted

commit 75990388365c5688dbade9c33a3394e40f757526
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Mon Dec 6 10:10:33 2010 -0500

    e2fsck: Add the ability to force a problem to not be fixed
    
    The boolean options "force_no" in the problems stanza of e2fsck.conf
    allows a particular problem code be treated as if the user will answer
    "no" to the question of whether a particular problem should be fixed
    --- even if e2fsck is run with the -y option.
    
    As an example use case, suppose a distribution had widely deployed a
    version of the kernel where under some circumstances, the EOFBLOCKS_FL
    flag would be left set even though it should not be left set, and a
    customer had a workload which exercised the fencepost error all the
    time, resulting in many large number of inodes that had EOFBLOCKS_FL
    set erroneously.  Enough, in fact, the e2fsck runs were taking too
    long.  (There was such a bug in the kernel, which was fixed by commit
    58590b06d in 2.6.36).
    
    Leaving EOFBLOCKS_FL set when it should not be isn't a huge deal, and
    is certainly than having high availability timeout alerts going off
    left and right.  So in this case, the best fix might be to put the
    following in /etc/e2fsck.conf:
    
    [problems]
    0x010060 = {                        # PR_1_EOFBLOCKS_FL_SET
         force_no = true
         no_ok = true
         no_nomsg = true
    }
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: fsck.ext4 taking a very long time because of "should not have EOFBLOCKS_FL set"
  2011-10-19 16:22 ` Andreas Dilger
@ 2011-10-20  7:48   ` Johannes Segitz
  0 siblings, 0 replies; 6+ messages in thread
From: Johannes Segitz @ 2011-10-20  7:48 UTC (permalink / raw)
  To: linux-ext4@vger.kernel.org

On Wed, Oct 19, 2011 at 18:22, Andreas Dilger <aedilger@gmail.com> wrote:
> You could always strace e2fsck to see what it is printing.

i tried that put i can't see which inode is currently processed

<snip fcntl lines>
fcntl(5, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=556, len=1}) = 0
fcntl(5, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=284, len=1}) = 0
fcntl(5, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=164, len=1}) = 0
fstat(5, {st_mode=S_IFREG|0600, st_size=376758272, ...}) = 0
munmap(0x7fe2aee4e000, 376758272)       = 0
ftruncate(5, 376762368)                 = 0
pwrite(5, "BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB"..., 1024, 376758272) = 1024
pwrite(5, "BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB"..., 1024, 376759296) = 1024
pwrite(5, "BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB"..., 1024, 376760320) = 1024
pwrite(5, "BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB"..., 1024, 376761344) = 1024
mmap(NULL, 376762368, PROT_READ|PROT_WRITE, MAP_SHARED, 5, 0) = 0x7fe2aee4d000

rinse repeat

Bye,
Johannes

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: fsck.ext4 taking a very long time because of "should not have EOFBLOCKS_FL set"
       [not found]   ` <CAFj9jjwOuukgzsgA8i3qzvEi3N7E19ugZfh3d+KGgGrrAms2OA@mail.gmail.com>
@ 2011-10-20 10:58     ` Johannes Segitz
  2011-10-20 18:59       ` Andreas Dilger
  0 siblings, 1 reply; 6+ messages in thread
From: Johannes Segitz @ 2011-10-20 10:58 UTC (permalink / raw)
  To: linux-ext4

On Wed, Oct 19, 2011 at 20:53, Ted Ts'o <tytso@mit.edu> wrote:
> On Wed, Oct 19, 2011 at 06:02:12PM +0200, Johannes Segitz wrote:
> What kernel version are you using, and can you upgrade to one that has
> this bug fixed?  This is a problem which was fixed over a year ago...

2.6.38-11-generic #50-Ubuntu SMP

I was running 3.0.4 until a few days ago.

I didn't fsck the filesystem for quite a while and the files on this
volume don't get
rewritten so it doesn't fix itself so i think it's just something that
was caused some
time ago and still persists

> What version of e2fsprogs are you using?

1.41.14-1ubuntu3 which seems to be the newest version

>    As an example use case, suppose a distribution had widely deployed a
>    version of the kernel where under some circumstances, the EOFBLOCKS_FL
>    flag would be left set even though it should not be left set, and a
>    customer had a workload which exercised the fencepost error all the
>    time, resulting in many large number of inodes that had EOFBLOCKS_FL
>    set erroneously.

yeah "suppose" ;)

>    Leaving EOFBLOCKS_FL set when it should not be isn't a huge deal, and
>    is certainly than having high availability timeout alerts going off
>    left and right.  So in this case, the best fix might be to put the
>    following in /etc/e2fsck.conf:
>
>    [problems]
>    0x010060 = {                        # PR_1_EOFBLOCKS_FL_SET
>         force_no = true
>         no_ok = true
>         no_nomsg = true
>    }

That was pretty much what i was looking for, thank you. I'll kill fsck
tonight if it's still
running and run it again with those settings.

Thank you for your help
Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: fsck.ext4 taking a very long time because of "should not have EOFBLOCKS_FL set"
  2011-10-20 10:58     ` Johannes Segitz
@ 2011-10-20 18:59       ` Andreas Dilger
  0 siblings, 0 replies; 6+ messages in thread
From: Andreas Dilger @ 2011-10-20 18:59 UTC (permalink / raw)
  To: Johannes Segitz; +Cc: linux-ext4

On 2011-10-20, at 4:58 AM, Johannes Segitz wrote:
> On Wed, Oct 19, 2011 at 20:53, Ted Ts'o <tytso@mit.edu> wrote:
>>    Leaving EOFBLOCKS_FL set when it should not be isn't a huge deal, and
>>    is certainly than having high availability timeout alerts going off
>>    left and right.  So in this case, the best fix might be to put the
>>    following in /etc/e2fsck.conf:
>> 
>>    [problems]
>>    0x010060 = {                        # PR_1_EOFBLOCKS_FL_SET
>>         force_no = true
>>         no_ok = true
>>         no_nomsg = true
>>    }
> 
> That was pretty much what i was looking for, thank you. I'll kill fsck
> tonight if it's still running and run it again with those settings.

At least it should have fixed the inodes that it has already scanned
on disk, so the next time you get a chance to run it without the above
[problems] option, it should be able to continue from where it left off.

Cheers, Andreas






^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-10-20 18:59 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-19 16:02 fsck.ext4 taking a very long time because of "should not have EOFBLOCKS_FL set" Johannes Segitz
2011-10-19 16:22 ` Andreas Dilger
2011-10-20  7:48   ` Johannes Segitz
2011-10-19 18:53 ` Ted Ts'o
     [not found]   ` <CAFj9jjwOuukgzsgA8i3qzvEi3N7E19ugZfh3d+KGgGrrAms2OA@mail.gmail.com>
2011-10-20 10:58     ` Johannes Segitz
2011-10-20 18:59       ` Andreas Dilger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).