From: Zlatko Calusic <zcalusic@bitsync.net>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-ext4@vger.kernel.org
Subject: Re: e2fsck not fixing deleted inode referenced errors?
Date: Tue, 30 Sep 2014 20:43:04 +0200 [thread overview]
Message-ID: <542AF9B8.2090800@bitsync.net> (raw)
In-Reply-To: <20140930183012.GA9942@birch.djwong.org>
On 30.09.2014 20:30, Darrick J. Wong wrote:
> On Tue, Sep 30, 2014 at 07:56:36PM +0200, Zlatko Calusic wrote:
>> Hope this is the right list to ask this question.
>>
>> I have an ext4 filesystem that has a few errors like this:
>>
>> Sep 30 19:14:09 atlas kernel: EXT4-fs error (device md2):
>> ext4_lookup:1448: inode #7913865: comm find: deleted inode
>> referenced: 7912058
>> Sep 30 19:14:09 atlas kernel: EXT4-fs error (device md2):
>> ext4_lookup:1448: inode #7913865: comm find: deleted inode
>> referenced: 7912055
>>
>> Yet, when I run e2fsck -fy on it, I have a clean run, no errors are
>> found and/or fixed. Is this the expected behaviour? What am I
>> supposed to do to get rid of errors like the above?
>
> [I should hope not.]
>
>> The filesystem is on a md mirror device, the kernel is 3.17.0-rc7,
>> e2progs 1.42.12-1 (Debian sid). Could md device somehow interfere? I
>> ran md check yesterday, but there were no errors.
>>
>> BTW, this all started when I got ata2.00: failed command: FLUSH
>> CACHE EXT error yesterday morning. I did several runs of e2fsck
>> before the filesystem came up clean, yet errors like the above are
>> popping constantly.
>
> Normally that kernel message only happens if a dir refers to an inode with
> link_count and mode set to 0.
>
> Is the disk attached to ata2.00 one of the RAID1 mirrors? What was the full
> error message, and does smartctl -a report anything?
Yes, it is part of the mirror:
ata2.00: ATA-8: WDC WD1002FBYS-02A6B0, 03.00C06, max UDMA/133
ata2.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
ata2.00: configured for UDMA/133
md2 : active raid1 sdb2[0] sda2[1]
976229760 blocks [2/2] [UU]
bitmap: 0/8 pages [0KB], 65536KB chunk
Full error message from the kernel log, together with data check I did
in the evening:
Sep 29 05:07:51 atlas kernel: ata2.00: exception Emask 0x10 SAct 0x0
SErr 0x4010000 action 0xe frozen
Sep 29 05:07:51 atlas kernel: ata2.00: irq_stat 0x00400040, connection
status changed
Sep 29 05:07:51 atlas kernel: ata2: SError: { PHYRdyChg DevExch }
Sep 29 05:07:51 atlas kernel: ata2.00: failed command: FLUSH CACHE EXT
Sep 29 05:07:51 atlas kernel: ata2.00: cmd
ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0\x0a res
40/00:f4:e2:7f:14/00:00:3a:00:00/40 Emask 0x10 (ATA bus error)
Sep 29 05:07:51 atlas kernel: ata2.00: status: { DRDY }
Sep 29 05:07:51 atlas kernel: ata2: hard resetting link
Sep 29 05:07:57 atlas kernel: ata2: link is slow to respond, please be
patient (ready=0)
Sep 29 05:08:00 atlas kernel: ata2: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Sep 29 05:08:00 atlas kernel: ata2.00: configured for UDMA/133
Sep 29 05:08:00 atlas kernel: ata2.00: retrying FLUSH 0xea Emask 0x10
Sep 29 05:08:00 atlas kernel: ata2: EH complete
Sep 29 05:37:36 atlas kernel: EXT4-fs error (device md2):
ext4_mb_generate_buddy:757: group 1783, block bitmap and bg descriptor
inconsistent: 8218 vs 9292 free clusters
Sep 29 05:37:36 atlas kernel: JBD2: Spotted dirty metadata buffer (dev =
md2, blocknr = 0). There's a risk of filesystem corruption in case of
system crash.
Sep 29 16:03:43 atlas kernel: EXT4-fs error (device md2):
ext4_mb_generate_buddy:757: group 995, block bitmap and bg descriptor
inconsistent: 15932 vs 15939 free clusters
Sep 29 16:03:43 atlas kernel: EXT4-fs error (device md2):
ext4_mb_generate_buddy:757: group 1732, block bitmap and bg descriptor
inconsistent: 5055 vs 5705 free clusters
Sep 29 19:24:01 atlas kernel: md: data-check of RAID array md2
Sep 29 19:24:01 atlas kernel: md: minimum _guaranteed_ speed: 1000
KB/sec/disk.
Sep 29 19:24:01 atlas kernel: md: using maximum available idle IO
bandwidth (but not more than 200000 KB/sec) for data-check.
Sep 29 19:24:01 atlas kernel: md: using 128k window, over a total of
976229760k.
Sep 29 22:37:53 atlas kernel: md: md2: data-check done.
Later on I did several (at least 3) e2fsck runs until the filesystem
finally was clean of errors. Only to stumble upon new errors today that
can't be fixed with e2fsck anymore. :(
>
> It would be interesting to see what "debugfs -R 'stat <7912058>' /dev/md2"
> returns.
Inode: 7912058 Type: regular Mode: 0644 Flags: 0x80000
Generation: 252726504 Version: 0x00000000:00000001
User: 0 Group: 0 Size: 0
File ACL: 0 Directory ACL: 0
Links: 0 Blockcount: 0
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x5428ccf9:667449f0 -- Mon Sep 29 05:07:37 2014
atime: 0x5428ccf9:65fa3740 -- Mon Sep 29 05:07:37 2014
mtime: 0x5428ccf9:667449f0 -- Mon Sep 29 05:07:37 2014
crtime: 0x53451666:d35246b0 -- Wed Apr 9 11:44:06 2014
dtime: 0x5428ccf9 -- Mon Sep 29 05:07:37 2014
Size of extra inode fields: 28
EXTENTS:
At this time there seems to be 7 such files. Here's what it looks like:
{atlas} [/ext/backup/atlas/usr/lib/x86_64-linux-gnu/imlib2/filters]# ls -la
ls: cannot access colormod.so: Input/output error
ls: cannot access bumpmap.so: Input/output error
ls: cannot access bumpmap.la: Input/output error
ls: cannot access testfilter.la: Input/output error
ls: cannot access testfilter.so: Input/output error
ls: cannot access colormod.la: Input/output error
total 8
drwxr-xr-x 2 root root 4096 Sep 28 11:10 .
drwxr-xr-x 4 root root 4096 Sep 14 2013 ..
-????????? ? ? ? ? ? bumpmap.la
-????????? ? ? ? ? ? bumpmap.so
-????????? ? ? ? ? ? colormod.la
-????????? ? ? ? ? ? colormod.so
-????????? ? ? ? ? ? testfilter.la
-????????? ? ? ? ? ? testfilter.so
{atlas} [/ext/backup/atlas/usr/lib/x86_64-linux-gnu/imlib2/filters]# cd
{atlas} [~]# umount /ext
tim{atlas} [~]# time e2fsck -fy /dev/md2
e2fsck 1.42.12 (29-Aug-2014)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/md2: 3863428/61022208 files (0.7% non-contiguous),
231256220/244057440 blocks
e2fsck -fy /dev/md2 9.57s user 2.05s system 5% cpu 3:14.40 total
Tried to delete that directory - impossible, i/o errors. I'll try to
reboot now to see if anything changes...
Thanks for your help.
--
Zlatko
next prev parent reply other threads:[~2014-09-30 18:43 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-30 17:56 e2fsck not fixing deleted inode referenced errors? Zlatko Calusic
2014-09-30 18:30 ` Darrick J. Wong
2014-09-30 18:43 ` Zlatko Calusic [this message]
2014-09-30 19:29 ` Darrick J. Wong
2014-09-30 20:10 ` Zlatko Calusic
2014-09-30 19:54 ` Theodore Ts'o
2014-09-30 20:27 ` Zlatko Calusic
2014-09-30 20:36 ` Darrick J. Wong
2014-09-30 21:34 ` Zlatko Calusic
2014-10-01 6:44 ` Zlatko Calusic
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=542AF9B8.2090800@bitsync.net \
--to=zcalusic@bitsync.net \
--cc=darrick.wong@oracle.com \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.