* [4.7-rc6 ext3 corruption] ext4_mb_generate_buddy:758: group 27, block bitmap and bg descriptor inconsistent:
@ 2016-07-12 5:41 Dave Chinner
2016-07-27 15:48 ` Jan Kara
0 siblings, 1 reply; 3+ messages in thread
From: Dave Chinner @ 2016-07-12 5:41 UTC (permalink / raw)
To: linux-ext4
Hi Folks,
Just rebooted a 4.7-rc6 test VM, and the root filesystem had the
journal abort a couple of seconds after mount while the system was
still booting:
[ 3.043543] EXT4-fs (sda1): mounting ext3 file system using the ext4 subsystem
[ 3.045027] EXT4-fs (sda1): INFO: recovery required on readonly filesystem
[ 3.046008] EXT4-fs (sda1): write access will be enabled during recovery
[ 3.120052] EXT4-fs (sda1): recovery complete
[ 3.121746] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
[ 3.122778] VFS: Mounted root (ext3 filesystem) readonly on device 8:1.
.....
[ 5.263329] EXT4-fs error (device sda1): ext4_mb_generate_buddy:758: group 27, block bitmap and bg descriptor inconsistent: 4197 vs 4196 free clusters
[ 5.266343] Aborting journal on device sda1-8.
[ 5.267939] EXT4-fs (sda1): Remounting filesystem read-only
[ 5.269129] EXT4-fs error (device sda1) in ext4_free_blocks:4904: Journal has aborted
[ 5.271431] EXT4-fs error (device sda1) in ext4_do_update_inode:4891: Journal has aborted
[ 5.273720] EXT4-fs error (device sda1) in ext4_truncate:4150: IO failure
[ 5.275917] EXT4-fs error (device sda1) in ext4_orphan_del:2923: Journal has aborted
[ 5.278325] EXT4-fs error (device sda1) in ext4_do_update_inode:4891: Journal has aborted
The root filesystem checked clean three reboots before this
occurred. e2fsck output on ro mounted fs:
# e2fsck /dev/sda1
e2fsck 1.43-WIP (18-May-2015)
/dev/sda1: recovering journal
Superblock last mount time is in the future.
(by less than a day, probably due to the hardware clock being incorrectly set)
/dev/sda1 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (542319, counted=546517).
Fix<y>? yes
Inode bitmap differences: -219131
Fix<y>? yes
Free inodes count wrong for group #27 (6720, counted=6721).
Fix<y>? yes
Directories count wrong for group #27 (9, counted=8).
Fix<y>? yes
Free inodes count wrong (341015, counted=341018).
Fix<y>? yes
/dev/sda1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda1: ***** REBOOT LINUX *****
/dev/sda1: 283606/624624 files (3.1% non-contiguous), 1949574/2496091 blocks
#
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [4.7-rc6 ext3 corruption] ext4_mb_generate_buddy:758: group 27, block bitmap and bg descriptor inconsistent:
2016-07-12 5:41 [4.7-rc6 ext3 corruption] ext4_mb_generate_buddy:758: group 27, block bitmap and bg descriptor inconsistent: Dave Chinner
@ 2016-07-27 15:48 ` Jan Kara
2016-07-27 23:57 ` Dave Chinner
0 siblings, 1 reply; 3+ messages in thread
From: Jan Kara @ 2016-07-27 15:48 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-ext4
Hi!
On Tue 12-07-16 15:41:37, Dave Chinner wrote:
> Just rebooted a 4.7-rc6 test VM, and the root filesystem had the
> journal abort a couple of seconds after mount while the system was
> still booting:
>
> [ 3.043543] EXT4-fs (sda1): mounting ext3 file system using the ext4 subsystem
> [ 3.045027] EXT4-fs (sda1): INFO: recovery required on readonly filesystem
> [ 3.046008] EXT4-fs (sda1): write access will be enabled during recovery
> [ 3.120052] EXT4-fs (sda1): recovery complete
> [ 3.121746] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
> [ 3.122778] VFS: Mounted root (ext3 filesystem) readonly on device 8:1.
> .....
> [ 5.263329] EXT4-fs error (device sda1): ext4_mb_generate_buddy:758: group 27, block bitmap and bg descriptor inconsistent: 4197 vs 4196 free clusters
> [ 5.266343] Aborting journal on device sda1-8.
> [ 5.267939] EXT4-fs (sda1): Remounting filesystem read-only
> [ 5.269129] EXT4-fs error (device sda1) in ext4_free_blocks:4904: Journal has aborted
> [ 5.271431] EXT4-fs error (device sda1) in ext4_do_update_inode:4891: Journal has aborted
> [ 5.273720] EXT4-fs error (device sda1) in ext4_truncate:4150: IO failure
> [ 5.275917] EXT4-fs error (device sda1) in ext4_orphan_del:2923: Journal has aborted
> [ 5.278325] EXT4-fs error (device sda1) in ext4_do_update_inode:4891: Journal has aborted
>
> The root filesystem checked clean three reboots before this
> occurred. e2fsck output on ro mounted fs:
>
> # e2fsck /dev/sda1
> e2fsck 1.43-WIP (18-May-2015)
> /dev/sda1: recovering journal
> Superblock last mount time is in the future.
> (by less than a day, probably due to the hardware clock being incorrectly set)
> /dev/sda1 contains a file system with errors, check forced.
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> Free blocks count wrong (542319, counted=546517).
> Fix<y>? yes
> Inode bitmap differences: -219131
> Fix<y>? yes
> Free inodes count wrong for group #27 (6720, counted=6721).
> Fix<y>? yes
> Directories count wrong for group #27 (9, counted=8).
> Fix<y>? yes
> Free inodes count wrong (341015, counted=341018).
> Fix<y>? yes
>
> /dev/sda1: ***** FILE SYSTEM WAS MODIFIED *****
> /dev/sda1: ***** REBOOT LINUX *****
> /dev/sda1: 283606/624624 files (3.1% non-contiguous), 1949574/2496091 blocks
> #
Hum, interesting. So 'Free blocks count wrong' and 'Free inodes count
wrong' messages are harmless - those entries and updated only
opportunistically and on mount and generally do not have to match on live
filesystem. The other three errors regarding inode and directory count are
a fallout from aborted inode deletion. Most importantly there is *no
problem* whatsoever with block bitmaps. So it was either some memory glitch
(bitflip in the counter or the bitmap) or there is some race and bb_free
can get out of sync with the bitmap and I don't see how that could happen
especially so early after mount... Strange.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [4.7-rc6 ext3 corruption] ext4_mb_generate_buddy:758: group 27, block bitmap and bg descriptor inconsistent:
2016-07-27 15:48 ` Jan Kara
@ 2016-07-27 23:57 ` Dave Chinner
0 siblings, 0 replies; 3+ messages in thread
From: Dave Chinner @ 2016-07-27 23:57 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-ext4
On Wed, Jul 27, 2016 at 05:48:43PM +0200, Jan Kara wrote:
> Hi!
>
> On Tue 12-07-16 15:41:37, Dave Chinner wrote:
> > Just rebooted a 4.7-rc6 test VM, and the root filesystem had the
> > journal abort a couple of seconds after mount while the system was
> > still booting:
> >
> > [ 3.043543] EXT4-fs (sda1): mounting ext3 file system using the ext4 subsystem
> > [ 3.045027] EXT4-fs (sda1): INFO: recovery required on readonly filesystem
> > [ 3.046008] EXT4-fs (sda1): write access will be enabled during recovery
> > [ 3.120052] EXT4-fs (sda1): recovery complete
> > [ 3.121746] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
> > [ 3.122778] VFS: Mounted root (ext3 filesystem) readonly on device 8:1.
> > .....
> > [ 5.263329] EXT4-fs error (device sda1): ext4_mb_generate_buddy:758: group 27, block bitmap and bg descriptor inconsistent: 4197 vs 4196 free clusters
> > [ 5.266343] Aborting journal on device sda1-8.
> > [ 5.267939] EXT4-fs (sda1): Remounting filesystem read-only
> > [ 5.269129] EXT4-fs error (device sda1) in ext4_free_blocks:4904: Journal has aborted
> > [ 5.271431] EXT4-fs error (device sda1) in ext4_do_update_inode:4891: Journal has aborted
> > [ 5.273720] EXT4-fs error (device sda1) in ext4_truncate:4150: IO failure
> > [ 5.275917] EXT4-fs error (device sda1) in ext4_orphan_del:2923: Journal has aborted
> > [ 5.278325] EXT4-fs error (device sda1) in ext4_do_update_inode:4891: Journal has aborted
> >
> > The root filesystem checked clean three reboots before this
> > occurred. e2fsck output on ro mounted fs:
> >
> > # e2fsck /dev/sda1
> > e2fsck 1.43-WIP (18-May-2015)
> > /dev/sda1: recovering journal
> > Superblock last mount time is in the future.
> > (by less than a day, probably due to the hardware clock being incorrectly set)
> > /dev/sda1 contains a file system with errors, check forced.
> > Pass 1: Checking inodes, blocks, and sizes
> > Pass 2: Checking directory structure
> > Pass 3: Checking directory connectivity
> > Pass 4: Checking reference counts
> > Pass 5: Checking group summary information
> > Free blocks count wrong (542319, counted=546517).
> > Fix<y>? yes
> > Inode bitmap differences: -219131
> > Fix<y>? yes
> > Free inodes count wrong for group #27 (6720, counted=6721).
> > Fix<y>? yes
> > Directories count wrong for group #27 (9, counted=8).
> > Fix<y>? yes
> > Free inodes count wrong (341015, counted=341018).
> > Fix<y>? yes
> >
> > /dev/sda1: ***** FILE SYSTEM WAS MODIFIED *****
> > /dev/sda1: ***** REBOOT LINUX *****
> > /dev/sda1: 283606/624624 files (3.1% non-contiguous), 1949574/2496091 blocks
> > #
>
> Hum, interesting. So 'Free blocks count wrong' and 'Free inodes count
> wrong' messages are harmless - those entries and updated only
> opportunistically and on mount and generally do not have to match on live
> filesystem. The other three errors regarding inode and directory count are
> a fallout from aborted inode deletion. Most importantly there is *no
> problem* whatsoever with block bitmaps. So it was either some memory glitch
> (bitflip in the counter or the bitmap) or there is some race and bb_free
> can get out of sync with the bitmap and I don't see how that could happen
> especially so early after mount... Strange.
Don't think bitflips from memory glitches are likely - the VM is
running on a machine with ECC ram. Some other kernel memory
corruption that affects the page cache also seems unlikely, because
it onyl happened after hanging the kernel hard due to XFS failures
on other filesystems and storage devices and having to effectively
"cold reboot" the VM from the qemu console (oops in a kworker thread
is now a Real Bad Thing to do to the system, it appears).
It is strange that these showed up around 4.7-rc4, and the last I
wacky hang-cold reboot-ext3 in bad state issue I encountered was in
4.7-rc6. I haven't seen problems since, but that's not to say
they've gone away...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-07-27 23:57 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-12 5:41 [4.7-rc6 ext3 corruption] ext4_mb_generate_buddy:758: group 27, block bitmap and bg descriptor inconsistent: Dave Chinner
2016-07-27 15:48 ` Jan Kara
2016-07-27 23:57 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).