linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Arthur Marsh <arthur.marsh@internode.on.net>,
	Richard Weinberger <richard.weinberger@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>
Subject: Re: ext3/ext4 filesystem corruption under post 5.1.0 kernels
Date: Mon, 1 Jul 2019 09:56:07 -0400	[thread overview]
Message-ID: <20190701135607.GB6549@mit.edu> (raw)
In-Reply-To: <CAMuHMdU-vfWjomDpttYTqgp4YzBu7z__p48r7rq6TSUwx7uFqQ@mail.gmail.com>

On Mon, Jul 01, 2019 at 02:43:14PM +0200, Geert Uytterhoeven wrote:
> Hi Ted,
> 
> Despite this fix having been applied upstream,  the kernel prints from
> time to time:
> 
>     EXT4-fs (sda1): error count since last fsck: 5
>     EXT4-fs (sda1): initial error at time 1557931133:
> ext4_get_branch:171: inode 1980: block 27550
>     EXT4-fs (sda1): last error at time 1558114349:
> ext4_get_branch:171: inode 1980: block 27550
> 
> This happens even after a manual run of "e2fsck -f" (while it's mounted
> RO), which reports a clean file system.

What's happening is this.  When the kernel detects a corruption, newer
kernels will set these superblock fields:

	__le32	s_error_count;		/* number of fs errors */
	__le32	s_first_error_time;	/* first time an error happened */
	__le32	s_first_error_ino;	/* inode involved in first error */
	__le64	s_first_error_block;	/* block involved of first error */
	__u8	s_first_error_func[32] __nonstring;	/* function where the error happened */
	__le32	s_first_error_line;	/* line number where error happened */
	__le32	s_last_error_time;	/* most recent time of an error */
	__le32	s_last_error_ino;	/* inode involved in last error */
	__le32	s_last_error_line;	/* line number where error happened */
	__le64	s_last_error_block;	/* block involved of last error */
	__u8	s_last_error_func[32] __nonstring;	/* function where the error happened */

When newer versions of e2fsck *fix* the corruption, it will clear
these fields.  It's basically a safety check because *way* too many
ext4 users run with errors=continue (aka, "don't worry, be happy"
mode), and so this is a poke in the system logs that the file system
is corrupted, and they, really, *REALLY* should fix it before they
lose (more) data.

> The inode and block numbers match the numbers printed due to the
> previous bug.

You can also see when the last file system error was detected via:

% date -d @1558114349
Fri 17 May 2019 01:32:29 PM EDT

> Do you have an idea what's wrong?
> Note that I run a very old version of e2fsck (from a decade ago).

... and that's the problem.  If you're going to be using newer
versions of the kernel, you really should be using newer versions of
e2fsprogs.

There have been a lot of bug fixes in the last 10 years, and some of
them can be data corruption bugs....

					- Ted

  reply	other threads:[~2019-07-01 13:56 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <48BA4A6E-5E2A-478E-A96E-A31FA959964C@internode.on.net>
2019-05-11 12:43 ` ext3/ext4 filesystem corruption under post 5.1.0 kernels Richard Weinberger
2019-05-11 22:06   ` Theodore Ts'o
2019-05-13 10:31     ` Arthur Marsh
2019-05-14  1:59       ` Arthur Marsh
2019-05-14 10:42         ` Ondrej Zary
2019-05-15  2:59         ` Arthur Marsh
2019-05-15  4:57           ` Theodore Ts'o
2019-05-15 12:12             ` Arthur Marsh
2019-05-16  2:56               ` Theodore Ts'o
2019-05-17 16:44             ` Geert Uytterhoeven
2019-07-01 12:43               ` Geert Uytterhoeven
2019-07-01 13:56                 ` Theodore Ts'o [this message]
2019-07-01 14:08                   ` Geert Uytterhoeven
2019-05-17  9:23     ` Geert Uytterhoeven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190701135607.GB6549@mit.edu \
    --to=tytso@mit.edu \
    --cc=arthur.marsh@internode.on.net \
    --cc=geert@linux-m68k.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=richard.weinberger@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).