EXT3 way too happy with write errors

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Simon Kirby <sim@netnation.com>
To: linux-ext4@vger.kernel.org
Subject: EXT3 way too happy with write errors
Date: Tue, 14 Oct 2008 17:22:56 -0700	[thread overview]
Message-ID: <20081015002256.GD25662@hostway.ca> (raw)

Hello!

While attempting to track down failed write error at a device layer,
I noticed that EXT3 seems to behave strangely after a single block I/O
failure.

I would expect that upon the first failed request, it would abort the
journal and remount-ro (if errors=remount-ro is specified).  Instead, it
seems to happily plonk along until I inject a few more failures (testing
with the fault injection framework), until it eventually fails enough to
abort the journal.  However, by then, "fsck" will show corruption --
sometimes severe.  If I force only one or two of write failures and 
then unmount, I can reproduce consistency corruption that shows up
with "fsck -f" even though the file system is not marked "errors"!

Why is this?

Example:

Oct  9 19:57:31 nas02 kernel: kjournald starting.  Commit interval 5 seconds
Oct  9 19:57:31 nas02 kernel: EXT3 FS on etherd/e3.0p1, internal journal
Oct  9 19:57:31 nas02 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Oct  9 20:00:18 nas02 kernel: FAULT_INJECTION: forcing a failure
Oct  9 20:00:18 nas02 kernel: Buffer I/O error on device etherd/e3.0p1, logical block 5186046
Oct  9 20:00:18 nas02 kernel: lost page write due to I/O error on etherd/e3.0p1
Oct  9 20:00:37 nas02 kernel: FAULT_INJECTION: forcing a failure
Oct  9 20:00:37 nas02 kernel: Buffer I/O error on device etherd/e3.0p1, logical block 410322
Oct  9 20:00:37 nas02 kernel: lost page write due to I/O error on etherd/e3.0p1
Oct  9 20:00:40 nas02 kernel: FAULT_INJECTION: forcing a failure
Oct  9 20:00:40 nas02 kernel: EXT3-fs error (device etherd/e3.0p1): read_block_bitmap: Cannot read block bitmap - block_group = 18, block_bitmap = 589824
Oct  9 20:00:40 nas02 kernel: Aborting journal on device etherd/e3.0p1.
Oct  9 20:00:40 nas02 kernel: FAULT_INJECTION: forcing a failure
Oct  9 20:00:40 nas02 kernel: Buffer I/O error on device etherd/e3.0p1, logical block 1545
Oct  9 20:00:40 nas02 kernel: lost page write due to I/O error on etherd/e3.0p1
Oct  9 20:00:40 nas02 kernel: Remounting filesystem read-only

[sroot@nas02:/]# fsck -C /mnt/web00
fsck 1.40-WIP (14-Nov-2006)
e2fsck 1.40-WIP (14-Nov-2006)
/dev/etherd/e3.0p1: recovering journal
/dev/etherd/e3.0p1 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Inode 49153, i_blocks is 2942528, should be 2942520.  Fix<y>?
Pass 2: Checking directory structure                                           
Pass 3: Checking directory connectivity                                        
Pass 4: Checking reference counts                                              
Pass 5: Checking group summary information                                     

/dev/etherd/e3.0p1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/etherd/e3.0p1: 126254/24690688 files (0.1% non-contiguous), 1778971/49359704 blocks

Shouldn't it be the case that the first request failure should
remount-ro?  Assuming the fault merely denied a single read or write
request, it should then be possible to reboot or remount,rw after the
fault is fixed and have consistency after just a journal replay...

Cheers,

Simon-

next             reply	other threads:[~2008-10-15  0:41 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-15  0:22 Simon Kirby [this message]
2008-12-18 17:07 ` EXT3 way too happy with write errors Jan Kara
2008-12-18 17:18   ` Simon Kirby
2008-12-18 17:27     ` Jan Kara
2008-12-18 17:49       ` Simon Kirby
2008-12-18 18:29         ` Michael Rubin
2009-01-03  2:15           ` Simon Kirby
2009-01-03  2:45             ` Eric Sandeen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081015002256.GD25662@hostway.ca \
    --to=sim@netnation.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).