linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Theodore Ts'o <tytso@mit.edu>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: Zheng Liu <gnehzuil.liu@gmail.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	linux-ext4@vger.kernel.org
Subject: Re: [PATCH v1 0/5] ext4: Shut down block groups when damage is detected
Date: Mon, 29 Jul 2013 21:57:47 -0400	[thread overview]
Message-ID: <20130730015747.GA20629@thunk.org> (raw)
In-Reply-To: <x49txjdbd21.fsf@segfault.boston.devel.redhat.com>

On Mon, Jul 29, 2013 at 11:28:38AM -0400, Jeff Moyer wrote:
> I think it's important to call out the exact failure scenario you're
> trying to address.  For hard disks, if you get a read error, it can
> typically be recovered by re-writing the block.  I imagine this is what
> fsck would be doing for metadata repair.  So, I'm not at all sure why
> you'd want to track bad blocks in the file system itself.  Could you
> elaborate, please?

The basic idea why we had a similar patch in Google was so that when
we discovered a potential problem in an allocation bitmap (i.e.,
either a read error, or finding that we had freed a bloc/inode which
was already marked as freed), instead of panic'ing the entire server,
or remounting the file system read/only (or otherwise taking it
off-line), you can just avoid allocating any blocks/inodes in that
block group (since we can't trust the allocation bitmap), but we can
keep using the file system, in a somewhat degraded mode.

Of course, eventually you'd want to take the machine off-line and run
fsck on the whole thing, and then rewrite the broken allocation
bitmap.  But in the meantime, there might be circumstances where it
would be inconvenient (or violate some pesky SLA :-), to take down the
server or even the individual storage device to run fsck on it.

Regards,

						- Ted

  parent reply	other threads:[~2013-07-30  1:57 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-19 23:55 [PATCH v1 0/5] ext4: Shut down block groups when damage is detected Darrick J. Wong
2013-07-19 23:55 ` [PATCH 1/5] ext4: Error out if verifying the block bitmap fails Darrick J. Wong
2013-08-28 19:36   ` Theodore Ts'o
2013-07-19 23:55 ` [PATCH 2/5] ext4: Fix type declaration of ext4_validate_block_bitmap Darrick J. Wong
2013-07-24  7:12   ` Zheng Liu
2013-07-26 16:06     ` Darrick J. Wong
2013-08-28 20:01       ` Theodore Ts'o
2013-07-19 23:55 ` [PATCH 3/5] ext4: Mark block group as corrupt on block bitmap error Darrick J. Wong
2013-07-23  3:38   ` Darrick J. Wong
2013-08-28 22:26     ` Theodore Ts'o
2013-07-19 23:55 ` [PATCH 4/5] ext4: Mark block group as corrupt on inode " Darrick J. Wong
2013-07-24  7:22   ` Zheng Liu
2013-08-28 22:45     ` Theodore Ts'o
2013-07-19 23:56 ` [PATCH 5/5] ext4: Mark group corrupt on group descriptor checksum error Darrick J. Wong
2013-08-28 22:49   ` Theodore Ts'o
2013-07-21 14:32 ` [PATCH v1 0/5] ext4: Shut down block groups when damage is detected Zheng Liu
2013-07-29 15:28   ` Jeff Moyer
2013-07-30  0:31     ` Zheng Liu
2013-07-31 18:52       ` Jan Kara
2013-07-31 21:28         ` Theodore Ts'o
2013-07-30  1:57     ` Theodore Ts'o [this message]
2013-08-10  6:02     ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130730015747.GA20629@thunk.org \
    --to=tytso@mit.edu \
    --cc=darrick.wong@oracle.com \
    --cc=gnehzuil.liu@gmail.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).