Re: [PATCH v1 0/5] ext4: Shut down block groups when damage is detected

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Zheng Liu <gnehzuil.liu@gmail.com>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>,
	Theodore Ts'o <tytso@mit.edu>,
	linux-ext4@vger.kernel.org
Subject: Re: [PATCH v1 0/5] ext4: Shut down block groups when damage is detected
Date: Tue, 30 Jul 2013 08:31:09 +0800	[thread overview]
Message-ID: <20130730003109.GE3648@gmail.com> (raw)
In-Reply-To: <x49txjdbd21.fsf@segfault.boston.devel.redhat.com>

Hi Jeff,

On Mon, Jul 29, 2013 at 11:28:38AM -0400, Jeff Moyer wrote:
> Zheng Liu <gnehzuil.liu@gmail.com> writes:
> 
> > My idea is to let file system can ignore the currurted block.  Namely,
> > when we meet a currupted block, we will track it as bad block in bad
> > block inode and find another block to save data.  This currupted block
> > will never be used.  The first step in my mind is to detect a currpted
> > block and mark it as bad block.  After reading the thread and Darrick's
> > original patch, I think Darrick's patch is a good start.
> 
> I think it's important to call out the exact failure scenario you're
> trying to address.  For hard disks, if you get a read error, it can
> typically be recovered by re-writing the block.  I imagine this is what
> fsck would be doing for metadata repair.  So, I'm not at all sure why
> you'd want to track bad blocks in the file system itself.  Could you
> elaborate, please?

In our product system at Taobao, we have a large CDN system around the
country.  These servers cache the most of web pages, images, etc....
These servers have some disks, and the disk must break down at some
time.  Now we need to umount this disk, and the whole disk just be left
in server until the whole server is dropped.  But as you have pointed
out, when we meet a disk failure, the whole disk might still works.  So
we hope that the file system could track the bad block, doesn't allocate
them, and the rest of spaces also can be used.  This can help us to
reduce the cost.

As you said above, some faliure scenarios are hard to be addressed.
E.g., we couldn't read any data from the disk.  But most scenarios are
that the disk just has some bad sectors.  So that would be great if the
disk still can be used.  In addition, we don't care about whether fsck
can fix these bad blocks because we don't want to reboot the server.  As
I describe before, these servers are as a cache of web site.  If they
are rebooted, they must take some time to preload the content from the
other servers and can not provide service.  This is not better than what
we do now (umount the disk).

Certainly, this might makes no sense to SSD/Flash device because when we
get an error from these devices, it is possible that they couldn't be
used.

Regards,
                                                - Zheng

next prev parent reply	other threads:[~2013-07-30  0:31 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-19 23:55 [PATCH v1 0/5] ext4: Shut down block groups when damage is detected Darrick J. Wong
2013-07-19 23:55 ` [PATCH 1/5] ext4: Error out if verifying the block bitmap fails Darrick J. Wong
2013-08-28 19:36   ` Theodore Ts'o
2013-07-19 23:55 ` [PATCH 2/5] ext4: Fix type declaration of ext4_validate_block_bitmap Darrick J. Wong
2013-07-24  7:12   ` Zheng Liu
2013-07-26 16:06     ` Darrick J. Wong
2013-08-28 20:01       ` Theodore Ts'o
2013-07-19 23:55 ` [PATCH 3/5] ext4: Mark block group as corrupt on block bitmap error Darrick J. Wong
2013-07-23  3:38   ` Darrick J. Wong
2013-08-28 22:26     ` Theodore Ts'o
2013-07-19 23:55 ` [PATCH 4/5] ext4: Mark block group as corrupt on inode " Darrick J. Wong
2013-07-24  7:22   ` Zheng Liu
2013-08-28 22:45     ` Theodore Ts'o
2013-07-19 23:56 ` [PATCH 5/5] ext4: Mark group corrupt on group descriptor checksum error Darrick J. Wong
2013-08-28 22:49   ` Theodore Ts'o
2013-07-21 14:32 ` [PATCH v1 0/5] ext4: Shut down block groups when damage is detected Zheng Liu
2013-07-29 15:28   ` Jeff Moyer
2013-07-30  0:31     ` Zheng Liu [this message]
2013-07-31 18:52       ` Jan Kara
2013-07-31 21:28         ` Theodore Ts'o
2013-07-30  1:57     ` Theodore Ts'o
2013-08-10  6:02     ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130730003109.GE3648@gmail.com \
    --to=gnehzuil.liu@gmail.com \
    --cc=darrick.wong@oracle.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).