Re: raid1 repair: sync_request() aborts if one of the drives has bad block recorded

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: NeilBrown <neilb@suse.de>
To: Alexander Lyakas <alex.bolshoy@gmail.com>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: raid1 repair: sync_request() aborts if one of the drives has bad block recorded
Date: Tue, 31 Jul 2012 12:11:41 +1000	[thread overview]
Message-ID: <20120731121141.109cbfd6@notabene.brown> (raw)
In-Reply-To: <CAGRgLy5FXG9gFgq+QR_OknhNVtEMuSj6d-Hso5PYcjaD2rzXcQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3160 bytes --]

On Tue, 24 Jul 2012 22:30:33 +0300 Alexander Lyakas <alex.bolshoy@gmail.com>
wrote:

> Hi Neil,
> apparently you decided not to apply that patch?

No, worse than that.  I marked your email as 'needs attention'.  That appears
to be an almost-certain guarantee that I'll never look at it again - must be
a bug in my brain.  Apologies.

> On Tue, Jul 17, 2012 at 4:17 PM, Alexander Lyakas
> <alex.bolshoy@gmail.com> wrote:
> > Thanks for your comments, I got confused with the REQUESTED bit.
> > I prepared the patch, with couple of notes:
> >
> > 1/ I decided to be more careful and schedule a write only in case of
> > resync or repair. I was not sure whether we should try to correct bad
> > blocks on device X, when device Y is recovering. Pls change it if you
> > feel otherwise.

That looks sensible.  I've left it as it is.

> >
> > 2/ I tested and committed the patch on top of ubuntu-precise 3.2.0-25.
> > I looked at your "for-next" branch, and saw that there is some new
> > code, which handles hot-replace, which I am not familiar with at this
> > point.

It shouldn't make any important change to this patch.
For RAID1, hot-replace just means there can be twice as many devices as you
would expect.

> >
> > Final note: I noticed that badblocks_show() fails if there are too
> > many bad blocks. It returns value larger than PAGE_SIZE, and then the
> > following linux code complains:
> > fs/sysfs/file.c:fill_read_buffer()
> >         /*
> >          * The code works fine with PAGE_SIZE return but it's likely to
> >          * indicate truncated result or overflow in normal use cases.
> >          */
> >         if (count >= (ssize_t)PAGE_SIZE) {
> >                 print_symbol("fill_read_buffer: %s returned bad count\n",
> >                         (unsigned long)ops->show);
> >                 /* Try to struggle along */
> >                 count = PAGE_SIZE - 1;
> >         }
> >
> > So I am not sure how to solve it, but it would be good for
> > user/application to receive the full list of bad blocks. Perhaps
> > application can pass fd via some ioctl (I feel you don't like ioctls),
> > and then kernel can use vfs_write() to print all the bad blocks to the
> > fd. Or simply return the bad blocks list through the ioctl output to
> > mdadm, and mdadm would print them. Perhaps some other way.

It isn't possible to get a full list of bad blocks from sysfs, much as it is
not possible to read the write-intent-bitmap or other metadata.

The main purpose for the two bad-blocks files in sysfs is to allow a
user-space metadata manager (mdmon) to find out when the kernel discovers a
bad block, to record in the metadata, and then to acknowledge it.
It is always possible to read the first entry from
the unacknowledged_bad_blocks file, then acknowledge it and so remove it from
the list, and in that way you can get all unacknowledged bad blocks.
Acknowledged bad blocks will be listed in the metadata already.

Still... I should probably fix the code so that it never displays a partial
truncated number, but stops before PAGE_SIZE..

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

next prev parent reply	other threads:[~2012-07-31  2:11 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-12 15:38 raid1 repair: sync_request() aborts if one of the drives has bad block recorded Alexander Lyakas
2012-07-16  3:37 ` NeilBrown
2012-07-16  8:45   ` Alexander Lyakas
2012-07-17  1:17     ` NeilBrown
2012-07-17 13:17       ` Alexander Lyakas
2012-07-24 19:30         ` Alexander Lyakas
2012-07-31  2:11           ` NeilBrown [this message]
2012-07-31  5:56             ` Alexander Lyakas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120731121141.109cbfd6@notabene.brown \
    --to=neilb@suse.de \
    --cc=alex.bolshoy@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).