linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Liu Bo <bo.li.liu@oracle.com>
To: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
Cc: Anand Jain <anand.jain@oracle.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v8 2/2] btrfs: check device for critical errors and mark failed
Date: Fri, 6 Oct 2017 16:33:58 -0700	[thread overview]
Message-ID: <20171006233357.GB19068@lim.localdomain> (raw)
In-Reply-To: <bd328a02-8823-a3a4-35ff-a9b890f00679@gmail.com>

On Thu, Oct 05, 2017 at 07:07:44AM -0400, Austin S. Hemmelgarn wrote:
> On 2017-10-04 16:11, Liu Bo wrote:
> > On Tue, Oct 03, 2017 at 11:59:20PM +0800, Anand Jain wrote:
> > > From: Anand Jain <Anand.Jain@oracle.com>
> > > 
> > > Write and flush errors are critical errors, upon which the device fd
> > > must be closed and marked as failed.
> > > 
> > 
> > Can we defer the job of closing device to umount?
> > 
> > We can go mark the device failed and skip it while doing read/write,
> > and umount can do the cleanup work.
> > 
> > That way we don't need a dedicated thread looping around to detect a
> > rare situation.
> If BTRFS doesn't close the device, then it's 100% guaranteed if it
> reconnects that it will show up under a different device node.  It would
> also mean that the device node stays visible when there is in fact no device
> connected to it, which is a pain from a monitoring perspective.

I see, you're assuming that these errors are due to disconnection of
disks, it could be bad sectors (although almost impossible from
enterprise hard disks) or some other errors across the stack.

I do agree that cleanup needs to be done if disk got disconnected, but
not doing cleanup here, a udev rule is needed to handle such an event.


thanks,
-liubo

  reply	other threads:[~2017-10-06 23:37 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-03 15:59 [PATCH v8 0/2] [RFC] Introduce device state 'failed' Anand Jain
2017-10-03 15:59 ` [PATCH v8 1/2] btrfs: introduce device dynamic state transition to failed Anand Jain
2017-10-13 18:47   ` Liu Bo
2017-10-16  6:09     ` Anand Jain
2017-10-03 15:59 ` [PATCH v8 2/2] btrfs: check device for critical errors and mark failed Anand Jain
2017-10-04 20:11   ` Liu Bo
2017-10-05 11:07     ` Austin S. Hemmelgarn
2017-10-06 23:33       ` Liu Bo [this message]
2017-10-09 11:58         ` Austin S. Hemmelgarn
2017-10-05 13:56     ` Anand Jain
2017-10-06 23:56       ` Liu Bo
2017-10-08 14:23         ` Anand Jain
2017-10-13 18:46           ` Liu Bo
2017-10-16  6:09             ` Anand Jain
2017-10-05 13:54 ` [PATCH v8.1 2/2] btrfs: mark device failed for write and flush errors Anand Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171006233357.GB19068@lim.localdomain \
    --to=bo.li.liu@oracle.com \
    --cc=ahferroin7@gmail.com \
    --cc=anand.jain@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).