From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-btrfs-owner@vger.kernel.org>
Received: from mx1.fusionio.com ([66.114.96.30]:53788 "EHLO mx1.fusionio.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1756134Ab2JEOur (ORCPT <rfc822;linux-btrfs@vger.kernel.org>);
	Fri, 5 Oct 2012 10:50:47 -0400
Date: Fri, 5 Oct 2012 10:50:44 -0400
From: Chris Mason <chris.mason@fusionio.com>
To: Stefan Behrens <sbehrens@giantdisaster.de>
CC: Chris Mason <clmason@fusionio.com>, Josef Bacik <JBacik@fusionio.com>,
        "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
        <axboe@kernel.dk>
Subject: Re: [PATCH] Btrfs: make filesystem read-only when submitting barrier
 fails
Message-ID: <20121005145044.GB9134@shiny>
References: <1344605915-22526-1-git-send-email-sbehrens@giantdisaster.de>
 <20121005125159.GP2370@localhost.localdomain>
 <20121005130911.GA9134@shiny>
 <506EE919.1030903@giantdisaster.de>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
In-Reply-To: <506EE919.1030903@giantdisaster.de>
Sender: linux-btrfs-owner@vger.kernel.org
List-ID: <linux-btrfs.vger.kernel.org>

[ Adding Jens to the cc ]

Jens, we have a proposed patch for btrfs to treat EIO errors on cache
flushes as fatal events (forcing the FS to readonly).  It really seems
like the right idea, except for the part where we trust the devices to
only return EIOs on cache flushes when things went horribly wrong.

Can you see any reason for EIOs to come back from cache flushes when we
might not want to declare a metadata emergency?

[ more context below ]

-chris

On Fri, Oct 05, 2012 at 08:05:13AM -0600, Stefan Behrens wrote:
> On Fri, 5 Oct 2012 09:09:11 -0400, Chris Mason wrote:
> > On Fri, Oct 05, 2012 at 06:51:59AM -0600, Josef Bacik wrote:
> >>
> >> So we're going from EOPNOTSUPP resulting in barriers just being turned off to
> >> the file system being mounted read only?  This is not inline with what every
> >> other linux file system does, which isn't necessarily a problem but I'm not sure
> >> it's the kind of change we want to make.  Think about somebody formatting a
> >> cheap usb stick as btrfs and not understanding why they can't write to it.  I'm
> >> fine either way, I just want to make sure that we think about the consequences
> >> of this before we pull it in.  Thanks,
> 
> (Just for the matter of completeness: A few minutes ago Josef agreed on
> IRC that this is not the case, EOPNOTSUPP is not seen as an error.)
> 
> > 
> > In the past I haven't really trusted the drives to return good errors
> > when there are problems with cache flushes.  It might be that drives
> > (and the block layer) are really smart about this now, I know that
> > Christoph thought any EIOs coming up from a barrier really were eios.
> > 
> > But I still have my doubts, mostly because I don't think anyone tests
> > these conditions on a regular basis.
> 
> Looking at the risk of this patch, the worst thing that can happen is
> that a flush request results in an EIO although there is no error at
> all. Then the filesystem is switched read-only and the user is not
> amused. All I can say is that I have not seen this so far, which does
> not mean that it cannot happen.
> 
> But I have seen two cases (on IRC and on mailing list) where drives that
> transitioned to offline caused flush errors and write errors. When we
> ignore the flush errors, the super blocks referring to the new tree
> roots are written to those disks that are still online, and the state of
> the filesystem is not correct. The trees refer to data that is not on
> disk (since it is not flushed and the write EOIs can be delayed since
> we're talking of hardware issues like hot unplugged USB drives). In this
> case, the user is not amused as well. And additionally, he needs to go
> back to a previous tree root revision which can mean to lose data.

Jens