From mboxrd@z Thu Jan 1 00:00:00 1970 From: Arne Jansen Subject: Re: [PATCH v5 0/3] Btrfs: add IO error device stats Date: Fri, 25 May 2012 22:41:34 +0200 Message-ID: <4FBFEE7E.8060008@gmx.net> References: <1337954770-10086-1-git-send-email-sbehrens@giantdisaster.de> <20120525151854.GA23362@infradead.org> <4FBFC63B.6050403@giantdisaster.de> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Christoph Hellwig , linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org To: Stefan Behrens Return-path: In-Reply-To: <4FBFC63B.6050403@giantdisaster.de> Sender: linux-btrfs-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On 05/25/12 19:49, Stefan Behrens wrote: > It would be helpful if already the generic block layer would offer > device error counters. Then btrfs could read them, add own counters for > its checksum detected errors, and store everything persistently in the > filesystem. > I take it that you not only count I/O-errors, but also corrupted blocks and errors generated by misdirected writes. These are informations that are not available to the block layer. > The goal is to replace disks that have an increased error rate with > spare disks, and the goal is to repair this degenerated RAID state quickly. > > > On 05/25/2012 17:18, Christoph Hellwig wrote: >> Can you explain why the device error counters should be in a filesystem >> instead of generic block layer code? >> >> On Fri, May 25, 2012 at 04:06:07PM +0200, Stefan Behrens wrote: > [...] >>> The goal is to detect when drives start to get an increased error rate, >>> when drives should be replaced soon. Therefore statistic counters are >>> added that count IO errors (read, write and flush). Additionally, the >>> software detected errors like checksum errors and corrupted blocks are >>> counted. >>> >>> An ioctl interface is added to get the device statistic counters. >>> A second ioctl is added to atomically get and reset these counters. >>> >>> The device statistics are written into the device tree with each >>> transaction commit. Only modified statistics are written. >>> When a filesystem is mounted, the device statistics for each involved >>> device are read from the device tree and used to initialize the >>> counters. >>> >>> A patch for the btrfs-progs world will also be sent. >>> >>> Stefan Behrens (3): >>> Btrfs: add device counters for detected IO and checksum errors >>> Btrfs: add ioctl to get and reset the device stats >>> Btrfs: read device stats on mount, write modified ones during commit >>> >>> fs/btrfs/ctree.h | 38 ++++++ >>> fs/btrfs/disk-io.c | 20 +++- >>> fs/btrfs/extent_io.c | 18 ++- >>> fs/btrfs/ioctl.c | 26 +++++ >>> fs/btrfs/ioctl.h | 33 ++++++ >>> fs/btrfs/print-tree.c | 3 + >>> fs/btrfs/scrub.c | 65 ++++++++--- >>> fs/btrfs/transaction.c | 4 + >>> fs/btrfs/volumes.c | 304 >>> +++++++++++++++++++++++++++++++++++++++++++++++- >>> fs/btrfs/volumes.h | 52 +++++++++ >>> 10 files changed, 539 insertions(+), 24 deletions(-) >>> >>> -- >>> 1.7.10.2 > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html