From mboxrd@z Thu Jan 1 00:00:00 1970 From: Josef Bacik Subject: [LSF/MM TOPIC] Working towards better power fail testing Date: Mon, 8 Dec 2014 17:11:41 -0500 Message-ID: <5486221D.6000006@fb.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Cc: To: Return-path: Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:54045 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754588AbaLHWLp (ORCPT ); Mon, 8 Dec 2014 17:11:45 -0500 Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Hello, We have been doing pretty well at populating xfstests with loads of tests to catch regressions and validate we're all working properly. One thing that has been lacking is a good way to verify file system integrity after a power fail. This is a core part of what file systems are supposed to provide but it is probably the least tested aspect. We have dm-flakey tests in xfstests to test fsync correctness, but these tests do not catch the random horrible things that can go wrong. We are still finding horrible scary things that go wrong in Btrfs because it is simply hard to reproduce and test for. I have been working on an idea to do this better, some may have seen my dm-power-fail attempt, and I've got a new incarnation of the idea thanks to discussions with Zach Brown. Obviously there will be a lot changing in this area in the time between now and March but it would be good to have everybody in the room talking about what they would need to build a good and deterministic test to make sure we're always giving a consistent file system and to make sure our fsync() handling is working properly. Thanks, Josef