From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: with ECARTIS (v1.0.0; list xfs); Tue, 31 Jul 2007 19:24:28 -0700 (PDT) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l712OKbm026662 for ; Tue, 31 Jul 2007 19:24:23 -0700 Date: Wed, 1 Aug 2007 12:24:18 +1000 From: David Chinner Subject: Re: RFC: log record CRC validation Message-ID: <20070801022418.GR31489@sgi.com> References: <20070725092445.GT12413810@sgi.com> <46A7226D.8080906@sgi.com> <46A8DF7E.4090006@agami.com> <20070726233129.GM12413810@sgi.com> <46A94963.7000103@agami.com> <20070727065930.GT12413810@sgi.com> <46AFD88E.9070403@agami.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <46AFD88E.9070403@agami.com> Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com List-Id: xfs To: Michael Nishimoto Cc: David Chinner , markgw@sgi.com, xfs-dev , xfs-oss On Tue, Jul 31, 2007 at 05:49:18PM -0700, Michael Nishimoto wrote: > David Chinner wrote: > >On Thu, Jul 26, 2007 at 06:24:51PM -0700, Michael Nishimoto wrote: > > > When a CRC error is found, your suggestion is correct. Recovery > > > should backup and process only completely good log records. The code > > > backs up in this same fashion when it encounters a region of > > > missing sector updates because of the async nature of log > > > writes and disk caches. ...... > >IOWs, I think that if we come across a bad CRC in a log record we > >can replay up to that point but we've still got to abort the mount > >and ask the user to run repair.... ..... > You are correct. I didn't need the example -- just a reminder that > CRC errors can occur anywhere, not just in the last 8 log-buffers > worth of data. ;-) > > And yes, repair is necessary now because we are no longer replaying > a transaction which we thought was committed. Ok, I'll make the mount replay up to previous good record and then abort with -EUCLEAN. > I have a request. Can this be made a mkfs.xfs option, so it can be > disabled? It's a definite possibility, because.... > What are your plans for adding CRCs to other metadata objects? Other objects will require some on-disk format change, and that will require a feature bit to be set. Basically, we can add CRCs to individual inodes without a version bump as we have some empty space in the v2 inode core. That will make v2 inodes the default, but we already have a superblock bit for that and all versions of linux support v2 inodes so there is no issue there. This will require userspace tool changes, though, because tools like repair will revert v2 inodes back to v1 format if the are not using project id's or the link count fits into 16 bits..... I haven't looked at great depth into other structures in terms of implementation details. I know that if we use a 16 bit CRC on directories we can get away without a on-disk format change as the xfs_da_blkinfo structure has 16 bits of padding. However, given that directory block size can reach 64k, a CRC16 check is really only capable of single bit error detection. Hence I think we really need CRC32 here which means an on-disk format change. For the other types of btree blocks we will also need on-disk changes as there is no padding we can use to hold a CRC in the headers. Adding CRCs to the SB, AGF, AGI and AFGL is trivial but really needs a feature bit to say they are there. So, yes, this sort of change will require a feature bit and that means it would be a mkfs option. I'd prefer to have all the changes ready to go before releasing them into the wild so that we only need a single feature bit, though there is the possibility of separating the directory changes out into "version 3" directories. Conversion of existing filesystems is harder because of the repacking of structures required. if we are going to convert filesystems, I would expect it to be done offline by xfs_repair (Hi Barry!). Given that it will already need tobe modified to validate CRCs, I think this is probably the best approach to this problem right now. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group