From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: with ECARTIS (v1.0.0; list xfs); Thu, 26 Jul 2007 23:59:37 -0700 (PDT) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l6R6xWbm001845 for ; Thu, 26 Jul 2007 23:59:34 -0700 Date: Fri, 27 Jul 2007 16:59:30 +1000 From: David Chinner Subject: Re: RFC: log record CRC validation Message-ID: <20070727065930.GT12413810@sgi.com> References: <20070725092445.GT12413810@sgi.com> <46A7226D.8080906@sgi.com> <46A8DF7E.4090006@agami.com> <20070726233129.GM12413810@sgi.com> <46A94963.7000103@agami.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <46A94963.7000103@agami.com> Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com List-Id: xfs To: Michael Nishimoto Cc: David Chinner , markgw@sgi.com, xfs-dev , xfs-oss On Thu, Jul 26, 2007 at 06:24:51PM -0700, Michael Nishimoto wrote: > The log checksum code has not been used since the > development phase of xfs. It did work at one point because I > remember using it and then decided to disable it and use just > the current cycle stamping technique. The checksum code was > just advisory, so I could see if it ever occurred during > development. > > When a CRC error is found, your suggestion is correct. Recovery > should backup and process only completely good log records. The code > backs up in this same fashion when it encounters a region of > missing sector updates because of the async nature of log > writes and disk caches. Yes, but that's usually only in the last 8 log-buffers worth of the the log that the hole exists in (i.e. 256k by default). However, if the tail block has a CRC error, we've got to through away the entire log and that, like zeroing a dirty log from xfs_repair, generally results in a corrupted filesystem image. An example of where this could be a problem is reusing a just-freed extent. Before reusing it we force the log to get the transaction on disk and rely on log replay to ensure that the block is freed in the event of a crash. We then go and write over the contents of the block. If that log transaction is not replayed (that freed the extent) then we've overwritten the previous contents of that extent and so the "current" contents of the extent after log replay are wrong. IOWs, I think that if we come across a bad CRC in a log record we can replay up to that point but we've still got to abort the mount and ask the user to run repair.... > At this point, I'm not convinced that xfs needs to do CRCs on > the xfs log because the size of an xfs log is relatively small. Sure, but the same argument can be made about the superblock, or an AGF and a directory block. That doesn't mean that they'll never have an error. Statistically speaking, the log contains that blocks in the filesystem we most frequently do I/O to, so it's the most likely region to see an I/O path induced bit error. If we see one on recovery...... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group