From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay3.corp.sgi.com [198.149.34.15]) by oss.sgi.com (Postfix) with ESMTP id A5E497F37 for ; Mon, 14 Sep 2015 14:18:16 -0500 (CDT) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.157.11]) by relay3.corp.sgi.com (Postfix) with ESMTP id 419ADAC006 for ; Mon, 14 Sep 2015 12:18:13 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by cuda.sgi.com with ESMTP id yGMTgl0sRQOadKZw (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Mon, 14 Sep 2015 12:18:12 -0700 (PDT) Date: Mon, 14 Sep 2015 15:18:09 -0400 From: Brian Foster Subject: Re: [PATCH 03/13] xfs_repair: make CRC checking consistent in path verification Message-ID: <20150914191809.GC34083@bfoster.bfoster> References: <1441827251-13128-1-git-send-email-sandeen@sandeen.net> <1441827251-13128-4-git-send-email-sandeen@sandeen.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1441827251-13128-4-git-send-email-sandeen@sandeen.net> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: Eric Sandeen Cc: xfs@oss.sgi.com On Wed, Sep 09, 2015 at 02:34:01PM -0500, Eric Sandeen wrote: > verify_da_path and verify_dir2_path both take steps to > re-compute the CRC of the block if it otherwise looks > ok and no other changes are needed. They do this inside > a loop, but the approach differs; verify_da_path expects > its caller to check the first buffer prior to the loop, > and verify_dir2_path expects its caller to check the last > buffer after the loop. > > Make this consistent by semi-arbitrarily choosing to make > verify_da_path (and its caller) match the method used by > verify_dir2_path, and check the last buffer after the > loop is done. > The code here seems Ok, but I don't think the commit log description describes what the code is doing. I could also just have misread it. This code is recursive and hairy enough as it is... As I follow it, we init the cursor to the leftmost descendant in traverse_int_dablock(). I don't see any crc verification in there. We get into process_leaf_attr_level() and it reads and walks through the leaf blocks, doing the crc check of each one. Each leaf block we verify corresponds to an entry in the node block, so the verify_da_path() walks the node entry index along until we slide over to the next node block. At that point, we verify the crc of the current node buffer, write it out and replace that level in the cursor with the next one. Eventually we hit the end of the leaf block chain and call verify_final_da_path(). If I'm following all that correctly, it looks to me that before this change we would have never verified the crc of the first node block. At least, I can't find where that might happen. After this change, that first node block crc is checked immediately before it's written out. But now that the post-read check is gone, I don't see where the last node block is crc-checked. Perhaps this should be checked in verify_final_da_path()..? Taking a quick look at the dir2 side, it looks like it checks the crc in traverse_int_dir2block() and bails out on -EFSBADCRC (presumably to rebuild the whole thing..?). As noted above, it does the pre-write crc check and I don't see where it would check the final node block(s) either. Hmm, this code is hairy enough it might be worth running through a debugger or doing a targeted corruption to see if I'm missing something... Brian > Signed-off-by: Eric Sandeen > Signed-off-by: Eric Sandeen > --- > repair/attr_repair.c | 24 ++++++++++++++---------- > 1 files changed, 14 insertions(+), 10 deletions(-) > > diff --git a/repair/attr_repair.c b/repair/attr_repair.c > index f29a5bd..aba0782 100644 > --- a/repair/attr_repair.c > +++ b/repair/attr_repair.c > @@ -606,6 +606,14 @@ verify_da_path(xfs_mount_t *mp, > ASSERT(cursor->level[this_level].dirty == 0 || > (cursor->level[this_level].dirty && !no_modify)); > > + /* > + * If block looks ok but CRC didn't match, make sure to > + * recompute it. > + */ > + if (!no_modify && > + cursor->level[this_level].bp->b_error == -EFSBADCRC) > + cursor->level[this_level].dirty = 1; > + > if (cursor->level[this_level].dirty && !no_modify) > libxfs_writebuf(cursor->level[this_level].bp, 0); > else > @@ -618,14 +626,6 @@ verify_da_path(xfs_mount_t *mp, > cursor->level[this_level].hashval = > be32_to_cpu(btree[0].hashval); > entry = cursor->level[this_level].index = 0; > - > - /* > - * We want to rewrite the buffer on a CRC error seeing as it > - * contains what appears to be a valid node block, but only if > - * we are fixing errors. > - */ > - if (bp->b_error == -EFSBADCRC && !no_modify) > - cursor->level[this_level].dirty++; > } > /* > * ditto for block numbers > @@ -1363,8 +1363,6 @@ process_leaf_attr_level(xfs_mount_t *mp, > da_bno, dev_bno, ino); > goto error_out; > } > - if (bp->b_error == -EFSBADCRC) > - repair++; > > leaf = bp->b_addr; > xfs_attr3_leaf_hdr_from_disk(mp->m_attr_geo, &leafhdr, leaf); > @@ -1419,6 +1417,12 @@ process_leaf_attr_level(xfs_mount_t *mp, > } > > current_hashval = greatest_hashval; > + /* > + * If block looks ok but CRC didn't match, make sure to > + * recompute it. > + */ > + if (!no_modify && bp->b_error == -EFSBADCRC) > + repair++; > > if (repair && !no_modify) > libxfs_writebuf(bp, 0); > -- > 1.7.1 > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs