From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay1.corp.sgi.com [137.38.102.111]) by oss.sgi.com (Postfix) with ESMTP id 7A3F47F47 for ; Mon, 17 Aug 2015 11:47:55 -0500 (CDT) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by relay1.corp.sgi.com (Postfix) with ESMTP id 6A5E68F8040 for ; Mon, 17 Aug 2015 09:47:55 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by cuda.sgi.com with ESMTP id l0zkMWplSZAYT6Kk (version=TLSv1 cipher=AES256-SHA bits=256 verify=NO) for ; Mon, 17 Aug 2015 09:47:54 -0700 (PDT) Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (Postfix) with ESMTPS id A602D461EB for ; Mon, 17 Aug 2015 16:47:53 +0000 (UTC) Received: from bfoster.bfoster (dhcp-41-174.bos.redhat.com [10.18.41.174]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t7HGlr0F018657 for ; Mon, 17 Aug 2015 12:47:53 -0400 From: Brian Foster Subject: [PATCH 14/13] xfs: swap leaf buffer into path struct atomically during path shift Date: Mon, 17 Aug 2015 12:47:52 -0400 Message-Id: <1439830072-61117-1-git-send-email-bfoster@redhat.com> In-Reply-To: <1439233309-19959-1-git-send-email-bfoster@redhat.com> References: <1439233309-19959-1-git-send-email-bfoster@redhat.com> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: xfs@oss.sgi.com The node directory lookup code uses a state structure that tracks the path of buffers used to search for the hash of a filename through the leaf blocks. When the lookup encounters a block that ends with the requested hash, but the entry has not yet been found, it must shift over to the next block and continue looking for the entry (i.e., duplicate hashes could continue over into the next block). This shift mechanism involves walking back up and down the state structure, replacing buffers at the appropriate btree levels as necessary. When a buffer is replaced, the old buffer is released and the new buffer read into the active slot in the path structure. Because the buffer is read directly into the path slot, a buffer read failure can result in setting a NULL buffer pointer in an active slot. This throws off the state cleanup code in xfs_dir2_node_lookup(), which expects to release a buffer from each active slot. Instead, a BUG occurs due to a NULL pointer dereference: BUG: unable to handle kernel NULL pointer dereference at 00000000000001e8 IP: [] xfs_trans_brelse+0x2a3/0x3c0 [xfs] ... RIP: 0010:[] [] xfs_trans_brelse+0x2a3/0x3c0 [xfs] ... Call Trace: [] xfs_dir2_node_lookup+0xa6/0x2c0 [xfs] [] xfs_dir_lookup+0x1ac/0x1c0 [xfs] [] xfs_lookup+0x91/0x290 [xfs] [] xfs_vn_lookup+0x73/0xb0 [xfs] [] lookup_real+0x1d/0x50 [] path_openat+0x91e/0x1490 [] do_filp_open+0x89/0x100 ... This has been reproduced via a parallel fsstress and filesystem shutdown workload in a loop. The shutdown triggers the read error in the aforementioned codepath and causes the BUG in xfs_dir2_node_lookup(). Update xfs_da3_path_shift() to update the active path slot atomically with respect to the caller when a buffer is replaced. This ensures that the caller always sees the old or new buffer in the slot and prevents the NULL pointer dereference. Signed-off-by: Brian Foster --- This is just another shutdown/error handling issue I've run into with the same testing associated with all of the other fixes. I'm tacking it on to the end of this series... Brian fs/xfs/libxfs/xfs_da_btree.c | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c index 3264d81..04a3765 100644 --- a/fs/xfs/libxfs/xfs_da_btree.c +++ b/fs/xfs/libxfs/xfs_da_btree.c @@ -1822,6 +1822,7 @@ xfs_da3_path_shift( struct xfs_da_args *args; struct xfs_da_node_entry *btree; struct xfs_da3_icnode_hdr nodehdr; + struct xfs_buf *bp; xfs_dablk_t blkno = 0; int level; int error; @@ -1865,21 +1866,27 @@ xfs_da3_path_shift( * same depth we were at originally. */ for (blk++, level++; level < path->active; blk++, level++) { + struct xfs_buf **bpp = &blk->bp; + /* - * Release the old block. - * (if it's dirty, trans won't actually let go) + * Read the next child block into a local buffer. */ - if (release) - xfs_trans_brelse(args->trans, blk->bp); + error = xfs_da3_node_read(args->trans, dp, blkno, -1, &bp, + args->whichfork); + if (error) + return error; /* - * Read the next child block. + * Release the old block (if it's dirty, the trans doesn't + * actually let go) and swap the local buffer into the path + * structure. This ensures failure of the above read doesn't set + * a NULL buffer in an active slot in the path. */ + if (release) + xfs_trans_brelse(args->trans, blk->bp); blk->blkno = blkno; - error = xfs_da3_node_read(args->trans, dp, blkno, -1, - &blk->bp, args->whichfork); - if (error) - return error; + *bpp = bp; + info = blk->bp->b_addr; ASSERT(info->magic == cpu_to_be16(XFS_DA_NODE_MAGIC) || info->magic == cpu_to_be16(XFS_DA3_NODE_MAGIC) || -- 2.1.0 _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs