From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay2.corp.sgi.com [137.38.102.29]) by oss.sgi.com (Postfix) with ESMTP id A07B029DFA for ; Thu, 25 Apr 2013 17:42:43 -0500 (CDT) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by relay2.corp.sgi.com (Postfix) with ESMTP id 80A7930407A for ; Thu, 25 Apr 2013 15:42:40 -0700 (PDT) Received: from e33.co.us.ibm.com (e33.co.us.ibm.com [32.97.110.151]) by cuda.sgi.com with ESMTP id wWolge7jBLHDJvMj (version=TLSv1 cipher=AES256-SHA bits=256 verify=NO) for ; Thu, 25 Apr 2013 15:42:39 -0700 (PDT) Received: from /spool/local by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 25 Apr 2013 16:42:38 -0600 Received: from d03relay03.boulder.ibm.com (d03relay03.boulder.ibm.com [9.17.195.228]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id D61011FF003F for ; Thu, 25 Apr 2013 16:36:44 -0600 (MDT) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by d03relay03.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r3PMflbS113194 for ; Thu, 25 Apr 2013 16:41:47 -0600 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r3PMiebF010308 for ; Thu, 25 Apr 2013 16:44:40 -0600 Subject: Re: [PATCH] xfs: shutdown filesystem if xfs_perag_get fails From: Chandra Seetharaman In-Reply-To: <20130423204956.GM10481@dastard> References: <20130419204102.736961610@sgi.com> <20130421174107.007313126@sgi.com> <5174603A.8030208@sandeen.net> <51753EDE.6000301@sgi.com> <51754A13.5000808@sandeen.net> <5175532B.3050509@sgi.com> <20130422233033.GK30622@dastard> <51769111.3050103@sgi.com> <1366732475.3762.32402.camel@chandra-dt.ibm.com> <20130423204956.GM10481@dastard> Date: Thu, 25 Apr 2013 17:41:46 -0500 Message-ID: <1366929706.4098.6.camel@chandra-dt.ibm.com> Mime-Version: 1.0 Reply-To: sekharan@us.ibm.com List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: Dave Chinner Cc: Eric Sandeen , Mark Tinguely , xfs@oss.sgi.com In which case something along the lines of --- diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 3806088..3fb2fa6 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -203,7 +203,13 @@ xfs_perag_get(struct xfs_mount *mp, xfs_agnumber_t agno) if (pag) { ASSERT(atomic_read(&pag->pag_ref) >= 0); ref = atomic_inc_return(&pag->pag_ref); - } + } else + /* + * xfs_perag_get() is called with invalid agno, + * which cannot happen. This indicates a problem + * in the calling code. + */ + BUG(); rcu_read_unlock(); trace_xfs_perag_get(mp, agno, ref, _RET_IP_); return pag; -------- would be useful ?. Since we have a NULL pag, we will trip somewhere else. At least with this, there is a pointer to the debugger/sysadmin about where/what to look for (may be with more valuable/correct comment than above). On Wed, 2013-04-24 at 06:49 +1000, Dave Chinner wrote: > On Tue, Apr 23, 2013 at 10:54:35AM -0500, Chandra Seetharaman wrote: > > On Tue, 2013-04-23 at 08:48 -0500, Mark Tinguely wrote: > > > On 04/22/13 18:30, Dave Chinner wrote: > > > > On Mon, Apr 22, 2013 at 10:11:39AM -0500, Mark Tinguely wrote: > > > >> #6 [ffff880135603980] _xfs_buf_find at ffffffffa01a7fef [xfs] > > > >> #7 [ffff8801356039f0] xfs_buf_get at ffffffffa01a824a [xfs] > > > >> #8 [ffff880135603a30] xfs_buf_read at ffffffffa01a83a4 [xfs] > > > >> #9 [ffff880135603a60] xlog_recover_inode_pass2 at ffffffffa0193629 [xfs] > > > > > > > > So it's the same problem as this bug fix addresses: > > > > > > > > commit 10616b806d1d7835b1d23b8d75ef638f92cb98b6 > > > > Author: Dave Chinner > > > > Date: Mon Jan 21 23:53:52 2013 +1100 > > > > > > > > xfs: fix _xfs_buf_find oops on blocks beyond the filesystem end > > > > > > > > When _xfs_buf_find is passed an out of range address, it will fail > > > > to find a relevant struct xfs_perag and oops with a null > > > > dereference. This can happen when trying to walk a filesystem with a > > > > metadata inode that has a partially corrupted extent map (i.e. the > > > > block number returned is corrupt, but is otherwise intact) and we > > > > try to read from the corrupted block address. > > > > > > > > In this case, just fail the lookup. If it is readahead being issued, > > > > it will simply not be done, but if it is real read that fails we > > > > will get an error being reported. Ideally this case should result > > > > in an EFSCORRUPTED error being reported, but we cannot return an > > > > error through xfs_buf_read() or xfs_buf_get() so this lookup failure > > > > may result in ENOMEM or EIO errors being reported instead. > > > > > > > > Signed-off-by: Dave Chinner > > > > Reviewed-by: Brian Foster > > > > Reviewed-by: Ben Myers > > > > Signed-off-by: Ben Myers > > > > > > > >> The recovery value is bad and is a problem on its own, but XFS does > > > >> not verify the validity of ag number when doing a xfs_perag_get(). > > I'd be interested to know why the inode in recovery is bad - is this > on a kernel that CRCs the log records? Or a result of some other bug > or hardware corruption? i.e. xfs_perag_get is not the problem here, > it's a corruption of a trusted inode number and we failed to detect > that corruption.... > > > > > Right, that's what the above fix does, but it can't be done on older > > > > kernels because grwofs relies on being able to get buffers beyond > > > > the existing filesystem limits... > > > > > > Thank-you, that make sense. > > > > > > I still do not like assuming xfs_perag_get() will always return a valid > > > perag pointer. > > > > I second that. > > > > Is there any reason we should _not_ check the return value from > > xfs_perag_get() for NULL ? > > Yes. The input AG should already be bounds checked before the perag > is looked up. If we are asking for an invalid AG, then the bug is not > in xfs_perag_get(), it is in the code that is calling it. i.e. error > checking the xfs_perag_get() function is a band-aid for improper > object validation, not a solution to the problem. > > That is, this function was designed to be extremely low overhead and > only to be handed validated data. If it is only handed validated > data, then it is guaranteed to return a valid per-ag structure, > and therefore error checking the return value is not necessary. > > Because xfs_perag_get is not designed to handle untrusted data it is > up to the calling code to first validate the AGNO that is passed to > xfs_perag_get(). If we aren't first validating the object that the > AGNO is derived from, then the calling code has failed to validate > it's inputs sufficiently, and lots of other things can go wrong (not > just the xfs_perag_get() call). > > For example, the above commit is a catchall for bad block numbers > being looked up in extent records. It was a quick fix, not a > targeted fix for the reported problems. For bad block numbers in > extents, we should be doing is validating block numbers when they > are looked up are within the filesystem bounds (eg. inside > xfs_bmapi_read/xfs_bmapi_write) so that a bad block number is caught > at lookup time, not at IO time. We only do that for extents that > point to block 0. > > Cheers, > > Dave. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs