From: Bill Kendall <wkendall@sgi.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@oss.sgi.com, Dann Frazier <dannf@debian.org>,
Michael Lueck <mlueck@lueckdatasystems.com>
Subject: Re: xfsdump SGI_FS_BULKSTAT errno = 22, how could this IRIX bug get into Ubuntu 10.04 Lucid between kernels 2.6.32-27 and 2.6.32-26?
Date: Tue, 08 Feb 2011 19:24:45 -0600 [thread overview]
Message-ID: <4D51ECDD.1030606@sgi.com> (raw)
In-Reply-To: <20110207220421.GD2559@dastard>
On 02/07/2011 04:04 PM, Dave Chinner wrote:
> On Mon, Feb 07, 2011 at 03:42:28PM -0600, Bill Kendall wrote:
>> On 02/07/2011 03:23 PM, Dave Chinner wrote:
>>> On Mon, Feb 07, 2011 at 02:55:36PM -0600, Bill Kendall wrote:
>>>> On 02/04/2011 02:49 PM, Dave Chinner wrote:
>>>>> On Fri, Feb 04, 2011 at 09:12:53AM -0500, Michael Lueck wrote:
>>>>>> Dave Chinner wrote:
>>>>>>> Ok, so xfsdump i seeing a short bulkstat, then an EINVAL returned
>>>>>> >from the next bulkstat. That's not a race condition, and makes me
>>>>>>> think you have some kind of on-disk corruption.
>>>>>>
>>>>>> Very odd that some kind of on-disk corruption is suddenly causing
>>>>>> xfsdump problems starting with Ubuntu 10.04 (Lucid) kernel
>>>>>> 2.6.32-27 and persisting in 2.6.32-28.
>>>>>
>>>>> Not really. The newer kernels have code in them that does more
>>>>> validity checks than previous kernels, so older kernels would have
>>>>> erroneously and silently returned unlinked files to xfsdump and have
>>>>> them backed up. IOWs, you'd never notice such a corruption with
>>>>> xfsdump. On the new kernel, xfsdump gets an EINVAL error to such
>>>>> occurrences, which it should have in the first place.
>>>>>
>>>>>> And there is one other person who confirmed this xfsdump problem
>>>>>> running Lucid with kernel 2.6.32-28. They reported their "me too"
>>>>>> in the Ubuntu bug tracker.
>>>>>>
>>>>>> Could it be that 2.6.32-26 and prior managed to write something to
>>>>>> disk corrupted, and the newer code is tripping on it?
>>>>>
>>>>> That's what I'm trying to find out. Or it could be something as
>>>>> simple as your disk has had an undetected bit error that has flipped
>>>>> a bit in the inode allocation btree.
>>>>>
>>>>
>>>> Hi Dave,
>>>>
>>>> I am able to reproduce this on a system running Ubuntu 10.4
>>>> (2.6.32-28). I took a metadump of the filesystem and moved it to
>>>> a system running 10.10 (2.6.35-25), and was able to successfully
>>>> dump it there. Likewise it dumps fine on 2.6.38-rc1. So this
>>>> suggests an issue with the Ubuntu 10.4 kernel.
>>>
>>> 2.6.35 hasn't had the untrusted inode lookup patches back ported to
>>> it, so it's no surprise that it isn't having problems - it's just
>>> like the older 2.6.32 kernels.
>>
>> I thought it landed in 2.6.35 and then a regression was fixed in
>> 2.6.36. The untrusted inode lookup changes are referenced here:
>> http://www.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.35
>
> My bad, I just checked the regression fix. I have no idea if it got
> back ported to 2.6.35-stable or not - it probably didn't judging by
> your results.....
>
>>> Hmmm, can you find out if there is any specific pattern to the inode
>>> numbers that are returning EINVAL? Maybe the inode allocbt freespace
>>> record checks aren't quite correct in the backport (like the
>>> original bogus alignment assumption I made).
>>
>> I'll take a look.
The failing bulkstats, at least the ones I've checked so far, are
hitting this path in xfs_bulkstat():
/*
* Skip if this inode is free.
*/
if (XFS_INOBT_MASK(chunkidx) & irbp->ir_free) {
lastino = ino;
continue;
}
The backport of the 4 untrusted inode lookup commits looks okay to
me, however I think they depend on commit
7dce11dbac54fce777eea0f5fb25b2694ccd7900 (xfs: always use iget in
bulkstat), which was checked in shortly before the untrusted
inode lookup changes. When that commit is added to the Ubuntu
2.6.32-28 kernel, xfsdump runs fine on the 2 filesystems of mine
that were exhibiting the problem.
Bill
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2011-02-09 1:22 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-02 13:30 xfsdump SGI_FS_BULKSTAT errno = 22, how could this IRIX bug get into Ubuntu 10.04 Lucid between kernels 2.6.32-27 and 2.6.32-26? Michael Lueck
2011-02-02 18:32 ` Bill Kendall
2011-02-02 19:03 ` Michael Lueck
2011-02-03 4:58 ` Dave Chinner
2011-02-03 14:43 ` Michael Lueck
2011-02-04 0:08 ` Dave Chinner
2011-02-04 14:12 ` Michael Lueck
2011-02-04 20:49 ` Dave Chinner
2011-02-07 20:55 ` Bill Kendall
2011-02-07 21:23 ` Dave Chinner
2011-02-07 21:42 ` Bill Kendall
2011-02-07 22:04 ` Dave Chinner
2011-02-09 1:24 ` Bill Kendall [this message]
2011-02-08 17:39 ` Michael Lueck
2011-02-08 19:52 ` Dave Chinner
2011-02-08 19:59 ` Michael Lueck
2011-02-08 20:24 ` Michael Lueck
2011-02-08 22:47 ` Dave Chinner
2011-02-14 2:52 ` Michael Lueck
2011-02-08 17:39 ` Michael Lueck
2011-02-03 14:51 ` Michael Lueck
2011-02-04 14:52 ` dann frazier
2011-04-22 12:34 ` Michael Lueck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D51ECDD.1030606@sgi.com \
--to=wkendall@sgi.com \
--cc=dannf@debian.org \
--cc=david@fromorbit.com \
--cc=linux-xfs@oss.sgi.com \
--cc=mlueck@lueckdatasystems.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox