From: Dave Chinner <david@fromorbit.com>
To: Mark Tinguely <tinguely@sgi.com>
Cc: Ben Myers <bpm@sgi.com>, Stan Hoeppner <stan@hardwarefreak.com>,
Markus Trippelsdorf <markus@trippelsdorf.de>,
xfs@oss.sgi.com
Subject: Re: [Bisected] Corruption of root fs during git bisect of drm system hang
Date: Sat, 20 Jul 2013 13:18:40 +1000 [thread overview]
Message-ID: <20130720031840.GA11674@dastard> (raw)
In-Reply-To: <51E9AB80.4000700@sgi.com>
On Fri, Jul 19, 2013 at 04:11:28PM -0500, Mark Tinguely wrote:
> On 07/19/13 07:22, Markus Trippelsdorf wrote:
> >
> >I've bisected this issue to the following commit:
> >
> > commit cca9f93a52d2ead50b5da59ca83d5f469ee4be5f
> > Author: Dave Chinner<dchinner@redhat.com>
> > Date: Thu Jun 27 16:04:49 2013 +1000
> >
> > xfs: don't do IO when creating an new inode
> >
> >Reverting this commit on top of the Linus tree "solves" all problems for
> >me. IOW I no longer loose my KDE and LibreOffice config files during a
> >crash. Log recovery now works fine and xfs_repair shows no issues.
> >
> >So users of 3.11.0-rc1 beware. Only run this version if you have
> >up-to-date backups handy.
> >
>
> I reviewed the above patch and liked it but, I think I recreated the
> above mentioned problem with a simple script:
>
> cp /root/.bash_history /root/.lesshst /root/.pwclientrc
> /root/.viminfo /root/.bash_profile /root/.lesshst.YCJCDz
> /root/.quiltrc /somexfsdir
> sync
> echo 'c' > /proc/sysrq-trigger
> .... reboot, remount ...
> cd /somexfsdir
I've only reproduced the problem *once* with this method - the first
time I tried. Then I mkfs'd the filesystem rather than repairing it
and I haven't been able to reproduce it since. So the problem is
far more subtle that just copying some files, running sync and
crashing the machine - there's some kind of initial or timing
condition that we are missing that triggers it...
The one interesting thing I noticed was that the generation number
in the crash case was non-zero. That's an important piece of
information, and:
> # cat .bash_history
> cat: .bash_history: No such file or directory
>
> xfs_db> inode 131
> xfs_db> p
> core.magic = 0x494e
> core.mode = 0
That's a "free" inode, and why XFS considers it invalid when the
lookup sees it.
> core.gen = 3707503345
You saw it as well, Mark.
That means it has actually been allocated and written to disk at
some point in time. That is, inodes allocated by mkfs in the root
inode chunk have a generation number of zero. For this to have a
non-zero generation number, it means that had to be written after
allocation - either before the sync or during log recovery.
Unfortunately, without the 'xfs_logprint -t -i <dev>' output from
prior to mounting the filesystem which demonstrates te problem, I
can't tell if the issue is a recovery problem or something that
happened before the crash....
> revert the above commit and the problem goes away.
....
> core.mode = 0100600
Not an free inode...
> core.gen = 0
And, importantly, the generation number is zero, as would be
expected for an inode in the root chunk.
FWIW, if you can reproduce this on demand, Mark, is to see if
mounting "-o ikeep" makes the problem go away as this optimisation
is only used on filesystems that are configured to free inode
chunks...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2013-07-20 8:45 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-10 9:06 Corruption of root fs during git bisect of drm system hang Markus Trippelsdorf
2013-07-11 0:31 ` Dave Chinner
2013-07-11 3:36 ` Markus Trippelsdorf
2013-07-11 3:58 ` Dave Chinner
2013-07-11 4:12 ` Stan Hoeppner
2013-07-11 9:07 ` Markus Trippelsdorf
2013-07-11 11:28 ` Markus Trippelsdorf
2013-07-11 20:24 ` Stan Hoeppner
2013-07-11 20:40 ` Markus Trippelsdorf
2013-07-11 23:01 ` Stan Hoeppner
2013-07-12 2:38 ` Dave Chinner
2013-07-12 2:17 ` Dave Chinner
2013-07-12 7:07 ` Markus Trippelsdorf
2013-07-13 9:05 ` Markus Trippelsdorf
2013-07-15 2:28 ` Dave Chinner
2013-07-15 6:47 ` Markus Trippelsdorf
2013-07-19 12:22 ` [Bisected] " Markus Trippelsdorf
2013-07-19 12:41 ` Stefan Ring
2013-07-19 12:51 ` Markus Trippelsdorf
2013-07-19 16:02 ` Eric Sandeen
2013-07-19 16:32 ` Markus Trippelsdorf
2013-07-19 19:13 ` Ben Myers
2013-07-19 19:56 ` Markus Trippelsdorf
2013-07-19 20:28 ` Markus Trippelsdorf
2013-07-19 19:23 ` Eric Sandeen
2013-07-19 19:53 ` Markus Trippelsdorf
2013-07-19 21:11 ` Mark Tinguely
2013-07-20 3:18 ` Dave Chinner [this message]
2013-07-20 17:21 ` Mark Tinguely
2013-07-21 7:37 ` Dave Chinner
2013-07-20 1:48 ` Dave Chinner
2013-07-22 10:22 ` Dave Chinner
2013-07-22 10:47 ` Markus Trippelsdorf
2013-07-22 22:54 ` Dave Chinner
2013-07-11 4:15 ` Markus Trippelsdorf
2013-07-11 0:37 ` Stan Hoeppner
2013-07-11 3:47 ` Markus Trippelsdorf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130720031840.GA11674@dastard \
--to=david@fromorbit.com \
--cc=bpm@sgi.com \
--cc=markus@trippelsdorf.de \
--cc=stan@hardwarefreak.com \
--cc=tinguely@sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox