public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andy Isaacson <adi@hexapodia.org>
To: Theodore Tso <tytso@mit.edu>,
	linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org
Subject: Re: hard lockup, followed by ext4_lookup: deleted inode referenced: 524788
Date: Tue, 29 Sep 2009 09:12:50 -0700	[thread overview]
Message-ID: <20090929161250.GX12922@hexapodia.org> (raw)
In-Reply-To: <20090929031308.GB24383@mit.edu>

On Mon, Sep 28, 2009 at 11:13:08PM -0400, Theodore Tso wrote:
> What this indicates to me is that an inode table block was written to
> the wrong location on disk.  In fact, given large numbers of inode
> numbers involved, it looks like large numbers of inode table blocks
> were written to the wrong location on disk.

Aha, sounds like an excellent theory.

> I'm surprised by how many inode tables blocks apparently had gotten
> mis-directed.  Almost certainly there must have been some kind of
> hardware failure that must have triggered this.  I'm not sure what
> caused it, but it does seem like your filesystem has been toasted
> fairly badly.

As I said, the machine hung hard while doing a bunch of writes to a USB
thumbdrive and a kernel compile on sda1.  It could be hardware, but I've
been using this laptop as my primary test box for several months and
it's been fairly reliable (as reliable as git-of-the-day is, pretty
much).

I'll run memtest86 and check SMART.

Note that it is running DMAR (the Intel VT-d iommu implementation), it
could be that a DMA got messed up -- since the logs didn't make it I
don't know if DMAR reported any DMA protection faults at the time of
failure.  The DMAR on this box has had some issues in the past which
seem to be fixed, but ...

> At this point my advice to you would be to try to recover as much data
> from the disk as you can, and to *not* try to run fsck or mount the

Oh, all the data is well backed-up; this is a seriously bleeding-edge
box.

I've taken a complete image of /dev/sda1 and will be reinstalling it.
The image is from after the kernel remounted / RO.

> disk using dd to a backup hard drive first.  If you're really curious
> we could try to look at the dumpe2fs output and see if we can find the
> pattern of what might have caused so many misdirected writes, but
> there's no guarantee that we would be able to find the definitive root
> cause, and from a recovery perspective, it's probably faster and less
> risk to reinstall your system disk from scratch.

I would like to get as close to root cause as possible.  I have a
filesystem image copied away and I'll be attempting to repro the
failure; this is a test system for a large deployment, so I don't want
any issues lurking. :)

Let me know what debug commands you'd like to run.  dumpe2fs output is
at http://web.hexapodia.org/~adi/tmp/dumpe2fs.out

-andy

  reply	other threads:[~2009-09-29 16:12 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-28 19:16 hard lockup, followed by ext4_lookup: deleted inode referenced: 524788 Andy Isaacson
2009-09-28 20:25 ` Theodore Tso
2009-09-28 21:28   ` Andy Isaacson
2009-09-29  3:13     ` Theodore Tso
2009-09-29 16:12       ` Andy Isaacson [this message]
2009-09-30 19:38         ` Andy Isaacson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090929161250.GX12922@hexapodia.org \
    --to=adi@hexapodia.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox