public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: karn@ka9q.net
Cc: xfs@oss.sgi.com
Subject: Re: Kernel bug when running xfs_fsr
Date: Fri, 20 May 2011 11:05:38 +1000	[thread overview]
Message-ID: <20110520010538.GN32466@dastard> (raw)
In-Reply-To: <BANLkTi=YSBY5Zq5ePCLZ2mLY70YEw=Yv7w@mail.gmail.com>

On Thu, May 19, 2011 at 03:35:04PM -0700, Phil Karn wrote:
> I just got the following on my console each time I invoked xfs_fsr on a XFS
> file system. The file system resides on a OCZ SSD that I've been having
> problems with. This morning my system deadlocked while running a program
> that created and deleted many small files on the SSD (a Perl script feeding
> a large number of email messages one at a time to procmail). I suspect bad
> garbage collection algorithms in the SSD; I recovered by booting into single
> user and running wiper.sh on the file system to replenish the drive's pool
> of erased pages. Since then I've been running wiper.sh regularly to ensure a
> sufficient erased page pool in the SSD. I had just run it when I ran
> xfs_fsr.
> 
> So it's possible that my file system data structures are messed up. However,
> the system otherwise seems normal, and I've been routinely tagging my files
> with extended attributes containing their SHA-1 hashes so I can check their
> integrity. So far my checks haven't found any corrupted files.
> 
> Here is the relevant output from my kernel log. Is this a XFS bug, or does
> it simply indicate a corrupted file system due to my earlier crash?
> 
> [29847.045684] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000018

Dereferencing an offset of 24 bytes from the start of a structure.

> [29847.045690] IP: [<ffffffffa033c11b>] xfs_trans_log_inode+0xb/0x30 [xfs]

Three structures possible: xfs_inode, xfs_trans, xfs_inode_log_item:

138 xfs_trans_log_inode(
139         xfs_trans_t     *tp,
140         xfs_inode_t     *ip,
141         uint            flags)
142 {
143         ASSERT(ip->i_transp == tp);
144         ASSERT(ip->i_itemp != NULL);
145         ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
146
147         tp->t_flags |= XFS_TRANS_DIRTY;
148         ip->i_itemp->ili_item.li_desc->lid_flags |= XFS_LID_DIRTY;

And the situation is that ip->i_itemp->ili_item.li_desc == NULL:

typedef struct xfs_log_item {
        struct list_head                li_ail;         /* AIL pointers */
        xfs_lsn_t                       li_lsn;         /* last on-disk lsn */
        struct xfs_log_item_desc        *li_desc;       /* ptr to current desc*/
.....

That should not happen - the inode should be linked into the
transaction (tp), and li_desc should never be NULL here.

Are you running with CONFIG_XFS_DEBUG=y? If not, it is probably
worthwhile as it should catch the problems more precisely before
a NULL pointer dereference occurs.

> and so on...it repeats a few times because I issued the xfs_fsr command a
> few times.

So it is reproducable? Can you turn on the xfs_swapext tracepoints
and gather the output over a failure, as well as using xfs_fsr -v -d
and capturing that output? That might indicate that there is a
specific inode extent swap configuration that triggers this problem
that I haven't realised exists.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

      reply	other threads:[~2011-05-20  1:05 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-19 22:35 Kernel bug when running xfs_fsr Phil Karn
2011-05-20  1:05 ` Dave Chinner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110520010538.GN32466@dastard \
    --to=david@fromorbit.com \
    --cc=karn@ka9q.net \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox