public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: kdasu <kdasu.kdev@gmail.com>
To: xfs@oss.sgi.com
Subject: [PATCH 4/4] xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37
Date: Fri, 17 Feb 2012 15:00:09 -0800 (PST)	[thread overview]
Message-ID: <33346051.post@talk.nabble.com> (raw)
In-Reply-To: <33346043.post@talk.nabble.com>



On the 2.6.37 kernel, xfs_fs_evict_inode() leads to a deadlock when
freeing multiple realtime extents. On further debugging the root
cause it was determined to be recursive locking of the RT bitmap
inode during evict operation within the same task context.
The same vfs evict sequence is replayed by the xfs log recovery on
mounts on a reboot after the problem happens first time.
This problem exists on kernel v2.6.39 as well.

Call stack:
xfs_ilock   <- simple task deadlock in the xfs_ilock(ip,  XFS_ILOCK_EXCL)
               re-acquired on second iteration when the inode is cached
xfs_iget_cache_hit
xfs_iget
xfs_trans_iget
xfs_rtfree_extent <- Call to xfs_trans_iget()
xfs_bmap_del_extent
xfs_bunmapi  <- while loop based on number of extents to free
xfs_itruncate_finish
xfs_inactive
evict

The deadlock fix has two parts :
1) check if the inode is already locked in xfs_iget.c in the
   xfs_iget_cache_hit() function. Do not acquire the inode lock again
   if ip is already locked with the  XFS_ILOCK_EXCL subclass.
   We use the active transaction structure to detect if the inode is
   already lokced.
2) In addition in xfs_trans_inode.c:xfs_trans_iget() prevent joining
   already active transaction.

The above changes are also needed along with the backport of following
2.6.39 kernel patches to 2.6.37 kernel:

xfs: only lock the rt bitmap inode once per allocation
xfs: fix xfs_get_extsz_hint for a zero extent size hint
xfs: add lockdep annotations for the rt inodes

Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
---
 fs/xfs/xfs_iget.c        |   12 +++++++++++-
 fs/xfs/xfs_trans_inode.c |    2 +-
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c
index 0cdd269..f05bdc2 100644
--- a/fs/xfs/xfs_iget.c
+++ b/fs/xfs/xfs_iget.c
@@ -143,6 +143,7 @@ xfs_inode_free(
 static int
 xfs_iget_cache_hit(
        struct xfs_perag        *pag,
+       xfs_trans_t             *tp,
        struct xfs_inode        *ip,
        int                     flags,
        int                     lock_flags) __releases(pag->pag_ici_lock)
@@ -234,6 +235,15 @@ xfs_iget_cache_hit(
                trace_xfs_iget_hit(ip);
        }
 
+       /* check inode already locked  */
+       spin_lock(&ip->i_flags_lock);
+       if (tp &&  ip->i_transp == tp) {
+               if ((ip->i_itemp->ili_lock_flags & lock_flags) &
+                       (XFS_ILOCK_EXCL))
+                       lock_flags = 0;
+       }
+       spin_unlock(&ip->i_flags_lock);
+
        if (lock_flags != 0)
                xfs_ilock(ip, lock_flags);
 
@@ -379,7 +389,7 @@ again:
        ip = radix_tree_lookup(&pag->pag_ici_root, agino);
 
        if (ip) {
-               error = xfs_iget_cache_hit(pag, ip, flags, lock_flags);
+               error = xfs_iget_cache_hit(pag, tp, ip, flags, lock_flags);
                if (error)
                        goto out_error_or_again;
        } else {
diff --git a/fs/xfs/xfs_trans_inode.c b/fs/xfs/xfs_trans_inode.c
index ccb3453..6f8db93 100644
--- a/fs/xfs/xfs_trans_inode.c
+++ b/fs/xfs/xfs_trans_inode.c
@@ -58,7 +58,7 @@ xfs_trans_iget(
        int                     error;
 
        error = xfs_iget(mp, tp, ino, flags, lock_flags, ipp);
-       if (!error && tp) {
+       if (!error && tp && !((*ipp)->i_transp)) {
                xfs_trans_ijoin(tp, *ipp);
                (*ipp)->i_itemp->ili_lock_flags = lock_flags;
        }
-- 
1.7.5.4

-- 
View this message in context: http://old.nabble.com/-PATCH-0-4--xfs%3A-resurrect-realtime-subvolume-support-on-kernel-2.6.37-tp33345988p33346051.html
Sent from the Xfs - General mailing list archive at Nabble.com.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2012-02-17 23:00 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-17 22:46 [PATCH 0/4] xfs: resurrect realtime subvolume support on kernel 2.6.37 kdasu
2012-02-17 22:51 ` [PATCH 1/4] xfs: only lock the rt bitmap inode once per allocation kdasu
2012-02-17 22:55   ` [PATCH 2/4] xfs: fix xfs_get_extsz_hint for a zero extent size hint kdasu
2012-02-17 22:58     ` [PATCH 3/4] xfs: add lockdep annotations for the rt inodes kdasu
2012-02-17 23:00       ` kdasu [this message]
2012-02-19 22:41         ` [PATCH 4/4] xfs: fix deadlock in xfs_rtfree_extent with kernel v2.6.37 Christoph Hellwig
2012-02-21 17:22           ` Kamal Dasu
2012-02-23 16:52             ` [PATCH 4/4] V2 " Kamal Dasu
2012-02-25  9:40               ` Christoph Hellwig
2012-02-25 15:46                 ` Kamal Dasu
2012-02-28  8:36                   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=33346051.post@talk.nabble.com \
    --to=kdasu.kdev@gmail.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox