From: Brian Foster <bfoster@redhat.com>
To: Eryu Guan <eguan@redhat.com>
Cc: linux-xfs@vger.kernel.org, Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH 0/5] xfs: quota deadlock fixes
Date: Wed, 22 Feb 2017 10:35:25 -0500 [thread overview]
Message-ID: <20170222153525.GC53025@bfoster.bfoster> (raw)
In-Reply-To: <20170220132535.GA32455@bfoster.bfoster>
On Mon, Feb 20, 2017 at 08:25:35AM -0500, Brian Foster wrote:
> On Mon, Feb 20, 2017 at 11:52:56AM +0800, Eryu Guan wrote:
> > On Fri, Feb 17, 2017 at 12:54:54PM -0500, Brian Foster wrote:
> > > On Fri, Feb 17, 2017 at 02:53:15PM +0800, Eryu Guan wrote:
> > > > On Wed, Feb 15, 2017 at 10:40:42AM -0500, Brian Foster wrote:
> > > > > Hi all,
> > > > >
> > > > > This is a collection of several quota related deadlock fixes for
> > > > > problems that have been reported to the list recently.
> > > > >
> > > > > Patch 1 fixes the low memory quotacheck problem reported by Martin[1].
> > > > > Dave is CC'd as he had comments on this particular thread that started a
> > > > > discussion, but I hadn't heard anything back since my last response.
> > > > >
> > > > > Patch 2 fixes a separate problem I ran into while attempting to
> > > > > reproduce Eryu's xfs/305 hang report[2].
> > > > >
> > > > > Patches 3-5 fix the actual problem reported by Eryu, which is a quotaoff
> > > > > deadlock reproduced by xfs/305.
> > > > >
> > > > > Further details are included in the individual commit log descriptions.
> > > > > Thoughts, reviews, flames appreciated.
> > > > >
> > > > > Eryu,
> > > > >
> > > > > I've run several hundred iterations of this on your reproducer system
> > > > > without reproducing the hang. I have reproduced a reset overnight but
> > > > > still haven't been able to grab a stack trace from that occurrence (I'll
> > > > > try again today/tonight with better console logging). I suspect this is
> > > >
> > > > I hit a NULL pointer dereference while testing your fix, I was running
> > > > xfs/305 for 1000 iterations and host crashed at the 639th run. Not sure
> > > > if it's the same issue you've met here. I posted dmesg log at the end of
> > > > mail. I haven't tried to see if I can reproduce it on stock linus tree
> > > > yet.
> > > >
> > >
> > > Interesting, thanks. I don't know for sure because I didn't hit anything
> > > on my second overnight run, but I wouldn't be surprised if it's the
> > > same, particularly if you hit this again. This does look like an
> > > independent problem to me, though. A kdump might be nice, if possible,
> > > given the difficulty to reproduce...
> >
> > Unfortunately, my second round of 1000 iteration run hit hang too, at
> > the 824th loop. Test configuration is all default, crc enabled XFS with
> > 4k block size, no rmapbt no reflink no finobt no sparse inode.
> >
> > I attached the dmesg log and sysrq-w output. I also left the host in the
> > hang state, you can login and take a look if you have interest.
> >
>
> Hmm, Ok thanks. This one looks more like the original problem.
> Everything is waiting on log reservation, the AIL is spinning on the
> locked quotaoff start log item, and the quotaoff purge sequence appears
> to be spinning on a dquot.
>
> Unfortunately, I can't tell why quotaoff is spinning. stap doesn't seem
> to compile anything on this box after a quick try, so I'll probably have
> to reinstall some of the debug code on top and (hopefully) reproduce.
> I'm guessing it's a similar dquot reference count issue, but it may or
> may not be the same since this one appears significantly harder to
> reproduce than the original...
>
I managed to reproduce with some of my old debug code. That code enabled
a walk of the inode space in the fs to see if any inodes still held
references to the dquot we're unable to purge. In this case, it looks
like we have a group dquot with an elevated reference count, but no
inode appears to have a reference to it. So somehow or another the
reference count appears to be broken...
I'm running again with more tracing to hopefully try and see if the
refcounting goes awry somehow. So far I'm unable to reproduce in some
~1200 iterations, but I'll leave it running. FWIW, I think this is
enough to say that this problem is independent from the one addressed by
the last few patches in this series (in which the dquot was legitimately
held by an inode by the time we attempt the purge).
Brian
> Brian
>
> > Thanks,
> > Eryu
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
prev parent reply other threads:[~2017-02-22 15:36 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-15 15:40 [PATCH 0/5] xfs: quota deadlock fixes Brian Foster
2017-02-15 15:40 ` [PATCH 1/5] xfs: bypass dquot reclaim to avoid quotacheck deadlock Brian Foster
2017-02-16 22:37 ` Dave Chinner
2017-02-17 18:30 ` Brian Foster
2017-02-17 23:12 ` Dave Chinner
2017-02-18 12:55 ` Brian Foster
2017-02-15 15:40 ` [PATCH 2/5] xfs: allocate quotaoff transactions up front to avoid log deadlock Brian Foster
2017-04-26 21:23 ` Darrick J. Wong
2017-04-27 12:03 ` Brian Foster
2017-04-27 15:47 ` Darrick J. Wong
2017-02-15 15:40 ` [PATCH 3/5] xfs: support ability to wait on new inodes Brian Foster
2017-04-27 21:15 ` Darrick J. Wong
2017-02-15 15:40 ` [PATCH 4/5] xfs: update ag iterator to support " Brian Foster
2017-04-27 21:17 ` Darrick J. Wong
2017-02-15 15:40 ` [PATCH 5/5] xfs: wait on new inodes during quotaoff dquot release Brian Foster
2017-04-27 21:17 ` Darrick J. Wong
2017-02-16 7:42 ` [PATCH 0/5] xfs: quota deadlock fixes Eryu Guan
2017-02-16 12:01 ` Brian Foster
2017-02-17 6:53 ` Eryu Guan
2017-02-17 17:54 ` Brian Foster
2017-02-20 3:52 ` Eryu Guan
2017-02-20 13:25 ` Brian Foster
2017-02-22 15:35 ` Brian Foster [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170222153525.GC53025@bfoster.bfoster \
--to=bfoster@redhat.com \
--cc=david@fromorbit.com \
--cc=eguan@redhat.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox