From: Brian Foster <bfoster@redhat.com>
To: Gareth Clay <gclay@pivotal.io>
Cc: linux-xfs@vger.kernel.org
Subject: Re: Many D state processes on XFS, kernel 4.4
Date: Wed, 26 Apr 2017 16:34:54 -0400 [thread overview]
Message-ID: <20170426203451.GA44531@bfoster.bfoster> (raw)
In-Reply-To: <CAPeSaCC-Z+atujOq9+uj0tgdkWDNPadiQVDB+23tj=DBH-EKrw@mail.gmail.com>
On Wed, Apr 26, 2017 at 05:47:15PM +0100, Gareth Clay wrote:
> Hi,
>
> We're trying to diagnose a problem on an AWS virtual machine with two
> XFS filesystems, each on loop devices. The loop files are sitting on
> an EXT4 filesystem on Amazon EBS. The VM is running lots of Linux
> containers - we're using Overlay FS on XFS to provide the root
> filesystems for these containers.
>
> The problem we're seeing is a lot of processes entering D state, stuck
> in the xlog_grant_head_wait function. We're also seeing xfsaild/loop0
> stuck in D state. We're not able to write to the filesystem at all on
> this device, it seems, without the process hitting D state. Once the
> processes enter D state they never recover, and the list of D state
> processes seems to be growing slowly over time.
>
> The filesystem on loop1 seems fine (we can run ls, touch etc)
>
> Would anyone be able to help us to diagnose the underlying problem please?
>
> Following the problem reporting FAQ we've collected the following
> details from the VM:
>
> uname -a:
> Linux 8dd9526f-00ba-4f7b-aa59-a62ec661c060 4.4.0-72-generic
> #93~14.04.1-Ubuntu SMP Fri Mar 31 15:05:15 UTC 2017 x86_64 x86_64
> x86_64 GNU/Linux
>
> xfs_repair version 3.1.9
>
> AWS VM with 8 CPU cores and EBS storage
>
> And we've also collected output from /proc, xfs_info, dmesg and the
> XFS trace tool in the following files:
>
> https://s3.amazonaws.com/grootfs-logs/dmesg
> https://s3.amazonaws.com/grootfs-logs/meminfo
> https://s3.amazonaws.com/grootfs-logs/mounts
> https://s3.amazonaws.com/grootfs-logs/partitions
> https://s3.amazonaws.com/grootfs-logs/trace_report.txt
> https://s3.amazonaws.com/grootfs-logs/xfs_info
>
It looks like everything is pretty much backed up on the log and the
tail of the log is pinned by some dquot items. The trace output shows
that xfsaild is spinning on flush locked dquots:
<...>-2737622 [001] 33449671.892834: xfs_ail_flushing: dev 7:0 lip 0x0xffff88012e655e30 lsn 191/61681 type XFS_LI_DQUOT flags IN_AIL
<...>-2737622 [001] 33449671.892868: xfs_ail_flushing: dev 7:0 lip 0x0xffff8800110d7bb0 lsn 191/61681 type XFS_LI_DQUOT flags IN_AIL
<...>-2737622 [001] 33449671.892869: xfs_ail_flushing: dev 7:0 lip 0x0xffff88012e655a80 lsn 191/67083 type XFS_LI_DQUOT flags IN_AIL
<...>-2737622 [001] 33449671.892869: xfs_ail_flushing: dev 7:0 lip 0x0xffff8800110d4810 lsn 191/67296 type XFS_LI_DQUOT flags IN_AIL
<...>-2737622 [001] 33449671.892869: xfs_ail_flushing: dev 7:0 lip 0x0xffff880122210460 lsn 191/67310 type XFS_LI_DQUOT flags IN_AIL
The cause of that is not immediately clear. One possible reason is it
could be due to I/O failure. Do you have any I/O error messages (i.e.,
"metadata I/O error: block ...") in your logs from before you ended up
in this state?
If not, I'm wondering if another possibility is an I/O that just never
completes.. is this something you can reliably reproduce?
Brian
> Thanks for any help or advice you can offer!
>
> Claudia and Gareth
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2017-04-26 20:34 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-26 16:47 Many D state processes on XFS, kernel 4.4 Gareth Clay
2017-04-26 20:34 ` Brian Foster [this message]
2017-04-27 16:01 ` Gareth Clay
2017-04-27 17:57 ` Brian Foster
2017-05-03 12:07 ` Gareth Clay
2017-05-03 14:24 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170426203451.GA44531@bfoster.bfoster \
--to=bfoster@redhat.com \
--cc=gclay@pivotal.io \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).