From: Brian Foster <bfoster@redhat.com>
To: Michael Weissenbacher <mw@dermichi.com>, linux-xfs@vger.kernel.org
Subject: Re: xfsaild in D state seems to be blocking all other i/o sporadically
Date: Wed, 19 Apr 2017 08:37:40 -0400 [thread overview]
Message-ID: <20170419123740.GC40497@bfoster.bfoster> (raw)
In-Reply-To: <20170419121202.2m3pi4ffojpnqyxc@eorzea.usersys.redhat.com>
On Wed, Apr 19, 2017 at 02:12:02PM +0200, Carlos Maiolino wrote:
> Hi,
>
> On Wed, Apr 19, 2017 at 12:58:05PM +0200, Michael Weissenbacher wrote:
> > Hi List!
> > I have a storage server which primarily does around 15-20 parallel
> > rsync's, nothing special. Sometimes (3-4 times a day) i notice that all
> > I/O on the file system suddenly comes to a halt and the only process
> > that continues to do any I/O (according to iotop) is the process
> > xfsaild/md127. When this happens, xfsaild only does reads (according to
> > iotop) and consistently in D State (according to top).
> > Unfortunately this can sometimes stay like this for 5-15 minutes. During
> > this time even a simple "ls" our "touch" would block and be stuck in D
> > state. All other running processes accessing the fs are of course also
> > stuck in D state. It is a XFS V5 filesystem.
> > Then again, as sudden as it began, everything goes back to normal and
> > I/O continues. The problem is accompanied with several "process blocked
> > for xxx seconds" in dmesg and also some dropped connections due to
> > network timeouts.
> >
> > I've tried several things to remedy the problem, including:
> > - changing I/O schedulers (tried noop, deadline and cfq). Deadline
> > seems to be best (the block goes away in less time compared with the
> > others).
> > - removing all mount options (defaults + usrquota, grpquota)
> > - upgrading to the latest 4.11.0-rc kernel (before that i was on 4.9.x)
> >
> > Nothing of the above seemed to have made a significant change to the
> > problem.
> >
> > xfs_info output of the fs in question:
> > meta-data=/dev/md127 isize=512 agcount=33,
> > agsize=268435440 blks
> > = sectsz=4096 attr=2, projid32bit=1
> > = crc=1 finobt=1 spinodes=0 rmapbt=0
> > = reflink=0
> > data = bsize=4096 blocks=8789917696, imaxpct=10
> > = sunit=16 swidth=96 blks
> > naming =version 2 bsize=4096 ascii-ci=0 ftype=1
> > log =internal bsize=4096 blocks=521728, version=2
> > = sectsz=4096 sunit=1 blks, lazy-count=1
> > realtime =none extsz=4096 blocks=0, rtextents=0
> >
>
> This is really not enough to give any idea of what might be happening, although
> this looks more like a slow storage while xfsaild is flushing the log, but we
> really need more information to try to give a better idea of what is going on,
> please look at:
>
> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
>
> Specially for: storage layout (RAID arrays, LVMs, thin provisioning, etc), and
> the dmesg output with the traces from the hang tasks.
>
Information around memory usage might be particularly interesting here
as well. E.g., /proc/meminfo and /proc/slabinfo..
Brian
> Cheers.
>
> > Storage Subsystem: Dell Perc H730P Controller 2GB NVCACHE, 12 6TB Disks,
> > RAID-10, latest Firmware Updates
> >
> > I would be happy to dig out more information if needed. How can i find
> > out if the RAID Controller itself gets stuck? Nothing bad shows up in
> > the hardware and SCSI controller logs.
> >
>
> --
> Carlos
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2017-04-19 12:37 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-19 10:58 xfsaild in D state seems to be blocking all other i/o sporadically Michael Weissenbacher
2017-04-19 12:12 ` Carlos Maiolino
2017-04-19 12:37 ` Brian Foster [this message]
2017-04-19 12:40 ` Michael Weissenbacher
2017-04-19 13:01 ` Michael Weissenbacher
2017-04-19 14:04 ` Carlos Maiolino
2017-04-19 14:20 ` Carlos Maiolino
2017-04-19 16:40 ` Michael Weissenbacher
2017-04-19 16:36 ` Michael Weissenbacher
2017-04-19 18:08 ` Brian Foster
2017-04-19 20:10 ` Michael Weissenbacher
2017-04-19 20:55 ` Darrick J. Wong
2017-04-19 21:47 ` Michael Weissenbacher
2017-04-19 23:48 ` Dave Chinner
2017-04-20 7:11 ` Michael Weissenbacher
2017-04-20 23:16 ` Dave Chinner
2017-04-21 7:43 ` Michael Weissenbacher
2017-04-21 9:18 ` Shan Hai
2017-04-22 8:38 ` Michael Weissenbacher
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170419123740.GC40497@bfoster.bfoster \
--to=bfoster@redhat.com \
--cc=linux-xfs@vger.kernel.org \
--cc=mw@dermichi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.