public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* xfsaild in D state seems to be blocking all other i/o sporadically
@ 2017-04-19 10:58 Michael Weissenbacher
  2017-04-19 12:12 ` Carlos Maiolino
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Weissenbacher @ 2017-04-19 10:58 UTC (permalink / raw)
  To: linux-xfs

Hi List!
I have a storage server which primarily does around 15-20 parallel
rsync's, nothing special. Sometimes (3-4 times a day) i notice that all
I/O on the file system suddenly comes to a halt and the only process
that continues to do any I/O (according to iotop) is the process
xfsaild/md127. When this happens, xfsaild only does reads (according to
iotop) and consistently in D State (according to top).
Unfortunately this can sometimes stay like this for 5-15 minutes. During
this time even a simple "ls" our "touch" would block and be stuck in D
state. All other running processes accessing the fs are of course also
stuck in D state. It is a XFS V5 filesystem.
Then again, as sudden as it began, everything goes back to normal and
I/O continues. The problem is accompanied with several "process blocked
for xxx seconds" in dmesg and also some dropped connections due to
network timeouts.

I've tried several things to remedy the problem, including:
  - changing I/O schedulers (tried noop, deadline and cfq). Deadline
seems to be best (the block goes away in less time compared with the
others).
  - removing all mount options (defaults + usrquota, grpquota)
  - upgrading to the latest 4.11.0-rc kernel (before that i was on 4.9.x)

Nothing of the above seemed to have made a significant change to the
problem.

xfs_info output of the fs in question:
meta-data=/dev/md127             isize=512    agcount=33,
agsize=268435440 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1 spinodes=0 rmapbt=0
         =                       reflink=0
data     =                       bsize=4096   blocks=8789917696, imaxpct=10
         =                       sunit=16     swidth=96 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=521728, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

Storage Subsystem: Dell Perc H730P Controller 2GB NVCACHE, 12 6TB Disks,
RAID-10, latest Firmware Updates

I would be happy to dig out more information if needed. How can i find
out if the RAID Controller itself gets stuck? Nothing bad shows up in
the hardware and SCSI controller logs.

Thanks,
Michael

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-04-22  8:38 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-19 10:58 xfsaild in D state seems to be blocking all other i/o sporadically Michael Weissenbacher
2017-04-19 12:12 ` Carlos Maiolino
2017-04-19 12:37   ` Brian Foster
2017-04-19 12:40   ` Michael Weissenbacher
2017-04-19 13:01     ` Michael Weissenbacher
2017-04-19 14:04       ` Carlos Maiolino
2017-04-19 14:20         ` Carlos Maiolino
2017-04-19 16:40           ` Michael Weissenbacher
2017-04-19 16:36         ` Michael Weissenbacher
2017-04-19 18:08           ` Brian Foster
2017-04-19 20:10             ` Michael Weissenbacher
2017-04-19 20:55               ` Darrick J. Wong
2017-04-19 21:47                 ` Michael Weissenbacher
2017-04-19 23:48                   ` Dave Chinner
2017-04-20  7:11                     ` Michael Weissenbacher
2017-04-20 23:16                       ` Dave Chinner
2017-04-21  7:43                         ` Michael Weissenbacher
2017-04-21  9:18                           ` Shan Hai
2017-04-22  8:38                             ` Michael Weissenbacher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox