public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Michael Weissenbacher <mw@dermichi.com>, linux-xfs@vger.kernel.org
Subject: Re: xfsaild in D state seems to be blocking all other i/o sporadically
Date: Wed, 19 Apr 2017 08:37:40 -0400	[thread overview]
Message-ID: <20170419123740.GC40497@bfoster.bfoster> (raw)
In-Reply-To: <20170419121202.2m3pi4ffojpnqyxc@eorzea.usersys.redhat.com>

On Wed, Apr 19, 2017 at 02:12:02PM +0200, Carlos Maiolino wrote:
> Hi,
> 
> On Wed, Apr 19, 2017 at 12:58:05PM +0200, Michael Weissenbacher wrote:
> > Hi List!
> > I have a storage server which primarily does around 15-20 parallel
> > rsync's, nothing special. Sometimes (3-4 times a day) i notice that all
> > I/O on the file system suddenly comes to a halt and the only process
> > that continues to do any I/O (according to iotop) is the process
> > xfsaild/md127. When this happens, xfsaild only does reads (according to
> > iotop) and consistently in D State (according to top).
> > Unfortunately this can sometimes stay like this for 5-15 minutes. During
> > this time even a simple "ls" our "touch" would block and be stuck in D
> > state. All other running processes accessing the fs are of course also
> > stuck in D state. It is a XFS V5 filesystem.
> > Then again, as sudden as it began, everything goes back to normal and
> > I/O continues. The problem is accompanied with several "process blocked
> > for xxx seconds" in dmesg and also some dropped connections due to
> > network timeouts.
> > 
> > I've tried several things to remedy the problem, including:
> >   - changing I/O schedulers (tried noop, deadline and cfq). Deadline
> > seems to be best (the block goes away in less time compared with the
> > others).
> >   - removing all mount options (defaults + usrquota, grpquota)
> >   - upgrading to the latest 4.11.0-rc kernel (before that i was on 4.9.x)
> > 
> > Nothing of the above seemed to have made a significant change to the
> > problem.
> > 
> > xfs_info output of the fs in question:
> > meta-data=/dev/md127             isize=512    agcount=33,
> > agsize=268435440 blks
> >          =                       sectsz=4096  attr=2, projid32bit=1
> >          =                       crc=1        finobt=1 spinodes=0 rmapbt=0
> >          =                       reflink=0
> > data     =                       bsize=4096   blocks=8789917696, imaxpct=10
> >          =                       sunit=16     swidth=96 blks
> > naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
> > log      =internal               bsize=4096   blocks=521728, version=2
> >          =                       sectsz=4096  sunit=1 blks, lazy-count=1
> > realtime =none                   extsz=4096   blocks=0, rtextents=0
> >
> 
> This is really not enough to give any idea of what might be happening, although
> this looks more like a slow storage while xfsaild is flushing the log, but we
> really need more information to try to give a better idea of what is going on,
> please look at:
> 
> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
> 
> Specially for: storage layout (RAID arrays, LVMs, thin provisioning, etc), and
> the dmesg output with the traces from the hang tasks.
> 

Information around memory usage might be particularly interesting here
as well. E.g., /proc/meminfo and /proc/slabinfo..

Brian

> Cheers.
> 
> > Storage Subsystem: Dell Perc H730P Controller 2GB NVCACHE, 12 6TB Disks,
> > RAID-10, latest Firmware Updates
> > 
> > I would be happy to dig out more information if needed. How can i find
> > out if the RAID Controller itself gets stuck? Nothing bad shows up in
> > the hardware and SCSI controller logs.
> > 
> 
> -- 
> Carlos
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2017-04-19 12:37 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-19 10:58 xfsaild in D state seems to be blocking all other i/o sporadically Michael Weissenbacher
2017-04-19 12:12 ` Carlos Maiolino
2017-04-19 12:37   ` Brian Foster [this message]
2017-04-19 12:40   ` Michael Weissenbacher
2017-04-19 13:01     ` Michael Weissenbacher
2017-04-19 14:04       ` Carlos Maiolino
2017-04-19 14:20         ` Carlos Maiolino
2017-04-19 16:40           ` Michael Weissenbacher
2017-04-19 16:36         ` Michael Weissenbacher
2017-04-19 18:08           ` Brian Foster
2017-04-19 20:10             ` Michael Weissenbacher
2017-04-19 20:55               ` Darrick J. Wong
2017-04-19 21:47                 ` Michael Weissenbacher
2017-04-19 23:48                   ` Dave Chinner
2017-04-20  7:11                     ` Michael Weissenbacher
2017-04-20 23:16                       ` Dave Chinner
2017-04-21  7:43                         ` Michael Weissenbacher
2017-04-21  9:18                           ` Shan Hai
2017-04-22  8:38                             ` Michael Weissenbacher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170419123740.GC40497@bfoster.bfoster \
    --to=bfoster@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mw@dermichi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox