public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Brian Candler <B.Candler@pobox.com>
To: Stan Hoeppner <stan@hardwarefreak.com>
Cc: xfs@oss.sgi.com
Subject: Re: Storage server, hung tasks and tracebacks
Date: Tue, 15 May 2012 15:02:37 +0100	[thread overview]
Message-ID: <20120515140237.GA3630@nsrc.org> (raw)
In-Reply-To: <4FA4C321.2070105@hardwarefreak.com>

Update:

After a week away, I am continuing to try to narrow down the problem of this
system withing hanging I/O.

I can fairly reliably repeat the problem on a system with 24 disks, and I've
embarked on trying some different configs to see what's the simplest way I
can make this die.

During this, I found something of interest: I happened to leave an 'iostat
5' process running, and that hung too.  i.e. ps showed it in 'D+' state, and
it was unkillable.

root        34  0.6  0.0      0     0 ?        D    11:29   1:18 [kswapd0]
root      1258  0.0  0.0  15976   532 ?        Ds   11:29   0:00 /usr/sbin/irqbalance
root      1421  0.0  0.0      0     0 ?        D    12:49   0:01 [xfsaild/md127]
snmp      1430  0.0  0.0  48608  3440 ?        D    11:29   0:00 /usr/sbin/snmpd -Lsd -Lf /dev/null -u snmp -g snmp -I -smux -p /var/run/snmpd.pid
xxxx      1614  1.1  0.0 378860  3812 pts/1    D+   12:50   1:15 bonnie++ -d /disk/scratch/test -s 16384k -n 98:800k:500k:1000
xxxx      1669  1.2  0.0 378860  3816 pts/2    D+   12:50   1:21 bonnie++ -d /disk/scratch/test -s 16384k -n 98:800k:500k:1000
xxxx      1727  0.5  0.0 383424   692 pts/3    Dl+  12:51   0:37 bonnie++ -d /disk/scratch/test -s 16384k -n 98:800k:500k:1000
xxxx      1782  1.2  0.0 378860  3824 pts/4    D+   12:51   1:20 bonnie++ -d /disk/scratch/test -s 16384k -n 98:800k:500k:1000
xxxx      1954  0.0  0.0   5912   544 pts/0    D+   12:58   0:00 iostat 5
root      2642  0.2  0.0      0     0 ?        D    13:25   0:09 [kworker/0:1]
root      3233  0.0  0.0   5044   168 ?        Ds   13:50   0:00 /usr/sbin/sshd -D -R
xxxx      4648  0.0  0.0   8104   936 pts/6    S+   14:41   0:00 grep --color=auto  D
root     29491  0.0  0.0      0     0 ?        D    12:45   0:00 [kworker/1:2]

I wonder if iostat actually communicates with the device driver at all? If
not, then presumably it's looking at some kernel data structure.  Maybe
there is a lock being kept open on that by someone/something.

At the same time, I notice that 'cat /proc/diskstats' still works, and
starting a new 'iostat 5' process works too.

After issuing halt -p I get this:

root        34  0.6  0.0      0     0 ?        D    11:29   1:18 [kswapd0]
root      1258  0.0  0.0  15976   532 ?        Ds   11:29   0:00 /usr/sbin/irqbalance
root      1421  0.0  0.0      0     0 ?        D    12:49   0:01 [xfsaild/md127]
snmp      1430  0.0  0.0  48608  3440 ?        D    11:29   0:00 /usr/sbin/snmpd -Lsd -Lf /dev/null -u snmp -g snmp -I -smux -p /var/run/snmpd.pid
xxxx      1614  1.0  0.0 378860  3812 pts/1    D+   12:50   1:15 bonnie++ -d /disk/scratch/test -s 16384k -n 98:800k:500k:1000
xxxx      1669  1.1  0.0 378860  3816 pts/2    D+   12:50   1:21 bonnie++ -d /disk/scratch/test -s 16384k -n 98:800k:500k:1000
xxxx      1727  0.5  0.0 383424   692 pts/3    Dl+  12:51   0:37 bonnie++ -d /disk/scratch/test -s 16384k -n 98:800k:500k:1000
xxxx      1782  1.1  0.0 378860  3824 pts/4    D+   12:51   1:20 bonnie++ -d /disk/scratch/test -s 16384k -n 98:800k:500k:1000
xxxx      1954  0.0  0.0   5912   544 pts/0    D+   12:58   0:00 iostat 5
root      2642  0.1  0.0      0     0 ?        D    13:25   0:09 [kworker/0:1]
root      3233  0.0  0.0   5044   168 ?        Ds   13:50   0:00 /usr/sbin/sshd -D -R
root      4753  0.0  0.0  15056   928 ?        D    14:42   0:00 umount /run/rpc_pipefs
root      4828  0.0  0.0   4296   348 ?        D    14:42   0:00 sync
root      4834  0.0  0.0   8100   624 pts/6    R+   14:50   0:00 grep --color=auto  D
root     29491  0.0  0.0      0     0 ?        D    12:45   0:00 [kworker/1:2]

I see even umount'ing rpc_pipefs is hanging. So this suggests there's some
sort of global lock involved.

Anyway, I just wonder if this jogs a memory in anyone, as to why iostat
would hang in an unkillable way.

Regards,

Brian.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2012-05-15 14:02 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-02 18:44 Storage server, hung tasks and tracebacks Brian Candler
2012-05-03 12:50 ` Stan Hoeppner
2012-05-03 20:41   ` Brian Candler
2012-05-03 22:19     ` Stan Hoeppner
2012-05-04 16:32       ` Brian Candler
2012-05-04 16:50         ` Stefan Ring
     [not found]         ` <4FA4C321.2070105@hardwarefreak.com>
2012-05-06  8:47           ` Brian Candler
2012-05-15 14:02           ` Brian Candler [this message]
2012-05-20 16:35             ` Brian Candler
2012-05-22 13:14               ` Brian Candler
2012-05-20 23:59             ` Dave Chinner
2012-05-21  9:58               ` Brian Candler
2012-09-09  9:47                 ` Brian Candler
2012-05-07  1:53         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120515140237.GA3630@nsrc.org \
    --to=b.candler@pobox.com \
    --cc=stan@hardwarefreak.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox