[XFS] Any process to a particular XFS device hung in D state forever.

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

* [XFS] Any process to a particular XFS device hung in D state forever.
@ 2016-04-19  9:56 Hugo Kuo
  2016-04-19 11:30 ` Brian Foster
  0 siblings, 1 reply; 8+ messages in thread
From: Hugo Kuo @ 2016-04-19  9:56 UTC (permalink / raw)
  To: xfs; +Cc: Darrell Bishop


[-- Attachment #1.1.1: Type: text/plain, Size: 1483 bytes --]

Hi XFS team,

We encountered a problem frequently in past three weeks. Our daemons store
data to XFS partition associate with xattr.

Disk seems not responding since all processes to this disk in D state and
can't be killed at all.

   - It happens on several disks. I feel it's randomly.
   - Reboot seems solve the problem temporarily.
   - All disks are multipath devices.


I suspected that's an issue from disk corrupted at beginning. But smartctl
doesn't show any clue about disk bad. And reboot makes the problem gone
away.


   - Any process to this disk is blocked. Even a simple $ls . Kernel log
   <https://gist.github.com/HugoKuo/f87748786b26ea04fd9e1d86d9538293>
   - I tested the disk by read bytes on block via $dd . It works fine
   without any error in dmesg.
   - The `xfs_repair -n` output of a problematic mount point [xfs_repair -n]
   <https://gist.github.com/HugoKuo/76f65bdc0b860ca6ed5e786f8c43da0e> . It
   is still processing.
   - Kernel : Linux node9 2.6.32-573.8.1.el6.x86_64 #1 SMP Tue Nov 10
   18:01:38 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
   - OS : CentOS release 6.5 (Final)
   - XFS : xfsprogs.x86_64         3.1.1-14.el6


There's an interesting behaviour of $ls command.

* This is completed in 1sec. Very quick and give me the result in the
test.d864 file $ls /srv/node/d864/tmp > test.d864
* This is hanging $ls /srv/node/d864/tmp

[image: Inline image 1]

I suspect there's something wrong with imap. Is there a known bug ?

Thanks // Hugo

[-- Attachment #1.1.2: Type: text/html, Size: 1959 bytes --]

[-- Attachment #1.2: dd.png --]
[-- Type: image/png, Size: 307262 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [XFS] Any process to a particular XFS device hung in D state forever.
  2016-04-19  9:56 [XFS] Any process to a particular XFS device hung in D state forever Hugo Kuo
@ 2016-04-19 11:30 ` Brian Foster
  2016-04-19 13:24   ` Hugo Kuo
  0 siblings, 1 reply; 8+ messages in thread
From: Brian Foster @ 2016-04-19 11:30 UTC (permalink / raw)
  To: Hugo Kuo; +Cc: Darrell Bishop, xfs

On Tue, Apr 19, 2016 at 05:56:19PM +0800, Hugo Kuo wrote:
> Hi XFS team,
> 
> We encountered a problem frequently in past three weeks. Our daemons store
> data to XFS partition associate with xattr.
> 
> Disk seems not responding since all processes to this disk in D state and
> can't be killed at all.
> 
>    - It happens on several disks. I feel it's randomly.
>    - Reboot seems solve the problem temporarily.
>    - All disks are multipath devices.
> 
> 
> I suspected that's an issue from disk corrupted at beginning. But smartctl
> doesn't show any clue about disk bad. And reboot makes the problem gone
> away.
> 
> 
>    - Any process to this disk is blocked. Even a simple $ls . Kernel log
>    <https://gist.github.com/HugoKuo/f87748786b26ea04fd9e1d86d9538293>

Looks like it's waiting on an AGF buffer. The buffer could be held by
something else, but we don't have enough information from that one
trace. Could you get all of the blocked tasks when in this state (e.g.,
"echo w > /proc/sysrq-trigger")?

>    - I tested the disk by read bytes on block via $dd . It works fine
>    without any error in dmesg.
>    - The `xfs_repair -n` output of a problematic mount point [xfs_repair -n]
>    <https://gist.github.com/HugoKuo/76f65bdc0b860ca6ed5e786f8c43da0e> . It
>    is still processing.

I presume this was run after a forced reboot..? If so, was the
filesystem remounted first to replay the log (xfs_repair -n doesn't
detect/warn about a dirty log, iirc). If the log was dirty, then repair
is a bit less interesting simply because some corruption is to be
expected in that scenario.

>    - Kernel : Linux node9 2.6.32-573.8.1.el6.x86_64 #1 SMP Tue Nov 10
>    18:01:38 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
>    - OS : CentOS release 6.5 (Final)
>    - XFS : xfsprogs.x86_64         3.1.1-14.el6
> 
> 
> There's an interesting behaviour of $ls command.
> 
> * This is completed in 1sec. Very quick and give me the result in the
> test.d864 file $ls /srv/node/d864/tmp > test.d864
> * This is hanging $ls /srv/node/d864/tmp
> 

I'm not following you here. Are you missing an attachment (test.d864)?

Brian

> [image: Inline image 1]
> 
> I suspect there's something wrong with imap. Is there a known bug ?
> 
> Thanks // Hugo

> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [XFS] Any process to a particular XFS device hung in D state forever.
  2016-04-19 11:30 ` Brian Foster
@ 2016-04-19 13:24   ` Hugo Kuo
  2016-04-19 19:34     ` Brian Foster
  0 siblings, 1 reply; 8+ messages in thread
From: Hugo Kuo @ 2016-04-19 13:24 UTC (permalink / raw)
  To: Brian Foster; +Cc: Darrell Bishop, xfs


[-- Attachment #1.1: Type: text/plain, Size: 3610 bytes --]

Hi Brain,

Here's the a gist include sysrq-trigger and strace of one of the hanging
$ls result. This is from another problematic disk (d817) on the same server.

https://gist.github.com/HugoKuo/8eb8208bbb7a7f562a6c9a3eafa8f37f

It looks like the hanging $ls is stuck on getting extend attribute of a
file on this disk. The full output can be found in the link above.

lstat("/srv/node/d864/tmp/tmpIRYFaW", {st_mode=S_IFREG|0600, st_size=0,
...}) = 0
capget(0x20080522, 0, NULL) = -1 EFAULT (Bad address)
getxattr("/srv/node/d864/tmp/tmpIRYFaW", "security.capability"


As for the xfs_repair output in link
https://gist.github.com/HugoKuo/76f65bdc0b860ca6ed5e786f8c43da0e . Your
question is if the node been force rebooted. The answer is NO.   I* didn't
reboot* this server yet. I force unmounted it via *$umount -l <dev>* . Then
run the xfs_repair.

$ls /srv/node/d864/tmp > test.d864
$ls /srv/node/d864/tmp


Here's the contents of test.d864
https://gist.github.com/HugoKuo/25f93cd6daf5b0666a2ab85defd63a56

Thanks // Hugo

On Tue, Apr 19, 2016 at 7:30 PM, Brian Foster <bfoster@redhat.com> wrote:

> On Tue, Apr 19, 2016 at 05:56:19PM +0800, Hugo Kuo wrote:
> > Hi XFS team,
> >
> > We encountered a problem frequently in past three weeks. Our daemons
> store
> > data to XFS partition associate with xattr.
> >
> > Disk seems not responding since all processes to this disk in D state and
> > can't be killed at all.
> >
> >    - It happens on several disks. I feel it's randomly.
> >    - Reboot seems solve the problem temporarily.
> >    - All disks are multipath devices.
> >
> >
> > I suspected that's an issue from disk corrupted at beginning. But
> smartctl
> > doesn't show any clue about disk bad. And reboot makes the problem gone
> > away.
> >
> >
> >    - Any process to this disk is blocked. Even a simple $ls . Kernel log
> >    <https://gist.github.com/HugoKuo/f87748786b26ea04fd9e1d86d9538293>
>
> Looks like it's waiting on an AGF buffer. The buffer could be held by
> something else, but we don't have enough information from that one
> trace. Could you get all of the blocked tasks when in this state (e.g.,
> "echo w > /proc/sysrq-trigger")?
>




>
> >    - I tested the disk by read bytes on block via $dd . It works fine
> >    without any error in dmesg.
> >    - The `xfs_repair -n` output of a problematic mount point [xfs_repair
> -n]
> >    <https://gist.github.com/HugoKuo/76f65bdc0b860ca6ed5e786f8c43da0e> .
> It
> >    is still processing.
>
> I presume this was run after a forced reboot..? If so, was the
> filesystem remounted first to replay the log (xfs_repair -n doesn't
> detect/warn about a dirty log, iirc). If the log was dirty, then repair
> is a bit less interesting simply because some corruption is to be
> expected in that scenario.
>
> >    - Kernel : Linux node9 2.6.32-573.8.1.el6.x86_64 #1 SMP Tue Nov 10
> >    18:01:38 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
> >    - OS : CentOS release 6.5 (Final)
> >    - XFS : xfsprogs.x86_64         3.1.1-14.el6
> >
> >
> > There's an interesting behaviour of $ls command.
> >
> > * This is completed in 1sec. Very quick and give me the result in the
> > test.d864 file $ls /srv/node/d864/tmp > test.d864
> > * This is hanging $ls /srv/node/d864/tmp
> >
>
> I'm not following you here. Are you missing an attachment (test.d864)?
>
> Brian
>
> > [image: Inline image 1]
> >
> > I suspect there's something wrong with imap. Is there a known bug ?
> >
> > Thanks // Hugo
>
>
>
> > _______________________________________________
> > xfs mailing list
> > xfs@oss.sgi.com
> > http://oss.sgi.com/mailman/listinfo/xfs
>
>

[-- Attachment #1.2: Type: text/html, Size: 8255 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [XFS] Any process to a particular XFS device hung in D state forever.
  2016-04-19 13:24   ` Hugo Kuo
@ 2016-04-19 19:34     ` Brian Foster
  2016-04-20  5:49       ` Hugo Kuo
  0 siblings, 1 reply; 8+ messages in thread
From: Brian Foster @ 2016-04-19 19:34 UTC (permalink / raw)
  To: Hugo Kuo; +Cc: Darrell Bishop, xfs

On Tue, Apr 19, 2016 at 09:24:55PM +0800, Hugo Kuo wrote:
> Hi Brain,
> 
> Here's the a gist include sysrq-trigger and strace of one of the hanging
> $ls result. This is from another problematic disk (d817) on the same server.
> 
> https://gist.github.com/HugoKuo/8eb8208bbb7a7f562a6c9a3eafa8f37f
> 
> It looks like the hanging $ls is stuck on getting extend attribute of a
> file on this disk. The full output can be found in the link above.
> 
> lstat("/srv/node/d864/tmp/tmpIRYFaW", {st_mode=S_IFREG|0600, st_size=0,
> ...}) = 0
> capget(0x20080522, 0, NULL) = -1 EFAULT (Bad address)
> getxattr("/srv/node/d864/tmp/tmpIRYFaW", "security.capability"
> 

So there's definitely some traces waiting on AGF locks and whatnot, but
also many traces that appear to be waiting on I/O. For example:

kernel: swift-object- D 0000000000000008     0  2096   1605 0x00000000
kernel: ffff8877cc2378b8 0000000000000082 ffff8877cc237818 ffff887ff016eb68
kernel: ffff883fd4ab6b28 0000000000000046 ffff883fd4bd9400 00000001e7ea49d0
kernel: ffff8877cc237848 ffffffff812735d1 ffff885fa2e4a5f8 ffff8877cc237fd8
kernel: Call Trace:
kernel: [<ffffffff812735d1>] ? __blk_run_queue+0x31/0x40
kernel: [<ffffffff81539455>] schedule_timeout+0x215/0x2e0
kernel: [<ffffffff812757c9>] ? blk_peek_request+0x189/0x210
kernel: [<ffffffff8126d9b3>] ? elv_queue_empty+0x33/0x40
kernel: [<ffffffffa00040a0>] ? dm_request_fn+0x240/0x340 [dm_mod]
kernel: [<ffffffff815390d3>] wait_for_common+0x123/0x180
kernel: [<ffffffff810672b0>] ? default_wake_function+0x0/0x20
kernel: [<ffffffffa0001036>] ? dm_unplug_all+0x36/0x50 [dm_mod]
kernel: [<ffffffffa0415b56>] ? _xfs_buf_read+0x46/0x60 [xfs]
kernel: [<ffffffffa040b417>] ? xfs_trans_read_buf+0x197/0x410 [xfs]
kernel: [<ffffffff815391ed>] wait_for_completion+0x1d/0x20
kernel: [<ffffffffa041503b>] xfs_buf_iowait+0x9b/0x100 [xfs]
kernel: [<ffffffffa040b417>] ? xfs_trans_read_buf+0x197/0x410 [xfs]
kernel: [<ffffffffa0415b56>] _xfs_buf_read+0x46/0x60 [xfs]
kernel: [<ffffffffa0415c1b>] xfs_buf_read+0xab/0x100 [xfs]
...

Are all of these swift processes running against independent storage, or
one big array? Also, can you tell (e.g., with iotop) whether progress is
being made here, albiet very slowly, or if the storage is indeed locked
up..?

In any event, given the I/O hangs, the fact that you're on an old distro
kernel and you have things like multipath enabled, it might be
worthwhile to see if you can rule out any multipath issues.

> 
> As for the xfs_repair output in link
> https://gist.github.com/HugoKuo/76f65bdc0b860ca6ed5e786f8c43da0e . Your
> question is if the node been force rebooted. The answer is NO.   I* didn't
> reboot* this server yet. I force unmounted it via *$umount -l <dev>* . Then
> run the xfs_repair.
>  

'umount -l' doesn't necessarily force anything. It just lazily unmounts
the fs from the namespace and cleans up the mount once all references
are dropped. I suspect the fs is still mounted internally.

Brian

> $ls /srv/node/d864/tmp > test.d864
> $ls /srv/node/d864/tmp
> 
> 
> Here's the contents of test.d864
> https://gist.github.com/HugoKuo/25f93cd6daf5b0666a2ab85defd63a56
> 
> Thanks // Hugo
> 
> On Tue, Apr 19, 2016 at 7:30 PM, Brian Foster <bfoster@redhat.com> wrote:
> 
> > On Tue, Apr 19, 2016 at 05:56:19PM +0800, Hugo Kuo wrote:
> > > Hi XFS team,
> > >
> > > We encountered a problem frequently in past three weeks. Our daemons
> > store
> > > data to XFS partition associate with xattr.
> > >
> > > Disk seems not responding since all processes to this disk in D state and
> > > can't be killed at all.
> > >
> > >    - It happens on several disks. I feel it's randomly.
> > >    - Reboot seems solve the problem temporarily.
> > >    - All disks are multipath devices.
> > >
> > >
> > > I suspected that's an issue from disk corrupted at beginning. But
> > smartctl
> > > doesn't show any clue about disk bad. And reboot makes the problem gone
> > > away.
> > >
> > >
> > >    - Any process to this disk is blocked. Even a simple $ls . Kernel log
> > >    <https://gist.github.com/HugoKuo/f87748786b26ea04fd9e1d86d9538293>
> >
> > Looks like it's waiting on an AGF buffer. The buffer could be held by
> > something else, but we don't have enough information from that one
> > trace. Could you get all of the blocked tasks when in this state (e.g.,
> > "echo w > /proc/sysrq-trigger")?
> >
> 
> 
> 
> 
> >
> > >    - I tested the disk by read bytes on block via $dd . It works fine
> > >    without any error in dmesg.
> > >    - The `xfs_repair -n` output of a problematic mount point [xfs_repair
> > -n]
> > >    <https://gist.github.com/HugoKuo/76f65bdc0b860ca6ed5e786f8c43da0e> .
> > It
> > >    is still processing.
> >
> > I presume this was run after a forced reboot..? If so, was the
> > filesystem remounted first to replay the log (xfs_repair -n doesn't
> > detect/warn about a dirty log, iirc). If the log was dirty, then repair
> > is a bit less interesting simply because some corruption is to be
> > expected in that scenario.
> >
> > >    - Kernel : Linux node9 2.6.32-573.8.1.el6.x86_64 #1 SMP Tue Nov 10
> > >    18:01:38 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
> > >    - OS : CentOS release 6.5 (Final)
> > >    - XFS : xfsprogs.x86_64         3.1.1-14.el6
> > >
> > >
> > > There's an interesting behaviour of $ls command.
> > >
> > > * This is completed in 1sec. Very quick and give me the result in the
> > > test.d864 file $ls /srv/node/d864/tmp > test.d864
> > > * This is hanging $ls /srv/node/d864/tmp
> > >
> >
> > I'm not following you here. Are you missing an attachment (test.d864)?
> >
> > Brian
> >
> > > [image: Inline image 1]
> > >
> > > I suspect there's something wrong with imap. Is there a known bug ?
> > >
> > > Thanks // Hugo
> >
> >
> >
> > > _______________________________________________
> > > xfs mailing list
> > > xfs@oss.sgi.com
> > > http://oss.sgi.com/mailman/listinfo/xfs
> >
> >

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [XFS] Any process to a particular XFS device hung in D state forever.
  2016-04-19 19:34     ` Brian Foster
@ 2016-04-20  5:49       ` Hugo Kuo
  2016-04-20 11:24         ` Brian Foster
  0 siblings, 1 reply; 8+ messages in thread
From: Hugo Kuo @ 2016-04-20  5:49 UTC (permalink / raw)
  To: Brian Foster; +Cc: Darrell Bishop, xfs


[-- Attachment #1.1: Type: text/plain, Size: 3467 bytes --]

Hi XFS team,


Here's the lsof output of the grouped result of any openfile happens on
problematic disks. The full log of xfs_repair -n is included in this gist
as well. The xfs_repair recommend to contact xfs mailing list in the end of
the command.

https://gist.github.com/HugoKuo/95613d7864aa0a1343615642b3309451

Perhaps I should go ahead to reboot the machine and run the xfs_repair
again.  Please find my answers inlines.


On Wed, Apr 20, 2016 at 3:34 AM, Brian Foster <bfoster@redhat.com> wrote:

>
> So there's definitely some traces waiting on AGF locks and whatnot, but
> also many traces that appear to be waiting on I/O. For example:
>

Yes, those I/O waiting is the original problem of this thread. It looks
like the disk was locked. All these I/O waiting for same disk (a multipath
entry).


>
> kernel: swift-object- D 0000000000000008     0  2096   1605 0x00000000
> kernel: ffff8877cc2378b8 0000000000000082 ffff8877cc237818 ffff887ff016eb68
> kernel: ffff883fd4ab6b28 0000000000000046 ffff883fd4bd9400 00000001e7ea49d0
> kernel: ffff8877cc237848 ffffffff812735d1 ffff885fa2e4a5f8 ffff8877cc237fd8
> kernel: Call Trace:
> kernel: [<ffffffff812735d1>] ? __blk_run_queue+0x31/0x40
> kernel: [<ffffffff81539455>] schedule_timeout+0x215/0x2e0
> kernel: [<ffffffff812757c9>] ? blk_peek_request+0x189/0x210
> kernel: [<ffffffff8126d9b3>] ? elv_queue_empty+0x33/0x40
> kernel: [<ffffffffa00040a0>] ? dm_request_fn+0x240/0x340 [dm_mod]
> kernel: [<ffffffff815390d3>] wait_for_common+0x123/0x180
> kernel: [<ffffffff810672b0>] ? default_wake_function+0x0/0x20
> kernel: [<ffffffffa0001036>] ? dm_unplug_all+0x36/0x50 [dm_mod]
> kernel: [<ffffffffa0415b56>] ? _xfs_buf_read+0x46/0x60 [xfs]
> kernel: [<ffffffffa040b417>] ? xfs_trans_read_buf+0x197/0x410 [xfs]
> kernel: [<ffffffff815391ed>] wait_for_completion+0x1d/0x20
> kernel: [<ffffffffa041503b>] xfs_buf_iowait+0x9b/0x100 [xfs]
> kernel: [<ffffffffa040b417>] ? xfs_trans_read_buf+0x197/0x410 [xfs]
> kernel: [<ffffffffa0415b56>] _xfs_buf_read+0x46/0x60 [xfs]
> kernel: [<ffffffffa0415c1b>] xfs_buf_read+0xab/0x100 [xfs]
>
>
> Are all of these swift processes running against independent storage, or
> one big array? Also, can you tell (e.g., with iotop) whether progress is
> being made here, albiet very slowly, or if the storage is indeed locked
> up..?
>
> There're 240+ swift processes in running.
All stuck swift processes were attempting to access same disk.  I can
confirm it's indeed locked rather than slowly. By monitoring io via iotop.
There's 0 activity one the problematic mount point.


> In any event, given the I/O hangs, the fact that you're on an old distro
> kernel and you have things like multipath enabled, it might be
> worthwhile to see if you can rule out any multipath issues.
>
>
To upgrade the kernel for CentOS6.5 may not the option for the time being
but it definitely worth to give it try by picking up one of nodes for
testing later. As for the multipath, yes I did suspect some mystery problem
with multipath + XFS under a certain loading. But it's more like a XFS and
inode related hence I start to investigate from XFS. If there's no chance
to move forward in XFS, I might break the multipath and observe the result
for awhile.


>
> 'umount -l' doesn't necessarily force anything. It just lazily unmounts
> the fs from the namespace and cleans up the mount once all references
> are dropped. I suspect the fs is still mounted internally.
>
> Brian
>
>
Thanks // Hugo

[-- Attachment #1.2: Type: text/html, Size: 5213 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [XFS] Any process to a particular XFS device hung in D state forever.
  2016-04-20  5:49       ` Hugo Kuo
@ 2016-04-20 11:24         ` Brian Foster
  2016-04-21  5:54           ` Hugo Kuo
  0 siblings, 1 reply; 8+ messages in thread
From: Brian Foster @ 2016-04-20 11:24 UTC (permalink / raw)
  To: Hugo Kuo; +Cc: Darrell Bishop, xfs

On Wed, Apr 20, 2016 at 01:49:49PM +0800, Hugo Kuo wrote:
> Hi XFS team,
> 
> 
> Here's the lsof output of the grouped result of any openfile happens on
> problematic disks. The full log of xfs_repair -n is included in this gist
> as well. The xfs_repair recommend to contact xfs mailing list in the end of
> the command.
> 
> https://gist.github.com/HugoKuo/95613d7864aa0a1343615642b3309451
> 
> Perhaps I should go ahead to reboot the machine and run the xfs_repair
> again.  Please find my answers inlines.
> 

Yes, repair is crashing in this case. Best to try xfs_repair after
you've rebooted and mounted/umounted the fs to replay the log. If it's
still crashing at that point, we'll probably want a metadata image of
the fs, if possible (though there's a good chance a newer xfsprogs has
the problem fixed).

> 
> On Wed, Apr 20, 2016 at 3:34 AM, Brian Foster <bfoster@redhat.com> wrote:
> 
> >
> > So there's definitely some traces waiting on AGF locks and whatnot, but
> > also many traces that appear to be waiting on I/O. For example:
> >
> 
> Yes, those I/O waiting is the original problem of this thread. It looks
> like the disk was locked. All these I/O waiting for same disk (a multipath
> entry).
> 
> 
> >
> > kernel: swift-object- D 0000000000000008     0  2096   1605 0x00000000
> > kernel: ffff8877cc2378b8 0000000000000082 ffff8877cc237818 ffff887ff016eb68
> > kernel: ffff883fd4ab6b28 0000000000000046 ffff883fd4bd9400 00000001e7ea49d0
> > kernel: ffff8877cc237848 ffffffff812735d1 ffff885fa2e4a5f8 ffff8877cc237fd8
> > kernel: Call Trace:
> > kernel: [<ffffffff812735d1>] ? __blk_run_queue+0x31/0x40
> > kernel: [<ffffffff81539455>] schedule_timeout+0x215/0x2e0
> > kernel: [<ffffffff812757c9>] ? blk_peek_request+0x189/0x210
> > kernel: [<ffffffff8126d9b3>] ? elv_queue_empty+0x33/0x40
> > kernel: [<ffffffffa00040a0>] ? dm_request_fn+0x240/0x340 [dm_mod]
> > kernel: [<ffffffff815390d3>] wait_for_common+0x123/0x180
> > kernel: [<ffffffff810672b0>] ? default_wake_function+0x0/0x20
> > kernel: [<ffffffffa0001036>] ? dm_unplug_all+0x36/0x50 [dm_mod]
> > kernel: [<ffffffffa0415b56>] ? _xfs_buf_read+0x46/0x60 [xfs]
> > kernel: [<ffffffffa040b417>] ? xfs_trans_read_buf+0x197/0x410 [xfs]
> > kernel: [<ffffffff815391ed>] wait_for_completion+0x1d/0x20
> > kernel: [<ffffffffa041503b>] xfs_buf_iowait+0x9b/0x100 [xfs]
> > kernel: [<ffffffffa040b417>] ? xfs_trans_read_buf+0x197/0x410 [xfs]
> > kernel: [<ffffffffa0415b56>] _xfs_buf_read+0x46/0x60 [xfs]
> > kernel: [<ffffffffa0415c1b>] xfs_buf_read+0xab/0x100 [xfs]
> >
> >
> > Are all of these swift processes running against independent storage, or
> > one big array? Also, can you tell (e.g., with iotop) whether progress is
> > being made here, albiet very slowly, or if the storage is indeed locked
> > up..?
> >
> > There're 240+ swift processes in running.
> All stuck swift processes were attempting to access same disk.  I can
> confirm it's indeed locked rather than slowly. By monitoring io via iotop.
> There's 0 activity one the problematic mount point.
> 
> 
> > In any event, given the I/O hangs, the fact that you're on an old distro
> > kernel and you have things like multipath enabled, it might be
> > worthwhile to see if you can rule out any multipath issues.
> >
> >
> To upgrade the kernel for CentOS6.5 may not the option for the time being
> but it definitely worth to give it try by picking up one of nodes for
> testing later. As for the multipath, yes I did suspect some mystery problem
> with multipath + XFS under a certain loading. But it's more like a XFS and
> inode related hence I start to investigate from XFS. If there's no chance
> to move forward in XFS, I might break the multipath and observe the result
> for awhile.
> 

It's hard to pinpoint something to the fs when there's a bunch of hung
I/Os. You probably want to track down the source of those problems
first.

Brian

> 
> >
> > 'umount -l' doesn't necessarily force anything. It just lazily unmounts
> > the fs from the namespace and cleans up the mount once all references
> > are dropped. I suspect the fs is still mounted internally.
> >
> > Brian
> >
> >
> Thanks // Hugo

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [XFS] Any process to a particular XFS device hung in D state forever.
  2016-04-20 11:24         ` Brian Foster
@ 2016-04-21  5:54           ` Hugo Kuo
  2016-04-21 12:40             ` Brian Foster
  0 siblings, 1 reply; 8+ messages in thread
From: Hugo Kuo @ 2016-04-21  5:54 UTC (permalink / raw)
  To: Brian Foster; +Cc: Darrell Bishop, xfs


[-- Attachment #1.1: Type: text/plain, Size: 5072 bytes --]

Hi Brian,

Here's the result of xfs_repair on the same disk after rebooting.
https://gist.github.com/HugoKuo/e1d683d9653e66a80dfcfcbee4294fe8
It's looks normal.

We rebooted the server and no hanging process in past 12hrs. I'll keep eyes
on the server.
I know it's helpful to to trace-cmd for xfs. But there's too may xfs
operations are happening in the server. It's crazy like 100MB data per
second. I stopped the trace-cmd. As you said, we need to find out the
source of the problem. trace-cmd would be a nice option. Is there a way to
flush recored data if nothing happens in the past hours ?

trace-cmd record -e xfs\*



Regards // Hugo




On Wed, Apr 20, 2016 at 7:24 PM, Brian Foster <bfoster@redhat.com> wrote:

> On Wed, Apr 20, 2016 at 01:49:49PM +0800, Hugo Kuo wrote:
> > Hi XFS team,
> >
> >
> > Here's the lsof output of the grouped result of any openfile happens on
> > problematic disks. The full log of xfs_repair -n is included in this gist
> > as well. The xfs_repair recommend to contact xfs mailing list in the end
> of
> > the command.
> >
> > https://gist.github.com/HugoKuo/95613d7864aa0a1343615642b3309451
> >
> > Perhaps I should go ahead to reboot the machine and run the xfs_repair
> > again.  Please find my answers inlines.
> >
>
> Yes, repair is crashing in this case. Best to try xfs_repair after
> you've rebooted and mounted/umounted the fs to replay the log. If it's
> still crashing at that point, we'll probably want a metadata image of
> the fs, if possible (though there's a good chance a newer xfsprogs has
> the problem fixed).
>
> >
> > On Wed, Apr 20, 2016 at 3:34 AM, Brian Foster <bfoster@redhat.com>
> wrote:
> >
> > >
> > > So there's definitely some traces waiting on AGF locks and whatnot, but
> > > also many traces that appear to be waiting on I/O. For example:
> > >
> >
> > Yes, those I/O waiting is the original problem of this thread. It looks
> > like the disk was locked. All these I/O waiting for same disk (a
> multipath
> > entry).
> >
> >
> > >
> > > kernel: swift-object- D 0000000000000008     0  2096   1605 0x00000000
> > > kernel: ffff8877cc2378b8 0000000000000082 ffff8877cc237818
> ffff887ff016eb68
> > > kernel: ffff883fd4ab6b28 0000000000000046 ffff883fd4bd9400
> 00000001e7ea49d0
> > > kernel: ffff8877cc237848 ffffffff812735d1 ffff885fa2e4a5f8
> ffff8877cc237fd8
> > > kernel: Call Trace:
> > > kernel: [<ffffffff812735d1>] ? __blk_run_queue+0x31/0x40
> > > kernel: [<ffffffff81539455>] schedule_timeout+0x215/0x2e0
> > > kernel: [<ffffffff812757c9>] ? blk_peek_request+0x189/0x210
> > > kernel: [<ffffffff8126d9b3>] ? elv_queue_empty+0x33/0x40
> > > kernel: [<ffffffffa00040a0>] ? dm_request_fn+0x240/0x340 [dm_mod]
> > > kernel: [<ffffffff815390d3>] wait_for_common+0x123/0x180
> > > kernel: [<ffffffff810672b0>] ? default_wake_function+0x0/0x20
> > > kernel: [<ffffffffa0001036>] ? dm_unplug_all+0x36/0x50 [dm_mod]
> > > kernel: [<ffffffffa0415b56>] ? _xfs_buf_read+0x46/0x60 [xfs]
> > > kernel: [<ffffffffa040b417>] ? xfs_trans_read_buf+0x197/0x410 [xfs]
> > > kernel: [<ffffffff815391ed>] wait_for_completion+0x1d/0x20
> > > kernel: [<ffffffffa041503b>] xfs_buf_iowait+0x9b/0x100 [xfs]
> > > kernel: [<ffffffffa040b417>] ? xfs_trans_read_buf+0x197/0x410 [xfs]
> > > kernel: [<ffffffffa0415b56>] _xfs_buf_read+0x46/0x60 [xfs]
> > > kernel: [<ffffffffa0415c1b>] xfs_buf_read+0xab/0x100 [xfs]
> > >
> > >
> > > Are all of these swift processes running against independent storage,
> or
> > > one big array? Also, can you tell (e.g., with iotop) whether progress
> is
> > > being made here, albiet very slowly, or if the storage is indeed locked
> > > up..?
> > >
> > > There're 240+ swift processes in running.
> > All stuck swift processes were attempting to access same disk.  I can
> > confirm it's indeed locked rather than slowly. By monitoring io via
> iotop.
> > There's 0 activity one the problematic mount point.
> >
> >
> > > In any event, given the I/O hangs, the fact that you're on an old
> distro
> > > kernel and you have things like multipath enabled, it might be
> > > worthwhile to see if you can rule out any multipath issues.
> > >
> > >
> > To upgrade the kernel for CentOS6.5 may not the option for the time being
> > but it definitely worth to give it try by picking up one of nodes for
> > testing later. As for the multipath, yes I did suspect some mystery
> problem
> > with multipath + XFS under a certain loading. But it's more like a XFS
> and
> > inode related hence I start to investigate from XFS. If there's no chance
> > to move forward in XFS, I might break the multipath and observe the
> result
> > for awhile.
> >
>
> It's hard to pinpoint something to the fs when there's a bunch of hung
> I/Os. You probably want to track down the source of those problems
> first.
>
> Brian
>
> >
> > >
> > > 'umount -l' doesn't necessarily force anything. It just lazily unmounts
> > > the fs from the namespace and cleans up the mount once all references
> > > are dropped. I suspect the fs is still mounted internally.
> > >
> > > Brian
> > >
> > >
> > Thanks // Hugo
>

[-- Attachment #1.2: Type: text/html, Size: 6959 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [XFS] Any process to a particular XFS device hung in D state forever.
  2016-04-21  5:54           ` Hugo Kuo
@ 2016-04-21 12:40             ` Brian Foster
  0 siblings, 0 replies; 8+ messages in thread
From: Brian Foster @ 2016-04-21 12:40 UTC (permalink / raw)
  To: Hugo Kuo; +Cc: Darrell Bishop, xfs

On Thu, Apr 21, 2016 at 01:54:44PM +0800, Hugo Kuo wrote:
> Hi Brian,
> 
> Here's the result of xfs_repair on the same disk after rebooting.
> https://gist.github.com/HugoKuo/e1d683d9653e66a80dfcfcbee4294fe8
> It's looks normal.
> 
> We rebooted the server and no hanging process in past 12hrs. I'll keep eyes
> on the server.
> I know it's helpful to to trace-cmd for xfs. But there's too may xfs
> operations are happening in the server. It's crazy like 100MB data per
> second. I stopped the trace-cmd. As you said, we need to find out the
> source of the problem. trace-cmd would be a nice option. Is there a way to
> flush recored data if nothing happens in the past hours ?
> 

I'm not sure XFS trace data will help you, at least until we've narrowed
down to something that looks like an XFS problem. I wonder if multipath
has any sort if tracing support..? It doesn't appear so on a quick look,
but dm.c has some trace_block_bio*()/trace_block_rq*() tracepoints that
might be useful enough to show whether requests are actually completing.

To answer your question, 'trace-cmd record' writes to a local trace.dat
file so you don't have to worry about pulling events from the kernel
yourself (i.e., as with 'trace-cmd start'). Just make sure you sync or
'xfs_io -c fsync trace.dat' before you force a hard reset, if necessary.

Brian

> trace-cmd record -e xfs\*
> 
> 
> 
> Regards // Hugo
> 
> 
> 
> 
> On Wed, Apr 20, 2016 at 7:24 PM, Brian Foster <bfoster@redhat.com> wrote:
> 
> > On Wed, Apr 20, 2016 at 01:49:49PM +0800, Hugo Kuo wrote:
> > > Hi XFS team,
> > >
> > >
> > > Here's the lsof output of the grouped result of any openfile happens on
> > > problematic disks. The full log of xfs_repair -n is included in this gist
> > > as well. The xfs_repair recommend to contact xfs mailing list in the end
> > of
> > > the command.
> > >
> > > https://gist.github.com/HugoKuo/95613d7864aa0a1343615642b3309451
> > >
> > > Perhaps I should go ahead to reboot the machine and run the xfs_repair
> > > again.  Please find my answers inlines.
> > >
> >
> > Yes, repair is crashing in this case. Best to try xfs_repair after
> > you've rebooted and mounted/umounted the fs to replay the log. If it's
> > still crashing at that point, we'll probably want a metadata image of
> > the fs, if possible (though there's a good chance a newer xfsprogs has
> > the problem fixed).
> >
> > >
> > > On Wed, Apr 20, 2016 at 3:34 AM, Brian Foster <bfoster@redhat.com>
> > wrote:
> > >
> > > >
> > > > So there's definitely some traces waiting on AGF locks and whatnot, but
> > > > also many traces that appear to be waiting on I/O. For example:
> > > >
> > >
> > > Yes, those I/O waiting is the original problem of this thread. It looks
> > > like the disk was locked. All these I/O waiting for same disk (a
> > multipath
> > > entry).
> > >
> > >
> > > >
> > > > kernel: swift-object- D 0000000000000008     0  2096   1605 0x00000000
> > > > kernel: ffff8877cc2378b8 0000000000000082 ffff8877cc237818
> > ffff887ff016eb68
> > > > kernel: ffff883fd4ab6b28 0000000000000046 ffff883fd4bd9400
> > 00000001e7ea49d0
> > > > kernel: ffff8877cc237848 ffffffff812735d1 ffff885fa2e4a5f8
> > ffff8877cc237fd8
> > > > kernel: Call Trace:
> > > > kernel: [<ffffffff812735d1>] ? __blk_run_queue+0x31/0x40
> > > > kernel: [<ffffffff81539455>] schedule_timeout+0x215/0x2e0
> > > > kernel: [<ffffffff812757c9>] ? blk_peek_request+0x189/0x210
> > > > kernel: [<ffffffff8126d9b3>] ? elv_queue_empty+0x33/0x40
> > > > kernel: [<ffffffffa00040a0>] ? dm_request_fn+0x240/0x340 [dm_mod]
> > > > kernel: [<ffffffff815390d3>] wait_for_common+0x123/0x180
> > > > kernel: [<ffffffff810672b0>] ? default_wake_function+0x0/0x20
> > > > kernel: [<ffffffffa0001036>] ? dm_unplug_all+0x36/0x50 [dm_mod]
> > > > kernel: [<ffffffffa0415b56>] ? _xfs_buf_read+0x46/0x60 [xfs]
> > > > kernel: [<ffffffffa040b417>] ? xfs_trans_read_buf+0x197/0x410 [xfs]
> > > > kernel: [<ffffffff815391ed>] wait_for_completion+0x1d/0x20
> > > > kernel: [<ffffffffa041503b>] xfs_buf_iowait+0x9b/0x100 [xfs]
> > > > kernel: [<ffffffffa040b417>] ? xfs_trans_read_buf+0x197/0x410 [xfs]
> > > > kernel: [<ffffffffa0415b56>] _xfs_buf_read+0x46/0x60 [xfs]
> > > > kernel: [<ffffffffa0415c1b>] xfs_buf_read+0xab/0x100 [xfs]
> > > >
> > > >
> > > > Are all of these swift processes running against independent storage,
> > or
> > > > one big array? Also, can you tell (e.g., with iotop) whether progress
> > is
> > > > being made here, albiet very slowly, or if the storage is indeed locked
> > > > up..?
> > > >
> > > > There're 240+ swift processes in running.
> > > All stuck swift processes were attempting to access same disk.  I can
> > > confirm it's indeed locked rather than slowly. By monitoring io via
> > iotop.
> > > There's 0 activity one the problematic mount point.
> > >
> > >
> > > > In any event, given the I/O hangs, the fact that you're on an old
> > distro
> > > > kernel and you have things like multipath enabled, it might be
> > > > worthwhile to see if you can rule out any multipath issues.
> > > >
> > > >
> > > To upgrade the kernel for CentOS6.5 may not the option for the time being
> > > but it definitely worth to give it try by picking up one of nodes for
> > > testing later. As for the multipath, yes I did suspect some mystery
> > problem
> > > with multipath + XFS under a certain loading. But it's more like a XFS
> > and
> > > inode related hence I start to investigate from XFS. If there's no chance
> > > to move forward in XFS, I might break the multipath and observe the
> > result
> > > for awhile.
> > >
> >
> > It's hard to pinpoint something to the fs when there's a bunch of hung
> > I/Os. You probably want to track down the source of those problems
> > first.
> >
> > Brian
> >
> > >
> > > >
> > > > 'umount -l' doesn't necessarily force anything. It just lazily unmounts
> > > > the fs from the namespace and cleans up the mount once all references
> > > > are dropped. I suspect the fs is still mounted internally.
> > > >
> > > > Brian
> > > >
> > > >
> > > Thanks // Hugo
> >

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-04-21 12:40 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-19  9:56 [XFS] Any process to a particular XFS device hung in D state forever Hugo Kuo
2016-04-19 11:30 ` Brian Foster
2016-04-19 13:24   ` Hugo Kuo
2016-04-19 19:34     ` Brian Foster
2016-04-20  5:49       ` Hugo Kuo
2016-04-20 11:24         ` Brian Foster
2016-04-21  5:54           ` Hugo Kuo
2016-04-21 12:40             ` Brian Foster

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox