* need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
@ 2013-04-09 12:53 符永涛
2013-04-09 13:03 ` 符永涛
2013-04-09 15:06 ` Eric Sandeen
0 siblings, 2 replies; 60+ messages in thread
From: 符永涛 @ 2013-04-09 12:53 UTC (permalink / raw)
To: xfs
Dear xfs experts,
I really need your help sincerely!!! In our production enviroment we
run glusterfs over top of xfs on Dell x720D(Raid 6). And the xfs file
system crash on some of the server frequently about every two weeks.
Can you help to give me a direction about how to debug this issue and
how to avoid it? Thank you very very much!
uname -a
Linux cqdx.miaoyan.cluster1.node11.qiyi.domain 2.6.32-279.el6.x86_64
#1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
Every time the crash log is same, as following
038 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
xfs_inotobp() returned error 22.
1039 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree
returned error 22
1040 Apr 9 09:41:36 cqdx kernel: XFS (sdb):
xfs_do_force_shutdown(0x1) called from line 1184 of file
fs/xfs/xfs_vnodeops.c. Return address = 0xffffffffa02ee20a
1041 Apr 9 09:41:36 cqdx kernel: XFS (sdb): I/O Error Detected.
Shutting down filesystem
1042 Apr 9 09:41:36 cqdx kernel: XFS (sdb): Please umount the
filesystem and rectify the problem(s)
1043 Apr 9 09:41:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
1044 Apr 9 09:42:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
1045 Apr 9 09:42:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
1046 Apr 9 09:43:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
--
符永涛
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 12:53 need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22 符永涛
@ 2013-04-09 13:03 ` 符永涛
2013-04-09 13:05 ` 符永涛
2013-04-09 22:16 ` Michael L. Semon
2013-04-09 15:06 ` Eric Sandeen
1 sibling, 2 replies; 60+ messages in thread
From: 符永涛 @ 2013-04-09 13:03 UTC (permalink / raw)
To: xfs, yongtaofu
BTW
xfs_info /dev/sdb
meta-data=/dev/sdb isize=256 agcount=28, agsize=268435440 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=7324303360, imaxpct=5
= sunit=16 swidth=160 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=521728, version=2
= sectsz=512 sunit=16 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
2013/4/9, 符永涛 <yongtaofu@gmail.com>:
> Dear xfs experts,
> I really need your help sincerely!!! In our production enviroment we
> run glusterfs over top of xfs on Dell x720D(Raid 6). And the xfs file
> system crash on some of the server frequently about every two weeks.
> Can you help to give me a direction about how to debug this issue and
> how to avoid it? Thank you very very much!
>
> uname -a
> Linux cqdx.miaoyan.cluster1.node11.qiyi.domain 2.6.32-279.el6.x86_64
> #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
>
> Every time the crash log is same, as following
>
> 038 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
> xfs_inotobp() returned error 22.
> 1039 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree
> returned error 22
> 1040 Apr 9 09:41:36 cqdx kernel: XFS (sdb):
> xfs_do_force_shutdown(0x1) called from line 1184 of file
> fs/xfs/xfs_vnodeops.c. Return address = 0xffffffffa02ee20a
> 1041 Apr 9 09:41:36 cqdx kernel: XFS (sdb): I/O Error Detected.
> Shutting down filesystem
> 1042 Apr 9 09:41:36 cqdx kernel: XFS (sdb): Please umount the
> filesystem and rectify the problem(s)
> 1043 Apr 9 09:41:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> returned.
> 1044 Apr 9 09:42:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> returned.
> 1045 Apr 9 09:42:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> returned.
> 1046 Apr 9 09:43:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> returned.
>
> --
> 符永涛
>
--
符永涛
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 13:03 ` 符永涛
@ 2013-04-09 13:05 ` 符永涛
2013-04-09 14:52 ` Ben Myers
2013-04-09 22:16 ` Michael L. Semon
1 sibling, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-09 13:05 UTC (permalink / raw)
To: xfs, yongtaofu
Also I want to know why all the server, all crash with the same crash stack?
Thank you, really need your help.
2013/4/9, 符永涛 <yongtaofu@gmail.com>:
> BTW
> xfs_info /dev/sdb
> meta-data=/dev/sdb isize=256 agcount=28, agsize=268435440
> blks
> = sectsz=512 attr=2
> data = bsize=4096 blocks=7324303360, imaxpct=5
> = sunit=16 swidth=160 blks
> naming =version 2 bsize=4096 ascii-ci=0
> log =internal bsize=4096 blocks=521728, version=2
> = sectsz=512 sunit=16 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
>
> 2013/4/9, 符永涛 <yongtaofu@gmail.com>:
>> Dear xfs experts,
>> I really need your help sincerely!!! In our production enviroment we
>> run glusterfs over top of xfs on Dell x720D(Raid 6). And the xfs file
>> system crash on some of the server frequently about every two weeks.
>> Can you help to give me a direction about how to debug this issue and
>> how to avoid it? Thank you very very much!
>>
>> uname -a
>> Linux cqdx.miaoyan.cluster1.node11.qiyi.domain 2.6.32-279.el6.x86_64
>> #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
>>
>> Every time the crash log is same, as following
>>
>> 038 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
>> xfs_inotobp() returned error 22.
>> 1039 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree
>> returned error 22
>> 1040 Apr 9 09:41:36 cqdx kernel: XFS (sdb):
>> xfs_do_force_shutdown(0x1) called from line 1184 of file
>> fs/xfs/xfs_vnodeops.c. Return address = 0xffffffffa02ee20a
>> 1041 Apr 9 09:41:36 cqdx kernel: XFS (sdb): I/O Error Detected.
>> Shutting down filesystem
>> 1042 Apr 9 09:41:36 cqdx kernel: XFS (sdb): Please umount the
>> filesystem and rectify the problem(s)
>> 1043 Apr 9 09:41:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> returned.
>> 1044 Apr 9 09:42:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> returned.
>> 1045 Apr 9 09:42:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> returned.
>> 1046 Apr 9 09:43:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> returned.
>>
>> --
>> 符永涛
>>
>
>
> --
> 符永涛
>
--
符永涛
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 13:05 ` 符永涛
@ 2013-04-09 14:52 ` Ben Myers
2013-04-09 15:00 ` 符永涛
2013-04-09 15:07 ` 符永涛
0 siblings, 2 replies; 60+ messages in thread
From: Ben Myers @ 2013-04-09 14:52 UTC (permalink / raw)
To: 符永涛; +Cc: xfs
Hey Yongtaofu,
On Tue, Apr 09, 2013 at 09:05:32PM +0800, 符永涛 wrote:
> Also I want to know why all the server, all crash with the same crash stack?
> Thank you, really need your help.
What you've posted so far looks like evidence of a forced shutdown and not a
crash. Is there a crash in addition to this forced shutdown? If so, can you
post the stack for that too?
>
> 2013/4/9, 符永涛 <yongtaofu@gmail.com>:
> > BTW
> > xfs_info /dev/sdb
> > meta-data=/dev/sdb isize=256 agcount=28, agsize=268435440
> > blks
> > = sectsz=512 attr=2
> > data = bsize=4096 blocks=7324303360, imaxpct=5
> > = sunit=16 swidth=160 blks
> > naming =version 2 bsize=4096 ascii-ci=0
> > log =internal bsize=4096 blocks=521728, version=2
> > = sectsz=512 sunit=16 blks, lazy-count=1
> > realtime =none extsz=4096 blocks=0, rtextents=0
> >
> > 2013/4/9, 符永涛 <yongtaofu@gmail.com>:
> >> Dear xfs experts,
> >> I really need your help sincerely!!! In our production enviroment we
> >> run glusterfs over top of xfs on Dell x720D(Raid 6). And the xfs file
> >> system crash on some of the server frequently about every two weeks.
> >> Can you help to give me a direction about how to debug this issue and
> >> how to avoid it? Thank you very very much!
> >>
> >> uname -a
> >> Linux cqdx.miaoyan.cluster1.node11.qiyi.domain 2.6.32-279.el6.x86_64
> >> #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
> >>
> >> Every time the crash log is same, as following
An initial guess is that somehow it is looking up a bad inode number, e.g. it
is beyond the end of the filesystem and xfs_dilocate returns EINVAL.
You could 'xfs_repair -n' to see what it finds (without modifying the
filesystem) as a first step.
> >> 038 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
> >> xfs_inotobp() returned error 22.
Were there any lines of output before this? In some codebases there are prints
in xfs_inotobp that would help show what happened.
> >> 1039 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree
> >> returned error 22
> >> 1040 Apr 9 09:41:36 cqdx kernel: XFS (sdb):
> >> xfs_do_force_shutdown(0x1) called from line 1184 of file
> >> fs/xfs/xfs_vnodeops.c. Return address = 0xffffffffa02ee20a
> >> 1041 Apr 9 09:41:36 cqdx kernel: XFS (sdb): I/O Error Detected.
> >> Shutting down filesystem
> >> 1042 Apr 9 09:41:36 cqdx kernel: XFS (sdb): Please umount the
> >> filesystem and rectify the problem(s)
> >> 1043 Apr 9 09:41:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> >> returned.
> >> 1044 Apr 9 09:42:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> >> returned.
> >> 1045 Apr 9 09:42:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> >> returned.
> >> 1046 Apr 9 09:43:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> >> returned.
The error 5 (EIO) look scary but they are due to the forced shutdown, don't
worry about them.
Thanks,
Ben
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 14:52 ` Ben Myers
@ 2013-04-09 15:00 ` 符永涛
2013-04-09 15:07 ` 符永涛
1 sibling, 0 replies; 60+ messages in thread
From: 符永涛 @ 2013-04-09 15:00 UTC (permalink / raw)
To: Ben Myers; +Cc: xfs
No crash just force shutdown, thank you for replying to me.
2013/4/9, Ben Myers <bpm@sgi.com>:
> Hey Yongtaofu,
>
> On Tue, Apr 09, 2013 at 09:05:32PM +0800, 符永涛 wrote:
>> Also I want to know why all the server, all crash with the same crash
>> stack?
>> Thank you, really need your help.
>
> What you've posted so far looks like evidence of a forced shutdown and not
> a
> crash. Is there a crash in addition to this forced shutdown? If so, can
> you
> post the stack for that too?
>
>>
>> 2013/4/9, 符永涛 <yongtaofu@gmail.com>:
>> > BTW
>> > xfs_info /dev/sdb
>> > meta-data=/dev/sdb isize=256 agcount=28,
>> > agsize=268435440
>> > blks
>> > = sectsz=512 attr=2
>> > data = bsize=4096 blocks=7324303360,
>> > imaxpct=5
>> > = sunit=16 swidth=160 blks
>> > naming =version 2 bsize=4096 ascii-ci=0
>> > log =internal bsize=4096 blocks=521728, version=2
>> > = sectsz=512 sunit=16 blks,
>> > lazy-count=1
>> > realtime =none extsz=4096 blocks=0, rtextents=0
>> >
>> > 2013/4/9, 符永涛 <yongtaofu@gmail.com>:
>> >> Dear xfs experts,
>> >> I really need your help sincerely!!! In our production enviroment we
>> >> run glusterfs over top of xfs on Dell x720D(Raid 6). And the xfs file
>> >> system crash on some of the server frequently about every two weeks.
>> >> Can you help to give me a direction about how to debug this issue and
>> >> how to avoid it? Thank you very very much!
>> >>
>> >> uname -a
>> >> Linux cqdx.miaoyan.cluster1.node11.qiyi.domain 2.6.32-279.el6.x86_64
>> >> #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
>> >>
>> >> Every time the crash log is same, as following
>
> An initial guess is that somehow it is looking up a bad inode number, e.g.
> it
> is beyond the end of the filesystem and xfs_dilocate returns EINVAL.
>
> You could 'xfs_repair -n' to see what it finds (without modifying the
> filesystem) as a first step.
>
>> >> 038 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
>> >> xfs_inotobp() returned error 22.
>
> Were there any lines of output before this? In some codebases there are
> prints
> in xfs_inotobp that would help show what happened.
>
>> >> 1039 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree
>> >> returned error 22
>> >> 1040 Apr 9 09:41:36 cqdx kernel: XFS (sdb):
>> >> xfs_do_force_shutdown(0x1) called from line 1184 of file
>> >> fs/xfs/xfs_vnodeops.c. Return address = 0xffffffffa02ee20a
>> >> 1041 Apr 9 09:41:36 cqdx kernel: XFS (sdb): I/O Error Detected.
>> >> Shutting down filesystem
>> >> 1042 Apr 9 09:41:36 cqdx kernel: XFS (sdb): Please umount the
>> >> filesystem and rectify the problem(s)
>> >> 1043 Apr 9 09:41:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> >> returned.
>> >> 1044 Apr 9 09:42:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> >> returned.
>> >> 1045 Apr 9 09:42:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> >> returned.
>> >> 1046 Apr 9 09:43:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> >> returned.
>
> The error 5 (EIO) look scary but they are due to the forced shutdown, don't
> worry about them.
>
> Thanks,
> Ben
>
--
符永涛
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 12:53 need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22 符永涛
2013-04-09 13:03 ` 符永涛
@ 2013-04-09 15:06 ` Eric Sandeen
2013-04-09 15:18 ` 符永涛
1 sibling, 1 reply; 60+ messages in thread
From: Eric Sandeen @ 2013-04-09 15:06 UTC (permalink / raw)
To: 符永涛; +Cc: xfs
On 4/9/13 7:53 AM, 符永涛 wrote:
> Dear xfs experts,
> I really need your help sincerely!!! In our production enviroment we
> run glusterfs over top of xfs on Dell x720D(Raid 6). And the xfs file
> system crash on some of the server frequently about every two weeks.
> Can you help to give me a direction about how to debug this issue and
> how to avoid it? Thank you very very much!
So this happens reliably, but infrequently? (only every 2 weeks or so?)
Can you provoke it any more often?
> uname -a
> Linux cqdx.miaoyan.cluster1.node11.qiyi.domain 2.6.32-279.el6.x86_64
> #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
That's a RHEL6 kernel; I'm assuming that this is a RHEL clone w/o RH support?
I agree with Ben that I'd like to see xfs_repair output.
Since the fs has shut down, you should unmount, remount, and unmount
again to replay the dirty log. Then do xfs_repair -n, and provide the output
if it discovers any errors.
Thanks,
-Eric
> Every time the crash log is same, as following
>
> 038 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
> xfs_inotobp() returned error 22.
> 1039 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree
> returned error 22
> 1040 Apr 9 09:41:36 cqdx kernel: XFS (sdb):
> xfs_do_force_shutdown(0x1) called from line 1184 of file
> fs/xfs/xfs_vnodeops.c. Return address = 0xffffffffa02ee20a
> 1041 Apr 9 09:41:36 cqdx kernel: XFS (sdb): I/O Error Detected.
> Shutting down filesystem
> 1042 Apr 9 09:41:36 cqdx kernel: XFS (sdb): Please umount the
> filesystem and rectify the problem(s)
> 1043 Apr 9 09:41:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> 1044 Apr 9 09:42:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> 1045 Apr 9 09:42:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> 1046 Apr 9 09:43:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 14:52 ` Ben Myers
2013-04-09 15:00 ` 符永涛
@ 2013-04-09 15:07 ` 符永涛
2013-04-09 15:10 ` 符永涛
1 sibling, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-09 15:07 UTC (permalink / raw)
To: Ben Myers; +Cc: xfs
[-- Attachment #1.1: Type: text/plain, Size: 5288 bytes --]
before xfs force shutdown happens there seems no useful log in
/var/log/messages
Apr 9 10:38:08 cqdx smbd[4597]: Unable to connect to CUPS server
localhost:631 - Connection refused
Apr 9 10:38:08 cqdx smbd[3394]: [2013/04/09 10:38:08.944125, 0]
printing/print_cups.c:468(cups_async_callback)
Apr 9 10:38:08 cqdx smbd[3394]: failed to retrieve printer list:
NT_STATUS_UNSUCCESSFUL
Apr 9 10:51:09 cqdx smbd[5205]: [2013/04/09 10:51:09.723610, 0]
printing/print_cups.c:109(cups_connect)
Apr 9 10:51:09 cqdx smbd[5205]: Unable to connect to CUPS server
localhost:631 - Connection refused
Apr 9 10:51:09 cqdx smbd[3394]: [2013/04/09 10:51:09.724132, 0]
printing/print_cups.c:468(cups_async_callback)
Apr 9 10:51:09 cqdx smbd[3394]: failed to retrieve printer list:
NT_STATUS_UNSUCCESSFUL
Apr 9 11:01:30 cqdx kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
returned error 22.
Apr 9 11:01:30 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
error 22
Apr 9 11:01:30 cqdx kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
from line 1184 of file fs/xfs/xfs_vnodeops.c. Return address =
0xffffffffa02ee20a
Apr 9 11:01:30 cqdx kernel: XFS (sdb): I/O Error Detected. Shutting down
filesystem
Apr 9 11:01:30 cqdx kernel: XFS (sdb): Please umount the filesystem and
rectify the problem(s)
Apr 9 11:01:51 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
Apr 9 11:02:21 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
Apr 9 11:02:51 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
Apr 9 11:03:21 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
Apr 9 11:03:51 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
Apr 9 11:03:57 cqdx init: tty (/dev/tty1) main process (3427) killed by
TERM signal
Apr 9 11:03:57 cqdx init: tty (/dev/tty2) main process (3429) killed by
TERM signal
2013/4/9 Ben Myers <bpm@sgi.com>
> Hey Yongtaofu,
>
> On Tue, Apr 09, 2013 at 09:05:32PM +0800, 符永涛 wrote:
> > Also I want to know why all the server, all crash with the same crash
> stack?
> > Thank you, really need your help.
>
> What you've posted so far looks like evidence of a forced shutdown and not
> a
> crash. Is there a crash in addition to this forced shutdown? If so, can
> you
> post the stack for that too?
>
> >
> > 2013/4/9, 符永涛 <yongtaofu@gmail.com>:
> > > BTW
> > > xfs_info /dev/sdb
> > > meta-data=/dev/sdb isize=256 agcount=28,
> agsize=268435440
> > > blks
> > > = sectsz=512 attr=2
> > > data = bsize=4096 blocks=7324303360,
> imaxpct=5
> > > = sunit=16 swidth=160 blks
> > > naming =version 2 bsize=4096 ascii-ci=0
> > > log =internal bsize=4096 blocks=521728, version=2
> > > = sectsz=512 sunit=16 blks,
> lazy-count=1
> > > realtime =none extsz=4096 blocks=0, rtextents=0
> > >
> > > 2013/4/9, 符永涛 <yongtaofu@gmail.com>:
> > >> Dear xfs experts,
> > >> I really need your help sincerely!!! In our production enviroment we
> > >> run glusterfs over top of xfs on Dell x720D(Raid 6). And the xfs file
> > >> system crash on some of the server frequently about every two weeks.
> > >> Can you help to give me a direction about how to debug this issue and
> > >> how to avoid it? Thank you very very much!
> > >>
> > >> uname -a
> > >> Linux cqdx.miaoyan.cluster1.node11.qiyi.domain 2.6.32-279.el6.x86_64
> > >> #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
> > >>
> > >> Every time the crash log is same, as following
>
> An initial guess is that somehow it is looking up a bad inode number, e.g.
> it
> is beyond the end of the filesystem and xfs_dilocate returns EINVAL.
>
> You could 'xfs_repair -n' to see what it finds (without modifying the
> filesystem) as a first step.
>
> > >> 038 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
> > >> xfs_inotobp() returned error 22.
>
> Were there any lines of output before this? In some codebases there are
> prints
> in xfs_inotobp that would help show what happened.
>
> > >> 1039 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree
> > >> returned error 22
> > >> 1040 Apr 9 09:41:36 cqdx kernel: XFS (sdb):
> > >> xfs_do_force_shutdown(0x1) called from line 1184 of file
> > >> fs/xfs/xfs_vnodeops.c. Return address = 0xffffffffa02ee20a
> > >> 1041 Apr 9 09:41:36 cqdx kernel: XFS (sdb): I/O Error Detected.
> > >> Shutting down filesystem
> > >> 1042 Apr 9 09:41:36 cqdx kernel: XFS (sdb): Please umount the
> > >> filesystem and rectify the problem(s)
> > >> 1043 Apr 9 09:41:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> > >> returned.
> > >> 1044 Apr 9 09:42:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> > >> returned.
> > >> 1045 Apr 9 09:42:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> > >> returned.
> > >> 1046 Apr 9 09:43:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> > >> returned.
>
> The error 5 (EIO) look scary but they are due to the forced shutdown, don't
> worry about them.
>
> Thanks,
> Ben
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 7095 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 15:07 ` 符永涛
@ 2013-04-09 15:10 ` 符永涛
2013-04-10 10:10 ` Emmanuel Florac
0 siblings, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-09 15:10 UTC (permalink / raw)
To: Ben Myers; +Cc: xfs
[-- Attachment #1.1: Type: text/plain, Size: 5610 bytes --]
Today 3 of our servers were impacted by the xfs shutdown. The logs are
identical.
2013/4/9 符永涛 <yongtaofu@gmail.com>
> before xfs force shutdown happens there seems no useful log in
> /var/log/messages
>
> Apr 9 10:38:08 cqdx smbd[4597]: Unable to connect to CUPS server
> localhost:631 - Connection refused
> Apr 9 10:38:08 cqdx smbd[3394]: [2013/04/09 10:38:08.944125, 0]
> printing/print_cups.c:468(cups_async_callback)
> Apr 9 10:38:08 cqdx smbd[3394]: failed to retrieve printer list:
> NT_STATUS_UNSUCCESSFUL
> Apr 9 10:51:09 cqdx smbd[5205]: [2013/04/09 10:51:09.723610, 0]
> printing/print_cups.c:109(cups_connect)
> Apr 9 10:51:09 cqdx smbd[5205]: Unable to connect to CUPS server
> localhost:631 - Connection refused
> Apr 9 10:51:09 cqdx smbd[3394]: [2013/04/09 10:51:09.724132, 0]
> printing/print_cups.c:468(cups_async_callback)
> Apr 9 10:51:09 cqdx smbd[3394]: failed to retrieve printer list:
> NT_STATUS_UNSUCCESSFUL
> Apr 9 11:01:30 cqdx kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
> returned error 22.
> Apr 9 11:01:30 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
> error 22
> Apr 9 11:01:30 cqdx kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
> from line 1184 of file fs/xfs/xfs_vnodeops.c. Return address =
> 0xffffffffa02ee20a
> Apr 9 11:01:30 cqdx kernel: XFS (sdb): I/O Error Detected. Shutting down
> filesystem
> Apr 9 11:01:30 cqdx kernel: XFS (sdb): Please umount the filesystem and
> rectify the problem(s)
> Apr 9 11:01:51 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 9 11:02:21 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 9 11:02:51 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 9 11:03:21 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 9 11:03:51 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 9 11:03:57 cqdx init: tty (/dev/tty1) main process (3427) killed by
> TERM signal
> Apr 9 11:03:57 cqdx init: tty (/dev/tty2) main process (3429) killed by
> TERM signal
>
>
>
> 2013/4/9 Ben Myers <bpm@sgi.com>
>
>> Hey Yongtaofu,
>>
>> On Tue, Apr 09, 2013 at 09:05:32PM +0800, 符永涛 wrote:
>> > Also I want to know why all the server, all crash with the same crash
>> stack?
>> > Thank you, really need your help.
>>
>> What you've posted so far looks like evidence of a forced shutdown and
>> not a
>> crash. Is there a crash in addition to this forced shutdown? If so, can
>> you
>> post the stack for that too?
>>
>> >
>> > 2013/4/9, 符永涛 <yongtaofu@gmail.com>:
>> > > BTW
>> > > xfs_info /dev/sdb
>> > > meta-data=/dev/sdb isize=256 agcount=28,
>> agsize=268435440
>> > > blks
>> > > = sectsz=512 attr=2
>> > > data = bsize=4096 blocks=7324303360,
>> imaxpct=5
>> > > = sunit=16 swidth=160 blks
>> > > naming =version 2 bsize=4096 ascii-ci=0
>> > > log =internal bsize=4096 blocks=521728, version=2
>> > > = sectsz=512 sunit=16 blks,
>> lazy-count=1
>> > > realtime =none extsz=4096 blocks=0, rtextents=0
>> > >
>> > > 2013/4/9, 符永涛 <yongtaofu@gmail.com>:
>> > >> Dear xfs experts,
>> > >> I really need your help sincerely!!! In our production enviroment we
>> > >> run glusterfs over top of xfs on Dell x720D(Raid 6). And the xfs file
>> > >> system crash on some of the server frequently about every two weeks.
>> > >> Can you help to give me a direction about how to debug this issue and
>> > >> how to avoid it? Thank you very very much!
>> > >>
>> > >> uname -a
>> > >> Linux cqdx.miaoyan.cluster1.node11.qiyi.domain 2.6.32-279.el6.x86_64
>> > >> #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
>> > >>
>> > >> Every time the crash log is same, as following
>>
>> An initial guess is that somehow it is looking up a bad inode number,
>> e.g. it
>> is beyond the end of the filesystem and xfs_dilocate returns EINVAL.
>>
>> You could 'xfs_repair -n' to see what it finds (without modifying the
>> filesystem) as a first step.
>>
>> > >> 038 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
>> > >> xfs_inotobp() returned error 22.
>>
>> Were there any lines of output before this? In some codebases there are
>> prints
>> in xfs_inotobp that would help show what happened.
>>
>> > >> 1039 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree
>> > >> returned error 22
>> > >> 1040 Apr 9 09:41:36 cqdx kernel: XFS (sdb):
>> > >> xfs_do_force_shutdown(0x1) called from line 1184 of file
>> > >> fs/xfs/xfs_vnodeops.c. Return address = 0xffffffffa02ee20a
>> > >> 1041 Apr 9 09:41:36 cqdx kernel: XFS (sdb): I/O Error Detected.
>> > >> Shutting down filesystem
>> > >> 1042 Apr 9 09:41:36 cqdx kernel: XFS (sdb): Please umount the
>> > >> filesystem and rectify the problem(s)
>> > >> 1043 Apr 9 09:41:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> > >> returned.
>> > >> 1044 Apr 9 09:42:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> > >> returned.
>> > >> 1045 Apr 9 09:42:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> > >> returned.
>> > >> 1046 Apr 9 09:43:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> > >> returned.
>>
>> The error 5 (EIO) look scary but they are due to the forced shutdown,
>> don't
>> worry about them.
>>
>> Thanks,
>> Ben
>>
>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 7650 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 15:06 ` Eric Sandeen
@ 2013-04-09 15:18 ` 符永涛
2013-04-09 15:23 ` Eric Sandeen
` (2 more replies)
0 siblings, 3 replies; 60+ messages in thread
From: 符永涛 @ 2013-04-09 15:18 UTC (permalink / raw)
To: Eric Sandeen; +Cc: xfs
[-- Attachment #1.1: Type: text/plain, Size: 14049 bytes --]
The servers are back to service now and It's hard to run xfs_repair. It
always happen bellow is the xfs_repair log when it happens on another
server several days ago.
-sh-4.1$ sudo xfs_repair -n /dev/glustervg/glusterlv
Phase 1 - find and verify superblock…
Phase 2 - using internal log
- scan filesystem freespace and inode maps…
agi unlinked bucket 0 is 4046848 in ag 0 (inode=4046848)
agi unlinked bucket 5 is 2340485 in ag 0 (inode=2340485)
agi unlinked bucket 6 is 2326854 in ag 0 (inode=2326854)
agi unlinked bucket 8 is 1802120 in ag 0 (inode=1802120)
agi unlinked bucket 14 is 495566 in ag 0 (inode=495566)
agi unlinked bucket 16 is 5899536 in ag 0 (inode=5899536)
agi unlinked bucket 19 is 4008211 in ag 0 (inode=4008211)
agi unlinked bucket 21 is 4906965 in ag 0 (inode=4906965)
agi unlinked bucket 23 is 2022231 in ag 0 (inode=2022231)
agi unlinked bucket 24 is 1626200 in ag 0 (inode=1626200)
agi unlinked bucket 25 is 938585 in ag 0 (inode=938585)
agi unlinked bucket 30 is 4226526 in ag 0 (inode=4226526)
agi unlinked bucket 34 is 4108962 in ag 0 (inode=4108962)
agi unlinked bucket 37 is 1740389 in ag 0 (inode=1740389)
agi unlinked bucket 39 is 247399 in ag 0 (inode=247399)
agi unlinked bucket 40 is 6237864 in ag 0 (inode=6237864)
agi unlinked bucket 43 is 3404331 in ag 0 (inode=3404331)
agi unlinked bucket 45 is 2092717 in ag 0 (inode=2092717)
agi unlinked bucket 48 is 4041008 in ag 0 (inode=4041008)
agi unlinked bucket 50 is 1459762 in ag 0 (inode=1459762)
agi unlinked bucket 56 is 852024 in ag 0 (inode=852024)
- found root in ode chunk
Phase 3 - for each AG…
- scan (but don't clear) agi unlinked lists…
- process known inodes and perform inode discovery…
- agno = 0
7f084d34e700: Badness in key lookup (length)
bp=(bno 123696, len 16384 bytes) key=(bno 123696, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 247776, len 16384 bytes) key=(bno 247776, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 425984, len 16384 bytes) key=(bno 425984, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 469280, len 16384 bytes) key=(bno 469280, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 729856, len 16384 bytes) key=(bno 729856, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 813072, len 16384 bytes) key=(bno 813072, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 870176, len 16384 bytes) key=(bno 870176, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 901056, len 16384 bytes) key=(bno 901056, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 1011104, len 16384 bytes) key=(bno 1011104, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 1046336, len 16384 bytes) key=(bno 1046336, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 1163424, len 16384 bytes) key=(bno 1163424, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 1170240, len 16384 bytes) key=(bno 1170240, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 1702160, len 16384 bytes) key=(bno 1702160, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 2004096, len 16384 bytes) key=(bno 2004096, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 2020496, len 16384 bytes) key=(bno 2020496, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 2023408, len 16384 bytes) key=(bno 2023408, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 2054464, len 16384 bytes) key=(bno 2054464, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 2113232, len 16384 bytes) key=(bno 2113232, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 2453472, len 16384 bytes) key=(bno 2453472, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 2949760, len 16384 bytes) key=(bno 2949760, len 8192 bytes)
7f084d34e700: Badness in key lookup (length)
bp=(bno 3118912, len 16384 bytes) key=(bno 3118912, len 8192 bytes)
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
- agno = 16
- agno = 17
- agno = 18
- agno = 19
- agno = 20
- agno = 21
- agno = 22
- agno = 23
- agno = 24
- agno = 25
- agno = 26
- agno = 27
- agno = 28
- agno = 29
- agno = 30
- process newly discovered in odes..
Phase 4 - check for duplicate blocks…
- setting up duplicate extent list…
- check for inodes claiming duplicate blocks…
- agno = 0
- agno = 1
- agno = 3
- agno = 9
- agno = 12
- agno = 14
- agno = 5
- agno = 19
- agno = 23
- agno = 24
- agno = 25
- agno = 26
- agno = 27
- agno = 28
- agno = 29
- agno = 30
- agno = 4
- agno = 2
- agno = 17
- agno = 6
- agno = 8
- agno = 16
- agno = 11
- agno = 10
- agno = 18
- agno = 13
- agno = 15
- agno = 20
- agno = 22
- agno = 21
- agno = 7
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity…
- traversing filesystem …
- traversal finished …
- moving disconnected inodes to lost+found …
disconnected inode 6235944, would move to lost+found
Phase 7 - verify link counts…
would have reset inode 6235944 nlinks from 0 to 1
No modify flag set, skipping filesystem flush and exiting..
第二步
repair的log
sh-4.1$ sudo xfs_repair /dev/glustervg/glusterlv
Phase 1 - find and verify superblock…
Phase 2 - using internal log
- zero log…
- scan filesystem freespace and inode maps…
agi unlinked bucket 0 is 4046848 in ag 0 (inode=4046848)
agi unlinked bucket 5 is 2340485 in ag 0 (inode=2340485)
agi unlinked bucket 6 is 2326854 in ag 0 (inode=2326854)
agi unlinked bucket 8 is 1802120 in ag 0 (inode=1802120)
agi unlinked bucket 14 is 495566 in ag 0 (inode=495566)
agi unlinked bucket 16 is 5899536 in ag 0 (inode=5899536)
agi unlinked bucket 19 is 4008211 in ag 0 (inode=4008211)
agi unlinked bucket 21 is 4906965 in ag 0 (inode=4906965)
agi unlinked bucket 23 is 2022231 in ag 0 (inode=2022231)
agi unlinked bucket 24 is 1626200 in ag 0 (inode=1626200)
agi unlinked bucket 25 is 938585 in ag 0 (inode=938585)
agi unlinked bucket 30 is 4226526 in ag 0 (inode=4226526)
agi unlinked bucket 34 is 4108962 in ag 0 (inode=4108962)
agi unlinked bucket 37 is 1740389 in ag 0 (inode=1740389)
agi unlinked bucket 39 is 247399 in ag 0 (inode=247399)
agi unlinked bucket 40 is 6237864 in ag 0 (inode=6237864)
agi unlinked bucket 43 is 3404331 in ag 0 (inode=3404331)
agi unlinked bucket 45 is 2092717 in ag 0 (inode=2092717)
agi unlinked bucket 48 is 4041008 in ag 0 (inode=4041008)
agi unlinked bucket 50 is 1459762 in ag 0 (inode=1459762)
agi unlinked bucket 56 is 852024 in ag 0 (inode=852024)
- found root in ode chunk
Phase 3 - for each AG…
- scan and clear agi unlinked lists…
- process known inodes and perform inode discovery…
- agno = 0
7f8220be6700: Badness in key lookup (length)
bp=(bno 123696, len 16384 bytes) key=(bno 123696, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 247776, len 16384 bytes) key=(bno 247776, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 425984, len 16384 bytes) key=(bno 425984, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 469280, len 16384 bytes) key=(bno 469280, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 729856, len 16384 bytes) key=(bno 729856, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 813072, len 16384 bytes) key=(bno 813072, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 870176, len 16384 bytes) key=(bno 870176, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 901056, len 16384 bytes) key=(bno 901056, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 1011104, len 16384 bytes) key=(bno 1011104, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 1046336, len 16384 bytes) key=(bno 1046336, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 1163424, len 16384 bytes) key=(bno 1163424, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 1170240, len 16384 bytes) key=(bno 1170240, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 1702160, len 16384 bytes) key=(bno 1702160, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 2004096, len 16384 bytes) key=(bno 2004096, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 2020496, len 16384 bytes) key=(bno 2020496, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 2023408, len 16384 bytes) key=(bno 2023408, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 2054464, len 16384 bytes) key=(bno 2054464, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 2113232, len 16384 bytes) key=(bno 2113232, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 2453472, len 16384 bytes) key=(bno 2453472, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 2949760, len 16384 bytes) key=(bno 2949760, len 8192 bytes)
7f8220be6700: Badness in key lookup (length)
bp=(bno 3118912, len 16384 bytes) key=(bno 3118912, len 8192 bytes)
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
- agno = 16
- agno = 17
- agno = 18
- agno = 19
- agno = 20
- agno = 21
- agno = 22
- agno = 23
- agno = 24
- agno = 25
- agno = 26
- agno = 27
- agno = 28
- agno = 29
- agno = 30
- process newly discovered in odes..
Phase 4 - check for duplicate blocks…
- setting up duplicate extent list…
- check for inodes claiming duplicate blocks…
- agno = 0
- agno = 4
- agno = 2
- agno = 3
- agno = 7
- agno = 18
- agno = 28
- agno = 6
- agno = 5
- agno = 1
- agno = 26
- agno = 8
- agno = 14
- agno = 17
- agno = 16
- agno = 10
- agno = 20
- agno = 13
- agno = 15
- agno = 11
- agno = 19
- agno = 22
- agno = 21
- agno = 23
- agno = 9
- agno = 12
- agno = 24
- agno = 27
- agno = 25
- agno = 29
- agno = 30
Phase 5 - rebuild AG headers and trees…
- reset superblock…
Phase 6 - check inode connectivity…
- resetting contents of realtime bitmap and summary in odes
- traversing filesystem …
- traversal finished …
- moving disconnected inodes to lost+found …
disconnected inode 6235944, moving to lost+found
Phase 7 - verify and correct link counts…
done
sh-4.1$ .
2013/4/9 Eric Sandeen <sandeen@sandeen.net>
> On 4/9/13 7:53 AM, 符永涛 wrote:
> > Dear xfs experts,
> > I really need your help sincerely!!! In our production enviroment we
> > run glusterfs over top of xfs on Dell x720D(Raid 6). And the xfs file
> > system crash on some of the server frequently about every two weeks.
> > Can you help to give me a direction about how to debug this issue and
> > how to avoid it? Thank you very very much!
>
> So this happens reliably, but infrequently? (only every 2 weeks or so?)
>
> Can you provoke it any more often?
>
> > uname -a
> > Linux cqdx.miaoyan.cluster1.node11.qiyi.domain 2.6.32-279.el6.x86_64
> > #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
>
> That's a RHEL6 kernel; I'm assuming that this is a RHEL clone w/o RH
> support?
>
> I agree with Ben that I'd like to see xfs_repair output.
>
> Since the fs has shut down, you should unmount, remount, and unmount
> again to replay the dirty log. Then do xfs_repair -n, and provide the
> output
> if it discovers any errors.
>
> Thanks,
> -Eric
>
> > Every time the crash log is same, as following
> >
> > 038 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
> > xfs_inotobp() returned error 22.
> > 1039 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree
> > returned error 22
> > 1040 Apr 9 09:41:36 cqdx kernel: XFS (sdb):
> > xfs_do_force_shutdown(0x1) called from line 1184 of file
> > fs/xfs/xfs_vnodeops.c. Return address = 0xffffffffa02ee20a
> > 1041 Apr 9 09:41:36 cqdx kernel: XFS (sdb): I/O Error Detected.
> > Shutting down filesystem
> > 1042 Apr 9 09:41:36 cqdx kernel: XFS (sdb): Please umount the
> > filesystem and rectify the problem(s)
> > 1043 Apr 9 09:41:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> returned.
> > 1044 Apr 9 09:42:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> returned.
> > 1045 Apr 9 09:42:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> returned.
> > 1046 Apr 9 09:43:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> returned.
> >
>
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 20722 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 15:18 ` 符永涛
@ 2013-04-09 15:23 ` Eric Sandeen
2013-04-09 15:25 ` 符永涛
2013-04-09 15:23 ` 符永涛
2013-04-09 17:10 ` Eric Sandeen
2 siblings, 1 reply; 60+ messages in thread
From: Eric Sandeen @ 2013-04-09 15:23 UTC (permalink / raw)
To: 符永涛; +Cc: xfs
On 4/9/13 10:18 AM, 符永涛 wrote:
> The servers are back to service now and It's hard to run xfs_repair.
> It always happen bellow is the xfs_repair log when it happens on
> another server several days ago.
> -sh-4.1$ sudo xfs_repair -n /dev/glustervg/glusterlv
Ok; just for what its worth, if you don't mount/umount first,
you may be running repair -n with a dirty log, and it will
find "more" corruption than it should, since the log is not
replayed.
I see that you included a non "-n" repair log as well, was
that after a mount/umount? Just to be sure...
Thanks,
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 15:18 ` 符永涛
2013-04-09 15:23 ` Eric Sandeen
@ 2013-04-09 15:23 ` 符永涛
2013-04-09 15:44 ` Eric Sandeen
2013-04-09 17:10 ` Eric Sandeen
2 siblings, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-09 15:23 UTC (permalink / raw)
To: Eric Sandeen; +Cc: xfs
[-- Attachment #1.1: Type: text/plain, Size: 15260 bytes --]
So my question is why xfs shutdown always happens? With same
reason(xfs_iunlink_remove: xfs_inotobp() returned error 22 xfs_inactive:
xfs_ifree returned error 22 xfs_do_force_shutdown).
The server load is high and is it related?
free -m
total used free shared buffers cached
Mem: 129016 128067 948 0 10 119905
-/+ buffers/cache: 8150 120865
Swap: 4093 0 4093
2013/4/9 符永涛 <yongtaofu@gmail.com>
> The servers are back to service now and It's hard to run xfs_repair. It
> always happen bellow is the xfs_repair log when it happens on another
> server several days ago.
> -sh-4.1$ sudo xfs_repair -n /dev/glustervg/glusterlv
> Phase 1 - find and verify superblock…
> Phase 2 - using internal log
> - scan filesystem freespace and inode maps…
> agi unlinked bucket 0 is 4046848 in ag 0 (inode=4046848)
> agi unlinked bucket 5 is 2340485 in ag 0 (inode=2340485)
> agi unlinked bucket 6 is 2326854 in ag 0 (inode=2326854)
> agi unlinked bucket 8 is 1802120 in ag 0 (inode=1802120)
> agi unlinked bucket 14 is 495566 in ag 0 (inode=495566)
> agi unlinked bucket 16 is 5899536 in ag 0 (inode=5899536)
> agi unlinked bucket 19 is 4008211 in ag 0 (inode=4008211)
> agi unlinked bucket 21 is 4906965 in ag 0 (inode=4906965)
> agi unlinked bucket 23 is 2022231 in ag 0 (inode=2022231)
> agi unlinked bucket 24 is 1626200 in ag 0 (inode=1626200)
> agi unlinked bucket 25 is 938585 in ag 0 (inode=938585)
> agi unlinked bucket 30 is 4226526 in ag 0 (inode=4226526)
> agi unlinked bucket 34 is 4108962 in ag 0 (inode=4108962)
> agi unlinked bucket 37 is 1740389 in ag 0 (inode=1740389)
> agi unlinked bucket 39 is 247399 in ag 0 (inode=247399)
> agi unlinked bucket 40 is 6237864 in ag 0 (inode=6237864)
> agi unlinked bucket 43 is 3404331 in ag 0 (inode=3404331)
> agi unlinked bucket 45 is 2092717 in ag 0 (inode=2092717)
> agi unlinked bucket 48 is 4041008 in ag 0 (inode=4041008)
> agi unlinked bucket 50 is 1459762 in ag 0 (inode=1459762)
> agi unlinked bucket 56 is 852024 in ag 0 (inode=852024)
> - found root in ode chunk
> Phase 3 - for each AG…
> - scan (but don't clear) agi unlinked lists…
> - process known inodes and perform inode discovery…
> - agno = 0
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 123696, len 16384 bytes) key=(bno 123696, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 247776, len 16384 bytes) key=(bno 247776, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 425984, len 16384 bytes) key=(bno 425984, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 469280, len 16384 bytes) key=(bno 469280, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 729856, len 16384 bytes) key=(bno 729856, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 813072, len 16384 bytes) key=(bno 813072, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 870176, len 16384 bytes) key=(bno 870176, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 901056, len 16384 bytes) key=(bno 901056, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 1011104, len 16384 bytes) key=(bno 1011104, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 1046336, len 16384 bytes) key=(bno 1046336, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 1163424, len 16384 bytes) key=(bno 1163424, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 1170240, len 16384 bytes) key=(bno 1170240, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 1702160, len 16384 bytes) key=(bno 1702160, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 2004096, len 16384 bytes) key=(bno 2004096, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 2020496, len 16384 bytes) key=(bno 2020496, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 2023408, len 16384 bytes) key=(bno 2023408, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 2054464, len 16384 bytes) key=(bno 2054464, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 2113232, len 16384 bytes) key=(bno 2113232, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 2453472, len 16384 bytes) key=(bno 2453472, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 2949760, len 16384 bytes) key=(bno 2949760, len 8192 bytes)
> 7f084d34e700: Badness in key lookup (length)
> bp=(bno 3118912, len 16384 bytes) key=(bno 3118912, len 8192 bytes)
> - agno = 1
> - agno = 2
> - agno = 3
> - agno = 4
> - agno = 5
> - agno = 6
> - agno = 7
> - agno = 8
> - agno = 9
> - agno = 10
> - agno = 11
> - agno = 12
> - agno = 13
> - agno = 14
> - agno = 15
> - agno = 16
> - agno = 17
> - agno = 18
> - agno = 19
> - agno = 20
> - agno = 21
> - agno = 22
> - agno = 23
> - agno = 24
> - agno = 25
> - agno = 26
> - agno = 27
> - agno = 28
> - agno = 29
> - agno = 30
> - process newly discovered in odes..
> Phase 4 - check for duplicate blocks…
> - setting up duplicate extent list…
> - check for inodes claiming duplicate blocks…
> - agno = 0
> - agno = 1
> - agno = 3
> - agno = 9
> - agno = 12
> - agno = 14
> - agno = 5
> - agno = 19
> - agno = 23
> - agno = 24
> - agno = 25
> - agno = 26
> - agno = 27
> - agno = 28
> - agno = 29
> - agno = 30
> - agno = 4
> - agno = 2
> - agno = 17
> - agno = 6
> - agno = 8
> - agno = 16
> - agno = 11
> - agno = 10
> - agno = 18
> - agno = 13
> - agno = 15
> - agno = 20
> - agno = 22
> - agno = 21
> - agno = 7
> No modify flag set, skipping phase 5
> Phase 6 - check inode connectivity…
> - traversing filesystem …
> - traversal finished …
> - moving disconnected inodes to lost+found …
> disconnected inode 6235944, would move to lost+found
> Phase 7 - verify link counts…
> would have reset inode 6235944 nlinks from 0 to 1
> No modify flag set, skipping filesystem flush and exiting..
>
>
>
> 第二步
> repair的log
>
> sh-4.1$ sudo xfs_repair /dev/glustervg/glusterlv
> Phase 1 - find and verify superblock…
> Phase 2 - using internal log
> - zero log…
> - scan filesystem freespace and inode maps…
> agi unlinked bucket 0 is 4046848 in ag 0 (inode=4046848)
> agi unlinked bucket 5 is 2340485 in ag 0 (inode=2340485)
> agi unlinked bucket 6 is 2326854 in ag 0 (inode=2326854)
> agi unlinked bucket 8 is 1802120 in ag 0 (inode=1802120)
> agi unlinked bucket 14 is 495566 in ag 0 (inode=495566)
> agi unlinked bucket 16 is 5899536 in ag 0 (inode=5899536)
> agi unlinked bucket 19 is 4008211 in ag 0 (inode=4008211)
> agi unlinked bucket 21 is 4906965 in ag 0 (inode=4906965)
> agi unlinked bucket 23 is 2022231 in ag 0 (inode=2022231)
> agi unlinked bucket 24 is 1626200 in ag 0 (inode=1626200)
> agi unlinked bucket 25 is 938585 in ag 0 (inode=938585)
> agi unlinked bucket 30 is 4226526 in ag 0 (inode=4226526)
> agi unlinked bucket 34 is 4108962 in ag 0 (inode=4108962)
> agi unlinked bucket 37 is 1740389 in ag 0 (inode=1740389)
> agi unlinked bucket 39 is 247399 in ag 0 (inode=247399)
> agi unlinked bucket 40 is 6237864 in ag 0 (inode=6237864)
> agi unlinked bucket 43 is 3404331 in ag 0 (inode=3404331)
> agi unlinked bucket 45 is 2092717 in ag 0 (inode=2092717)
> agi unlinked bucket 48 is 4041008 in ag 0 (inode=4041008)
> agi unlinked bucket 50 is 1459762 in ag 0 (inode=1459762)
> agi unlinked bucket 56 is 852024 in ag 0 (inode=852024)
> - found root in ode chunk
> Phase 3 - for each AG…
> - scan and clear agi unlinked lists…
> - process known inodes and perform inode discovery…
> - agno = 0
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 123696, len 16384 bytes) key=(bno 123696, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 247776, len 16384 bytes) key=(bno 247776, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 425984, len 16384 bytes) key=(bno 425984, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 469280, len 16384 bytes) key=(bno 469280, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 729856, len 16384 bytes) key=(bno 729856, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 813072, len 16384 bytes) key=(bno 813072, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 870176, len 16384 bytes) key=(bno 870176, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 901056, len 16384 bytes) key=(bno 901056, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 1011104, len 16384 bytes) key=(bno 1011104, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 1046336, len 16384 bytes) key=(bno 1046336, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 1163424, len 16384 bytes) key=(bno 1163424, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 1170240, len 16384 bytes) key=(bno 1170240, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 1702160, len 16384 bytes) key=(bno 1702160, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 2004096, len 16384 bytes) key=(bno 2004096, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 2020496, len 16384 bytes) key=(bno 2020496, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 2023408, len 16384 bytes) key=(bno 2023408, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 2054464, len 16384 bytes) key=(bno 2054464, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 2113232, len 16384 bytes) key=(bno 2113232, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 2453472, len 16384 bytes) key=(bno 2453472, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 2949760, len 16384 bytes) key=(bno 2949760, len 8192 bytes)
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 3118912, len 16384 bytes) key=(bno 3118912, len 8192 bytes)
> - agno = 1
> - agno = 2
> - agno = 3
> - agno = 4
> - agno = 5
> - agno = 6
> - agno = 7
> - agno = 8
> - agno = 9
> - agno = 10
> - agno = 11
> - agno = 12
> - agno = 13
> - agno = 14
> - agno = 15
> - agno = 16
> - agno = 17
> - agno = 18
> - agno = 19
> - agno = 20
> - agno = 21
> - agno = 22
> - agno = 23
> - agno = 24
> - agno = 25
> - agno = 26
> - agno = 27
> - agno = 28
> - agno = 29
> - agno = 30
> - process newly discovered in odes..
> Phase 4 - check for duplicate blocks…
> - setting up duplicate extent list…
> - check for inodes claiming duplicate blocks…
> - agno = 0
> - agno = 4
> - agno = 2
> - agno = 3
> - agno = 7
> - agno = 18
> - agno = 28
> - agno = 6
> - agno = 5
> - agno = 1
> - agno = 26
> - agno = 8
> - agno = 14
> - agno = 17
> - agno = 16
> - agno = 10
> - agno = 20
> - agno = 13
> - agno = 15
> - agno = 11
> - agno = 19
> - agno = 22
> - agno = 21
> - agno = 23
> - agno = 9
> - agno = 12
> - agno = 24
> - agno = 27
> - agno = 25
> - agno = 29
> - agno = 30
> Phase 5 - rebuild AG headers and trees…
> - reset superblock…
> Phase 6 - check inode connectivity…
> - resetting contents of realtime bitmap and summary in odes
> - traversing filesystem …
> - traversal finished …
> - moving disconnected inodes to lost+found …
> disconnected inode 6235944, moving to lost+found
> Phase 7 - verify and correct link counts…
> done
> sh-4.1$ .
>
>
> 2013/4/9 Eric Sandeen <sandeen@sandeen.net>
>
>> On 4/9/13 7:53 AM, 符永涛 wrote:
>> > Dear xfs experts,
>> > I really need your help sincerely!!! In our production enviroment we
>> > run glusterfs over top of xfs on Dell x720D(Raid 6). And the xfs file
>> > system crash on some of the server frequently about every two weeks.
>> > Can you help to give me a direction about how to debug this issue and
>> > how to avoid it? Thank you very very much!
>>
>> So this happens reliably, but infrequently? (only every 2 weeks or so?)
>>
>> Can you provoke it any more often?
>>
>> > uname -a
>> > Linux cqdx.miaoyan.cluster1.node11.qiyi.domain 2.6.32-279.el6.x86_64
>> > #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
>>
>> That's a RHEL6 kernel; I'm assuming that this is a RHEL clone w/o RH
>> support?
>>
>> I agree with Ben that I'd like to see xfs_repair output.
>>
>> Since the fs has shut down, you should unmount, remount, and unmount
>> again to replay the dirty log. Then do xfs_repair -n, and provide the
>> output
>> if it discovers any errors.
>>
>> Thanks,
>> -Eric
>>
>> > Every time the crash log is same, as following
>> >
>> > 038 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
>> > xfs_inotobp() returned error 22.
>> > 1039 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree
>> > returned error 22
>> > 1040 Apr 9 09:41:36 cqdx kernel: XFS (sdb):
>> > xfs_do_force_shutdown(0x1) called from line 1184 of file
>> > fs/xfs/xfs_vnodeops.c. Return address = 0xffffffffa02ee20a
>> > 1041 Apr 9 09:41:36 cqdx kernel: XFS (sdb): I/O Error Detected.
>> > Shutting down filesystem
>> > 1042 Apr 9 09:41:36 cqdx kernel: XFS (sdb): Please umount the
>> > filesystem and rectify the problem(s)
>> > 1043 Apr 9 09:41:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> returned.
>> > 1044 Apr 9 09:42:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> returned.
>> > 1045 Apr 9 09:42:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> returned.
>> > 1046 Apr 9 09:43:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> returned.
>> >
>>
>>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 22291 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 15:23 ` Eric Sandeen
@ 2013-04-09 15:25 ` 符永涛
0 siblings, 0 replies; 60+ messages in thread
From: 符永涛 @ 2013-04-09 15:25 UTC (permalink / raw)
To: Eric Sandeen; +Cc: xfs
[-- Attachment #1.1: Type: text/plain, Size: 864 bytes --]
Yes it was executed after a mount/umount because it is configured in fstab
and I reboot the system then it was mounted and then I umount it to run
xfs_repair.
2013/4/9 Eric Sandeen <sandeen@sandeen.net>
> On 4/9/13 10:18 AM, 符永涛 wrote:
> > The servers are back to service now and It's hard to run xfs_repair.
> > It always happen bellow is the xfs_repair log when it happens on
> > another server several days ago.
>
> > -sh-4.1$ sudo xfs_repair -n /dev/glustervg/glusterlv
>
> Ok; just for what its worth, if you don't mount/umount first,
> you may be running repair -n with a dirty log, and it will
> find "more" corruption than it should, since the log is not
> replayed.
>
> I see that you included a non "-n" repair log as well, was
> that after a mount/umount? Just to be sure...
>
> Thanks,
> -Eric
>
>
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 1293 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 15:23 ` 符永涛
@ 2013-04-09 15:44 ` Eric Sandeen
2013-04-09 15:48 ` 符永涛
0 siblings, 1 reply; 60+ messages in thread
From: Eric Sandeen @ 2013-04-09 15:44 UTC (permalink / raw)
To: 符永涛; +Cc: xfs
On 4/9/13 10:23 AM, 符永涛 wrote:
> So my question is why xfs shutdown always happens? With same
> reason(xfs_iunlink_remove: xfs_inotobp() returned error 22
> xfs_inactive: xfs_ifree returned error 22 xfs_do_force_shutdown).
> The server load is high and is it related?
We don't know yet. :)
I'd like to know what was passed to xfs_inotobp when it shut down,
and what caused the EINVAL (22) return. Perhaps some tracing
or systemtap scripts could to it.
If you only hit this every few weeks, though, it's going to be
difficult to catch.
-Eric
> free -m
> total used free shared buffers cached
> Mem: 129016 128067 948 0 10 119905
> -/+ buffers/cache: 8150 120865
> Swap: 4093 0 4093
>
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 15:44 ` Eric Sandeen
@ 2013-04-09 15:48 ` 符永涛
2013-04-09 15:49 ` 符永涛
2013-04-09 15:58 ` Brian Foster
0 siblings, 2 replies; 60+ messages in thread
From: 符永涛 @ 2013-04-09 15:48 UTC (permalink / raw)
To: Eric Sandeen; +Cc: xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 1035 bytes --]
I'll work on reproducing the issue, can you share some tracing script to
me? Thank you.
2013/4/9 Eric Sandeen <sandeen@sandeen.net>
> On 4/9/13 10:23 AM, 符永涛 wrote:
> > So my question is why xfs shutdown always happens? With same
> > reason(xfs_iunlink_remove: xfs_inotobp() returned error 22
> > xfs_inactive: xfs_ifree returned error 22 xfs_do_force_shutdown).
> > The server load is high and is it related?
>
> We don't know yet. :)
>
> I'd like to know what was passed to xfs_inotobp when it shut down,
> and what caused the EINVAL (22) return. Perhaps some tracing
> or systemtap scripts could to it.
>
> If you only hit this every few weeks, though, it's going to be
> difficult to catch.
>
> -Eric
>
> > free -m
> > total used free shared buffers cached
> > Mem: 129016 128067 948 0 10 119905
> > -/+ buffers/cache: 8150 120865
> > Swap: 4093 0 4093
> >
> >
>
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 1879 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 15:48 ` 符永涛
@ 2013-04-09 15:49 ` 符永涛
2013-04-09 15:58 ` Brian Foster
1 sibling, 0 replies; 60+ messages in thread
From: 符永涛 @ 2013-04-09 15:49 UTC (permalink / raw)
To: Eric Sandeen; +Cc: xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 1170 bytes --]
Or some useful debug tips?
2013/4/9 符永涛 <yongtaofu@gmail.com>
> I'll work on reproducing the issue, can you share some tracing script to
> me? Thank you.
>
>
> 2013/4/9 Eric Sandeen <sandeen@sandeen.net>
>
>> On 4/9/13 10:23 AM, 符永涛 wrote:
>> > So my question is why xfs shutdown always happens? With same
>> > reason(xfs_iunlink_remove: xfs_inotobp() returned error 22
>> > xfs_inactive: xfs_ifree returned error 22 xfs_do_force_shutdown).
>> > The server load is high and is it related?
>>
>> We don't know yet. :)
>>
>> I'd like to know what was passed to xfs_inotobp when it shut down,
>> and what caused the EINVAL (22) return. Perhaps some tracing
>> or systemtap scripts could to it.
>>
>> If you only hit this every few weeks, though, it's going to be
>> difficult to catch.
>>
>> -Eric
>>
>> > free -m
>> > total used free shared buffers
>> cached
>> > Mem: 129016 128067 948 0 10
>> 119905
>> > -/+ buffers/cache: 8150 120865
>> > Swap: 4093 0 4093
>> >
>> >
>>
>>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 2325 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 15:48 ` 符永涛
2013-04-09 15:49 ` 符永涛
@ 2013-04-09 15:58 ` Brian Foster
1 sibling, 0 replies; 60+ messages in thread
From: Brian Foster @ 2013-04-09 15:58 UTC (permalink / raw)
To: 符永涛; +Cc: Eric Sandeen, xfs@oss.sgi.com
On 04/09/2013 11:48 AM, 符永涛 wrote:
> I'll work on reproducing the issue, can you share some tracing script to
> me? Thank you.
>
Hi,
We're interested in tracking this problem down and are working on a
script that might help us gather more information. We still have some
research to do there, so please be patient! We'll share it when ready.
In the meantime, I think it would be helpful to take Eric's advice if
this occurs again:
- umount/remount to replay the XFS log.
- umount and collect a metadump of the filesystem (xfs_metadump). We'd
be interested to see this if you're willing/able to collect and share it
(note that this covers the fs prior to repair).
- capture the xfs_repair output and share that as well.
Thanks.
Brian
>
> 2013/4/9 Eric Sandeen <sandeen@sandeen.net <mailto:sandeen@sandeen.net>>
>
> On 4/9/13 10:23 AM, 符永涛 wrote:
> > So my question is why xfs shutdown always happens? With same
> > reason(xfs_iunlink_remove: xfs_inotobp() returned error 22
> > xfs_inactive: xfs_ifree returned error 22 xfs_do_force_shutdown).
> > The server load is high and is it related?
>
> We don't know yet. :)
>
> I'd like to know what was passed to xfs_inotobp when it shut down,
> and what caused the EINVAL (22) return. Perhaps some tracing
> or systemtap scripts could to it.
>
> If you only hit this every few weeks, though, it's going to be
> difficult to catch.
>
> -Eric
>
> > free -m
> > total used free shared buffers
> cached
> > Mem: 129016 128067 948 0 10
> 119905
> > -/+ buffers/cache: 8150 120865
> > Swap: 4093 0 4093
> >
> >
>
>
>
>
> --
> 符永涛
>
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 15:18 ` 符永涛
2013-04-09 15:23 ` Eric Sandeen
2013-04-09 15:23 ` 符永涛
@ 2013-04-09 17:10 ` Eric Sandeen
2013-04-10 5:34 ` 符永涛
2 siblings, 1 reply; 60+ messages in thread
From: Eric Sandeen @ 2013-04-09 17:10 UTC (permalink / raw)
To: 符永涛; +Cc: xfs
On 4/9/13 10:18 AM, 符永涛 wrote:
> The servers are back to service now and It's hard to run xfs_repair. It always happen bellow is the xfs_repair log when it happens on another server several days ago.
...
> 第二步
> repair的log
>
> sh-4.1$ sudo xfs_repair /dev/glustervg/glusterlv
> Phase 1 - find and verify superblock…
> Phase 2 - using internal log
> - zero log…
> - scan filesystem freespace and inode maps…
> agi unlinked bucket 0 is 4046848 in ag 0 (inode=4046848)
> agi unlinked bucket 5 is 2340485 in ag 0 (inode=2340485)
> agi unlinked bucket 6 is 2326854 in ag 0 (inode=2326854)
> agi unlinked bucket 8 is 1802120 in ag 0 (inode=1802120)
> agi unlinked bucket 14 is 495566 in ag 0 (inode=495566)
> agi unlinked bucket 16 is 5899536 in ag 0 (inode=5899536)
> agi unlinked bucket 19 is 4008211 in ag 0 (inode=4008211)
> agi unlinked bucket 21 is 4906965 in ag 0 (inode=4906965)
> agi unlinked bucket 23 is 2022231 in ag 0 (inode=2022231)
> agi unlinked bucket 24 is 1626200 in ag 0 (inode=1626200)
> agi unlinked bucket 25 is 938585 in ag 0 (inode=938585)
> agi unlinked bucket 30 is 4226526 in ag 0 (inode=4226526)
> agi unlinked bucket 34 is 4108962 in ag 0 (inode=4108962)
> agi unlinked bucket 37 is 1740389 in ag 0 (inode=1740389)
> agi unlinked bucket 39 is 247399 in ag 0 (inode=247399)
> agi unlinked bucket 40 is 6237864 in ag 0 (inode=6237864)
> agi unlinked bucket 43 is 3404331 in ag 0 (inode=3404331)
> agi unlinked bucket 45 is 2092717 in ag 0 (inode=2092717)
> agi unlinked bucket 48 is 4041008 in ag 0 (inode=4041008)
> agi unlinked bucket 50 is 1459762 in ag 0 (inode=1459762)
> agi unlinked bucket 56 is 852024 in ag 0 (inode=852024)
If this machine is still around in similar state, can you do a
# find /path/to/mount -inum $INODE_NUMBER
for the inode numbers above, and see what files they are?
That might give us a clue about what operations were happening
to them. Dumping the gluster xattrs on those files
might also be interesting. Just guesses here, but it'd be a
little more data.
(if this is an old repair, maybe doing the same for your most
recent incident would be best)
Thanks,
-Eric
> - found root in ode chunk
> Phase 3 - for each AG…
> - scan and clear agi unlinked lists…
> - process known inodes and perform inode discovery…
> - agno = 0
> 7f8220be6700: Badness in key lookup (length)
> bp=(bno 123696, len 16384 bytes) key=(bno 123696, len 8192 bytes)
(FWIW the above warnings look like an xfs_repair bug, not related)
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 13:03 ` 符永涛
2013-04-09 13:05 ` 符永涛
@ 2013-04-09 22:16 ` Michael L. Semon
2013-04-09 22:18 ` Eric Sandeen
1 sibling, 1 reply; 60+ messages in thread
From: Michael L. Semon @ 2013-04-09 22:16 UTC (permalink / raw)
To: 符永涛; +Cc: xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 3471 bytes --]
A meager non-expert user question with full ignorance of glusterfs: Why
are you having I/O errors once every two weeks?
This looks like XFS behavior I've seen under 2 conditions: 1) when I test
XFS on the device-mapper flakey object, using XFS without an external
journal, and 2) when I try to press my hard-drive connectors against the
motherboard while the PC is still running. Your error message looks more
like the result of (2) than of (1).
XFS behavior on flakey is not the best, and I wish it would recover in such
situations. In Case (2), I'm fairly sure that the PC is confused on a
hardware level because the drive light does not go out. Then again, seeing
the behavior of other file systems that fight through the errors, maybe
it's for the best. If you're fighting I/O errors, there is no winner, and
it's best to get rid of the I/O error.
OK, I'm off the soapbox and will quietly wait for a RAID expert like Dave
or Stan to jump in and make me feel like a complete amateur...
MIchael
On Tue, Apr 9, 2013 at 9:03 AM, 符永涛 <yongtaofu@gmail.com> wrote:
> BTW
> xfs_info /dev/sdb
> meta-data=/dev/sdb isize=256 agcount=28, agsize=268435440
> blks
> = sectsz=512 attr=2
> data = bsize=4096 blocks=7324303360, imaxpct=5
> = sunit=16 swidth=160 blks
> naming =version 2 bsize=4096 ascii-ci=0
> log =internal bsize=4096 blocks=521728, version=2
> = sectsz=512 sunit=16 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
>
> 2013/4/9, 符永涛 <yongtaofu@gmail.com>:
> > Dear xfs experts,
> > I really need your help sincerely!!! In our production enviroment we
> > run glusterfs over top of xfs on Dell x720D(Raid 6). And the xfs file
> > system crash on some of the server frequently about every two weeks.
> > Can you help to give me a direction about how to debug this issue and
> > how to avoid it? Thank you very very much!
> >
> > uname -a
> > Linux cqdx.miaoyan.cluster1.node11.qiyi.domain 2.6.32-279.el6.x86_64
> > #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
> >
> > Every time the crash log is same, as following
> >
> > 038 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
> > xfs_inotobp() returned error 22.
> > 1039 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree
> > returned error 22
> > 1040 Apr 9 09:41:36 cqdx kernel: XFS (sdb):
> > xfs_do_force_shutdown(0x1) called from line 1184 of file
> > fs/xfs/xfs_vnodeops.c. Return address = 0xffffffffa02ee20a
> > 1041 Apr 9 09:41:36 cqdx kernel: XFS (sdb): I/O Error Detected.
> > Shutting down filesystem
> > 1042 Apr 9 09:41:36 cqdx kernel: XFS (sdb): Please umount the
> > filesystem and rectify the problem(s)
> > 1043 Apr 9 09:41:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> > returned.
> > 1044 Apr 9 09:42:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> > returned.
> > 1045 Apr 9 09:42:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> > returned.
> > 1046 Apr 9 09:43:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> > returned.
> >
> > --
> > 符永涛
> >
>
>
> --
> 符永涛
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
[-- Attachment #1.2: Type: text/html, Size: 4561 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 22:16 ` Michael L. Semon
@ 2013-04-09 22:18 ` Eric Sandeen
2013-04-09 22:48 ` Ben Myers
0 siblings, 1 reply; 60+ messages in thread
From: Eric Sandeen @ 2013-04-09 22:18 UTC (permalink / raw)
To: Michael L. Semon; +Cc: 符永涛, xfs@oss.sgi.com
On 4/9/13 5:16 PM, Michael L. Semon wrote:
> A meager non-expert user question with full ignorance of glusterfs:
> Why are you having I/O errors once every two weeks?
It's runtime errors or corruption, followed by fs shutdown, which then
results in IO errors, because all IOs are rejected on the shutdown FS.
But that's not always immediately obvious from the stream of resulting
"I/O Error" messages ;)
-Eric
> This looks like XFS behavior I've seen under 2 conditions: 1) when I
> test XFS on the device-mapper flakey object, using XFS without an
> external journal, and 2) when I try to press my hard-drive connectors
> against the motherboard while the PC is still running. Your error
> message looks more like the result of (2) than of (1).
>
> XFS behavior on flakey is not the best, and I wish it would recover
> in such situations. In Case (2), I'm fairly sure that the PC is
> confused on a hardware level because the drive light does not go out.
> Then again, seeing the behavior of other file systems that fight
> through the errors, maybe it's for the best. If you're fighting I/O
> errors, there is no winner, and it's best to get rid of the I/O
> error.
>
> OK, I'm off the soapbox and will quietly wait for a RAID expert like
> Dave or Stan to jump in and make me feel like a complete amateur...
>
> MIchael
>
> On Tue, Apr 9, 2013 at 9:03 AM, 符永涛 <yongtaofu@gmail.com
> <mailto:yongtaofu@gmail.com>> wrote:
>
> BTW xfs_info /dev/sdb meta-data=/dev/sdb isize=256
> agcount=28, agsize=268435440 blks = sectsz=512
> attr=2 data = bsize=4096
> blocks=7324303360 <tel:7324303360>, imaxpct=5 =
> sunit=16 swidth=160 blks naming =version 2
> bsize=4096 ascii-ci=0 log =internal bsize=4096
> blocks=521728, version=2 = sectsz=512
> sunit=16 blks, lazy-count=1 realtime =none
> extsz=4096 blocks=0, rtextents=0
>
> 2013/4/9, 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>:
>> Dear xfs experts, I really need your help sincerely!!! In our
>> production enviroment we run glusterfs over top of xfs on Dell
>> x720D(Raid 6). And the xfs file system crash on some of the server
>> frequently about every two weeks. Can you help to give me a
>> direction about how to debug this issue and how to avoid it? Thank
>> you very very much!
>>
>> uname -a Linux cqdx.miaoyan.cluster1.node11.qiyi.domain
>> 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64
>> x86_64 x86_64 GNU/Linux
>>
>> Every time the crash log is same, as following
>>
>> 038 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
>> xfs_inotobp() returned error 22. 1039 Apr 9 09:41:36 cqdx kernel:
>> XFS (sdb): xfs_inactive: xfs_ifree returned error 22 1040 Apr 9
>> 09:41:36 cqdx kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
>> from line 1184 of file fs/xfs/xfs_vnodeops.c. Return address =
>> 0xffffffffa02ee20a 1041 Apr 9 09:41:36 cqdx kernel: XFS (sdb): I/O
>> Error Detected. Shutting down filesystem 1042 Apr 9 09:41:36 cqdx
>> kernel: XFS (sdb): Please umount the filesystem and rectify the
>> problem(s) 1043 Apr 9 09:41:53 cqdx kernel: XFS (sdb):
>> xfs_log_force: error 5 returned. 1044 Apr 9 09:42:23 cqdx kernel:
>> XFS (sdb): xfs_log_force: error 5 returned. 1045 Apr 9 09:42:53
>> cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned. 1046 Apr
>> 9 09:43:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> returned.
>>
>> -- 符永涛
>>
>
>
> -- 符永涛
>
> _______________________________________________ xfs mailing list
> xfs@oss.sgi.com <mailto:xfs@oss.sgi.com>
> http://oss.sgi.com/mailman/listinfo/xfs
>
>
>
>
> _______________________________________________ xfs mailing list
> xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 22:18 ` Eric Sandeen
@ 2013-04-09 22:48 ` Ben Myers
2013-04-09 23:30 ` Dave Chinner
0 siblings, 1 reply; 60+ messages in thread
From: Ben Myers @ 2013-04-09 22:48 UTC (permalink / raw)
To: Eric Sandeen, Michael L. Semon, 符永涛; +Cc: xfs@oss.sgi.com
Hey,
On Tue, Apr 09, 2013 at 05:18:27PM -0500, Eric Sandeen wrote:
> On 4/9/13 5:16 PM, Michael L. Semon wrote:
> > A meager non-expert user question with full ignorance of glusterfs:
> > Why are you having I/O errors once every two weeks?
>
> It's runtime errors or corruption, followed by fs shutdown, which then
> results in IO errors, because all IOs are rejected on the shutdown FS.
>
> But that's not always immediately obvious from the stream of resulting
> "I/O Error" messages ;)
The IO errors are maybe a bit excessive and scary. I can understand why some
people might misinterpret those messages and assume it's a hardware problem.
-Ben
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 22:48 ` Ben Myers
@ 2013-04-09 23:30 ` Dave Chinner
0 siblings, 0 replies; 60+ messages in thread
From: Dave Chinner @ 2013-04-09 23:30 UTC (permalink / raw)
To: Ben Myers
Cc: 符永涛, Michael L. Semon, Eric Sandeen,
xfs@oss.sgi.com
On Tue, Apr 09, 2013 at 05:48:52PM -0500, Ben Myers wrote:
> Hey,
>
> On Tue, Apr 09, 2013 at 05:18:27PM -0500, Eric Sandeen wrote:
> > On 4/9/13 5:16 PM, Michael L. Semon wrote:
> > > A meager non-expert user question with full ignorance of glusterfs:
> > > Why are you having I/O errors once every two weeks?
> >
> > It's runtime errors or corruption, followed by fs shutdown, which then
> > results in IO errors, because all IOs are rejected on the shutdown FS.
> >
> > But that's not always immediately obvious from the stream of resulting
> > "I/O Error" messages ;)
>
> The IO errors are maybe a bit excessive and scary.
That's entirely the point. If we stay silent we get complaints about
not telling people that there's something wrong. If the error
messages are not excessive and scary, then people don't report them
and so we never hear about problems that are occurring.
> I can understand why some
> people might misinterpret those messages and assume it's a hardware problem.
Quite frankly, the biggest problem we have *always* had is that
people don't bother to read their log files when something has gone
wrong or selectively quote the logs when reporting the bug. This
is the primary reason for the "how to report a bug" FAQ entry asking
for the *full logs* to be posted in a bug report.
Removing error messages because they are "noisy" is not the answer.
Verbose error messages (especially corruption reports) are there
mainly for the benefit of the developers, not the user. The user
needs to know when a corruption has occurred, but we need to
understand the what, how and why of the issue.
It's far better to scare users by dumping all the relevant info into
the log when an error occurs than to be sitting around scratching
our heads going "WTF?" like we are right now because there isn't
enough information in the logs to have even a basic clue of what is
going wrong...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 17:10 ` Eric Sandeen
@ 2013-04-10 5:34 ` 符永涛
2013-04-10 5:36 ` 符永涛
0 siblings, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-10 5:34 UTC (permalink / raw)
To: Eric Sandeen; +Cc: xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 3665 bytes --]
Here's the file info in lost+found:
[ lost+found]# pwd
/mnt/xfsd/lost+found
[ lost+found]# ls -l
总用量 4
---------T 1 root root 0 2月 28 15:42 3097
---------T 1 root root 0 2月 28 15:16 6169
[root@10.15.136.67 lost+found]# sudo getfattr -m . -d -e hex 6169
[root@10.15.136.67 lost+found]# sudo getfattr -m . -d -e hex 3097
# file: 3097
trusted.afr.ec-data-client-2=0x000000000000000000000000
trusted.afr.ec-data-client-3=0x000000000000000000000000
trusted.afr.ec-data1-client-2=0x000000000000000000000000
trusted.afr.ec-data1-client-3=0x000000000000000000000000
trusted.gfid=0x2bb701d327c44bb0af78d69e89f192a4
trusted.glusterfs.dht.linkto=0x65632d64617461312d7265706c69636174652d3400
trusted.glusterfs.quota.b8e8b3ef-0268-40af-93b6-257c4c7ef17a.contri=0x0000000004249000
It seems they're some link files for glusterfs dht xlator.
Thank you.
2013/4/10 Eric Sandeen <sandeen@sandeen.net>
> On 4/9/13 10:18 AM, 符永涛 wrote:
> > The servers are back to service now and It's hard to run xfs_repair. It
> always happen bellow is the xfs_repair log when it happens on another
> server several days ago.
>
> ...
>
> > 第二步
> > repair的log
> >
> > sh-4.1$ sudo xfs_repair /dev/glustervg/glusterlv
> > Phase 1 - find and verify superblock…
> > Phase 2 - using internal log
> > - zero log…
> > - scan filesystem freespace and inode maps…
> > agi unlinked bucket 0 is 4046848 in ag 0 (inode=4046848)
> > agi unlinked bucket 5 is 2340485 in ag 0 (inode=2340485)
> > agi unlinked bucket 6 is 2326854 in ag 0 (inode=2326854)
> > agi unlinked bucket 8 is 1802120 in ag 0 (inode=1802120)
> > agi unlinked bucket 14 is 495566 in ag 0 (inode=495566)
> > agi unlinked bucket 16 is 5899536 in ag 0 (inode=5899536)
> > agi unlinked bucket 19 is 4008211 in ag 0 (inode=4008211)
> > agi unlinked bucket 21 is 4906965 in ag 0 (inode=4906965)
> > agi unlinked bucket 23 is 2022231 in ag 0 (inode=2022231)
> > agi unlinked bucket 24 is 1626200 in ag 0 (inode=1626200)
> > agi unlinked bucket 25 is 938585 in ag 0 (inode=938585)
> > agi unlinked bucket 30 is 4226526 in ag 0 (inode=4226526)
> > agi unlinked bucket 34 is 4108962 in ag 0 (inode=4108962)
> > agi unlinked bucket 37 is 1740389 in ag 0 (inode=1740389)
> > agi unlinked bucket 39 is 247399 in ag 0 (inode=247399)
> > agi unlinked bucket 40 is 6237864 in ag 0 (inode=6237864)
> > agi unlinked bucket 43 is 3404331 in ag 0 (inode=3404331)
> > agi unlinked bucket 45 is 2092717 in ag 0 (inode=2092717)
> > agi unlinked bucket 48 is 4041008 in ag 0 (inode=4041008)
> > agi unlinked bucket 50 is 1459762 in ag 0 (inode=1459762)
> > agi unlinked bucket 56 is 852024 in ag 0 (inode=852024)
>
> If this machine is still around in similar state, can you do a
>
> # find /path/to/mount -inum $INODE_NUMBER
>
> for the inode numbers above, and see what files they are?
> That might give us a clue about what operations were happening
> to them. Dumping the gluster xattrs on those files
> might also be interesting. Just guesses here, but it'd be a
> little more data.
>
> (if this is an old repair, maybe doing the same for your most
> recent incident would be best)
>
> Thanks,
> -Eric
>
> > - found root in ode chunk
> > Phase 3 - for each AG…
> > - scan and clear agi unlinked lists…
> > - process known inodes and perform inode discovery…
> > - agno = 0
> > 7f8220be6700: Badness in key lookup (length)
> > bp=(bno 123696, len 16384 bytes) key=(bno 123696, len 8192 bytes)
>
> (FWIW the above warnings look like an xfs_repair bug, not related)
>
> -Eric
>
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 4720 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-10 5:34 ` 符永涛
@ 2013-04-10 5:36 ` 符永涛
0 siblings, 0 replies; 60+ messages in thread
From: 符永涛 @ 2013-04-10 5:36 UTC (permalink / raw)
To: Eric Sandeen; +Cc: xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 4082 bytes --]
[/mnt/xfsd/lost+found]# pwd
/mnt/xfsd/lost+found
[ /mnt/xfsd/lost+found]# ls -l
total 0
-rw-r--r-- 1 root root 0 Feb 1 19:18 6235944
[ /mnt/xfsd/lost+found]# sudo getfattr -m . -d -e hex 6235944
[ /mnt/xfsd/lost+found]#
2013/4/10 符永涛 <yongtaofu@gmail.com>
> Here's the file info in lost+found:
>
> [ lost+found]# pwd
> /mnt/xfsd/lost+found
> [ lost+found]# ls -l
> 总用量 4
> ---------T 1 root root 0 2月 28 15:42 3097
> ---------T 1 root root 0 2月 28 15:16 6169
> [root@10.15.136.67 lost+found]# sudo getfattr -m . -d -e hex 6169
> [root@10.15.136.67 lost+found]# sudo getfattr -m . -d -e hex 3097
> # file: 3097
> trusted.afr.ec-data-client-2=0x000000000000000000000000
> trusted.afr.ec-data-client-3=0x000000000000000000000000
> trusted.afr.ec-data1-client-2=0x000000000000000000000000
> trusted.afr.ec-data1-client-3=0x000000000000000000000000
> trusted.gfid=0x2bb701d327c44bb0af78d69e89f192a4
> trusted.glusterfs.dht.linkto=0x65632d64617461312d7265706c69636174652d3400
>
> trusted.glusterfs.quota.b8e8b3ef-0268-40af-93b6-257c4c7ef17a.contri=0x0000000004249000
>
>
> It seems they're some link files for glusterfs dht xlator.
>
> Thank you.
>
>
> 2013/4/10 Eric Sandeen <sandeen@sandeen.net>
>
>> On 4/9/13 10:18 AM, 符永涛 wrote:
>> > The servers are back to service now and It's hard to run xfs_repair. It
>> always happen bellow is the xfs_repair log when it happens on another
>> server several days ago.
>>
>> ...
>>
>> > 第二步
>> > repair的log
>> >
>> > sh-4.1$ sudo xfs_repair /dev/glustervg/glusterlv
>> > Phase 1 - find and verify superblock…
>> > Phase 2 - using internal log
>> > - zero log…
>> > - scan filesystem freespace and inode maps…
>> > agi unlinked bucket 0 is 4046848 in ag 0 (inode=4046848)
>> > agi unlinked bucket 5 is 2340485 in ag 0 (inode=2340485)
>> > agi unlinked bucket 6 is 2326854 in ag 0 (inode=2326854)
>> > agi unlinked bucket 8 is 1802120 in ag 0 (inode=1802120)
>> > agi unlinked bucket 14 is 495566 in ag 0 (inode=495566)
>> > agi unlinked bucket 16 is 5899536 in ag 0 (inode=5899536)
>> > agi unlinked bucket 19 is 4008211 in ag 0 (inode=4008211)
>> > agi unlinked bucket 21 is 4906965 in ag 0 (inode=4906965)
>> > agi unlinked bucket 23 is 2022231 in ag 0 (inode=2022231)
>> > agi unlinked bucket 24 is 1626200 in ag 0 (inode=1626200)
>> > agi unlinked bucket 25 is 938585 in ag 0 (inode=938585)
>> > agi unlinked bucket 30 is 4226526 in ag 0 (inode=4226526)
>> > agi unlinked bucket 34 is 4108962 in ag 0 (inode=4108962)
>> > agi unlinked bucket 37 is 1740389 in ag 0 (inode=1740389)
>> > agi unlinked bucket 39 is 247399 in ag 0 (inode=247399)
>> > agi unlinked bucket 40 is 6237864 in ag 0 (inode=6237864)
>> > agi unlinked bucket 43 is 3404331 in ag 0 (inode=3404331)
>> > agi unlinked bucket 45 is 2092717 in ag 0 (inode=2092717)
>> > agi unlinked bucket 48 is 4041008 in ag 0 (inode=4041008)
>> > agi unlinked bucket 50 is 1459762 in ag 0 (inode=1459762)
>> > agi unlinked bucket 56 is 852024 in ag 0 (inode=852024)
>>
>> If this machine is still around in similar state, can you do a
>>
>> # find /path/to/mount -inum $INODE_NUMBER
>>
>> for the inode numbers above, and see what files they are?
>> That might give us a clue about what operations were happening
>> to them. Dumping the gluster xattrs on those files
>> might also be interesting. Just guesses here, but it'd be a
>> little more data.
>>
>> (if this is an old repair, maybe doing the same for your most
>> recent incident would be best)
>>
>> Thanks,
>> -Eric
>>
>> > - found root in ode chunk
>> > Phase 3 - for each AG…
>> > - scan and clear agi unlinked lists…
>> > - process known inodes and perform inode discovery…
>> > - agno = 0
>> > 7f8220be6700: Badness in key lookup (length)
>> > bp=(bno 123696, len 16384 bytes) key=(bno 123696, len 8192 bytes)
>>
>> (FWIW the above warnings look like an xfs_repair bug, not related)
>>
>> -Eric
>>
>>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 5420 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-09 15:10 ` 符永涛
@ 2013-04-10 10:10 ` Emmanuel Florac
2013-04-10 12:52 ` Dave Chinner
2013-04-10 13:52 ` 符永涛
0 siblings, 2 replies; 60+ messages in thread
From: Emmanuel Florac @ 2013-04-10 10:10 UTC (permalink / raw)
To: 符永涛; +Cc: Ben Myers, xfs
Le Tue, 9 Apr 2013 23:10:03 +0800
符永涛 <yongtaofu@gmail.com> écrivait:
> > Apr 9 11:01:30 cqdx kernel: XFS (sdb): I/O Error Detected.
> > Shutting down filesystem
This. I/O error detected. That means that at some point the underlying
device (disk, RAID array, SAN volume) couldn't be reached. So this
could very well be a case of a flakey drive, array, cable or SCSI
driver.
What's the storage setup here?
--
------------------------------------------------------------------------
Emmanuel Florac | Direction technique
| Intellique
| <eflorac@intellique.com>
| +33 1 78 94 84 02
------------------------------------------------------------------------
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-10 10:10 ` Emmanuel Florac
@ 2013-04-10 12:52 ` Dave Chinner
2013-04-10 13:52 ` 符永涛
1 sibling, 0 replies; 60+ messages in thread
From: Dave Chinner @ 2013-04-10 12:52 UTC (permalink / raw)
To: Emmanuel Florac; +Cc: Ben Myers, 符永涛, xfs
On Wed, Apr 10, 2013 at 12:10:25PM +0200, Emmanuel Florac wrote:
> Le Tue, 9 Apr 2013 23:10:03 +0800
> 符永涛 <yongtaofu@gmail.com> écrivait:
>
> > > Apr 9 11:01:30 cqdx kernel: XFS (sdb): I/O Error Detected.
> > > Shutting down filesystem
>
> This. I/O error detected. That means that at some point the underlying
> device (disk, RAID array, SAN volume) couldn't be reached. So this
> could very well be a case of a flakey drive, array, cable or SCSI
> driver.
You can't take that one line of output out of context and then say
it's a hardware problem - that's a generic IO-error-causes-shutdown
message, not an EIO from the storage stack.
The EINVAL error that is reported before this is the cause of the
shutdown, and that is from a corrupted unlinked list. EINVAL
indicates that we are falling off the end of the unlinked list
without finding the inode that we are trying to remove from the
unlinked list. Debug kernels will assert fail at this point.
What causes that problem is still unknown. Nobody has been able to
isolate the reproducer, so progress is slow. If someone can give me
a script that reprodcues it directly on XFS (i.e no gluster), then
it won't take long to find the bug....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-10 10:10 ` Emmanuel Florac
2013-04-10 12:52 ` Dave Chinner
@ 2013-04-10 13:52 ` 符永涛
2013-04-11 19:11 ` 符永涛
1 sibling, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-10 13:52 UTC (permalink / raw)
To: Emmanuel Florac; +Cc: Ben Myers, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 3038 bytes --]
The storage info is as following:
RAID-6
SATA HDD
Controller: PERC H710P Mini (Embedded)
Disk /dev/sdb: 30000.3 GB, 30000346562560 bytes
255 heads, 63 sectors/track, 3647334 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
sd 0:2:1:0: [sdb] 58594426880 512-byte logical blocks: (30.0 TB/27.2 TiB)
sd 0:2:1:0: [sdb] Write Protect is off
sd 0:2:1:0: [sdb] Mode Sense: 1f 00 00 08
sd 0:2:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
sd 0:2:1:0: [sdb] Attached SCSI disk
*-storage
description: RAID bus controller
product: MegaRAID SAS 2208 [Thunderbolt]
vendor: LSI Logic / Symbios Logic
physical id: 0
bus info: pci@0000:02:00.0
logical name: scsi0
version: 01
width: 64 bits
clock: 33MHz
capabilities: storage pm pciexpress vpd msi msix bus_master cap_list
rom
configuration: driver=megaraid_sas latency=0
resources: irq:42 ioport:fc00(size=256) memory:dd7fc000-dd7fffff
memory:dd780000-dd7bffff memory:dc800000-dc81ffff(prefetchable)
*-disk:0
description: SCSI Disk
product: PERC H710P
vendor: DELL
physical id: 2.0.0
bus info: scsi@0:2.0.0
logical name: /dev/sda
version: 3.13
serial: 0049d6ce1d9f2035180096fde490f648
size: 558GiB (599GB)
capabilities: partitioned partitioned:dos
configuration: ansiversion=5 signature=000aa336
*-disk:1
description: SCSI Disk
product: PERC H710P
vendor: DELL
physical id: 2.1.0
bus info: scsi@0:2.1.0
logical name: /dev/sdb
logical name: /mnt/xfsd
version: 3.13
serial: 003366f71da22035180096fde490f648
size: 27TiB (30TB)
configuration: ansiversion=5 mount.fstype=xfs
mount.options=rw,relatime,attr2,delaylog,logbsize=64k,sunit=128,swidth=1280,noquota
state=mounted
Thank you.
2013/4/10 Emmanuel Florac <eflorac@intellique.com>
> Le Tue, 9 Apr 2013 23:10:03 +0800
> 符永涛 <yongtaofu@gmail.com> écrivait:
>
> > > Apr 9 11:01:30 cqdx kernel: XFS (sdb): I/O Error Detected.
> > > Shutting down filesystem
>
> This. I/O error detected. That means that at some point the underlying
> device (disk, RAID array, SAN volume) couldn't be reached. So this
> could very well be a case of a flakey drive, array, cable or SCSI
> driver.
>
> What's the storage setup here?
>
> --
> ------------------------------------------------------------------------
> Emmanuel Florac | Direction technique
> | Intellique
> | <eflorac@intellique.com>
> | +33 1 78 94 84 02
> ------------------------------------------------------------------------
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 5927 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-10 13:52 ` 符永涛
@ 2013-04-11 19:11 ` 符永涛
2013-04-11 19:55 ` 符永涛
2013-04-11 23:26 ` Brian Foster
0 siblings, 2 replies; 60+ messages in thread
From: 符永涛 @ 2013-04-11 19:11 UTC (permalink / raw)
To: Emmanuel Florac; +Cc: Ben Myers, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 22695 bytes --]
It happens tonight again on one of our servers, how to debug the root
cause? Thank you.
Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
returned error 22.
Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
error 22
Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
from line 1184 of file fs/xfs/xfs_vnodeops.c. Return address =
0xffffffffa02ee20a
Apr 12 02:32:10 cqdx kernel: XFS (sdb): I/O Error Detected. Shutting down
filesystem
Apr 12 02:32:10 cqdx kernel: XFS (sdb): Please umount the filesystem and
rectify the problem(s)
Apr 12 02:32:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
Apr 12 02:32:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
Apr 12 02:33:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
Apr 12 02:33:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
xfs_repair -n
Phase 7 - verify link counts...
would have reset inode 20021 nlinks from 0 to 1
would have reset inode 20789 nlinks from 0 to 1
would have reset inode 35125 nlinks from 0 to 1
would have reset inode 35637 nlinks from 0 to 1
would have reset inode 36149 nlinks from 0 to 1
would have reset inode 38197 nlinks from 0 to 1
would have reset inode 39477 nlinks from 0 to 1
would have reset inode 54069 nlinks from 0 to 1
would have reset inode 62261 nlinks from 0 to 1
would have reset inode 63029 nlinks from 0 to 1
would have reset inode 72501 nlinks from 0 to 1
would have reset inode 79925 nlinks from 0 to 1
would have reset inode 81205 nlinks from 0 to 1
would have reset inode 84789 nlinks from 0 to 1
would have reset inode 87861 nlinks from 0 to 1
would have reset inode 90663 nlinks from 0 to 1
would have reset inode 91189 nlinks from 0 to 1
would have reset inode 95541 nlinks from 0 to 1
would have reset inode 98101 nlinks from 0 to 1
would have reset inode 101173 nlinks from 0 to 1
would have reset inode 113205 nlinks from 0 to 1
would have reset inode 114741 nlinks from 0 to 1
would have reset inode 126261 nlinks from 0 to 1
would have reset inode 140597 nlinks from 0 to 1
would have reset inode 144693 nlinks from 0 to 1
would have reset inode 147765 nlinks from 0 to 1
would have reset inode 152885 nlinks from 0 to 1
would have reset inode 161333 nlinks from 0 to 1
would have reset inode 161845 nlinks from 0 to 1
would have reset inode 167477 nlinks from 0 to 1
would have reset inode 172341 nlinks from 0 to 1
would have reset inode 191797 nlinks from 0 to 1
would have reset inode 204853 nlinks from 0 to 1
would have reset inode 205365 nlinks from 0 to 1
would have reset inode 215349 nlinks from 0 to 1
would have reset inode 215861 nlinks from 0 to 1
would have reset inode 216373 nlinks from 0 to 1
would have reset inode 217397 nlinks from 0 to 1
would have reset inode 224309 nlinks from 0 to 1
would have reset inode 225589 nlinks from 0 to 1
would have reset inode 234549 nlinks from 0 to 1
would have reset inode 234805 nlinks from 0 to 1
would have reset inode 249653 nlinks from 0 to 1
would have reset inode 250677 nlinks from 0 to 1
would have reset inode 252469 nlinks from 0 to 1
would have reset inode 261429 nlinks from 0 to 1
would have reset inode 265013 nlinks from 0 to 1
would have reset inode 266805 nlinks from 0 to 1
would have reset inode 267317 nlinks from 0 to 1
would have reset inode 268853 nlinks from 0 to 1
would have reset inode 272437 nlinks from 0 to 1
would have reset inode 273205 nlinks from 0 to 1
would have reset inode 274229 nlinks from 0 to 1
would have reset inode 278325 nlinks from 0 to 1
would have reset inode 278837 nlinks from 0 to 1
would have reset inode 281397 nlinks from 0 to 1
would have reset inode 292661 nlinks from 0 to 1
would have reset inode 300853 nlinks from 0 to 1
would have reset inode 302901 nlinks from 0 to 1
would have reset inode 305205 nlinks from 0 to 1
would have reset inode 314165 nlinks from 0 to 1
would have reset inode 315189 nlinks from 0 to 1
would have reset inode 320309 nlinks from 0 to 1
would have reset inode 324917 nlinks from 0 to 1
would have reset inode 328245 nlinks from 0 to 1
would have reset inode 335925 nlinks from 0 to 1
would have reset inode 339253 nlinks from 0 to 1
would have reset inode 339765 nlinks from 0 to 1
would have reset inode 348213 nlinks from 0 to 1
would have reset inode 360501 nlinks from 0 to 1
would have reset inode 362037 nlinks from 0 to 1
would have reset inode 366389 nlinks from 0 to 1
would have reset inode 385845 nlinks from 0 to 1
would have reset inode 390709 nlinks from 0 to 1
would have reset inode 409141 nlinks from 0 to 1
would have reset inode 413237 nlinks from 0 to 1
would have reset inode 414773 nlinks from 0 to 1
would have reset inode 417845 nlinks from 0 to 1
would have reset inode 436021 nlinks from 0 to 1
would have reset inode 439349 nlinks from 0 to 1
would have reset inode 447029 nlinks from 0 to 1
would have reset inode 491317 nlinks from 0 to 1
would have reset inode 494133 nlinks from 0 to 1
would have reset inode 495413 nlinks from 0 to 1
would have reset inode 501301 nlinks from 0 to 1
would have reset inode 506421 nlinks from 0 to 1
would have reset inode 508469 nlinks from 0 to 1
would have reset inode 508981 nlinks from 0 to 1
would have reset inode 511797 nlinks from 0 to 1
would have reset inode 513077 nlinks from 0 to 1
would have reset inode 517941 nlinks from 0 to 1
would have reset inode 521013 nlinks from 0 to 1
would have reset inode 522805 nlinks from 0 to 1
would have reset inode 523317 nlinks from 0 to 1
would have reset inode 525621 nlinks from 0 to 1
would have reset inode 527925 nlinks from 0 to 1
would have reset inode 535605 nlinks from 0 to 1
would have reset inode 541749 nlinks from 0 to 1
would have reset inode 573493 nlinks from 0 to 1
would have reset inode 578613 nlinks from 0 to 1
would have reset inode 583029 nlinks from 0 to 1
would have reset inode 585525 nlinks from 0 to 1
would have reset inode 586293 nlinks from 0 to 1
would have reset inode 586805 nlinks from 0 to 1
would have reset inode 591413 nlinks from 0 to 1
would have reset inode 594485 nlinks from 0 to 1
would have reset inode 596277 nlinks from 0 to 1
would have reset inode 603189 nlinks from 0 to 1
would have reset inode 613429 nlinks from 0 to 1
would have reset inode 617781 nlinks from 0 to 1
would have reset inode 621877 nlinks from 0 to 1
would have reset inode 623925 nlinks from 0 to 1
would have reset inode 625205 nlinks from 0 to 1
would have reset inode 626741 nlinks from 0 to 1
would have reset inode 639541 nlinks from 0 to 1
would have reset inode 640053 nlinks from 0 to 1
would have reset inode 640565 nlinks from 0 to 1
would have reset inode 645173 nlinks from 0 to 1
would have reset inode 652853 nlinks from 0 to 1
would have reset inode 656181 nlinks from 0 to 1
would have reset inode 659253 nlinks from 0 to 1
would have reset inode 663605 nlinks from 0 to 1
would have reset inode 667445 nlinks from 0 to 1
would have reset inode 680757 nlinks from 0 to 1
would have reset inode 691253 nlinks from 0 to 1
would have reset inode 691765 nlinks from 0 to 1
would have reset inode 697653 nlinks from 0 to 1
would have reset inode 700469 nlinks from 0 to 1
would have reset inode 707893 nlinks from 0 to 1
would have reset inode 716853 nlinks from 0 to 1
would have reset inode 722229 nlinks from 0 to 1
would have reset inode 722741 nlinks from 0 to 1
would have reset inode 723765 nlinks from 0 to 1
would have reset inode 731957 nlinks from 0 to 1
would have reset inode 742965 nlinks from 0 to 1
would have reset inode 743477 nlinks from 0 to 1
would have reset inode 745781 nlinks from 0 to 1
would have reset inode 746293 nlinks from 0 to 1
would have reset inode 774453 nlinks from 0 to 1
would have reset inode 778805 nlinks from 0 to 1
would have reset inode 785013 nlinks from 0 to 1
would have reset inode 785973 nlinks from 0 to 1
would have reset inode 791349 nlinks from 0 to 1
would have reset inode 796981 nlinks from 0 to 1
would have reset inode 803381 nlinks from 0 to 1
would have reset inode 806965 nlinks from 0 to 1
would have reset inode 811798 nlinks from 0 to 1
would have reset inode 812310 nlinks from 0 to 1
would have reset inode 813078 nlinks from 0 to 1
would have reset inode 813607 nlinks from 0 to 1
would have reset inode 814183 nlinks from 0 to 1
would have reset inode 822069 nlinks from 0 to 1
would have reset inode 828469 nlinks from 0 to 1
would have reset inode 830005 nlinks from 0 to 1
would have reset inode 832053 nlinks from 0 to 1
would have reset inode 832565 nlinks from 0 to 1
would have reset inode 836661 nlinks from 0 to 1
would have reset inode 841013 nlinks from 0 to 1
would have reset inode 841525 nlinks from 0 to 1
would have reset inode 845365 nlinks from 0 to 1
would have reset inode 846133 nlinks from 0 to 1
would have reset inode 847157 nlinks from 0 to 1
would have reset inode 852533 nlinks from 0 to 1
would have reset inode 857141 nlinks from 0 to 1
would have reset inode 863271 nlinks from 0 to 1
would have reset inode 866855 nlinks from 0 to 1
would have reset inode 887861 nlinks from 0 to 1
would have reset inode 891701 nlinks from 0 to 1
would have reset inode 894773 nlinks from 0 to 1
would have reset inode 900149 nlinks from 0 to 1
would have reset inode 902197 nlinks from 0 to 1
would have reset inode 906293 nlinks from 0 to 1
would have reset inode 906805 nlinks from 0 to 1
would have reset inode 909877 nlinks from 0 to 1
would have reset inode 925493 nlinks from 0 to 1
would have reset inode 949543 nlinks from 0 to 1
would have reset inode 955175 nlinks from 0 to 1
would have reset inode 963623 nlinks from 0 to 1
would have reset inode 967733 nlinks from 0 to 1
would have reset inode 968231 nlinks from 0 to 1
would have reset inode 982069 nlinks from 0 to 1
would have reset inode 1007413 nlinks from 0 to 1
would have reset inode 1011509 nlinks from 0 to 1
would have reset inode 1014069 nlinks from 0 to 1
would have reset inode 1014581 nlinks from 0 to 1
would have reset inode 1022005 nlinks from 0 to 1
would have reset inode 1022517 nlinks from 0 to 1
would have reset inode 1023029 nlinks from 0 to 1
would have reset inode 1025333 nlinks from 0 to 1
would have reset inode 1043765 nlinks from 0 to 1
would have reset inode 1044789 nlinks from 0 to 1
would have reset inode 1049397 nlinks from 0 to 1
would have reset inode 1050933 nlinks from 0 to 1
would have reset inode 1051445 nlinks from 0 to 1
would have reset inode 1054261 nlinks from 0 to 1
would have reset inode 1060917 nlinks from 0 to 1
would have reset inode 1063477 nlinks from 0 to 1
would have reset inode 1076021 nlinks from 0 to 1
would have reset inode 1081141 nlinks from 0 to 1
would have reset inode 1086261 nlinks from 0 to 1
would have reset inode 1097269 nlinks from 0 to 1
would have reset inode 1099829 nlinks from 0 to 1
would have reset inode 1100853 nlinks from 0 to 1
would have reset inode 1101877 nlinks from 0 to 1
would have reset inode 1126709 nlinks from 0 to 1
would have reset inode 1134389 nlinks from 0 to 1
would have reset inode 1141045 nlinks from 0 to 1
would have reset inode 1141557 nlinks from 0 to 1
would have reset inode 1142581 nlinks from 0 to 1
would have reset inode 1148469 nlinks from 0 to 1
would have reset inode 1153333 nlinks from 0 to 1
would have reset inode 1181749 nlinks from 0 to 1
would have reset inode 1192245 nlinks from 0 to 1
would have reset inode 1198133 nlinks from 0 to 1
would have reset inode 1203765 nlinks from 0 to 1
would have reset inode 1221429 nlinks from 0 to 1
would have reset inode 1223989 nlinks from 0 to 1
would have reset inode 1235509 nlinks from 0 to 1
would have reset inode 1239349 nlinks from 0 to 1
would have reset inode 1240885 nlinks from 0 to 1
would have reset inode 1241397 nlinks from 0 to 1
would have reset inode 1241909 nlinks from 0 to 1
would have reset inode 1242421 nlinks from 0 to 1
would have reset inode 1244981 nlinks from 0 to 1
would have reset inode 1246517 nlinks from 0 to 1
would have reset inode 1253429 nlinks from 0 to 1
would have reset inode 1271861 nlinks from 0 to 1
would have reset inode 1274677 nlinks from 0 to 1
would have reset inode 1277749 nlinks from 0 to 1
would have reset inode 1278773 nlinks from 0 to 1
would have reset inode 1286709 nlinks from 0 to 1
would have reset inode 1288245 nlinks from 0 to 1
would have reset inode 1299765 nlinks from 0 to 1
would have reset inode 1302325 nlinks from 0 to 1
would have reset inode 1304885 nlinks from 0 to 1
would have reset inode 1305397 nlinks from 0 to 1
would have reset inode 1307509 nlinks from 0 to 1
would have reset inode 1309493 nlinks from 0 to 1
would have reset inode 1310517 nlinks from 0 to 1
would have reset inode 1311029 nlinks from 0 to 1
would have reset inode 1312053 nlinks from 0 to 1
would have reset inode 1316917 nlinks from 0 to 1
would have reset inode 1317941 nlinks from 0 to 1
would have reset inode 1320821 nlinks from 0 to 1
would have reset inode 1322805 nlinks from 0 to 1
would have reset inode 1332789 nlinks from 0 to 1
would have reset inode 1336373 nlinks from 0 to 1
would have reset inode 1345653 nlinks from 0 to 1
would have reset inode 1354549 nlinks from 0 to 1
would have reset inode 1361973 nlinks from 0 to 1
would have reset inode 1369909 nlinks from 0 to 1
would have reset inode 1372981 nlinks from 0 to 1
would have reset inode 1388853 nlinks from 0 to 1
would have reset inode 1402933 nlinks from 0 to 1
would have reset inode 1403445 nlinks from 0 to 1
would have reset inode 1420085 nlinks from 0 to 1
would have reset inode 1452853 nlinks from 0 to 1
would have reset inode 1456437 nlinks from 0 to 1
would have reset inode 1457973 nlinks from 0 to 1
would have reset inode 1459253 nlinks from 0 to 1
would have reset inode 1467957 nlinks from 0 to 1
would have reset inode 1471541 nlinks from 0 to 1
would have reset inode 1476661 nlinks from 0 to 1
would have reset inode 1479733 nlinks from 0 to 1
would have reset inode 1483061 nlinks from 0 to 1
would have reset inode 1484085 nlinks from 0 to 1
would have reset inode 1486133 nlinks from 0 to 1
would have reset inode 1489461 nlinks from 0 to 1
would have reset inode 1490037 nlinks from 0 to 1
would have reset inode 1492021 nlinks from 0 to 1
would have reset inode 1493557 nlinks from 0 to 1
would have reset inode 1494069 nlinks from 0 to 1
would have reset inode 1496885 nlinks from 0 to 1
would have reset inode 1498421 nlinks from 0 to 1
would have reset inode 1498933 nlinks from 0 to 1
would have reset inode 1499957 nlinks from 0 to 1
would have reset inode 1506101 nlinks from 0 to 1
would have reset inode 1507637 nlinks from 0 to 1
would have reset inode 1510453 nlinks from 0 to 1
would have reset inode 1514293 nlinks from 0 to 1
would have reset inode 1517365 nlinks from 0 to 1
would have reset inode 1520693 nlinks from 0 to 1
would have reset inode 1521973 nlinks from 0 to 1
would have reset inode 1530421 nlinks from 0 to 1
would have reset inode 1530933 nlinks from 0 to 1
would have reset inode 1537333 nlinks from 0 to 1
would have reset inode 1538357 nlinks from 0 to 1
would have reset inode 1548853 nlinks from 0 to 1
would have reset inode 1553973 nlinks from 0 to 1
would have reset inode 1557301 nlinks from 0 to 1
would have reset inode 1564213 nlinks from 0 to 1
would have reset inode 1564725 nlinks from 0 to 1
would have reset inode 1576501 nlinks from 0 to 1
would have reset inode 1580597 nlinks from 0 to 1
would have reset inode 1584693 nlinks from 0 to 1
would have reset inode 1586485 nlinks from 0 to 1
would have reset inode 1589301 nlinks from 0 to 1
would have reset inode 1589813 nlinks from 0 to 1
would have reset inode 1592629 nlinks from 0 to 1
would have reset inode 1595701 nlinks from 0 to 1
would have reset inode 1601077 nlinks from 0 to 1
would have reset inode 1623861 nlinks from 0 to 1
would have reset inode 1626677 nlinks from 0 to 1
would have reset inode 1627701 nlinks from 0 to 1
would have reset inode 1633333 nlinks from 0 to 1
would have reset inode 1639221 nlinks from 0 to 1
would have reset inode 1649205 nlinks from 0 to 1
would have reset inode 1686325 nlinks from 0 to 1
would have reset inode 1690677 nlinks from 0 to 1
would have reset inode 1693749 nlinks from 0 to 1
would have reset inode 1704757 nlinks from 0 to 1
would have reset inode 1707061 nlinks from 0 to 1
would have reset inode 1709109 nlinks from 0 to 1
would have reset inode 1719349 nlinks from 0 to 1
would have reset inode 1737013 nlinks from 0 to 1
would have reset inode 1741365 nlinks from 0 to 1
would have reset inode 1747509 nlinks from 0 to 1
would have reset inode 1770805 nlinks from 0 to 1
would have reset inode 1780789 nlinks from 0 to 1
would have reset inode 1793589 nlinks from 0 to 1
would have reset inode 1795125 nlinks from 0 to 1
would have reset inode 1800757 nlinks from 0 to 1
would have reset inode 1801269 nlinks from 0 to 1
would have reset inode 1802549 nlinks from 0 to 1
would have reset inode 1804085 nlinks from 0 to 1
would have reset inode 1817141 nlinks from 0 to 1
would have reset inode 1821749 nlinks from 0 to 1
would have reset inode 1832757 nlinks from 0 to 1
would have reset inode 1836341 nlinks from 0 to 1
would have reset inode 1856309 nlinks from 0 to 1
would have reset inode 1900597 nlinks from 0 to 1
would have reset inode 1902901 nlinks from 0 to 1
would have reset inode 1912373 nlinks from 0 to 1
would have reset inode 1943093 nlinks from 0 to 1
would have reset inode 1944373 nlinks from 0 to 1
would have reset inode 1954101 nlinks from 0 to 1
would have reset inode 1955893 nlinks from 0 to 1
would have reset inode 1961781 nlinks from 0 to 1
would have reset inode 1974325 nlinks from 0 to 1
would have reset inode 1978677 nlinks from 0 to 1
would have reset inode 1981237 nlinks from 0 to 1
would have reset inode 1992245 nlinks from 0 to 1
would have reset inode 2000949 nlinks from 0 to 1
would have reset inode 2002229 nlinks from 0 to 1
would have reset inode 2004789 nlinks from 0 to 1
would have reset inode 2005301 nlinks from 0 to 1
would have reset inode 2011189 nlinks from 0 to 1
would have reset inode 2012981 nlinks from 0 to 1
would have reset inode 2015285 nlinks from 0 to 1
would have reset inode 2018869 nlinks from 0 to 1
would have reset inode 2028341 nlinks from 0 to 1
would have reset inode 2028853 nlinks from 0 to 1
would have reset inode 2030901 nlinks from 0 to 1
would have reset inode 2032181 nlinks from 0 to 1
would have reset inode 2032693 nlinks from 0 to 1
would have reset inode 2040117 nlinks from 0 to 1
would have reset inode 2053685 nlinks from 0 to 1
would have reset inode 2083893 nlinks from 0 to 1
would have reset inode 2087221 nlinks from 0 to 1
would have reset inode 2095925 nlinks from 0 to 1
would have reset inode 2098741 nlinks from 0 to 1
would have reset inode 2100533 nlinks from 0 to 1
would have reset inode 2101301 nlinks from 0 to 1
would have reset inode 2123573 nlinks from 0 to 1
would have reset inode 2132789 nlinks from 0 to 1
would have reset inode 2133813 nlinks from 0 to 1
2013/4/10 符永涛 <yongtaofu@gmail.com>
> The storage info is as following:
> RAID-6
> SATA HDD
> Controller: PERC H710P Mini (Embedded)
> Disk /dev/sdb: 30000.3 GB, 30000346562560 bytes
> 255 heads, 63 sectors/track, 3647334 cylinders
> Units = cylinders of 16065 * 512 = 8225280 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x00000000
>
> sd 0:2:1:0: [sdb] 58594426880 512-byte logical blocks: (30.0 TB/27.2 TiB)
> sd 0:2:1:0: [sdb] Write Protect is off
> sd 0:2:1:0: [sdb] Mode Sense: 1f 00 00 08
> sd 0:2:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> sd 0:2:1:0: [sdb] Attached SCSI disk
>
> *-storage
> description: RAID bus controller
> product: MegaRAID SAS 2208 [Thunderbolt]
> vendor: LSI Logic / Symbios Logic
> physical id: 0
> bus info: pci@0000:02:00.0
> logical name: scsi0
> version: 01
> width: 64 bits
> clock: 33MHz
> capabilities: storage pm pciexpress vpd msi msix bus_master
> cap_list rom
> configuration: driver=megaraid_sas latency=0
> resources: irq:42 ioport:fc00(size=256) memory:dd7fc000-dd7fffff
> memory:dd780000-dd7bffff memory:dc800000-dc81ffff(prefetchable)
> *-disk:0
> description: SCSI Disk
> product: PERC H710P
> vendor: DELL
> physical id: 2.0.0
> bus info: scsi@0:2.0.0
> logical name: /dev/sda
> version: 3.13
> serial: 0049d6ce1d9f2035180096fde490f648
> size: 558GiB (599GB)
> capabilities: partitioned partitioned:dos
> configuration: ansiversion=5 signature=000aa336
> *-disk:1
> description: SCSI Disk
> product: PERC H710P
> vendor: DELL
> physical id: 2.1.0
> bus info: scsi@0:2.1.0
> logical name: /dev/sdb
> logical name: /mnt/xfsd
> version: 3.13
> serial: 003366f71da22035180096fde490f648
> size: 27TiB (30TB)
> configuration: ansiversion=5 mount.fstype=xfs
> mount.options=rw,relatime,attr2,delaylog,logbsize=64k,sunit=128,swidth=1280,noquota
> state=mounted
>
> Thank you.
>
>
> 2013/4/10 Emmanuel Florac <eflorac@intellique.com>
>
>> Le Tue, 9 Apr 2013 23:10:03 +0800
>> 符永涛 <yongtaofu@gmail.com> écrivait:
>>
>> > > Apr 9 11:01:30 cqdx kernel: XFS (sdb): I/O Error Detected.
>> > > Shutting down filesystem
>>
>> This. I/O error detected. That means that at some point the underlying
>> device (disk, RAID array, SAN volume) couldn't be reached. So this
>> could very well be a case of a flakey drive, array, cable or SCSI
>> driver.
>>
>> What's the storage setup here?
>>
>> --
>> ------------------------------------------------------------------------
>> Emmanuel Florac | Direction technique
>> | Intellique
>> | <eflorac@intellique.com>
>> | +33 1 78 94 84 02
>> ------------------------------------------------------------------------
>>
>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 26794 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-11 19:11 ` 符永涛
@ 2013-04-11 19:55 ` 符永涛
2013-04-11 23:26 ` Brian Foster
1 sibling, 0 replies; 60+ messages in thread
From: 符永涛 @ 2013-04-11 19:55 UTC (permalink / raw)
To: Emmanuel Florac; +Cc: Ben Myers, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 23754 bytes --]
fs/xfs/xfs_vnodeops.c. Return address = 0xffffffffa02ee20a
The return address always 0xffffffffa02ee20a?
2013/4/12 符永涛 <yongtaofu@gmail.com>
> It happens tonight again on one of our servers, how to debug the root
> cause? Thank you.
>
> Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
> returned error 22.
> Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
> error 22
> Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
> from line 1184 of file fs/xfs/xfs_vnodeops.c. Return address =
> 0xffffffffa02ee20a
> Apr 12 02:32:10 cqdx kernel: XFS (sdb): I/O Error Detected. Shutting down
> filesystem
> Apr 12 02:32:10 cqdx kernel: XFS (sdb): Please umount the filesystem and
> rectify the problem(s)
> Apr 12 02:32:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 12 02:32:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 12 02:33:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 12 02:33:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
>
> xfs_repair -n
>
>
> Phase 7 - verify link counts...
> would have reset inode 20021 nlinks from 0 to 1
> would have reset inode 20789 nlinks from 0 to 1
> would have reset inode 35125 nlinks from 0 to 1
> would have reset inode 35637 nlinks from 0 to 1
> would have reset inode 36149 nlinks from 0 to 1
> would have reset inode 38197 nlinks from 0 to 1
> would have reset inode 39477 nlinks from 0 to 1
> would have reset inode 54069 nlinks from 0 to 1
> would have reset inode 62261 nlinks from 0 to 1
> would have reset inode 63029 nlinks from 0 to 1
> would have reset inode 72501 nlinks from 0 to 1
> would have reset inode 79925 nlinks from 0 to 1
> would have reset inode 81205 nlinks from 0 to 1
> would have reset inode 84789 nlinks from 0 to 1
> would have reset inode 87861 nlinks from 0 to 1
> would have reset inode 90663 nlinks from 0 to 1
> would have reset inode 91189 nlinks from 0 to 1
> would have reset inode 95541 nlinks from 0 to 1
> would have reset inode 98101 nlinks from 0 to 1
> would have reset inode 101173 nlinks from 0 to 1
> would have reset inode 113205 nlinks from 0 to 1
> would have reset inode 114741 nlinks from 0 to 1
> would have reset inode 126261 nlinks from 0 to 1
> would have reset inode 140597 nlinks from 0 to 1
> would have reset inode 144693 nlinks from 0 to 1
> would have reset inode 147765 nlinks from 0 to 1
> would have reset inode 152885 nlinks from 0 to 1
> would have reset inode 161333 nlinks from 0 to 1
> would have reset inode 161845 nlinks from 0 to 1
> would have reset inode 167477 nlinks from 0 to 1
> would have reset inode 172341 nlinks from 0 to 1
> would have reset inode 191797 nlinks from 0 to 1
> would have reset inode 204853 nlinks from 0 to 1
> would have reset inode 205365 nlinks from 0 to 1
> would have reset inode 215349 nlinks from 0 to 1
> would have reset inode 215861 nlinks from 0 to 1
> would have reset inode 216373 nlinks from 0 to 1
> would have reset inode 217397 nlinks from 0 to 1
> would have reset inode 224309 nlinks from 0 to 1
> would have reset inode 225589 nlinks from 0 to 1
> would have reset inode 234549 nlinks from 0 to 1
> would have reset inode 234805 nlinks from 0 to 1
> would have reset inode 249653 nlinks from 0 to 1
> would have reset inode 250677 nlinks from 0 to 1
> would have reset inode 252469 nlinks from 0 to 1
> would have reset inode 261429 nlinks from 0 to 1
> would have reset inode 265013 nlinks from 0 to 1
> would have reset inode 266805 nlinks from 0 to 1
> would have reset inode 267317 nlinks from 0 to 1
> would have reset inode 268853 nlinks from 0 to 1
> would have reset inode 272437 nlinks from 0 to 1
> would have reset inode 273205 nlinks from 0 to 1
> would have reset inode 274229 nlinks from 0 to 1
> would have reset inode 278325 nlinks from 0 to 1
> would have reset inode 278837 nlinks from 0 to 1
> would have reset inode 281397 nlinks from 0 to 1
> would have reset inode 292661 nlinks from 0 to 1
> would have reset inode 300853 nlinks from 0 to 1
> would have reset inode 302901 nlinks from 0 to 1
> would have reset inode 305205 nlinks from 0 to 1
> would have reset inode 314165 nlinks from 0 to 1
> would have reset inode 315189 nlinks from 0 to 1
> would have reset inode 320309 nlinks from 0 to 1
> would have reset inode 324917 nlinks from 0 to 1
> would have reset inode 328245 nlinks from 0 to 1
> would have reset inode 335925 nlinks from 0 to 1
> would have reset inode 339253 nlinks from 0 to 1
> would have reset inode 339765 nlinks from 0 to 1
> would have reset inode 348213 nlinks from 0 to 1
> would have reset inode 360501 nlinks from 0 to 1
> would have reset inode 362037 nlinks from 0 to 1
> would have reset inode 366389 nlinks from 0 to 1
> would have reset inode 385845 nlinks from 0 to 1
> would have reset inode 390709 nlinks from 0 to 1
> would have reset inode 409141 nlinks from 0 to 1
> would have reset inode 413237 nlinks from 0 to 1
> would have reset inode 414773 nlinks from 0 to 1
> would have reset inode 417845 nlinks from 0 to 1
> would have reset inode 436021 nlinks from 0 to 1
> would have reset inode 439349 nlinks from 0 to 1
> would have reset inode 447029 nlinks from 0 to 1
> would have reset inode 491317 nlinks from 0 to 1
> would have reset inode 494133 nlinks from 0 to 1
> would have reset inode 495413 nlinks from 0 to 1
> would have reset inode 501301 nlinks from 0 to 1
> would have reset inode 506421 nlinks from 0 to 1
> would have reset inode 508469 nlinks from 0 to 1
> would have reset inode 508981 nlinks from 0 to 1
> would have reset inode 511797 nlinks from 0 to 1
> would have reset inode 513077 nlinks from 0 to 1
> would have reset inode 517941 nlinks from 0 to 1
> would have reset inode 521013 nlinks from 0 to 1
> would have reset inode 522805 nlinks from 0 to 1
> would have reset inode 523317 nlinks from 0 to 1
> would have reset inode 525621 nlinks from 0 to 1
> would have reset inode 527925 nlinks from 0 to 1
> would have reset inode 535605 nlinks from 0 to 1
> would have reset inode 541749 nlinks from 0 to 1
> would have reset inode 573493 nlinks from 0 to 1
> would have reset inode 578613 nlinks from 0 to 1
> would have reset inode 583029 nlinks from 0 to 1
> would have reset inode 585525 nlinks from 0 to 1
> would have reset inode 586293 nlinks from 0 to 1
> would have reset inode 586805 nlinks from 0 to 1
> would have reset inode 591413 nlinks from 0 to 1
> would have reset inode 594485 nlinks from 0 to 1
> would have reset inode 596277 nlinks from 0 to 1
> would have reset inode 603189 nlinks from 0 to 1
> would have reset inode 613429 nlinks from 0 to 1
> would have reset inode 617781 nlinks from 0 to 1
> would have reset inode 621877 nlinks from 0 to 1
> would have reset inode 623925 nlinks from 0 to 1
> would have reset inode 625205 nlinks from 0 to 1
> would have reset inode 626741 nlinks from 0 to 1
> would have reset inode 639541 nlinks from 0 to 1
> would have reset inode 640053 nlinks from 0 to 1
> would have reset inode 640565 nlinks from 0 to 1
> would have reset inode 645173 nlinks from 0 to 1
> would have reset inode 652853 nlinks from 0 to 1
> would have reset inode 656181 nlinks from 0 to 1
> would have reset inode 659253 nlinks from 0 to 1
> would have reset inode 663605 nlinks from 0 to 1
> would have reset inode 667445 nlinks from 0 to 1
> would have reset inode 680757 nlinks from 0 to 1
> would have reset inode 691253 nlinks from 0 to 1
> would have reset inode 691765 nlinks from 0 to 1
> would have reset inode 697653 nlinks from 0 to 1
> would have reset inode 700469 nlinks from 0 to 1
> would have reset inode 707893 nlinks from 0 to 1
> would have reset inode 716853 nlinks from 0 to 1
> would have reset inode 722229 nlinks from 0 to 1
> would have reset inode 722741 nlinks from 0 to 1
> would have reset inode 723765 nlinks from 0 to 1
> would have reset inode 731957 nlinks from 0 to 1
> would have reset inode 742965 nlinks from 0 to 1
> would have reset inode 743477 nlinks from 0 to 1
> would have reset inode 745781 nlinks from 0 to 1
> would have reset inode 746293 nlinks from 0 to 1
> would have reset inode 774453 nlinks from 0 to 1
> would have reset inode 778805 nlinks from 0 to 1
> would have reset inode 785013 nlinks from 0 to 1
> would have reset inode 785973 nlinks from 0 to 1
> would have reset inode 791349 nlinks from 0 to 1
> would have reset inode 796981 nlinks from 0 to 1
> would have reset inode 803381 nlinks from 0 to 1
> would have reset inode 806965 nlinks from 0 to 1
> would have reset inode 811798 nlinks from 0 to 1
> would have reset inode 812310 nlinks from 0 to 1
> would have reset inode 813078 nlinks from 0 to 1
> would have reset inode 813607 nlinks from 0 to 1
> would have reset inode 814183 nlinks from 0 to 1
> would have reset inode 822069 nlinks from 0 to 1
> would have reset inode 828469 nlinks from 0 to 1
> would have reset inode 830005 nlinks from 0 to 1
> would have reset inode 832053 nlinks from 0 to 1
> would have reset inode 832565 nlinks from 0 to 1
> would have reset inode 836661 nlinks from 0 to 1
> would have reset inode 841013 nlinks from 0 to 1
> would have reset inode 841525 nlinks from 0 to 1
> would have reset inode 845365 nlinks from 0 to 1
> would have reset inode 846133 nlinks from 0 to 1
> would have reset inode 847157 nlinks from 0 to 1
> would have reset inode 852533 nlinks from 0 to 1
> would have reset inode 857141 nlinks from 0 to 1
> would have reset inode 863271 nlinks from 0 to 1
> would have reset inode 866855 nlinks from 0 to 1
> would have reset inode 887861 nlinks from 0 to 1
> would have reset inode 891701 nlinks from 0 to 1
> would have reset inode 894773 nlinks from 0 to 1
> would have reset inode 900149 nlinks from 0 to 1
> would have reset inode 902197 nlinks from 0 to 1
> would have reset inode 906293 nlinks from 0 to 1
> would have reset inode 906805 nlinks from 0 to 1
> would have reset inode 909877 nlinks from 0 to 1
> would have reset inode 925493 nlinks from 0 to 1
> would have reset inode 949543 nlinks from 0 to 1
> would have reset inode 955175 nlinks from 0 to 1
> would have reset inode 963623 nlinks from 0 to 1
> would have reset inode 967733 nlinks from 0 to 1
> would have reset inode 968231 nlinks from 0 to 1
> would have reset inode 982069 nlinks from 0 to 1
> would have reset inode 1007413 nlinks from 0 to 1
> would have reset inode 1011509 nlinks from 0 to 1
> would have reset inode 1014069 nlinks from 0 to 1
> would have reset inode 1014581 nlinks from 0 to 1
> would have reset inode 1022005 nlinks from 0 to 1
> would have reset inode 1022517 nlinks from 0 to 1
> would have reset inode 1023029 nlinks from 0 to 1
> would have reset inode 1025333 nlinks from 0 to 1
> would have reset inode 1043765 nlinks from 0 to 1
> would have reset inode 1044789 nlinks from 0 to 1
> would have reset inode 1049397 nlinks from 0 to 1
> would have reset inode 1050933 nlinks from 0 to 1
> would have reset inode 1051445 nlinks from 0 to 1
> would have reset inode 1054261 nlinks from 0 to 1
> would have reset inode 1060917 nlinks from 0 to 1
> would have reset inode 1063477 nlinks from 0 to 1
> would have reset inode 1076021 nlinks from 0 to 1
> would have reset inode 1081141 nlinks from 0 to 1
> would have reset inode 1086261 nlinks from 0 to 1
> would have reset inode 1097269 nlinks from 0 to 1
> would have reset inode 1099829 nlinks from 0 to 1
> would have reset inode 1100853 nlinks from 0 to 1
> would have reset inode 1101877 nlinks from 0 to 1
> would have reset inode 1126709 nlinks from 0 to 1
> would have reset inode 1134389 nlinks from 0 to 1
> would have reset inode 1141045 nlinks from 0 to 1
> would have reset inode 1141557 nlinks from 0 to 1
> would have reset inode 1142581 nlinks from 0 to 1
> would have reset inode 1148469 nlinks from 0 to 1
> would have reset inode 1153333 nlinks from 0 to 1
> would have reset inode 1181749 nlinks from 0 to 1
> would have reset inode 1192245 nlinks from 0 to 1
> would have reset inode 1198133 nlinks from 0 to 1
> would have reset inode 1203765 nlinks from 0 to 1
> would have reset inode 1221429 nlinks from 0 to 1
> would have reset inode 1223989 nlinks from 0 to 1
> would have reset inode 1235509 nlinks from 0 to 1
> would have reset inode 1239349 nlinks from 0 to 1
> would have reset inode 1240885 nlinks from 0 to 1
> would have reset inode 1241397 nlinks from 0 to 1
> would have reset inode 1241909 nlinks from 0 to 1
> would have reset inode 1242421 nlinks from 0 to 1
> would have reset inode 1244981 nlinks from 0 to 1
> would have reset inode 1246517 nlinks from 0 to 1
> would have reset inode 1253429 nlinks from 0 to 1
> would have reset inode 1271861 nlinks from 0 to 1
> would have reset inode 1274677 nlinks from 0 to 1
> would have reset inode 1277749 nlinks from 0 to 1
> would have reset inode 1278773 nlinks from 0 to 1
> would have reset inode 1286709 nlinks from 0 to 1
> would have reset inode 1288245 nlinks from 0 to 1
> would have reset inode 1299765 nlinks from 0 to 1
> would have reset inode 1302325 nlinks from 0 to 1
> would have reset inode 1304885 nlinks from 0 to 1
> would have reset inode 1305397 nlinks from 0 to 1
> would have reset inode 1307509 nlinks from 0 to 1
> would have reset inode 1309493 nlinks from 0 to 1
> would have reset inode 1310517 nlinks from 0 to 1
> would have reset inode 1311029 nlinks from 0 to 1
> would have reset inode 1312053 nlinks from 0 to 1
> would have reset inode 1316917 nlinks from 0 to 1
> would have reset inode 1317941 nlinks from 0 to 1
> would have reset inode 1320821 nlinks from 0 to 1
> would have reset inode 1322805 nlinks from 0 to 1
> would have reset inode 1332789 nlinks from 0 to 1
> would have reset inode 1336373 nlinks from 0 to 1
> would have reset inode 1345653 nlinks from 0 to 1
> would have reset inode 1354549 nlinks from 0 to 1
> would have reset inode 1361973 nlinks from 0 to 1
> would have reset inode 1369909 nlinks from 0 to 1
> would have reset inode 1372981 nlinks from 0 to 1
> would have reset inode 1388853 nlinks from 0 to 1
> would have reset inode 1402933 nlinks from 0 to 1
> would have reset inode 1403445 nlinks from 0 to 1
> would have reset inode 1420085 nlinks from 0 to 1
> would have reset inode 1452853 nlinks from 0 to 1
> would have reset inode 1456437 nlinks from 0 to 1
> would have reset inode 1457973 nlinks from 0 to 1
> would have reset inode 1459253 nlinks from 0 to 1
> would have reset inode 1467957 nlinks from 0 to 1
> would have reset inode 1471541 nlinks from 0 to 1
> would have reset inode 1476661 nlinks from 0 to 1
> would have reset inode 1479733 nlinks from 0 to 1
> would have reset inode 1483061 nlinks from 0 to 1
> would have reset inode 1484085 nlinks from 0 to 1
> would have reset inode 1486133 nlinks from 0 to 1
> would have reset inode 1489461 nlinks from 0 to 1
> would have reset inode 1490037 nlinks from 0 to 1
> would have reset inode 1492021 nlinks from 0 to 1
> would have reset inode 1493557 nlinks from 0 to 1
> would have reset inode 1494069 nlinks from 0 to 1
> would have reset inode 1496885 nlinks from 0 to 1
> would have reset inode 1498421 nlinks from 0 to 1
> would have reset inode 1498933 nlinks from 0 to 1
> would have reset inode 1499957 nlinks from 0 to 1
> would have reset inode 1506101 nlinks from 0 to 1
> would have reset inode 1507637 nlinks from 0 to 1
> would have reset inode 1510453 nlinks from 0 to 1
> would have reset inode 1514293 nlinks from 0 to 1
> would have reset inode 1517365 nlinks from 0 to 1
> would have reset inode 1520693 nlinks from 0 to 1
> would have reset inode 1521973 nlinks from 0 to 1
> would have reset inode 1530421 nlinks from 0 to 1
> would have reset inode 1530933 nlinks from 0 to 1
> would have reset inode 1537333 nlinks from 0 to 1
> would have reset inode 1538357 nlinks from 0 to 1
> would have reset inode 1548853 nlinks from 0 to 1
> would have reset inode 1553973 nlinks from 0 to 1
> would have reset inode 1557301 nlinks from 0 to 1
> would have reset inode 1564213 nlinks from 0 to 1
> would have reset inode 1564725 nlinks from 0 to 1
> would have reset inode 1576501 nlinks from 0 to 1
> would have reset inode 1580597 nlinks from 0 to 1
> would have reset inode 1584693 nlinks from 0 to 1
> would have reset inode 1586485 nlinks from 0 to 1
> would have reset inode 1589301 nlinks from 0 to 1
> would have reset inode 1589813 nlinks from 0 to 1
> would have reset inode 1592629 nlinks from 0 to 1
> would have reset inode 1595701 nlinks from 0 to 1
> would have reset inode 1601077 nlinks from 0 to 1
> would have reset inode 1623861 nlinks from 0 to 1
> would have reset inode 1626677 nlinks from 0 to 1
> would have reset inode 1627701 nlinks from 0 to 1
> would have reset inode 1633333 nlinks from 0 to 1
> would have reset inode 1639221 nlinks from 0 to 1
> would have reset inode 1649205 nlinks from 0 to 1
> would have reset inode 1686325 nlinks from 0 to 1
> would have reset inode 1690677 nlinks from 0 to 1
> would have reset inode 1693749 nlinks from 0 to 1
> would have reset inode 1704757 nlinks from 0 to 1
> would have reset inode 1707061 nlinks from 0 to 1
> would have reset inode 1709109 nlinks from 0 to 1
> would have reset inode 1719349 nlinks from 0 to 1
> would have reset inode 1737013 nlinks from 0 to 1
> would have reset inode 1741365 nlinks from 0 to 1
> would have reset inode 1747509 nlinks from 0 to 1
> would have reset inode 1770805 nlinks from 0 to 1
> would have reset inode 1780789 nlinks from 0 to 1
> would have reset inode 1793589 nlinks from 0 to 1
> would have reset inode 1795125 nlinks from 0 to 1
> would have reset inode 1800757 nlinks from 0 to 1
> would have reset inode 1801269 nlinks from 0 to 1
> would have reset inode 1802549 nlinks from 0 to 1
> would have reset inode 1804085 nlinks from 0 to 1
> would have reset inode 1817141 nlinks from 0 to 1
> would have reset inode 1821749 nlinks from 0 to 1
> would have reset inode 1832757 nlinks from 0 to 1
> would have reset inode 1836341 nlinks from 0 to 1
> would have reset inode 1856309 nlinks from 0 to 1
> would have reset inode 1900597 nlinks from 0 to 1
> would have reset inode 1902901 nlinks from 0 to 1
> would have reset inode 1912373 nlinks from 0 to 1
> would have reset inode 1943093 nlinks from 0 to 1
> would have reset inode 1944373 nlinks from 0 to 1
> would have reset inode 1954101 nlinks from 0 to 1
> would have reset inode 1955893 nlinks from 0 to 1
> would have reset inode 1961781 nlinks from 0 to 1
> would have reset inode 1974325 nlinks from 0 to 1
> would have reset inode 1978677 nlinks from 0 to 1
> would have reset inode 1981237 nlinks from 0 to 1
> would have reset inode 1992245 nlinks from 0 to 1
> would have reset inode 2000949 nlinks from 0 to 1
> would have reset inode 2002229 nlinks from 0 to 1
> would have reset inode 2004789 nlinks from 0 to 1
> would have reset inode 2005301 nlinks from 0 to 1
> would have reset inode 2011189 nlinks from 0 to 1
> would have reset inode 2012981 nlinks from 0 to 1
> would have reset inode 2015285 nlinks from 0 to 1
> would have reset inode 2018869 nlinks from 0 to 1
> would have reset inode 2028341 nlinks from 0 to 1
> would have reset inode 2028853 nlinks from 0 to 1
> would have reset inode 2030901 nlinks from 0 to 1
> would have reset inode 2032181 nlinks from 0 to 1
> would have reset inode 2032693 nlinks from 0 to 1
> would have reset inode 2040117 nlinks from 0 to 1
> would have reset inode 2053685 nlinks from 0 to 1
> would have reset inode 2083893 nlinks from 0 to 1
> would have reset inode 2087221 nlinks from 0 to 1
> would have reset inode 2095925 nlinks from 0 to 1
> would have reset inode 2098741 nlinks from 0 to 1
> would have reset inode 2100533 nlinks from 0 to 1
> would have reset inode 2101301 nlinks from 0 to 1
> would have reset inode 2123573 nlinks from 0 to 1
> would have reset inode 2132789 nlinks from 0 to 1
> would have reset inode 2133813 nlinks from 0 to 1
>
>
>
>
>
> 2013/4/10 符永涛 <yongtaofu@gmail.com>
>
>> The storage info is as following:
>> RAID-6
>> SATA HDD
>> Controller: PERC H710P Mini (Embedded)
>> Disk /dev/sdb: 30000.3 GB, 30000346562560 bytes
>> 255 heads, 63 sectors/track, 3647334 cylinders
>> Units = cylinders of 16065 * 512 = 8225280 bytes
>> Sector size (logical/physical): 512 bytes / 512 bytes
>> I/O size (minimum/optimal): 512 bytes / 512 bytes
>> Disk identifier: 0x00000000
>>
>> sd 0:2:1:0: [sdb] 58594426880 512-byte logical blocks: (30.0 TB/27.2 TiB)
>> sd 0:2:1:0: [sdb] Write Protect is off
>> sd 0:2:1:0: [sdb] Mode Sense: 1f 00 00 08
>> sd 0:2:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't
>> support DPO or FUA
>> sd 0:2:1:0: [sdb] Attached SCSI disk
>>
>> *-storage
>> description: RAID bus controller
>> product: MegaRAID SAS 2208 [Thunderbolt]
>> vendor: LSI Logic / Symbios Logic
>> physical id: 0
>> bus info: pci@0000:02:00.0
>> logical name: scsi0
>> version: 01
>> width: 64 bits
>> clock: 33MHz
>> capabilities: storage pm pciexpress vpd msi msix bus_master
>> cap_list rom
>> configuration: driver=megaraid_sas latency=0
>> resources: irq:42 ioport:fc00(size=256) memory:dd7fc000-dd7fffff
>> memory:dd780000-dd7bffff memory:dc800000-dc81ffff(prefetchable)
>> *-disk:0
>> description: SCSI Disk
>> product: PERC H710P
>> vendor: DELL
>> physical id: 2.0.0
>> bus info: scsi@0:2.0.0
>> logical name: /dev/sda
>> version: 3.13
>> serial: 0049d6ce1d9f2035180096fde490f648
>> size: 558GiB (599GB)
>> capabilities: partitioned partitioned:dos
>> configuration: ansiversion=5 signature=000aa336
>> *-disk:1
>> description: SCSI Disk
>> product: PERC H710P
>> vendor: DELL
>> physical id: 2.1.0
>> bus info: scsi@0:2.1.0
>> logical name: /dev/sdb
>> logical name: /mnt/xfsd
>> version: 3.13
>> serial: 003366f71da22035180096fde490f648
>> size: 27TiB (30TB)
>> configuration: ansiversion=5 mount.fstype=xfs
>> mount.options=rw,relatime,attr2,delaylog,logbsize=64k,sunit=128,swidth=1280,noquota
>> state=mounted
>>
>> Thank you.
>>
>>
>> 2013/4/10 Emmanuel Florac <eflorac@intellique.com>
>>
>>> Le Tue, 9 Apr 2013 23:10:03 +0800
>>> 符永涛 <yongtaofu@gmail.com> écrivait:
>>>
>>> > > Apr 9 11:01:30 cqdx kernel: XFS (sdb): I/O Error Detected.
>>> > > Shutting down filesystem
>>>
>>> This. I/O error detected. That means that at some point the underlying
>>> device (disk, RAID array, SAN volume) couldn't be reached. So this
>>> could very well be a case of a flakey drive, array, cable or SCSI
>>> driver.
>>>
>>> What's the storage setup here?
>>>
>>> --
>>> ------------------------------------------------------------------------
>>> Emmanuel Florac | Direction technique
>>> | Intellique
>>> | <eflorac@intellique.com>
>>> | +33 1 78 94 84 02
>>> ------------------------------------------------------------------------
>>>
>>
>>
>>
>> --
>> 符永涛
>>
>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 27564 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-11 19:11 ` 符永涛
2013-04-11 19:55 ` 符永涛
@ 2013-04-11 23:26 ` Brian Foster
2013-04-12 0:45 ` 符永涛
` (2 more replies)
1 sibling, 3 replies; 60+ messages in thread
From: Brian Foster @ 2013-04-11 23:26 UTC (permalink / raw)
To: 符永涛; +Cc: Ben Myers, xfs@oss.sgi.com
[-- Attachment #1: Type: text/plain, Size: 24457 bytes --]
On 04/11/2013 03:11 PM, 符永涛 wrote:
> It happens tonight again on one of our servers, how to debug the root
> cause? Thank you.
>
Hi,
I've attached a system tap script (stap -v xfs.stp) that should
hopefully print out a bit more data should the issue happen again. Do
you have a small enough number of nodes (or predictable enough pattern)
that you could run this on the nodes that tend to fail and collect the
output?
Also, could you collect an xfs_metadump of the filesystem in question
and make it available for download and analysis somewhere? I believe the
ideal approach is to mount/umount the filesystem first to replay the log
before collecting a metadump, but somebody could correct me on that (to
be safe, you could collect multiple dumps: pre-mount and post-mount).
Could you also describe your workload a little bit? Thanks.
Brian
> Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
> xfs_inotobp() returned error 22.
> Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
> error 22
> Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_do_force_shutdown(0x1)
> called from line 1184 of file fs/xfs/xfs_vnodeops.c. Return address =
> 0xffffffffa02ee20a
> Apr 12 02:32:10 cqdx kernel: XFS (sdb): I/O Error Detected. Shutting
> down filesystem
> Apr 12 02:32:10 cqdx kernel: XFS (sdb): Please umount the filesystem and
> rectify the problem(s)
> Apr 12 02:32:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 12 02:32:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 12 02:33:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 12 02:33:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
>
> xfs_repair -n
>
>
> Phase 7 - verify link counts...
> would have reset inode 20021 nlinks from 0 to 1
> would have reset inode 20789 nlinks from 0 to 1
> would have reset inode 35125 nlinks from 0 to 1
> would have reset inode 35637 nlinks from 0 to 1
> would have reset inode 36149 nlinks from 0 to 1
> would have reset inode 38197 nlinks from 0 to 1
> would have reset inode 39477 nlinks from 0 to 1
> would have reset inode 54069 nlinks from 0 to 1
> would have reset inode 62261 nlinks from 0 to 1
> would have reset inode 63029 nlinks from 0 to 1
> would have reset inode 72501 nlinks from 0 to 1
> would have reset inode 79925 nlinks from 0 to 1
> would have reset inode 81205 nlinks from 0 to 1
> would have reset inode 84789 nlinks from 0 to 1
> would have reset inode 87861 nlinks from 0 to 1
> would have reset inode 90663 nlinks from 0 to 1
> would have reset inode 91189 nlinks from 0 to 1
> would have reset inode 95541 nlinks from 0 to 1
> would have reset inode 98101 nlinks from 0 to 1
> would have reset inode 101173 nlinks from 0 to 1
> would have reset inode 113205 nlinks from 0 to 1
> would have reset inode 114741 nlinks from 0 to 1
> would have reset inode 126261 nlinks from 0 to 1
> would have reset inode 140597 nlinks from 0 to 1
> would have reset inode 144693 nlinks from 0 to 1
> would have reset inode 147765 nlinks from 0 to 1
> would have reset inode 152885 nlinks from 0 to 1
> would have reset inode 161333 nlinks from 0 to 1
> would have reset inode 161845 nlinks from 0 to 1
> would have reset inode 167477 nlinks from 0 to 1
> would have reset inode 172341 nlinks from 0 to 1
> would have reset inode 191797 nlinks from 0 to 1
> would have reset inode 204853 nlinks from 0 to 1
> would have reset inode 205365 nlinks from 0 to 1
> would have reset inode 215349 nlinks from 0 to 1
> would have reset inode 215861 nlinks from 0 to 1
> would have reset inode 216373 nlinks from 0 to 1
> would have reset inode 217397 nlinks from 0 to 1
> would have reset inode 224309 nlinks from 0 to 1
> would have reset inode 225589 nlinks from 0 to 1
> would have reset inode 234549 nlinks from 0 to 1
> would have reset inode 234805 nlinks from 0 to 1
> would have reset inode 249653 nlinks from 0 to 1
> would have reset inode 250677 nlinks from 0 to 1
> would have reset inode 252469 nlinks from 0 to 1
> would have reset inode 261429 nlinks from 0 to 1
> would have reset inode 265013 nlinks from 0 to 1
> would have reset inode 266805 nlinks from 0 to 1
> would have reset inode 267317 nlinks from 0 to 1
> would have reset inode 268853 nlinks from 0 to 1
> would have reset inode 272437 nlinks from 0 to 1
> would have reset inode 273205 nlinks from 0 to 1
> would have reset inode 274229 nlinks from 0 to 1
> would have reset inode 278325 nlinks from 0 to 1
> would have reset inode 278837 nlinks from 0 to 1
> would have reset inode 281397 nlinks from 0 to 1
> would have reset inode 292661 nlinks from 0 to 1
> would have reset inode 300853 nlinks from 0 to 1
> would have reset inode 302901 nlinks from 0 to 1
> would have reset inode 305205 nlinks from 0 to 1
> would have reset inode 314165 nlinks from 0 to 1
> would have reset inode 315189 nlinks from 0 to 1
> would have reset inode 320309 nlinks from 0 to 1
> would have reset inode 324917 nlinks from 0 to 1
> would have reset inode 328245 nlinks from 0 to 1
> would have reset inode 335925 nlinks from 0 to 1
> would have reset inode 339253 nlinks from 0 to 1
> would have reset inode 339765 nlinks from 0 to 1
> would have reset inode 348213 nlinks from 0 to 1
> would have reset inode 360501 nlinks from 0 to 1
> would have reset inode 362037 nlinks from 0 to 1
> would have reset inode 366389 nlinks from 0 to 1
> would have reset inode 385845 nlinks from 0 to 1
> would have reset inode 390709 nlinks from 0 to 1
> would have reset inode 409141 nlinks from 0 to 1
> would have reset inode 413237 nlinks from 0 to 1
> would have reset inode 414773 nlinks from 0 to 1
> would have reset inode 417845 nlinks from 0 to 1
> would have reset inode 436021 nlinks from 0 to 1
> would have reset inode 439349 nlinks from 0 to 1
> would have reset inode 447029 nlinks from 0 to 1
> would have reset inode 491317 nlinks from 0 to 1
> would have reset inode 494133 nlinks from 0 to 1
> would have reset inode 495413 nlinks from 0 to 1
> would have reset inode 501301 nlinks from 0 to 1
> would have reset inode 506421 nlinks from 0 to 1
> would have reset inode 508469 nlinks from 0 to 1
> would have reset inode 508981 nlinks from 0 to 1
> would have reset inode 511797 nlinks from 0 to 1
> would have reset inode 513077 nlinks from 0 to 1
> would have reset inode 517941 nlinks from 0 to 1
> would have reset inode 521013 nlinks from 0 to 1
> would have reset inode 522805 nlinks from 0 to 1
> would have reset inode 523317 nlinks from 0 to 1
> would have reset inode 525621 nlinks from 0 to 1
> would have reset inode 527925 nlinks from 0 to 1
> would have reset inode 535605 nlinks from 0 to 1
> would have reset inode 541749 nlinks from 0 to 1
> would have reset inode 573493 nlinks from 0 to 1
> would have reset inode 578613 nlinks from 0 to 1
> would have reset inode 583029 nlinks from 0 to 1
> would have reset inode 585525 nlinks from 0 to 1
> would have reset inode 586293 nlinks from 0 to 1
> would have reset inode 586805 nlinks from 0 to 1
> would have reset inode 591413 nlinks from 0 to 1
> would have reset inode 594485 nlinks from 0 to 1
> would have reset inode 596277 nlinks from 0 to 1
> would have reset inode 603189 nlinks from 0 to 1
> would have reset inode 613429 nlinks from 0 to 1
> would have reset inode 617781 nlinks from 0 to 1
> would have reset inode 621877 nlinks from 0 to 1
> would have reset inode 623925 nlinks from 0 to 1
> would have reset inode 625205 nlinks from 0 to 1
> would have reset inode 626741 nlinks from 0 to 1
> would have reset inode 639541 nlinks from 0 to 1
> would have reset inode 640053 nlinks from 0 to 1
> would have reset inode 640565 nlinks from 0 to 1
> would have reset inode 645173 nlinks from 0 to 1
> would have reset inode 652853 nlinks from 0 to 1
> would have reset inode 656181 nlinks from 0 to 1
> would have reset inode 659253 nlinks from 0 to 1
> would have reset inode 663605 nlinks from 0 to 1
> would have reset inode 667445 nlinks from 0 to 1
> would have reset inode 680757 nlinks from 0 to 1
> would have reset inode 691253 nlinks from 0 to 1
> would have reset inode 691765 nlinks from 0 to 1
> would have reset inode 697653 nlinks from 0 to 1
> would have reset inode 700469 nlinks from 0 to 1
> would have reset inode 707893 nlinks from 0 to 1
> would have reset inode 716853 nlinks from 0 to 1
> would have reset inode 722229 nlinks from 0 to 1
> would have reset inode 722741 nlinks from 0 to 1
> would have reset inode 723765 nlinks from 0 to 1
> would have reset inode 731957 nlinks from 0 to 1
> would have reset inode 742965 nlinks from 0 to 1
> would have reset inode 743477 nlinks from 0 to 1
> would have reset inode 745781 nlinks from 0 to 1
> would have reset inode 746293 nlinks from 0 to 1
> would have reset inode 774453 nlinks from 0 to 1
> would have reset inode 778805 nlinks from 0 to 1
> would have reset inode 785013 nlinks from 0 to 1
> would have reset inode 785973 nlinks from 0 to 1
> would have reset inode 791349 nlinks from 0 to 1
> would have reset inode 796981 nlinks from 0 to 1
> would have reset inode 803381 nlinks from 0 to 1
> would have reset inode 806965 nlinks from 0 to 1
> would have reset inode 811798 nlinks from 0 to 1
> would have reset inode 812310 nlinks from 0 to 1
> would have reset inode 813078 nlinks from 0 to 1
> would have reset inode 813607 nlinks from 0 to 1
> would have reset inode 814183 nlinks from 0 to 1
> would have reset inode 822069 nlinks from 0 to 1
> would have reset inode 828469 nlinks from 0 to 1
> would have reset inode 830005 nlinks from 0 to 1
> would have reset inode 832053 nlinks from 0 to 1
> would have reset inode 832565 nlinks from 0 to 1
> would have reset inode 836661 nlinks from 0 to 1
> would have reset inode 841013 nlinks from 0 to 1
> would have reset inode 841525 nlinks from 0 to 1
> would have reset inode 845365 nlinks from 0 to 1
> would have reset inode 846133 nlinks from 0 to 1
> would have reset inode 847157 nlinks from 0 to 1
> would have reset inode 852533 nlinks from 0 to 1
> would have reset inode 857141 nlinks from 0 to 1
> would have reset inode 863271 nlinks from 0 to 1
> would have reset inode 866855 nlinks from 0 to 1
> would have reset inode 887861 nlinks from 0 to 1
> would have reset inode 891701 nlinks from 0 to 1
> would have reset inode 894773 nlinks from 0 to 1
> would have reset inode 900149 nlinks from 0 to 1
> would have reset inode 902197 nlinks from 0 to 1
> would have reset inode 906293 nlinks from 0 to 1
> would have reset inode 906805 nlinks from 0 to 1
> would have reset inode 909877 nlinks from 0 to 1
> would have reset inode 925493 nlinks from 0 to 1
> would have reset inode 949543 nlinks from 0 to 1
> would have reset inode 955175 nlinks from 0 to 1
> would have reset inode 963623 nlinks from 0 to 1
> would have reset inode 967733 nlinks from 0 to 1
> would have reset inode 968231 nlinks from 0 to 1
> would have reset inode 982069 nlinks from 0 to 1
> would have reset inode 1007413 nlinks from 0 to 1
> would have reset inode 1011509 nlinks from 0 to 1
> would have reset inode 1014069 nlinks from 0 to 1
> would have reset inode 1014581 nlinks from 0 to 1
> would have reset inode 1022005 nlinks from 0 to 1
> would have reset inode 1022517 nlinks from 0 to 1
> would have reset inode 1023029 nlinks from 0 to 1
> would have reset inode 1025333 nlinks from 0 to 1
> would have reset inode 1043765 nlinks from 0 to 1
> would have reset inode 1044789 nlinks from 0 to 1
> would have reset inode 1049397 nlinks from 0 to 1
> would have reset inode 1050933 nlinks from 0 to 1
> would have reset inode 1051445 nlinks from 0 to 1
> would have reset inode 1054261 nlinks from 0 to 1
> would have reset inode 1060917 nlinks from 0 to 1
> would have reset inode 1063477 nlinks from 0 to 1
> would have reset inode 1076021 nlinks from 0 to 1
> would have reset inode 1081141 nlinks from 0 to 1
> would have reset inode 1086261 nlinks from 0 to 1
> would have reset inode 1097269 nlinks from 0 to 1
> would have reset inode 1099829 nlinks from 0 to 1
> would have reset inode 1100853 nlinks from 0 to 1
> would have reset inode 1101877 nlinks from 0 to 1
> would have reset inode 1126709 nlinks from 0 to 1
> would have reset inode 1134389 nlinks from 0 to 1
> would have reset inode 1141045 nlinks from 0 to 1
> would have reset inode 1141557 nlinks from 0 to 1
> would have reset inode 1142581 nlinks from 0 to 1
> would have reset inode 1148469 nlinks from 0 to 1
> would have reset inode 1153333 nlinks from 0 to 1
> would have reset inode 1181749 nlinks from 0 to 1
> would have reset inode 1192245 nlinks from 0 to 1
> would have reset inode 1198133 nlinks from 0 to 1
> would have reset inode 1203765 nlinks from 0 to 1
> would have reset inode 1221429 nlinks from 0 to 1
> would have reset inode 1223989 nlinks from 0 to 1
> would have reset inode 1235509 nlinks from 0 to 1
> would have reset inode 1239349 nlinks from 0 to 1
> would have reset inode 1240885 nlinks from 0 to 1
> would have reset inode 1241397 nlinks from 0 to 1
> would have reset inode 1241909 nlinks from 0 to 1
> would have reset inode 1242421 nlinks from 0 to 1
> would have reset inode 1244981 nlinks from 0 to 1
> would have reset inode 1246517 nlinks from 0 to 1
> would have reset inode 1253429 nlinks from 0 to 1
> would have reset inode 1271861 nlinks from 0 to 1
> would have reset inode 1274677 nlinks from 0 to 1
> would have reset inode 1277749 nlinks from 0 to 1
> would have reset inode 1278773 nlinks from 0 to 1
> would have reset inode 1286709 nlinks from 0 to 1
> would have reset inode 1288245 nlinks from 0 to 1
> would have reset inode 1299765 nlinks from 0 to 1
> would have reset inode 1302325 nlinks from 0 to 1
> would have reset inode 1304885 nlinks from 0 to 1
> would have reset inode 1305397 nlinks from 0 to 1
> would have reset inode 1307509 nlinks from 0 to 1
> would have reset inode 1309493 nlinks from 0 to 1
> would have reset inode 1310517 nlinks from 0 to 1
> would have reset inode 1311029 nlinks from 0 to 1
> would have reset inode 1312053 nlinks from 0 to 1
> would have reset inode 1316917 nlinks from 0 to 1
> would have reset inode 1317941 nlinks from 0 to 1
> would have reset inode 1320821 nlinks from 0 to 1
> would have reset inode 1322805 nlinks from 0 to 1
> would have reset inode 1332789 nlinks from 0 to 1
> would have reset inode 1336373 nlinks from 0 to 1
> would have reset inode 1345653 nlinks from 0 to 1
> would have reset inode 1354549 nlinks from 0 to 1
> would have reset inode 1361973 nlinks from 0 to 1
> would have reset inode 1369909 nlinks from 0 to 1
> would have reset inode 1372981 nlinks from 0 to 1
> would have reset inode 1388853 nlinks from 0 to 1
> would have reset inode 1402933 nlinks from 0 to 1
> would have reset inode 1403445 nlinks from 0 to 1
> would have reset inode 1420085 nlinks from 0 to 1
> would have reset inode 1452853 nlinks from 0 to 1
> would have reset inode 1456437 nlinks from 0 to 1
> would have reset inode 1457973 nlinks from 0 to 1
> would have reset inode 1459253 nlinks from 0 to 1
> would have reset inode 1467957 nlinks from 0 to 1
> would have reset inode 1471541 nlinks from 0 to 1
> would have reset inode 1476661 nlinks from 0 to 1
> would have reset inode 1479733 nlinks from 0 to 1
> would have reset inode 1483061 nlinks from 0 to 1
> would have reset inode 1484085 nlinks from 0 to 1
> would have reset inode 1486133 nlinks from 0 to 1
> would have reset inode 1489461 nlinks from 0 to 1
> would have reset inode 1490037 nlinks from 0 to 1
> would have reset inode 1492021 nlinks from 0 to 1
> would have reset inode 1493557 nlinks from 0 to 1
> would have reset inode 1494069 nlinks from 0 to 1
> would have reset inode 1496885 nlinks from 0 to 1
> would have reset inode 1498421 nlinks from 0 to 1
> would have reset inode 1498933 nlinks from 0 to 1
> would have reset inode 1499957 nlinks from 0 to 1
> would have reset inode 1506101 nlinks from 0 to 1
> would have reset inode 1507637 nlinks from 0 to 1
> would have reset inode 1510453 nlinks from 0 to 1
> would have reset inode 1514293 nlinks from 0 to 1
> would have reset inode 1517365 nlinks from 0 to 1
> would have reset inode 1520693 nlinks from 0 to 1
> would have reset inode 1521973 nlinks from 0 to 1
> would have reset inode 1530421 nlinks from 0 to 1
> would have reset inode 1530933 nlinks from 0 to 1
> would have reset inode 1537333 nlinks from 0 to 1
> would have reset inode 1538357 nlinks from 0 to 1
> would have reset inode 1548853 nlinks from 0 to 1
> would have reset inode 1553973 nlinks from 0 to 1
> would have reset inode 1557301 nlinks from 0 to 1
> would have reset inode 1564213 nlinks from 0 to 1
> would have reset inode 1564725 nlinks from 0 to 1
> would have reset inode 1576501 nlinks from 0 to 1
> would have reset inode 1580597 nlinks from 0 to 1
> would have reset inode 1584693 nlinks from 0 to 1
> would have reset inode 1586485 nlinks from 0 to 1
> would have reset inode 1589301 nlinks from 0 to 1
> would have reset inode 1589813 nlinks from 0 to 1
> would have reset inode 1592629 nlinks from 0 to 1
> would have reset inode 1595701 nlinks from 0 to 1
> would have reset inode 1601077 nlinks from 0 to 1
> would have reset inode 1623861 nlinks from 0 to 1
> would have reset inode 1626677 nlinks from 0 to 1
> would have reset inode 1627701 nlinks from 0 to 1
> would have reset inode 1633333 nlinks from 0 to 1
> would have reset inode 1639221 nlinks from 0 to 1
> would have reset inode 1649205 nlinks from 0 to 1
> would have reset inode 1686325 nlinks from 0 to 1
> would have reset inode 1690677 nlinks from 0 to 1
> would have reset inode 1693749 nlinks from 0 to 1
> would have reset inode 1704757 nlinks from 0 to 1
> would have reset inode 1707061 nlinks from 0 to 1
> would have reset inode 1709109 nlinks from 0 to 1
> would have reset inode 1719349 nlinks from 0 to 1
> would have reset inode 1737013 nlinks from 0 to 1
> would have reset inode 1741365 nlinks from 0 to 1
> would have reset inode 1747509 nlinks from 0 to 1
> would have reset inode 1770805 nlinks from 0 to 1
> would have reset inode 1780789 nlinks from 0 to 1
> would have reset inode 1793589 nlinks from 0 to 1
> would have reset inode 1795125 nlinks from 0 to 1
> would have reset inode 1800757 nlinks from 0 to 1
> would have reset inode 1801269 nlinks from 0 to 1
> would have reset inode 1802549 nlinks from 0 to 1
> would have reset inode 1804085 nlinks from 0 to 1
> would have reset inode 1817141 nlinks from 0 to 1
> would have reset inode 1821749 nlinks from 0 to 1
> would have reset inode 1832757 nlinks from 0 to 1
> would have reset inode 1836341 nlinks from 0 to 1
> would have reset inode 1856309 nlinks from 0 to 1
> would have reset inode 1900597 nlinks from 0 to 1
> would have reset inode 1902901 nlinks from 0 to 1
> would have reset inode 1912373 nlinks from 0 to 1
> would have reset inode 1943093 nlinks from 0 to 1
> would have reset inode 1944373 nlinks from 0 to 1
> would have reset inode 1954101 nlinks from 0 to 1
> would have reset inode 1955893 nlinks from 0 to 1
> would have reset inode 1961781 nlinks from 0 to 1
> would have reset inode 1974325 nlinks from 0 to 1
> would have reset inode 1978677 nlinks from 0 to 1
> would have reset inode 1981237 nlinks from 0 to 1
> would have reset inode 1992245 nlinks from 0 to 1
> would have reset inode 2000949 nlinks from 0 to 1
> would have reset inode 2002229 nlinks from 0 to 1
> would have reset inode 2004789 nlinks from 0 to 1
> would have reset inode 2005301 nlinks from 0 to 1
> would have reset inode 2011189 nlinks from 0 to 1
> would have reset inode 2012981 nlinks from 0 to 1
> would have reset inode 2015285 nlinks from 0 to 1
> would have reset inode 2018869 nlinks from 0 to 1
> would have reset inode 2028341 nlinks from 0 to 1
> would have reset inode 2028853 nlinks from 0 to 1
> would have reset inode 2030901 nlinks from 0 to 1
> would have reset inode 2032181 nlinks from 0 to 1
> would have reset inode 2032693 nlinks from 0 to 1
> would have reset inode 2040117 nlinks from 0 to 1
> would have reset inode 2053685 nlinks from 0 to 1
> would have reset inode 2083893 nlinks from 0 to 1
> would have reset inode 2087221 nlinks from 0 to 1
> would have reset inode 2095925 nlinks from 0 to 1
> would have reset inode 2098741 nlinks from 0 to 1
> would have reset inode 2100533 nlinks from 0 to 1
> would have reset inode 2101301 nlinks from 0 to 1
> would have reset inode 2123573 nlinks from 0 to 1
> would have reset inode 2132789 nlinks from 0 to 1
> would have reset inode 2133813 nlinks from 0 to 1
>
>
>
>
>
> 2013/4/10 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>
> The storage info is as following:
> RAID-6
> SATA HDD
> Controller: PERC H710P Mini (Embedded)
> Disk /dev/sdb: 30000.3 GB, 30000346562560 bytes
> 255 heads, 63 sectors/track, 3647334 cylinders
> Units = cylinders of 16065 * 512 = 8225280 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x00000000
>
> sd 0:2:1:0: [sdb] 58594426880 512-byte logical blocks: (30.0 TB/27.2
> TiB)
> sd 0:2:1:0: [sdb] Write Protect is off
> sd 0:2:1:0: [sdb] Mode Sense: 1f 00 00 08
> sd 0:2:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> sd 0:2:1:0: [sdb] Attached SCSI disk
>
> *-storage
> description: RAID bus controller
> product: MegaRAID SAS 2208 [Thunderbolt]
> vendor: LSI Logic / Symbios Logic
> physical id: 0
> bus info: pci@0000:02:00.0
> logical name: scsi0
> version: 01
> width: 64 bits
> clock: 33MHz
> capabilities: storage pm pciexpress vpd msi msix bus_master
> cap_list rom
> configuration: driver=megaraid_sas latency=0
> resources: irq:42 ioport:fc00(size=256)
> memory:dd7fc000-dd7fffff memory:dd780000-dd7bffff
> memory:dc800000-dc81ffff(prefetchable)
> *-disk:0
> description: SCSI Disk
> product: PERC H710P
> vendor: DELL
> physical id: 2.0.0
> bus info: scsi@0:2.0.0
> logical name: /dev/sda
> version: 3.13
> serial: 0049d6ce1d9f2035180096fde490f648
> size: 558GiB (599GB)
> capabilities: partitioned partitioned:dos
> configuration: ansiversion=5 signature=000aa336
> *-disk:1
> description: SCSI Disk
> product: PERC H710P
> vendor: DELL
> physical id: 2.1.0
> bus info: scsi@0:2.1.0
> logical name: /dev/sdb
> logical name: /mnt/xfsd
> version: 3.13
> serial: 003366f71da22035180096fde490f648
> size: 27TiB (30TB)
> configuration: ansiversion=5 mount.fstype=xfs
> mount.options=rw,relatime,attr2,delaylog,logbsize=64k,sunit=128,swidth=1280,noquota
> state=mounted
>
> Thank you.
>
>
> 2013/4/10 Emmanuel Florac <eflorac@intellique.com
> <mailto:eflorac@intellique.com>>
>
> Le Tue, 9 Apr 2013 23:10:03 +0800
> 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>> écrivait:
>
> > > Apr 9 11:01:30 cqdx kernel: XFS (sdb): I/O Error Detected.
> > > Shutting down filesystem
>
> This. I/O error detected. That means that at some point the
> underlying
> device (disk, RAID array, SAN volume) couldn't be reached. So this
> could very well be a case of a flakey drive, array, cable or SCSI
> driver.
>
> What's the storage setup here?
>
> --
> ------------------------------------------------------------------------
> Emmanuel Florac | Direction technique
> | Intellique
> | <eflorac@intellique.com
> <mailto:eflorac@intellique.com>>
> | +33 1 78 94 84 02
> ------------------------------------------------------------------------
>
>
>
>
> --
> 符永涛
>
>
>
>
> --
> 符永涛
>
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
[-- Attachment #2: xfs.stp --]
[-- Type: text/plain, Size: 3402 bytes --]
/*
* unlink path
*/
/* assert that nlink == 0 on addition to the unlinked inode list */
probe module("xfs").function("xfs_iunlink")
{
if ($ip->i_d->di_nlink == 0)
next;
printf("\n--- %s\n", probefunc());
printf("vars: %s\n", $$vars)
printf("ip: i_ino = 0x%x, i_flags = 0x%x\n", $ip->i_ino, $ip->i_flags);
printf("ip->i_d: di_nlink = 0x%x, di_gen = 0x%x\n", $ip->i_d->di_nlink, $ip->i_d->di_gen);
printf("kernel backtrace:\n");
print_backtrace();
printf("user backtrace:\n");
print_ubacktrace();
}
/* can we check the nlink count when an agi buffer is written out?? */
/*
* reclaim path
*/
/* if we failed to remove an unlinked inode, what inode caused us to blow up? */
/*probe module("xfs").statement("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1779")*/
probe module("xfs").function("xfs_iunlink_remove").return
{
/* EINVAL */
if (returnval() != 22)
next;
printf("\n--- %s -- %s -- %s\n", probefunc(), pp(), $$return);
printf("vars: %s\n", $$vars);
printf("ip: i_ino = 0x%x, i_flags = 0x%x\n", $ip->i_ino, $ip->i_flags);
printf("ip->i_d: di_nlink = 0x%x, di_gen = 0x%x\n", $ip->i_d->di_nlink, $ip->i_d->di_gen);
}
/*
# Capture an EINVAL return from xfs_imap() based on an invalid ino.
probe module("xfs").statement("xfs_imap@fs/xfs/xfs_ialloc.c:1283")
{
//if (($agno < $mp->m_sb->sb_agcount) && ($agbno < $mp->m_sb->sb_agblocks))
//next;
printf("\n--- %s -- %s -- EINVAL 1\n", probefunc(), pp());
printf("vars: %s\n", $$vars);
printf("mp: m_agno_log = 0x%x, m_agino_log = 0x%x\n", $mp->m_agno_log, $mp->m_agino_log);
printf("mp->m_sb: sb_agcount = 0x%x, sb_agblocks = 0x%x, sb_inopblog = 0x%x, sb_agblklog = 0x%x\n",
$mp->m_sb->sb_agcount, $mp->m_sb->sb_agblocks, $mp->m_sb->sb_inopblog, $mp->m_sb->sb_agblklog);
printf("kernel backtrace:\n");
print_backtrace();
printf("user backtrace:\n");
print_ubacktrace();
}
*/
probe module("xfs").function("xfs_imap").return
{
/* EINVAL */
if (returnval() != 22)
next;
printf("\n--- %s -- %s -- %s\n", probefunc(), pp(), $$return);
printf("vars: %s\n", $$vars);
printf("mp: m_agno_log = 0x%x, m_agino_log = 0x%x\n", $mp->m_agno_log, $mp->m_agino_log);
printf("mp->m_sb: sb_agcount = 0x%x, sb_agblocks = 0x%x, sb_inopblog = 0x%x, sb_agblklog = 0x%x, sb_dblocks = 0x%x\n",
$mp->m_sb->sb_agcount, $mp->m_sb->sb_agblocks, $mp->m_sb->sb_inopblog, $mp->m_sb->sb_agblklog, $mp->m_sb->sb_dblocks);
/*
* this doesn't appear to print valid data (at least if I'm modifying
* the code explicitly), but alas there is an xfs_alert() if invalid.
*/
printf("imap: im_blkno = 0x%x, im_len = 0x%x, im_boffset = 0x%x\n",
$imap->im_blkno, $imap->im_len, $imap->im_boffset);
printf("kernel backtrace:\n");
print_backtrace();
printf("user backtrace:\n");
print_ubacktrace();
}
/*
# Capture an EINVAL return from xfs_imap() based on an invalid imap.
probe module("xfs").statement("xfs_imap@fs/xfs/xfs_ialloc.c+115")
{
printf("\n--- %s -- %s -- EINVAL 2\n", probefunc(), pp());
printf("vars: %s\n", $$vars);
printf("imap: im_blkno = 0x%x, im_len = 0x%x, im_boffset = 0x%x\n",
$imap->im_blkno, $imap->im_len, $imap->im_boffset);
printf("mp->m_sb: sb_dblocks = 0x%x\n", $mp->m_sb->sb_dblocks);
printf("kernel backtrace:\n");
print_backtrace();
printf("user backtrace:\n");
print_ubacktrace();
}
*/
[-- Attachment #3: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-11 23:26 ` Brian Foster
@ 2013-04-12 0:45 ` 符永涛
2013-04-12 12:50 ` Brian Foster
2013-04-12 1:07 ` Eric Sandeen
2013-04-12 4:32 ` 符永涛
2 siblings, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-12 0:45 UTC (permalink / raw)
To: Brian Foster; +Cc: Ben Myers, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 26144 bytes --]
the workload is about:
24 servers, replica(3) which means the distribute is 8
load is about 3(TB)-8(TB) per day.
2013/4/12 Brian Foster <bfoster@redhat.com>
> On 04/11/2013 03:11 PM, 符永涛 wrote:
> > It happens tonight again on one of our servers, how to debug the root
> > cause? Thank you.
> >
>
> Hi,
>
> I've attached a system tap script (stap -v xfs.stp) that should
> hopefully print out a bit more data should the issue happen again. Do
> you have a small enough number of nodes (or predictable enough pattern)
> that you could run this on the nodes that tend to fail and collect the
> output?
>
> Also, could you collect an xfs_metadump of the filesystem in question
> and make it available for download and analysis somewhere? I believe the
> ideal approach is to mount/umount the filesystem first to replay the log
> before collecting a metadump, but somebody could correct me on that (to
> be safe, you could collect multiple dumps: pre-mount and post-mount).
>
> Could you also describe your workload a little bit? Thanks.
>
> Brian
>
> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
> > xfs_inotobp() returned error 22.
> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
> > error 22
> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_do_force_shutdown(0x1)
> > called from line 1184 of file fs/xfs/xfs_vnodeops.c. Return address =
> > 0xffffffffa02ee20a
> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): I/O Error Detected. Shutting
> > down filesystem
> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): Please umount the filesystem and
> > rectify the problem(s)
> > Apr 12 02:32:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> > Apr 12 02:32:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> > Apr 12 02:33:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> > Apr 12 02:33:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> >
> > xfs_repair -n
> >
> >
> > Phase 7 - verify link counts...
> > would have reset inode 20021 nlinks from 0 to 1
> > would have reset inode 20789 nlinks from 0 to 1
> > would have reset inode 35125 nlinks from 0 to 1
> > would have reset inode 35637 nlinks from 0 to 1
> > would have reset inode 36149 nlinks from 0 to 1
> > would have reset inode 38197 nlinks from 0 to 1
> > would have reset inode 39477 nlinks from 0 to 1
> > would have reset inode 54069 nlinks from 0 to 1
> > would have reset inode 62261 nlinks from 0 to 1
> > would have reset inode 63029 nlinks from 0 to 1
> > would have reset inode 72501 nlinks from 0 to 1
> > would have reset inode 79925 nlinks from 0 to 1
> > would have reset inode 81205 nlinks from 0 to 1
> > would have reset inode 84789 nlinks from 0 to 1
> > would have reset inode 87861 nlinks from 0 to 1
> > would have reset inode 90663 nlinks from 0 to 1
> > would have reset inode 91189 nlinks from 0 to 1
> > would have reset inode 95541 nlinks from 0 to 1
> > would have reset inode 98101 nlinks from 0 to 1
> > would have reset inode 101173 nlinks from 0 to 1
> > would have reset inode 113205 nlinks from 0 to 1
> > would have reset inode 114741 nlinks from 0 to 1
> > would have reset inode 126261 nlinks from 0 to 1
> > would have reset inode 140597 nlinks from 0 to 1
> > would have reset inode 144693 nlinks from 0 to 1
> > would have reset inode 147765 nlinks from 0 to 1
> > would have reset inode 152885 nlinks from 0 to 1
> > would have reset inode 161333 nlinks from 0 to 1
> > would have reset inode 161845 nlinks from 0 to 1
> > would have reset inode 167477 nlinks from 0 to 1
> > would have reset inode 172341 nlinks from 0 to 1
> > would have reset inode 191797 nlinks from 0 to 1
> > would have reset inode 204853 nlinks from 0 to 1
> > would have reset inode 205365 nlinks from 0 to 1
> > would have reset inode 215349 nlinks from 0 to 1
> > would have reset inode 215861 nlinks from 0 to 1
> > would have reset inode 216373 nlinks from 0 to 1
> > would have reset inode 217397 nlinks from 0 to 1
> > would have reset inode 224309 nlinks from 0 to 1
> > would have reset inode 225589 nlinks from 0 to 1
> > would have reset inode 234549 nlinks from 0 to 1
> > would have reset inode 234805 nlinks from 0 to 1
> > would have reset inode 249653 nlinks from 0 to 1
> > would have reset inode 250677 nlinks from 0 to 1
> > would have reset inode 252469 nlinks from 0 to 1
> > would have reset inode 261429 nlinks from 0 to 1
> > would have reset inode 265013 nlinks from 0 to 1
> > would have reset inode 266805 nlinks from 0 to 1
> > would have reset inode 267317 nlinks from 0 to 1
> > would have reset inode 268853 nlinks from 0 to 1
> > would have reset inode 272437 nlinks from 0 to 1
> > would have reset inode 273205 nlinks from 0 to 1
> > would have reset inode 274229 nlinks from 0 to 1
> > would have reset inode 278325 nlinks from 0 to 1
> > would have reset inode 278837 nlinks from 0 to 1
> > would have reset inode 281397 nlinks from 0 to 1
> > would have reset inode 292661 nlinks from 0 to 1
> > would have reset inode 300853 nlinks from 0 to 1
> > would have reset inode 302901 nlinks from 0 to 1
> > would have reset inode 305205 nlinks from 0 to 1
> > would have reset inode 314165 nlinks from 0 to 1
> > would have reset inode 315189 nlinks from 0 to 1
> > would have reset inode 320309 nlinks from 0 to 1
> > would have reset inode 324917 nlinks from 0 to 1
> > would have reset inode 328245 nlinks from 0 to 1
> > would have reset inode 335925 nlinks from 0 to 1
> > would have reset inode 339253 nlinks from 0 to 1
> > would have reset inode 339765 nlinks from 0 to 1
> > would have reset inode 348213 nlinks from 0 to 1
> > would have reset inode 360501 nlinks from 0 to 1
> > would have reset inode 362037 nlinks from 0 to 1
> > would have reset inode 366389 nlinks from 0 to 1
> > would have reset inode 385845 nlinks from 0 to 1
> > would have reset inode 390709 nlinks from 0 to 1
> > would have reset inode 409141 nlinks from 0 to 1
> > would have reset inode 413237 nlinks from 0 to 1
> > would have reset inode 414773 nlinks from 0 to 1
> > would have reset inode 417845 nlinks from 0 to 1
> > would have reset inode 436021 nlinks from 0 to 1
> > would have reset inode 439349 nlinks from 0 to 1
> > would have reset inode 447029 nlinks from 0 to 1
> > would have reset inode 491317 nlinks from 0 to 1
> > would have reset inode 494133 nlinks from 0 to 1
> > would have reset inode 495413 nlinks from 0 to 1
> > would have reset inode 501301 nlinks from 0 to 1
> > would have reset inode 506421 nlinks from 0 to 1
> > would have reset inode 508469 nlinks from 0 to 1
> > would have reset inode 508981 nlinks from 0 to 1
> > would have reset inode 511797 nlinks from 0 to 1
> > would have reset inode 513077 nlinks from 0 to 1
> > would have reset inode 517941 nlinks from 0 to 1
> > would have reset inode 521013 nlinks from 0 to 1
> > would have reset inode 522805 nlinks from 0 to 1
> > would have reset inode 523317 nlinks from 0 to 1
> > would have reset inode 525621 nlinks from 0 to 1
> > would have reset inode 527925 nlinks from 0 to 1
> > would have reset inode 535605 nlinks from 0 to 1
> > would have reset inode 541749 nlinks from 0 to 1
> > would have reset inode 573493 nlinks from 0 to 1
> > would have reset inode 578613 nlinks from 0 to 1
> > would have reset inode 583029 nlinks from 0 to 1
> > would have reset inode 585525 nlinks from 0 to 1
> > would have reset inode 586293 nlinks from 0 to 1
> > would have reset inode 586805 nlinks from 0 to 1
> > would have reset inode 591413 nlinks from 0 to 1
> > would have reset inode 594485 nlinks from 0 to 1
> > would have reset inode 596277 nlinks from 0 to 1
> > would have reset inode 603189 nlinks from 0 to 1
> > would have reset inode 613429 nlinks from 0 to 1
> > would have reset inode 617781 nlinks from 0 to 1
> > would have reset inode 621877 nlinks from 0 to 1
> > would have reset inode 623925 nlinks from 0 to 1
> > would have reset inode 625205 nlinks from 0 to 1
> > would have reset inode 626741 nlinks from 0 to 1
> > would have reset inode 639541 nlinks from 0 to 1
> > would have reset inode 640053 nlinks from 0 to 1
> > would have reset inode 640565 nlinks from 0 to 1
> > would have reset inode 645173 nlinks from 0 to 1
> > would have reset inode 652853 nlinks from 0 to 1
> > would have reset inode 656181 nlinks from 0 to 1
> > would have reset inode 659253 nlinks from 0 to 1
> > would have reset inode 663605 nlinks from 0 to 1
> > would have reset inode 667445 nlinks from 0 to 1
> > would have reset inode 680757 nlinks from 0 to 1
> > would have reset inode 691253 nlinks from 0 to 1
> > would have reset inode 691765 nlinks from 0 to 1
> > would have reset inode 697653 nlinks from 0 to 1
> > would have reset inode 700469 nlinks from 0 to 1
> > would have reset inode 707893 nlinks from 0 to 1
> > would have reset inode 716853 nlinks from 0 to 1
> > would have reset inode 722229 nlinks from 0 to 1
> > would have reset inode 722741 nlinks from 0 to 1
> > would have reset inode 723765 nlinks from 0 to 1
> > would have reset inode 731957 nlinks from 0 to 1
> > would have reset inode 742965 nlinks from 0 to 1
> > would have reset inode 743477 nlinks from 0 to 1
> > would have reset inode 745781 nlinks from 0 to 1
> > would have reset inode 746293 nlinks from 0 to 1
> > would have reset inode 774453 nlinks from 0 to 1
> > would have reset inode 778805 nlinks from 0 to 1
> > would have reset inode 785013 nlinks from 0 to 1
> > would have reset inode 785973 nlinks from 0 to 1
> > would have reset inode 791349 nlinks from 0 to 1
> > would have reset inode 796981 nlinks from 0 to 1
> > would have reset inode 803381 nlinks from 0 to 1
> > would have reset inode 806965 nlinks from 0 to 1
> > would have reset inode 811798 nlinks from 0 to 1
> > would have reset inode 812310 nlinks from 0 to 1
> > would have reset inode 813078 nlinks from 0 to 1
> > would have reset inode 813607 nlinks from 0 to 1
> > would have reset inode 814183 nlinks from 0 to 1
> > would have reset inode 822069 nlinks from 0 to 1
> > would have reset inode 828469 nlinks from 0 to 1
> > would have reset inode 830005 nlinks from 0 to 1
> > would have reset inode 832053 nlinks from 0 to 1
> > would have reset inode 832565 nlinks from 0 to 1
> > would have reset inode 836661 nlinks from 0 to 1
> > would have reset inode 841013 nlinks from 0 to 1
> > would have reset inode 841525 nlinks from 0 to 1
> > would have reset inode 845365 nlinks from 0 to 1
> > would have reset inode 846133 nlinks from 0 to 1
> > would have reset inode 847157 nlinks from 0 to 1
> > would have reset inode 852533 nlinks from 0 to 1
> > would have reset inode 857141 nlinks from 0 to 1
> > would have reset inode 863271 nlinks from 0 to 1
> > would have reset inode 866855 nlinks from 0 to 1
> > would have reset inode 887861 nlinks from 0 to 1
> > would have reset inode 891701 nlinks from 0 to 1
> > would have reset inode 894773 nlinks from 0 to 1
> > would have reset inode 900149 nlinks from 0 to 1
> > would have reset inode 902197 nlinks from 0 to 1
> > would have reset inode 906293 nlinks from 0 to 1
> > would have reset inode 906805 nlinks from 0 to 1
> > would have reset inode 909877 nlinks from 0 to 1
> > would have reset inode 925493 nlinks from 0 to 1
> > would have reset inode 949543 nlinks from 0 to 1
> > would have reset inode 955175 nlinks from 0 to 1
> > would have reset inode 963623 nlinks from 0 to 1
> > would have reset inode 967733 nlinks from 0 to 1
> > would have reset inode 968231 nlinks from 0 to 1
> > would have reset inode 982069 nlinks from 0 to 1
> > would have reset inode 1007413 nlinks from 0 to 1
> > would have reset inode 1011509 nlinks from 0 to 1
> > would have reset inode 1014069 nlinks from 0 to 1
> > would have reset inode 1014581 nlinks from 0 to 1
> > would have reset inode 1022005 nlinks from 0 to 1
> > would have reset inode 1022517 nlinks from 0 to 1
> > would have reset inode 1023029 nlinks from 0 to 1
> > would have reset inode 1025333 nlinks from 0 to 1
> > would have reset inode 1043765 nlinks from 0 to 1
> > would have reset inode 1044789 nlinks from 0 to 1
> > would have reset inode 1049397 nlinks from 0 to 1
> > would have reset inode 1050933 nlinks from 0 to 1
> > would have reset inode 1051445 nlinks from 0 to 1
> > would have reset inode 1054261 nlinks from 0 to 1
> > would have reset inode 1060917 nlinks from 0 to 1
> > would have reset inode 1063477 nlinks from 0 to 1
> > would have reset inode 1076021 nlinks from 0 to 1
> > would have reset inode 1081141 nlinks from 0 to 1
> > would have reset inode 1086261 nlinks from 0 to 1
> > would have reset inode 1097269 nlinks from 0 to 1
> > would have reset inode 1099829 nlinks from 0 to 1
> > would have reset inode 1100853 nlinks from 0 to 1
> > would have reset inode 1101877 nlinks from 0 to 1
> > would have reset inode 1126709 nlinks from 0 to 1
> > would have reset inode 1134389 nlinks from 0 to 1
> > would have reset inode 1141045 nlinks from 0 to 1
> > would have reset inode 1141557 nlinks from 0 to 1
> > would have reset inode 1142581 nlinks from 0 to 1
> > would have reset inode 1148469 nlinks from 0 to 1
> > would have reset inode 1153333 nlinks from 0 to 1
> > would have reset inode 1181749 nlinks from 0 to 1
> > would have reset inode 1192245 nlinks from 0 to 1
> > would have reset inode 1198133 nlinks from 0 to 1
> > would have reset inode 1203765 nlinks from 0 to 1
> > would have reset inode 1221429 nlinks from 0 to 1
> > would have reset inode 1223989 nlinks from 0 to 1
> > would have reset inode 1235509 nlinks from 0 to 1
> > would have reset inode 1239349 nlinks from 0 to 1
> > would have reset inode 1240885 nlinks from 0 to 1
> > would have reset inode 1241397 nlinks from 0 to 1
> > would have reset inode 1241909 nlinks from 0 to 1
> > would have reset inode 1242421 nlinks from 0 to 1
> > would have reset inode 1244981 nlinks from 0 to 1
> > would have reset inode 1246517 nlinks from 0 to 1
> > would have reset inode 1253429 nlinks from 0 to 1
> > would have reset inode 1271861 nlinks from 0 to 1
> > would have reset inode 1274677 nlinks from 0 to 1
> > would have reset inode 1277749 nlinks from 0 to 1
> > would have reset inode 1278773 nlinks from 0 to 1
> > would have reset inode 1286709 nlinks from 0 to 1
> > would have reset inode 1288245 nlinks from 0 to 1
> > would have reset inode 1299765 nlinks from 0 to 1
> > would have reset inode 1302325 nlinks from 0 to 1
> > would have reset inode 1304885 nlinks from 0 to 1
> > would have reset inode 1305397 nlinks from 0 to 1
> > would have reset inode 1307509 nlinks from 0 to 1
> > would have reset inode 1309493 nlinks from 0 to 1
> > would have reset inode 1310517 nlinks from 0 to 1
> > would have reset inode 1311029 nlinks from 0 to 1
> > would have reset inode 1312053 nlinks from 0 to 1
> > would have reset inode 1316917 nlinks from 0 to 1
> > would have reset inode 1317941 nlinks from 0 to 1
> > would have reset inode 1320821 nlinks from 0 to 1
> > would have reset inode 1322805 nlinks from 0 to 1
> > would have reset inode 1332789 nlinks from 0 to 1
> > would have reset inode 1336373 nlinks from 0 to 1
> > would have reset inode 1345653 nlinks from 0 to 1
> > would have reset inode 1354549 nlinks from 0 to 1
> > would have reset inode 1361973 nlinks from 0 to 1
> > would have reset inode 1369909 nlinks from 0 to 1
> > would have reset inode 1372981 nlinks from 0 to 1
> > would have reset inode 1388853 nlinks from 0 to 1
> > would have reset inode 1402933 nlinks from 0 to 1
> > would have reset inode 1403445 nlinks from 0 to 1
> > would have reset inode 1420085 nlinks from 0 to 1
> > would have reset inode 1452853 nlinks from 0 to 1
> > would have reset inode 1456437 nlinks from 0 to 1
> > would have reset inode 1457973 nlinks from 0 to 1
> > would have reset inode 1459253 nlinks from 0 to 1
> > would have reset inode 1467957 nlinks from 0 to 1
> > would have reset inode 1471541 nlinks from 0 to 1
> > would have reset inode 1476661 nlinks from 0 to 1
> > would have reset inode 1479733 nlinks from 0 to 1
> > would have reset inode 1483061 nlinks from 0 to 1
> > would have reset inode 1484085 nlinks from 0 to 1
> > would have reset inode 1486133 nlinks from 0 to 1
> > would have reset inode 1489461 nlinks from 0 to 1
> > would have reset inode 1490037 nlinks from 0 to 1
> > would have reset inode 1492021 nlinks from 0 to 1
> > would have reset inode 1493557 nlinks from 0 to 1
> > would have reset inode 1494069 nlinks from 0 to 1
> > would have reset inode 1496885 nlinks from 0 to 1
> > would have reset inode 1498421 nlinks from 0 to 1
> > would have reset inode 1498933 nlinks from 0 to 1
> > would have reset inode 1499957 nlinks from 0 to 1
> > would have reset inode 1506101 nlinks from 0 to 1
> > would have reset inode 1507637 nlinks from 0 to 1
> > would have reset inode 1510453 nlinks from 0 to 1
> > would have reset inode 1514293 nlinks from 0 to 1
> > would have reset inode 1517365 nlinks from 0 to 1
> > would have reset inode 1520693 nlinks from 0 to 1
> > would have reset inode 1521973 nlinks from 0 to 1
> > would have reset inode 1530421 nlinks from 0 to 1
> > would have reset inode 1530933 nlinks from 0 to 1
> > would have reset inode 1537333 nlinks from 0 to 1
> > would have reset inode 1538357 nlinks from 0 to 1
> > would have reset inode 1548853 nlinks from 0 to 1
> > would have reset inode 1553973 nlinks from 0 to 1
> > would have reset inode 1557301 nlinks from 0 to 1
> > would have reset inode 1564213 nlinks from 0 to 1
> > would have reset inode 1564725 nlinks from 0 to 1
> > would have reset inode 1576501 nlinks from 0 to 1
> > would have reset inode 1580597 nlinks from 0 to 1
> > would have reset inode 1584693 nlinks from 0 to 1
> > would have reset inode 1586485 nlinks from 0 to 1
> > would have reset inode 1589301 nlinks from 0 to 1
> > would have reset inode 1589813 nlinks from 0 to 1
> > would have reset inode 1592629 nlinks from 0 to 1
> > would have reset inode 1595701 nlinks from 0 to 1
> > would have reset inode 1601077 nlinks from 0 to 1
> > would have reset inode 1623861 nlinks from 0 to 1
> > would have reset inode 1626677 nlinks from 0 to 1
> > would have reset inode 1627701 nlinks from 0 to 1
> > would have reset inode 1633333 nlinks from 0 to 1
> > would have reset inode 1639221 nlinks from 0 to 1
> > would have reset inode 1649205 nlinks from 0 to 1
> > would have reset inode 1686325 nlinks from 0 to 1
> > would have reset inode 1690677 nlinks from 0 to 1
> > would have reset inode 1693749 nlinks from 0 to 1
> > would have reset inode 1704757 nlinks from 0 to 1
> > would have reset inode 1707061 nlinks from 0 to 1
> > would have reset inode 1709109 nlinks from 0 to 1
> > would have reset inode 1719349 nlinks from 0 to 1
> > would have reset inode 1737013 nlinks from 0 to 1
> > would have reset inode 1741365 nlinks from 0 to 1
> > would have reset inode 1747509 nlinks from 0 to 1
> > would have reset inode 1770805 nlinks from 0 to 1
> > would have reset inode 1780789 nlinks from 0 to 1
> > would have reset inode 1793589 nlinks from 0 to 1
> > would have reset inode 1795125 nlinks from 0 to 1
> > would have reset inode 1800757 nlinks from 0 to 1
> > would have reset inode 1801269 nlinks from 0 to 1
> > would have reset inode 1802549 nlinks from 0 to 1
> > would have reset inode 1804085 nlinks from 0 to 1
> > would have reset inode 1817141 nlinks from 0 to 1
> > would have reset inode 1821749 nlinks from 0 to 1
> > would have reset inode 1832757 nlinks from 0 to 1
> > would have reset inode 1836341 nlinks from 0 to 1
> > would have reset inode 1856309 nlinks from 0 to 1
> > would have reset inode 1900597 nlinks from 0 to 1
> > would have reset inode 1902901 nlinks from 0 to 1
> > would have reset inode 1912373 nlinks from 0 to 1
> > would have reset inode 1943093 nlinks from 0 to 1
> > would have reset inode 1944373 nlinks from 0 to 1
> > would have reset inode 1954101 nlinks from 0 to 1
> > would have reset inode 1955893 nlinks from 0 to 1
> > would have reset inode 1961781 nlinks from 0 to 1
> > would have reset inode 1974325 nlinks from 0 to 1
> > would have reset inode 1978677 nlinks from 0 to 1
> > would have reset inode 1981237 nlinks from 0 to 1
> > would have reset inode 1992245 nlinks from 0 to 1
> > would have reset inode 2000949 nlinks from 0 to 1
> > would have reset inode 2002229 nlinks from 0 to 1
> > would have reset inode 2004789 nlinks from 0 to 1
> > would have reset inode 2005301 nlinks from 0 to 1
> > would have reset inode 2011189 nlinks from 0 to 1
> > would have reset inode 2012981 nlinks from 0 to 1
> > would have reset inode 2015285 nlinks from 0 to 1
> > would have reset inode 2018869 nlinks from 0 to 1
> > would have reset inode 2028341 nlinks from 0 to 1
> > would have reset inode 2028853 nlinks from 0 to 1
> > would have reset inode 2030901 nlinks from 0 to 1
> > would have reset inode 2032181 nlinks from 0 to 1
> > would have reset inode 2032693 nlinks from 0 to 1
> > would have reset inode 2040117 nlinks from 0 to 1
> > would have reset inode 2053685 nlinks from 0 to 1
> > would have reset inode 2083893 nlinks from 0 to 1
> > would have reset inode 2087221 nlinks from 0 to 1
> > would have reset inode 2095925 nlinks from 0 to 1
> > would have reset inode 2098741 nlinks from 0 to 1
> > would have reset inode 2100533 nlinks from 0 to 1
> > would have reset inode 2101301 nlinks from 0 to 1
> > would have reset inode 2123573 nlinks from 0 to 1
> > would have reset inode 2132789 nlinks from 0 to 1
> > would have reset inode 2133813 nlinks from 0 to 1
> >
> >
> >
> >
> >
> > 2013/4/10 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
> >
> > The storage info is as following:
> > RAID-6
> > SATA HDD
> > Controller: PERC H710P Mini (Embedded)
> > Disk /dev/sdb: 30000.3 GB, 30000346562560 bytes
> > 255 heads, 63 sectors/track, 3647334 cylinders
> > Units = cylinders of 16065 * 512 = 8225280 bytes
> > Sector size (logical/physical): 512 bytes / 512 bytes
> > I/O size (minimum/optimal): 512 bytes / 512 bytes
> > Disk identifier: 0x00000000
> >
> > sd 0:2:1:0: [sdb] 58594426880 512-byte logical blocks: (30.0 TB/27.2
> > TiB)
> > sd 0:2:1:0: [sdb] Write Protect is off
> > sd 0:2:1:0: [sdb] Mode Sense: 1f 00 00 08
> > sd 0:2:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't
> > support DPO or FUA
> > sd 0:2:1:0: [sdb] Attached SCSI disk
> >
> > *-storage
> > description: RAID bus controller
> > product: MegaRAID SAS 2208 [Thunderbolt]
> > vendor: LSI Logic / Symbios Logic
> > physical id: 0
> > bus info: pci@0000:02:00.0
> > logical name: scsi0
> > version: 01
> > width: 64 bits
> > clock: 33MHz
> > capabilities: storage pm pciexpress vpd msi msix bus_master
> > cap_list rom
> > configuration: driver=megaraid_sas latency=0
> > resources: irq:42 ioport:fc00(size=256)
> > memory:dd7fc000-dd7fffff memory:dd780000-dd7bffff
> > memory:dc800000-dc81ffff(prefetchable)
> > *-disk:0
> > description: SCSI Disk
> > product: PERC H710P
> > vendor: DELL
> > physical id: 2.0.0
> > bus info: scsi@0:2.0.0
> > logical name: /dev/sda
> > version: 3.13
> > serial: 0049d6ce1d9f2035180096fde490f648
> > size: 558GiB (599GB)
> > capabilities: partitioned partitioned:dos
> > configuration: ansiversion=5 signature=000aa336
> > *-disk:1
> > description: SCSI Disk
> > product: PERC H710P
> > vendor: DELL
> > physical id: 2.1.0
> > bus info: scsi@0:2.1.0
> > logical name: /dev/sdb
> > logical name: /mnt/xfsd
> > version: 3.13
> > serial: 003366f71da22035180096fde490f648
> > size: 27TiB (30TB)
> > configuration: ansiversion=5 mount.fstype=xfs
> >
> mount.options=rw,relatime,attr2,delaylog,logbsize=64k,sunit=128,swidth=1280,noquota
> > state=mounted
> >
> > Thank you.
> >
> >
> > 2013/4/10 Emmanuel Florac <eflorac@intellique.com
> > <mailto:eflorac@intellique.com>>
> >
> > Le Tue, 9 Apr 2013 23:10:03 +0800
> > 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>> écrivait:
> >
> > > > Apr 9 11:01:30 cqdx kernel: XFS (sdb): I/O Error Detected.
> > > > Shutting down filesystem
> >
> > This. I/O error detected. That means that at some point the
> > underlying
> > device (disk, RAID array, SAN volume) couldn't be reached. So
> this
> > could very well be a case of a flakey drive, array, cable or SCSI
> > driver.
> >
> > What's the storage setup here?
> >
> > --
> >
> ------------------------------------------------------------------------
> > Emmanuel Florac | Direction technique
> > | Intellique
> > | <eflorac@intellique.com
> > <mailto:eflorac@intellique.com>>
> > | +33 1 78 94 84 02
> >
> ------------------------------------------------------------------------
> >
> >
> >
> >
> > --
> > 符永涛
> >
> >
> >
> >
> > --
> > 符永涛
> >
> >
> > _______________________________________________
> > xfs mailing list
> > xfs@oss.sgi.com
> > http://oss.sgi.com/mailman/listinfo/xfs
> >
>
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 31877 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-11 23:26 ` Brian Foster
2013-04-12 0:45 ` 符永涛
@ 2013-04-12 1:07 ` Eric Sandeen
2013-04-12 1:36 ` 符永涛
2013-04-12 6:15 ` 符永涛
2013-04-12 4:32 ` 符永涛
2 siblings, 2 replies; 60+ messages in thread
From: Eric Sandeen @ 2013-04-12 1:07 UTC (permalink / raw)
To: Brian Foster; +Cc: Ben Myers, 符永涛, xfs@oss.sgi.com
On 4/11/13 6:26 PM, Brian Foster wrote:
> On 04/11/2013 03:11 PM, 符永涛 wrote:
>> It happens tonight again on one of our servers, how to debug the root
>> cause? Thank you.
>>
>
> Hi,
>
> I've attached a system tap script (stap -v xfs.stp) that should
> hopefully print out a bit more data should the issue happen again. Do
> you have a small enough number of nodes (or predictable enough pattern)
> that you could run this on the nodes that tend to fail and collect the
> output?
>
> Also, could you collect an xfs_metadump of the filesystem in question
> and make it available for download and analysis somewhere? I believe the
> ideal approach is to mount/umount the filesystem first to replay the log
> before collecting a metadump, but somebody could correct me on that (to
> be safe, you could collect multiple dumps: pre-mount and post-mount).
Dave suggested yesterday that this would be best: metadump right
after unmounting post-failure, then mount/umount & generate another metadump.
-Eric
> Could you also describe your workload a little bit? Thanks.
>
> Brian
>
>> Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
>> xfs_inotobp() returned error 22.
>> Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
>> error 22
>> Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_do_force_shutdown(0x1)
>> called from line 1184 of file fs/xfs/xfs_vnodeops.c. Return address =
>> 0xffffffffa02ee20a
>> Apr 12 02:32:10 cqdx kernel: XFS (sdb): I/O Error Detected. Shutting
>> down filesystem
>> Apr 12 02:32:10 cqdx kernel: XFS (sdb): Please umount the filesystem and
>> rectify the problem(s)
>> Apr 12 02:32:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
>> Apr 12 02:32:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
>> Apr 12 02:33:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
>> Apr 12 02:33:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
>>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-12 1:07 ` Eric Sandeen
@ 2013-04-12 1:36 ` 符永涛
2013-04-12 1:38 ` 符永涛
2013-04-12 6:15 ` 符永涛
1 sibling, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-12 1:36 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Brian Foster, Ben Myers, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 2283 bytes --]
Sorry I didn't dump the meta right after unmounting post-failure, I have
dumped meta after a mount/umount, I'll share the meta to you soon.
2013/4/12 Eric Sandeen <sandeen@sandeen.net>
> On 4/11/13 6:26 PM, Brian Foster wrote:
> > On 04/11/2013 03:11 PM, 符永涛 wrote:
> >> It happens tonight again on one of our servers, how to debug the root
> >> cause? Thank you.
> >>
> >
> > Hi,
> >
> > I've attached a system tap script (stap -v xfs.stp) that should
> > hopefully print out a bit more data should the issue happen again. Do
> > you have a small enough number of nodes (or predictable enough pattern)
> > that you could run this on the nodes that tend to fail and collect the
> > output?
> >
> > Also, could you collect an xfs_metadump of the filesystem in question
> > and make it available for download and analysis somewhere? I believe the
> > ideal approach is to mount/umount the filesystem first to replay the log
> > before collecting a metadump, but somebody could correct me on that (to
> > be safe, you could collect multiple dumps: pre-mount and post-mount).
>
> Dave suggested yesterday that this would be best: metadump right
> after unmounting post-failure, then mount/umount & generate another
> metadump.
>
> -Eric
>
> > Could you also describe your workload a little bit? Thanks.
> >
> > Brian
> >
> >> Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
> >> xfs_inotobp() returned error 22.
> >> Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
> >> error 22
> >> Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_do_force_shutdown(0x1)
> >> called from line 1184 of file fs/xfs/xfs_vnodeops.c. Return address =
> >> 0xffffffffa02ee20a
> >> Apr 12 02:32:10 cqdx kernel: XFS (sdb): I/O Error Detected. Shutting
> >> down filesystem
> >> Apr 12 02:32:10 cqdx kernel: XFS (sdb): Please umount the filesystem and
> >> rectify the problem(s)
> >> Apr 12 02:32:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> >> Apr 12 02:32:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> >> Apr 12 02:33:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> >> Apr 12 02:33:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> >>
>
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 3009 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-12 1:36 ` 符永涛
@ 2013-04-12 1:38 ` 符永涛
0 siblings, 0 replies; 60+ messages in thread
From: 符永涛 @ 2013-04-12 1:38 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Brian Foster, Ben Myers, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 2503 bytes --]
Next time I'll dump it according to your suggestions, thank you.
2013/4/12 符永涛 <yongtaofu@gmail.com>
> Sorry I didn't dump the meta right after unmounting post-failure, I have
> dumped meta after a mount/umount, I'll share the meta to you soon.
>
>
> 2013/4/12 Eric Sandeen <sandeen@sandeen.net>
>
>> On 4/11/13 6:26 PM, Brian Foster wrote:
>> > On 04/11/2013 03:11 PM, 符永涛 wrote:
>> >> It happens tonight again on one of our servers, how to debug the root
>> >> cause? Thank you.
>> >>
>> >
>> > Hi,
>> >
>> > I've attached a system tap script (stap -v xfs.stp) that should
>> > hopefully print out a bit more data should the issue happen again. Do
>> > you have a small enough number of nodes (or predictable enough pattern)
>> > that you could run this on the nodes that tend to fail and collect the
>> > output?
>> >
>> > Also, could you collect an xfs_metadump of the filesystem in question
>> > and make it available for download and analysis somewhere? I believe the
>> > ideal approach is to mount/umount the filesystem first to replay the log
>> > before collecting a metadump, but somebody could correct me on that (to
>> > be safe, you could collect multiple dumps: pre-mount and post-mount).
>>
>> Dave suggested yesterday that this would be best: metadump right
>> after unmounting post-failure, then mount/umount & generate another
>> metadump.
>>
>> -Eric
>>
>> > Could you also describe your workload a little bit? Thanks.
>> >
>> > Brian
>> >
>> >> Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
>> >> xfs_inotobp() returned error 22.
>> >> Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree
>> returned
>> >> error 22
>> >> Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_do_force_shutdown(0x1)
>> >> called from line 1184 of file fs/xfs/xfs_vnodeops.c. Return address =
>> >> 0xffffffffa02ee20a
>> >> Apr 12 02:32:10 cqdx kernel: XFS (sdb): I/O Error Detected. Shutting
>> >> down filesystem
>> >> Apr 12 02:32:10 cqdx kernel: XFS (sdb): Please umount the filesystem
>> and
>> >> rectify the problem(s)
>> >> Apr 12 02:32:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> returned.
>> >> Apr 12 02:32:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> returned.
>> >> Apr 12 02:33:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> returned.
>> >> Apr 12 02:33:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> returned.
>> >>
>>
>>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 3500 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-11 23:26 ` Brian Foster
2013-04-12 0:45 ` 符永涛
2013-04-12 1:07 ` Eric Sandeen
@ 2013-04-12 4:32 ` 符永涛
2013-04-12 5:16 ` Eric Sandeen
2013-04-12 5:23 ` 符永涛
2 siblings, 2 replies; 60+ messages in thread
From: 符永涛 @ 2013-04-12 4:32 UTC (permalink / raw)
To: Brian Foster; +Cc: Ben Myers, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 26399 bytes --]
Hi Brian,
Sorry but when I execute the script it says:
WARNING: cannot find module xfs debuginfo: No DWARF information found
semantic error: no match while resolving probe point
module("xfs").function("xfs_iunlink")
uname -a
2.6.32-279.el6.x86_64
kernel debuginfo has been installed.
Where can I find the correct xfs debuginfo?
Thank you for your help.
2013/4/12 Brian Foster <bfoster@redhat.com>
> On 04/11/2013 03:11 PM, 符永涛 wrote:
> > It happens tonight again on one of our servers, how to debug the root
> > cause? Thank you.
> >
>
> Hi,
>
> I've attached a system tap script (stap -v xfs.stp) that should
> hopefully print out a bit more data should the issue happen again. Do
> you have a small enough number of nodes (or predictable enough pattern)
> that you could run this on the nodes that tend to fail and collect the
> output?
>
> Also, could you collect an xfs_metadump of the filesystem in question
> and make it available for download and analysis somewhere? I believe the
> ideal approach is to mount/umount the filesystem first to replay the log
> before collecting a metadump, but somebody could correct me on that (to
> be safe, you could collect multiple dumps: pre-mount and post-mount).
>
> Could you also describe your workload a little bit? Thanks.
>
> Brian
>
> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
> > xfs_inotobp() returned error 22.
> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
> > error 22
> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_do_force_shutdown(0x1)
> > called from line 1184 of file fs/xfs/xfs_vnodeops.c. Return address =
> > 0xffffffffa02ee20a
> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): I/O Error Detected. Shutting
> > down filesystem
> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): Please umount the filesystem and
> > rectify the problem(s)
> > Apr 12 02:32:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> > Apr 12 02:32:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> > Apr 12 02:33:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> > Apr 12 02:33:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> >
> > xfs_repair -n
> >
> >
> > Phase 7 - verify link counts...
> > would have reset inode 20021 nlinks from 0 to 1
> > would have reset inode 20789 nlinks from 0 to 1
> > would have reset inode 35125 nlinks from 0 to 1
> > would have reset inode 35637 nlinks from 0 to 1
> > would have reset inode 36149 nlinks from 0 to 1
> > would have reset inode 38197 nlinks from 0 to 1
> > would have reset inode 39477 nlinks from 0 to 1
> > would have reset inode 54069 nlinks from 0 to 1
> > would have reset inode 62261 nlinks from 0 to 1
> > would have reset inode 63029 nlinks from 0 to 1
> > would have reset inode 72501 nlinks from 0 to 1
> > would have reset inode 79925 nlinks from 0 to 1
> > would have reset inode 81205 nlinks from 0 to 1
> > would have reset inode 84789 nlinks from 0 to 1
> > would have reset inode 87861 nlinks from 0 to 1
> > would have reset inode 90663 nlinks from 0 to 1
> > would have reset inode 91189 nlinks from 0 to 1
> > would have reset inode 95541 nlinks from 0 to 1
> > would have reset inode 98101 nlinks from 0 to 1
> > would have reset inode 101173 nlinks from 0 to 1
> > would have reset inode 113205 nlinks from 0 to 1
> > would have reset inode 114741 nlinks from 0 to 1
> > would have reset inode 126261 nlinks from 0 to 1
> > would have reset inode 140597 nlinks from 0 to 1
> > would have reset inode 144693 nlinks from 0 to 1
> > would have reset inode 147765 nlinks from 0 to 1
> > would have reset inode 152885 nlinks from 0 to 1
> > would have reset inode 161333 nlinks from 0 to 1
> > would have reset inode 161845 nlinks from 0 to 1
> > would have reset inode 167477 nlinks from 0 to 1
> > would have reset inode 172341 nlinks from 0 to 1
> > would have reset inode 191797 nlinks from 0 to 1
> > would have reset inode 204853 nlinks from 0 to 1
> > would have reset inode 205365 nlinks from 0 to 1
> > would have reset inode 215349 nlinks from 0 to 1
> > would have reset inode 215861 nlinks from 0 to 1
> > would have reset inode 216373 nlinks from 0 to 1
> > would have reset inode 217397 nlinks from 0 to 1
> > would have reset inode 224309 nlinks from 0 to 1
> > would have reset inode 225589 nlinks from 0 to 1
> > would have reset inode 234549 nlinks from 0 to 1
> > would have reset inode 234805 nlinks from 0 to 1
> > would have reset inode 249653 nlinks from 0 to 1
> > would have reset inode 250677 nlinks from 0 to 1
> > would have reset inode 252469 nlinks from 0 to 1
> > would have reset inode 261429 nlinks from 0 to 1
> > would have reset inode 265013 nlinks from 0 to 1
> > would have reset inode 266805 nlinks from 0 to 1
> > would have reset inode 267317 nlinks from 0 to 1
> > would have reset inode 268853 nlinks from 0 to 1
> > would have reset inode 272437 nlinks from 0 to 1
> > would have reset inode 273205 nlinks from 0 to 1
> > would have reset inode 274229 nlinks from 0 to 1
> > would have reset inode 278325 nlinks from 0 to 1
> > would have reset inode 278837 nlinks from 0 to 1
> > would have reset inode 281397 nlinks from 0 to 1
> > would have reset inode 292661 nlinks from 0 to 1
> > would have reset inode 300853 nlinks from 0 to 1
> > would have reset inode 302901 nlinks from 0 to 1
> > would have reset inode 305205 nlinks from 0 to 1
> > would have reset inode 314165 nlinks from 0 to 1
> > would have reset inode 315189 nlinks from 0 to 1
> > would have reset inode 320309 nlinks from 0 to 1
> > would have reset inode 324917 nlinks from 0 to 1
> > would have reset inode 328245 nlinks from 0 to 1
> > would have reset inode 335925 nlinks from 0 to 1
> > would have reset inode 339253 nlinks from 0 to 1
> > would have reset inode 339765 nlinks from 0 to 1
> > would have reset inode 348213 nlinks from 0 to 1
> > would have reset inode 360501 nlinks from 0 to 1
> > would have reset inode 362037 nlinks from 0 to 1
> > would have reset inode 366389 nlinks from 0 to 1
> > would have reset inode 385845 nlinks from 0 to 1
> > would have reset inode 390709 nlinks from 0 to 1
> > would have reset inode 409141 nlinks from 0 to 1
> > would have reset inode 413237 nlinks from 0 to 1
> > would have reset inode 414773 nlinks from 0 to 1
> > would have reset inode 417845 nlinks from 0 to 1
> > would have reset inode 436021 nlinks from 0 to 1
> > would have reset inode 439349 nlinks from 0 to 1
> > would have reset inode 447029 nlinks from 0 to 1
> > would have reset inode 491317 nlinks from 0 to 1
> > would have reset inode 494133 nlinks from 0 to 1
> > would have reset inode 495413 nlinks from 0 to 1
> > would have reset inode 501301 nlinks from 0 to 1
> > would have reset inode 506421 nlinks from 0 to 1
> > would have reset inode 508469 nlinks from 0 to 1
> > would have reset inode 508981 nlinks from 0 to 1
> > would have reset inode 511797 nlinks from 0 to 1
> > would have reset inode 513077 nlinks from 0 to 1
> > would have reset inode 517941 nlinks from 0 to 1
> > would have reset inode 521013 nlinks from 0 to 1
> > would have reset inode 522805 nlinks from 0 to 1
> > would have reset inode 523317 nlinks from 0 to 1
> > would have reset inode 525621 nlinks from 0 to 1
> > would have reset inode 527925 nlinks from 0 to 1
> > would have reset inode 535605 nlinks from 0 to 1
> > would have reset inode 541749 nlinks from 0 to 1
> > would have reset inode 573493 nlinks from 0 to 1
> > would have reset inode 578613 nlinks from 0 to 1
> > would have reset inode 583029 nlinks from 0 to 1
> > would have reset inode 585525 nlinks from 0 to 1
> > would have reset inode 586293 nlinks from 0 to 1
> > would have reset inode 586805 nlinks from 0 to 1
> > would have reset inode 591413 nlinks from 0 to 1
> > would have reset inode 594485 nlinks from 0 to 1
> > would have reset inode 596277 nlinks from 0 to 1
> > would have reset inode 603189 nlinks from 0 to 1
> > would have reset inode 613429 nlinks from 0 to 1
> > would have reset inode 617781 nlinks from 0 to 1
> > would have reset inode 621877 nlinks from 0 to 1
> > would have reset inode 623925 nlinks from 0 to 1
> > would have reset inode 625205 nlinks from 0 to 1
> > would have reset inode 626741 nlinks from 0 to 1
> > would have reset inode 639541 nlinks from 0 to 1
> > would have reset inode 640053 nlinks from 0 to 1
> > would have reset inode 640565 nlinks from 0 to 1
> > would have reset inode 645173 nlinks from 0 to 1
> > would have reset inode 652853 nlinks from 0 to 1
> > would have reset inode 656181 nlinks from 0 to 1
> > would have reset inode 659253 nlinks from 0 to 1
> > would have reset inode 663605 nlinks from 0 to 1
> > would have reset inode 667445 nlinks from 0 to 1
> > would have reset inode 680757 nlinks from 0 to 1
> > would have reset inode 691253 nlinks from 0 to 1
> > would have reset inode 691765 nlinks from 0 to 1
> > would have reset inode 697653 nlinks from 0 to 1
> > would have reset inode 700469 nlinks from 0 to 1
> > would have reset inode 707893 nlinks from 0 to 1
> > would have reset inode 716853 nlinks from 0 to 1
> > would have reset inode 722229 nlinks from 0 to 1
> > would have reset inode 722741 nlinks from 0 to 1
> > would have reset inode 723765 nlinks from 0 to 1
> > would have reset inode 731957 nlinks from 0 to 1
> > would have reset inode 742965 nlinks from 0 to 1
> > would have reset inode 743477 nlinks from 0 to 1
> > would have reset inode 745781 nlinks from 0 to 1
> > would have reset inode 746293 nlinks from 0 to 1
> > would have reset inode 774453 nlinks from 0 to 1
> > would have reset inode 778805 nlinks from 0 to 1
> > would have reset inode 785013 nlinks from 0 to 1
> > would have reset inode 785973 nlinks from 0 to 1
> > would have reset inode 791349 nlinks from 0 to 1
> > would have reset inode 796981 nlinks from 0 to 1
> > would have reset inode 803381 nlinks from 0 to 1
> > would have reset inode 806965 nlinks from 0 to 1
> > would have reset inode 811798 nlinks from 0 to 1
> > would have reset inode 812310 nlinks from 0 to 1
> > would have reset inode 813078 nlinks from 0 to 1
> > would have reset inode 813607 nlinks from 0 to 1
> > would have reset inode 814183 nlinks from 0 to 1
> > would have reset inode 822069 nlinks from 0 to 1
> > would have reset inode 828469 nlinks from 0 to 1
> > would have reset inode 830005 nlinks from 0 to 1
> > would have reset inode 832053 nlinks from 0 to 1
> > would have reset inode 832565 nlinks from 0 to 1
> > would have reset inode 836661 nlinks from 0 to 1
> > would have reset inode 841013 nlinks from 0 to 1
> > would have reset inode 841525 nlinks from 0 to 1
> > would have reset inode 845365 nlinks from 0 to 1
> > would have reset inode 846133 nlinks from 0 to 1
> > would have reset inode 847157 nlinks from 0 to 1
> > would have reset inode 852533 nlinks from 0 to 1
> > would have reset inode 857141 nlinks from 0 to 1
> > would have reset inode 863271 nlinks from 0 to 1
> > would have reset inode 866855 nlinks from 0 to 1
> > would have reset inode 887861 nlinks from 0 to 1
> > would have reset inode 891701 nlinks from 0 to 1
> > would have reset inode 894773 nlinks from 0 to 1
> > would have reset inode 900149 nlinks from 0 to 1
> > would have reset inode 902197 nlinks from 0 to 1
> > would have reset inode 906293 nlinks from 0 to 1
> > would have reset inode 906805 nlinks from 0 to 1
> > would have reset inode 909877 nlinks from 0 to 1
> > would have reset inode 925493 nlinks from 0 to 1
> > would have reset inode 949543 nlinks from 0 to 1
> > would have reset inode 955175 nlinks from 0 to 1
> > would have reset inode 963623 nlinks from 0 to 1
> > would have reset inode 967733 nlinks from 0 to 1
> > would have reset inode 968231 nlinks from 0 to 1
> > would have reset inode 982069 nlinks from 0 to 1
> > would have reset inode 1007413 nlinks from 0 to 1
> > would have reset inode 1011509 nlinks from 0 to 1
> > would have reset inode 1014069 nlinks from 0 to 1
> > would have reset inode 1014581 nlinks from 0 to 1
> > would have reset inode 1022005 nlinks from 0 to 1
> > would have reset inode 1022517 nlinks from 0 to 1
> > would have reset inode 1023029 nlinks from 0 to 1
> > would have reset inode 1025333 nlinks from 0 to 1
> > would have reset inode 1043765 nlinks from 0 to 1
> > would have reset inode 1044789 nlinks from 0 to 1
> > would have reset inode 1049397 nlinks from 0 to 1
> > would have reset inode 1050933 nlinks from 0 to 1
> > would have reset inode 1051445 nlinks from 0 to 1
> > would have reset inode 1054261 nlinks from 0 to 1
> > would have reset inode 1060917 nlinks from 0 to 1
> > would have reset inode 1063477 nlinks from 0 to 1
> > would have reset inode 1076021 nlinks from 0 to 1
> > would have reset inode 1081141 nlinks from 0 to 1
> > would have reset inode 1086261 nlinks from 0 to 1
> > would have reset inode 1097269 nlinks from 0 to 1
> > would have reset inode 1099829 nlinks from 0 to 1
> > would have reset inode 1100853 nlinks from 0 to 1
> > would have reset inode 1101877 nlinks from 0 to 1
> > would have reset inode 1126709 nlinks from 0 to 1
> > would have reset inode 1134389 nlinks from 0 to 1
> > would have reset inode 1141045 nlinks from 0 to 1
> > would have reset inode 1141557 nlinks from 0 to 1
> > would have reset inode 1142581 nlinks from 0 to 1
> > would have reset inode 1148469 nlinks from 0 to 1
> > would have reset inode 1153333 nlinks from 0 to 1
> > would have reset inode 1181749 nlinks from 0 to 1
> > would have reset inode 1192245 nlinks from 0 to 1
> > would have reset inode 1198133 nlinks from 0 to 1
> > would have reset inode 1203765 nlinks from 0 to 1
> > would have reset inode 1221429 nlinks from 0 to 1
> > would have reset inode 1223989 nlinks from 0 to 1
> > would have reset inode 1235509 nlinks from 0 to 1
> > would have reset inode 1239349 nlinks from 0 to 1
> > would have reset inode 1240885 nlinks from 0 to 1
> > would have reset inode 1241397 nlinks from 0 to 1
> > would have reset inode 1241909 nlinks from 0 to 1
> > would have reset inode 1242421 nlinks from 0 to 1
> > would have reset inode 1244981 nlinks from 0 to 1
> > would have reset inode 1246517 nlinks from 0 to 1
> > would have reset inode 1253429 nlinks from 0 to 1
> > would have reset inode 1271861 nlinks from 0 to 1
> > would have reset inode 1274677 nlinks from 0 to 1
> > would have reset inode 1277749 nlinks from 0 to 1
> > would have reset inode 1278773 nlinks from 0 to 1
> > would have reset inode 1286709 nlinks from 0 to 1
> > would have reset inode 1288245 nlinks from 0 to 1
> > would have reset inode 1299765 nlinks from 0 to 1
> > would have reset inode 1302325 nlinks from 0 to 1
> > would have reset inode 1304885 nlinks from 0 to 1
> > would have reset inode 1305397 nlinks from 0 to 1
> > would have reset inode 1307509 nlinks from 0 to 1
> > would have reset inode 1309493 nlinks from 0 to 1
> > would have reset inode 1310517 nlinks from 0 to 1
> > would have reset inode 1311029 nlinks from 0 to 1
> > would have reset inode 1312053 nlinks from 0 to 1
> > would have reset inode 1316917 nlinks from 0 to 1
> > would have reset inode 1317941 nlinks from 0 to 1
> > would have reset inode 1320821 nlinks from 0 to 1
> > would have reset inode 1322805 nlinks from 0 to 1
> > would have reset inode 1332789 nlinks from 0 to 1
> > would have reset inode 1336373 nlinks from 0 to 1
> > would have reset inode 1345653 nlinks from 0 to 1
> > would have reset inode 1354549 nlinks from 0 to 1
> > would have reset inode 1361973 nlinks from 0 to 1
> > would have reset inode 1369909 nlinks from 0 to 1
> > would have reset inode 1372981 nlinks from 0 to 1
> > would have reset inode 1388853 nlinks from 0 to 1
> > would have reset inode 1402933 nlinks from 0 to 1
> > would have reset inode 1403445 nlinks from 0 to 1
> > would have reset inode 1420085 nlinks from 0 to 1
> > would have reset inode 1452853 nlinks from 0 to 1
> > would have reset inode 1456437 nlinks from 0 to 1
> > would have reset inode 1457973 nlinks from 0 to 1
> > would have reset inode 1459253 nlinks from 0 to 1
> > would have reset inode 1467957 nlinks from 0 to 1
> > would have reset inode 1471541 nlinks from 0 to 1
> > would have reset inode 1476661 nlinks from 0 to 1
> > would have reset inode 1479733 nlinks from 0 to 1
> > would have reset inode 1483061 nlinks from 0 to 1
> > would have reset inode 1484085 nlinks from 0 to 1
> > would have reset inode 1486133 nlinks from 0 to 1
> > would have reset inode 1489461 nlinks from 0 to 1
> > would have reset inode 1490037 nlinks from 0 to 1
> > would have reset inode 1492021 nlinks from 0 to 1
> > would have reset inode 1493557 nlinks from 0 to 1
> > would have reset inode 1494069 nlinks from 0 to 1
> > would have reset inode 1496885 nlinks from 0 to 1
> > would have reset inode 1498421 nlinks from 0 to 1
> > would have reset inode 1498933 nlinks from 0 to 1
> > would have reset inode 1499957 nlinks from 0 to 1
> > would have reset inode 1506101 nlinks from 0 to 1
> > would have reset inode 1507637 nlinks from 0 to 1
> > would have reset inode 1510453 nlinks from 0 to 1
> > would have reset inode 1514293 nlinks from 0 to 1
> > would have reset inode 1517365 nlinks from 0 to 1
> > would have reset inode 1520693 nlinks from 0 to 1
> > would have reset inode 1521973 nlinks from 0 to 1
> > would have reset inode 1530421 nlinks from 0 to 1
> > would have reset inode 1530933 nlinks from 0 to 1
> > would have reset inode 1537333 nlinks from 0 to 1
> > would have reset inode 1538357 nlinks from 0 to 1
> > would have reset inode 1548853 nlinks from 0 to 1
> > would have reset inode 1553973 nlinks from 0 to 1
> > would have reset inode 1557301 nlinks from 0 to 1
> > would have reset inode 1564213 nlinks from 0 to 1
> > would have reset inode 1564725 nlinks from 0 to 1
> > would have reset inode 1576501 nlinks from 0 to 1
> > would have reset inode 1580597 nlinks from 0 to 1
> > would have reset inode 1584693 nlinks from 0 to 1
> > would have reset inode 1586485 nlinks from 0 to 1
> > would have reset inode 1589301 nlinks from 0 to 1
> > would have reset inode 1589813 nlinks from 0 to 1
> > would have reset inode 1592629 nlinks from 0 to 1
> > would have reset inode 1595701 nlinks from 0 to 1
> > would have reset inode 1601077 nlinks from 0 to 1
> > would have reset inode 1623861 nlinks from 0 to 1
> > would have reset inode 1626677 nlinks from 0 to 1
> > would have reset inode 1627701 nlinks from 0 to 1
> > would have reset inode 1633333 nlinks from 0 to 1
> > would have reset inode 1639221 nlinks from 0 to 1
> > would have reset inode 1649205 nlinks from 0 to 1
> > would have reset inode 1686325 nlinks from 0 to 1
> > would have reset inode 1690677 nlinks from 0 to 1
> > would have reset inode 1693749 nlinks from 0 to 1
> > would have reset inode 1704757 nlinks from 0 to 1
> > would have reset inode 1707061 nlinks from 0 to 1
> > would have reset inode 1709109 nlinks from 0 to 1
> > would have reset inode 1719349 nlinks from 0 to 1
> > would have reset inode 1737013 nlinks from 0 to 1
> > would have reset inode 1741365 nlinks from 0 to 1
> > would have reset inode 1747509 nlinks from 0 to 1
> > would have reset inode 1770805 nlinks from 0 to 1
> > would have reset inode 1780789 nlinks from 0 to 1
> > would have reset inode 1793589 nlinks from 0 to 1
> > would have reset inode 1795125 nlinks from 0 to 1
> > would have reset inode 1800757 nlinks from 0 to 1
> > would have reset inode 1801269 nlinks from 0 to 1
> > would have reset inode 1802549 nlinks from 0 to 1
> > would have reset inode 1804085 nlinks from 0 to 1
> > would have reset inode 1817141 nlinks from 0 to 1
> > would have reset inode 1821749 nlinks from 0 to 1
> > would have reset inode 1832757 nlinks from 0 to 1
> > would have reset inode 1836341 nlinks from 0 to 1
> > would have reset inode 1856309 nlinks from 0 to 1
> > would have reset inode 1900597 nlinks from 0 to 1
> > would have reset inode 1902901 nlinks from 0 to 1
> > would have reset inode 1912373 nlinks from 0 to 1
> > would have reset inode 1943093 nlinks from 0 to 1
> > would have reset inode 1944373 nlinks from 0 to 1
> > would have reset inode 1954101 nlinks from 0 to 1
> > would have reset inode 1955893 nlinks from 0 to 1
> > would have reset inode 1961781 nlinks from 0 to 1
> > would have reset inode 1974325 nlinks from 0 to 1
> > would have reset inode 1978677 nlinks from 0 to 1
> > would have reset inode 1981237 nlinks from 0 to 1
> > would have reset inode 1992245 nlinks from 0 to 1
> > would have reset inode 2000949 nlinks from 0 to 1
> > would have reset inode 2002229 nlinks from 0 to 1
> > would have reset inode 2004789 nlinks from 0 to 1
> > would have reset inode 2005301 nlinks from 0 to 1
> > would have reset inode 2011189 nlinks from 0 to 1
> > would have reset inode 2012981 nlinks from 0 to 1
> > would have reset inode 2015285 nlinks from 0 to 1
> > would have reset inode 2018869 nlinks from 0 to 1
> > would have reset inode 2028341 nlinks from 0 to 1
> > would have reset inode 2028853 nlinks from 0 to 1
> > would have reset inode 2030901 nlinks from 0 to 1
> > would have reset inode 2032181 nlinks from 0 to 1
> > would have reset inode 2032693 nlinks from 0 to 1
> > would have reset inode 2040117 nlinks from 0 to 1
> > would have reset inode 2053685 nlinks from 0 to 1
> > would have reset inode 2083893 nlinks from 0 to 1
> > would have reset inode 2087221 nlinks from 0 to 1
> > would have reset inode 2095925 nlinks from 0 to 1
> > would have reset inode 2098741 nlinks from 0 to 1
> > would have reset inode 2100533 nlinks from 0 to 1
> > would have reset inode 2101301 nlinks from 0 to 1
> > would have reset inode 2123573 nlinks from 0 to 1
> > would have reset inode 2132789 nlinks from 0 to 1
> > would have reset inode 2133813 nlinks from 0 to 1
> >
> >
> >
> >
> >
> > 2013/4/10 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
> >
> > The storage info is as following:
> > RAID-6
> > SATA HDD
> > Controller: PERC H710P Mini (Embedded)
> > Disk /dev/sdb: 30000.3 GB, 30000346562560 bytes
> > 255 heads, 63 sectors/track, 3647334 cylinders
> > Units = cylinders of 16065 * 512 = 8225280 bytes
> > Sector size (logical/physical): 512 bytes / 512 bytes
> > I/O size (minimum/optimal): 512 bytes / 512 bytes
> > Disk identifier: 0x00000000
> >
> > sd 0:2:1:0: [sdb] 58594426880 512-byte logical blocks: (30.0 TB/27.2
> > TiB)
> > sd 0:2:1:0: [sdb] Write Protect is off
> > sd 0:2:1:0: [sdb] Mode Sense: 1f 00 00 08
> > sd 0:2:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't
> > support DPO or FUA
> > sd 0:2:1:0: [sdb] Attached SCSI disk
> >
> > *-storage
> > description: RAID bus controller
> > product: MegaRAID SAS 2208 [Thunderbolt]
> > vendor: LSI Logic / Symbios Logic
> > physical id: 0
> > bus info: pci@0000:02:00.0
> > logical name: scsi0
> > version: 01
> > width: 64 bits
> > clock: 33MHz
> > capabilities: storage pm pciexpress vpd msi msix bus_master
> > cap_list rom
> > configuration: driver=megaraid_sas latency=0
> > resources: irq:42 ioport:fc00(size=256)
> > memory:dd7fc000-dd7fffff memory:dd780000-dd7bffff
> > memory:dc800000-dc81ffff(prefetchable)
> > *-disk:0
> > description: SCSI Disk
> > product: PERC H710P
> > vendor: DELL
> > physical id: 2.0.0
> > bus info: scsi@0:2.0.0
> > logical name: /dev/sda
> > version: 3.13
> > serial: 0049d6ce1d9f2035180096fde490f648
> > size: 558GiB (599GB)
> > capabilities: partitioned partitioned:dos
> > configuration: ansiversion=5 signature=000aa336
> > *-disk:1
> > description: SCSI Disk
> > product: PERC H710P
> > vendor: DELL
> > physical id: 2.1.0
> > bus info: scsi@0:2.1.0
> > logical name: /dev/sdb
> > logical name: /mnt/xfsd
> > version: 3.13
> > serial: 003366f71da22035180096fde490f648
> > size: 27TiB (30TB)
> > configuration: ansiversion=5 mount.fstype=xfs
> >
> mount.options=rw,relatime,attr2,delaylog,logbsize=64k,sunit=128,swidth=1280,noquota
> > state=mounted
> >
> > Thank you.
> >
> >
> > 2013/4/10 Emmanuel Florac <eflorac@intellique.com
> > <mailto:eflorac@intellique.com>>
> >
> > Le Tue, 9 Apr 2013 23:10:03 +0800
> > 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>> écrivait:
> >
> > > > Apr 9 11:01:30 cqdx kernel: XFS (sdb): I/O Error Detected.
> > > > Shutting down filesystem
> >
> > This. I/O error detected. That means that at some point the
> > underlying
> > device (disk, RAID array, SAN volume) couldn't be reached. So
> this
> > could very well be a case of a flakey drive, array, cable or SCSI
> > driver.
> >
> > What's the storage setup here?
> >
> > --
> >
> ------------------------------------------------------------------------
> > Emmanuel Florac | Direction technique
> > | Intellique
> > | <eflorac@intellique.com
> > <mailto:eflorac@intellique.com>>
> > | +33 1 78 94 84 02
> >
> ------------------------------------------------------------------------
> >
> >
> >
> >
> > --
> > 符永涛
> >
> >
> >
> >
> > --
> > 符永涛
> >
> >
> > _______________________________________________
> > xfs mailing list
> > xfs@oss.sgi.com
> > http://oss.sgi.com/mailman/listinfo/xfs
> >
>
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 32217 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-12 4:32 ` 符永涛
@ 2013-04-12 5:16 ` Eric Sandeen
2013-04-12 5:40 ` 符永涛
2013-04-12 5:23 ` 符永涛
1 sibling, 1 reply; 60+ messages in thread
From: Eric Sandeen @ 2013-04-12 5:16 UTC (permalink / raw)
To: 符永涛; +Cc: Brian Foster, Ben Myers, xfs@oss.sgi.com
On 4/11/13 11:32 PM, 符永涛 wrote:
> Hi Brian,
> Sorry but when I execute the script it says:
> WARNING: cannot find module xfs debuginfo: No DWARF information found
> semantic error: no match while resolving probe point module("xfs").function("xfs_iunlink")
>
> uname -a
> 2.6.32-279.el6.x86_64
> kernel debuginfo has been installed.
>
> Where can I find the correct xfs debuginfo?
it should be in the kernel-debuginfo rpm (of the same version/release as the kernel rpm you're running)
You should have:
/usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
If not, can you show:
# uname -a
# rpm -q kernel
# rpm -q kernel-debuginfo
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-12 4:32 ` 符永涛
2013-04-12 5:16 ` Eric Sandeen
@ 2013-04-12 5:23 ` 符永涛
1 sibling, 0 replies; 60+ messages in thread
From: 符永涛 @ 2013-04-12 5:23 UTC (permalink / raw)
To: Brian Foster; +Cc: Ben Myers, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 49141 bytes --]
sudo stap -L 'kernel.trace("*")'|grep xfs_iunlink
sudo stap -L 'kernel.trace("*")'|grep xfs_ifree
sudo stap -L 'kernel.trace("*")'|grep xfs
kernel.trace("xfs_agf") $mp:struct xfs_mount* $agf:struct xfs_agf*
$flags:int $caller_ip:long unsigned int
kernel.trace("xfs_alloc_busy") $mp:struct xfs_mount* $agno:xfs_agnumber_t
$agbno:xfs_agblock_t $len:xfs_extlen_t
kernel.trace("xfs_alloc_busy_clear") $mp:struct xfs_mount*
$agno:xfs_agnumber_t $agbno:xfs_agblock_t $len:xfs_extlen_t
kernel.trace("xfs_alloc_busy_enomem") $mp:struct xfs_mount*
$agno:xfs_agnumber_t $agbno:xfs_agblock_t $len:xfs_extlen_t
kernel.trace("xfs_alloc_busy_force") $mp:struct xfs_mount*
$agno:xfs_agnumber_t $agbno:xfs_agblock_t $len:xfs_extlen_t
kernel.trace("xfs_alloc_busy_reuse") $mp:struct xfs_mount*
$agno:xfs_agnumber_t $agbno:xfs_agblock_t $len:xfs_extlen_t
kernel.trace("xfs_alloc_busy_trim") $mp:struct xfs_mount*
$agno:xfs_agnumber_t $agbno:xfs_agblock_t $len:xfs_extlen_t
$tbno:xfs_agblock_t $tlen:xfs_extlen_t
kernel.trace("xfs_alloc_exact_done") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_exact_error") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_exact_notfound") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_file_space") $ip:struct xfs_inode*
kernel.trace("xfs_alloc_near_busy") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_near_error") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_near_first") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_near_greater") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_near_lesser") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_near_noentry") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_near_nominleft") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_size_busy") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_size_done") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_size_error") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_size_neither") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_size_noentry") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_size_nominleft") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_small_done") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_small_error") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_small_freelist") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_small_notenough") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_vextent_allfailed") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_vextent_badargs") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_vextent_loopfailed") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_vextent_noagbp") $args:struct xfs_alloc_arg*
kernel.trace("xfs_alloc_vextent_nofix") $args:struct xfs_alloc_arg*
kernel.trace("xfs_attr_list_add") $ctx:struct xfs_attr_list_context*
kernel.trace("xfs_attr_list_full") $ctx:struct xfs_attr_list_context*
kernel.trace("xfs_attr_list_leaf") $ctx:struct xfs_attr_list_context*
kernel.trace("xfs_attr_list_leaf_end") $ctx:struct xfs_attr_list_context*
kernel.trace("xfs_attr_list_node_descend") $ctx:struct
xfs_attr_list_context* $btree:struct xfs_da_node_entry*
kernel.trace("xfs_attr_list_notfound") $ctx:struct xfs_attr_list_context*
kernel.trace("xfs_attr_list_sf") $ctx:struct xfs_attr_list_context*
kernel.trace("xfs_attr_list_sf_all") $ctx:struct xfs_attr_list_context*
kernel.trace("xfs_attr_list_wrong_blk") $ctx:struct xfs_attr_list_context*
kernel.trace("xfs_bdstrat_shut") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_bmap_post_update") $ip:struct xfs_inode*
$idx:xfs_extnum_t $state:int $caller_ip:long unsigned int
kernel.trace("xfs_bmap_pre_update") $ip:struct xfs_inode* $idx:xfs_extnum_t
$state:int $caller_ip:long unsigned int
kernel.trace("xfs_btree_corrupt") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_buf_bawrite") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_buf_bdwrite") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_buf_cond_lock") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_buf_delwri_dequeue") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_buf_delwri_queue") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_buf_delwri_split") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_buf_error_relse") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_buf_find") $bp:struct xfs_buf* $flags:unsigned int
$caller_ip:long unsigned int
kernel.trace("xfs_buf_free") $bp:struct xfs_buf* $caller_ip:long unsigned
int
kernel.trace("xfs_buf_get") $bp:struct xfs_buf* $flags:unsigned int
$caller_ip:long unsigned int
kernel.trace("xfs_buf_get_uncached") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_buf_hold") $bp:struct xfs_buf* $caller_ip:long unsigned
int
kernel.trace("xfs_buf_init") $bp:struct xfs_buf* $caller_ip:long unsigned
int
kernel.trace("xfs_buf_iodone") $bp:struct xfs_buf* $caller_ip:long unsigned
int
kernel.trace("xfs_buf_ioerror") $bp:struct xfs_buf* $error:int
$caller_ip:long unsigned int
kernel.trace("xfs_buf_iorequest") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_buf_iowait") $bp:struct xfs_buf* $caller_ip:long unsigned
int
kernel.trace("xfs_buf_iowait_done") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_buf_item_committed") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_buf_item_format") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_buf_item_format_stale") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_buf_item_iodone") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_buf_item_iodone_async") $bp:struct xfs_buf*
$caller_ip:long unsigned int
kernel.trace("xfs_buf_item_pin") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_buf_item_push") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_buf_item_pushbuf") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_buf_item_relse") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_buf_item_size") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_buf_item_size_stale") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_buf_item_trylock") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_buf_item_unlock") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_buf_item_unlock_stale") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_buf_item_unpin") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_buf_item_unpin_stale") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_buf_lock") $bp:struct xfs_buf* $caller_ip:long unsigned
int
kernel.trace("xfs_buf_lock_done") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_buf_read") $bp:struct xfs_buf* $flags:unsigned int
$caller_ip:long unsigned int
kernel.trace("xfs_buf_rele") $bp:struct xfs_buf* $caller_ip:long unsigned
int
kernel.trace("xfs_buf_unlock") $bp:struct xfs_buf* $caller_ip:long unsigned
int
kernel.trace("xfs_bunmap") $ip:struct xfs_inode* $bno:xfs_fileoff_t
$len:xfs_filblks_t $flags:int $caller_ip:long unsigned int
kernel.trace("xfs_check_acl") $ip:struct xfs_inode*
kernel.trace("xfs_clear_inode") $ip:struct xfs_inode*
kernel.trace("xfs_create") $dp:struct xfs_inode* $xfs_create:struct
xfs_name*
kernel.trace("xfs_da_btree_corrupt") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_delalloc_enospc") $ip:struct xfs_inode* $offset:xfs_off_t
$count:ssize_t
kernel.trace("xfs_destroy_inode") $ip:struct xfs_inode*
kernel.trace("xfs_dir2_block_addname") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_block_lookup") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_block_removename") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_block_replace") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_block_to_leaf") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_block_to_sf") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_grow_inode") $args:struct xfs_da_args* $idx:int
kernel.trace("xfs_dir2_leaf_addname") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_leaf_lookup") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_leaf_removename") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_leaf_replace") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_leaf_to_block") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_leaf_to_node") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_leafn_add") $args:struct xfs_da_args* $idx:int
kernel.trace("xfs_dir2_leafn_moveents") $args:struct xfs_da_args*
$src_idx:int $dst_idx:int $count:int
kernel.trace("xfs_dir2_leafn_remove") $args:struct xfs_da_args* $idx:int
kernel.trace("xfs_dir2_node_addname") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_node_lookup") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_node_removename") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_node_replace") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_node_to_leaf") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_sf_addname") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_sf_create") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_sf_lookup") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_sf_removename") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_sf_replace") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_sf_to_block") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_sf_toino4") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_sf_toino8") $args:struct xfs_da_args*
kernel.trace("xfs_dir2_shrink_inode") $args:struct xfs_da_args* $idx:int
kernel.trace("xfs_discard_busy") $mp:struct xfs_mount* $agno:xfs_agnumber_t
$agbno:xfs_agblock_t $len:xfs_extlen_t
kernel.trace("xfs_discard_exclude") $mp:struct xfs_mount*
$agno:xfs_agnumber_t $agbno:xfs_agblock_t $len:xfs_extlen_t
kernel.trace("xfs_discard_extent") $mp:struct xfs_mount*
$agno:xfs_agnumber_t $agbno:xfs_agblock_t $len:xfs_extlen_t
kernel.trace("xfs_discard_toosmall") $mp:struct xfs_mount*
$agno:xfs_agnumber_t $agbno:xfs_agblock_t $len:xfs_extlen_t
kernel.trace("xfs_dqadjust") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqalloc") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqattach_found") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqattach_get") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqflush") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqflush_done") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqflush_force") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqget_hit") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqget_miss") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqinit") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqlookup_done") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqlookup_found") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqlookup_freelist") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqlookup_want") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqput") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqput_free") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqput_wait") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqread") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqread_fail") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqreclaim_dirty") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqreclaim_unlink") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqreclaim_want") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqrele") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqreuse") $dqp:struct xfs_dquot*
kernel.trace("xfs_dqtobp_read") $dqp:struct xfs_dquot*
kernel.trace("xfs_dquot_dqalloc") $ip:struct xfs_inode*
kernel.trace("xfs_dquot_dqdetach") $ip:struct xfs_inode*
kernel.trace("xfs_extlist") $ip:struct xfs_inode* $idx:xfs_extnum_t
$state:int $caller_ip:long unsigned int
kernel.trace("xfs_file_buffered_write") $ip:struct xfs_inode* $count:size_t
$offset:loff_t $flags:int
kernel.trace("xfs_file_compat_ioctl") $ip:struct xfs_inode*
kernel.trace("xfs_file_direct_write") $ip:struct xfs_inode* $count:size_t
$offset:loff_t $flags:int
kernel.trace("xfs_file_fsync") $ip:struct xfs_inode*
kernel.trace("xfs_file_ioctl") $ip:struct xfs_inode*
kernel.trace("xfs_file_read") $ip:struct xfs_inode* $count:size_t
$offset:loff_t $flags:int
kernel.trace("xfs_file_splice_read") $ip:struct xfs_inode* $count:size_t
$offset:loff_t $flags:int
kernel.trace("xfs_file_splice_write") $ip:struct xfs_inode* $count:size_t
$offset:loff_t $flags:int
kernel.trace("xfs_free_extent") $mp:struct xfs_mount* $agno:xfs_agnumber_t
$agbno:xfs_agblock_t $len:xfs_extlen_t $isfl:bool $haveleft:int
$haveright:int
kernel.trace("xfs_free_file_space") $ip:struct xfs_inode*
kernel.trace("xfs_get_blocks_alloc") $ip:struct xfs_inode*
$offset:xfs_off_t $count:ssize_t $type:int $irec:struct xfs_bmbt_irec*
kernel.trace("xfs_get_blocks_found") $ip:struct xfs_inode*
$offset:xfs_off_t $count:ssize_t $type:int $irec:struct xfs_bmbt_irec*
kernel.trace("xfs_get_blocks_notfound") $ip:struct xfs_inode*
$offset:xfs_off_t $count:ssize_t
kernel.trace("xfs_getattr") $ip:struct xfs_inode*
kernel.trace("xfs_iext_insert") $ip:struct xfs_inode* $idx:xfs_extnum_t
$r:struct xfs_bmbt_irec* $state:int $caller_ip:long unsigned int
kernel.trace("xfs_iext_remove") $ip:struct xfs_inode* $idx:xfs_extnum_t
$state:int $caller_ip:long unsigned int
kernel.trace("xfs_iget_hit") $ip:struct xfs_inode*
kernel.trace("xfs_iget_miss") $ip:struct xfs_inode*
kernel.trace("xfs_iget_reclaim") $ip:struct xfs_inode*
kernel.trace("xfs_iget_reclaim_fail") $ip:struct xfs_inode*
kernel.trace("xfs_iget_skip") $ip:struct xfs_inode*
kernel.trace("xfs_ihold") $ip:struct xfs_inode* $caller_ip:long unsigned int
kernel.trace("xfs_ilock") $ip:struct xfs_inode* $lock_flags:unsigned int
$caller_ip:long unsigned int
kernel.trace("xfs_ilock_demote") $ip:struct xfs_inode* $lock_flags:unsigned
int $caller_ip:long unsigned int
kernel.trace("xfs_ilock_nowait") $ip:struct xfs_inode* $lock_flags:unsigned
int $caller_ip:long unsigned int
kernel.trace("xfs_inode_item_push") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_inode_pin") $ip:struct xfs_inode* $caller_ip:long
unsigned int
kernel.trace("xfs_inode_unpin") $ip:struct xfs_inode* $caller_ip:long
unsigned int
kernel.trace("xfs_inode_unpin_nowait") $ip:struct xfs_inode*
$caller_ip:long unsigned int
kernel.trace("xfs_invalidatepage") $inode:struct inode* $page:struct page*
$off:long unsigned int
kernel.trace("xfs_ioctl_setattr") $ip:struct xfs_inode*
kernel.trace("xfs_irele") $ip:struct xfs_inode* $caller_ip:long unsigned int
kernel.trace("xfs_itruncate_finish_end") $ip:struct xfs_inode*
$new_size:xfs_fsize_t
kernel.trace("xfs_itruncate_finish_start") $ip:struct xfs_inode*
$new_size:xfs_fsize_t
kernel.trace("xfs_itruncate_start") $ip:struct xfs_inode*
$new_size:xfs_fsize_t $flag:int $toss_start:xfs_off_t $toss_finish:xfs_off_t
kernel.trace("xfs_iunlock") $ip:struct xfs_inode* $lock_flags:unsigned int
$caller_ip:long unsigned int
kernel.trace("xfs_link") $dp:struct xfs_inode* $xfs_link:struct xfs_name*
kernel.trace("xfs_log_done_nonperm") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_done_perm") $log:struct log* $tic:struct xlog_ticket*
kernel.trace("xfs_log_grant_enter") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_grant_error") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_grant_exit") $log:struct log* $tic:struct xlog_ticket*
kernel.trace("xfs_log_grant_sleep1") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_grant_sleep2") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_grant_wake1") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_grant_wake2") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_grant_wake_up") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_recover_buf_cancel") $log:struct log* $buf_f:struct
xfs_buf_log_format*
kernel.trace("xfs_log_recover_buf_cancel_add") $log:struct log*
$buf_f:struct xfs_buf_log_format*
kernel.trace("xfs_log_recover_buf_cancel_ref_inc") $log:struct log*
$buf_f:struct xfs_buf_log_format*
kernel.trace("xfs_log_recover_buf_dquot_buf") $log:struct log*
$buf_f:struct xfs_buf_log_format*
kernel.trace("xfs_log_recover_buf_inode_buf") $log:struct log*
$buf_f:struct xfs_buf_log_format*
kernel.trace("xfs_log_recover_buf_not_cancel") $log:struct log*
$buf_f:struct xfs_buf_log_format*
kernel.trace("xfs_log_recover_buf_recover") $log:struct log* $buf_f:struct
xfs_buf_log_format*
kernel.trace("xfs_log_recover_buf_reg_buf") $log:struct log* $buf_f:struct
xfs_buf_log_format*
kernel.trace("xfs_log_recover_inode_cancel") $log:struct log* $in_f:struct
xfs_inode_log_format*
kernel.trace("xfs_log_recover_inode_recover") $log:struct log* $in_f:struct
xfs_inode_log_format*
kernel.trace("xfs_log_recover_inode_skip") $log:struct log* $in_f:struct
xfs_inode_log_format*
kernel.trace("xfs_log_recover_item_add") $log:struct log* $trans:struct
xlog_recover* $item:struct xlog_recover_item* $pass:int
kernel.trace("xfs_log_recover_item_add_cont") $log:struct log*
$trans:struct xlog_recover* $item:struct xlog_recover_item* $pass:int
kernel.trace("xfs_log_recover_item_recover") $log:struct log* $trans:struct
xlog_recover* $item:struct xlog_recover_item* $pass:int
kernel.trace("xfs_log_recover_item_reorder_head") $log:struct log*
$trans:struct xlog_recover* $item:struct xlog_recover_item* $pass:int
kernel.trace("xfs_log_recover_item_reorder_tail") $log:struct log*
$trans:struct xlog_recover* $item:struct xlog_recover_item* $pass:int
kernel.trace("xfs_log_regrant_reserve_enter") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_regrant_reserve_exit") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_regrant_reserve_sub") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_regrant_write_enter") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_regrant_write_error") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_regrant_write_exit") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_regrant_write_sleep1") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_regrant_write_sleep2") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_regrant_write_wake1") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_regrant_write_wake2") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_regrant_write_wake_up") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_reserve") $log:struct log* $tic:struct xlog_ticket*
kernel.trace("xfs_log_umount_write") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_ungrant_enter") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_ungrant_exit") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_log_ungrant_sub") $log:struct log* $tic:struct
xlog_ticket*
kernel.trace("xfs_lookup") $dp:struct xfs_inode* $xfs_lookup:struct
xfs_name*
kernel.trace("xfs_map_blocks_alloc") $ip:struct xfs_inode*
$offset:xfs_off_t $count:ssize_t $type:int $irec:struct xfs_bmbt_irec*
kernel.trace("xfs_map_blocks_found") $ip:struct xfs_inode*
$offset:xfs_off_t $count:ssize_t $type:int $irec:struct xfs_bmbt_irec*
kernel.trace("xfs_pagecache_inval") $ip:struct xfs_inode* $start:xfs_off_t
$finish:xfs_off_t
kernel.trace("xfs_perag_clear_reclaim") $mp:struct xfs_mount*
$agno:xfs_agnumber_t $refcount:int $caller_ip:long unsigned int
kernel.trace("xfs_perag_get") $mp:struct xfs_mount* $agno:xfs_agnumber_t
$refcount:int $caller_ip:long unsigned int
kernel.trace("xfs_perag_get_tag") $mp:struct xfs_mount*
$agno:xfs_agnumber_t $refcount:int $caller_ip:long unsigned int
kernel.trace("xfs_perag_put") $mp:struct xfs_mount* $agno:xfs_agnumber_t
$refcount:int $caller_ip:long unsigned int
kernel.trace("xfs_perag_set_reclaim") $mp:struct xfs_mount*
$agno:xfs_agnumber_t $refcount:int $caller_ip:long unsigned int
kernel.trace("xfs_readdir") $ip:struct xfs_inode*
kernel.trace("xfs_readlink") $ip:struct xfs_inode*
kernel.trace("xfs_releasepage") $inode:struct inode* $page:struct page*
$off:long unsigned int
kernel.trace("xfs_remove") $dp:struct xfs_inode* $xfs_remove:struct
xfs_name*
kernel.trace("xfs_rename") $src_dp:struct xfs_inode* $target_dp:struct
xfs_inode* $src_name:struct xfs_name* $target_name:struct xfs_name*
kernel.trace("xfs_reset_dqcounts") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_setattr") $ip:struct xfs_inode*
kernel.trace("xfs_swap_extent_after") $ip:struct xfs_inode* $which:int
kernel.trace("xfs_swap_extent_before") $ip:struct xfs_inode* $which:int
kernel.trace("xfs_symlink") $dp:struct xfs_inode* $xfs_symlink:struct
xfs_name*
kernel.trace("xfs_trans_bhold") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_trans_bhold_release") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_trans_binval") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_trans_bjoin") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_trans_brelse") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_trans_commit_lsn") $trans:struct xfs_trans*
kernel.trace("xfs_trans_get_buf") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_trans_get_buf_recur") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_trans_getsb") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_trans_getsb_recur") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_trans_log_buf") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_trans_read_buf") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_trans_read_buf_io") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_trans_read_buf_recur") $bip:struct xfs_buf_log_item*
kernel.trace("xfs_trans_read_buf_shut") $bp:struct xfs_buf* $caller_ip:long
unsigned int
kernel.trace("xfs_unwritten_convert") $ip:struct xfs_inode*
$offset:xfs_off_t $count:ssize_t
kernel.trace("xfs_vm_bmap") $ip:struct xfs_inode*
kernel.trace("xfs_write_inode") $ip:struct xfs_inode*
kernel.trace("xfs_writepage") $inode:struct inode* $page:struct page*
$off:long unsigned int
2013/4/12 符永涛 <yongtaofu@gmail.com>
> Hi Brian,
> Sorry but when I execute the script it says:
> WARNING: cannot find module xfs debuginfo: No DWARF information found
> semantic error: no match while resolving probe point
> module("xfs").function("xfs_iunlink")
>
> uname -a
> 2.6.32-279.el6.x86_64
> kernel debuginfo has been installed.
>
> Where can I find the correct xfs debuginfo?
>
>
> Thank you for your help.
>
>
> 2013/4/12 Brian Foster <bfoster@redhat.com>
>
>> On 04/11/2013 03:11 PM, 符永涛 wrote:
>> > It happens tonight again on one of our servers, how to debug the root
>> > cause? Thank you.
>> >
>>
>> Hi,
>>
>> I've attached a system tap script (stap -v xfs.stp) that should
>> hopefully print out a bit more data should the issue happen again. Do
>> you have a small enough number of nodes (or predictable enough pattern)
>> that you could run this on the nodes that tend to fail and collect the
>> output?
>>
>> Also, could you collect an xfs_metadump of the filesystem in question
>> and make it available for download and analysis somewhere? I believe the
>> ideal approach is to mount/umount the filesystem first to replay the log
>> before collecting a metadump, but somebody could correct me on that (to
>> be safe, you could collect multiple dumps: pre-mount and post-mount).
>>
>> Could you also describe your workload a little bit? Thanks.
>>
>> Brian
>>
>> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
>> > xfs_inotobp() returned error 22.
>> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
>> > error 22
>> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_do_force_shutdown(0x1)
>> > called from line 1184 of file fs/xfs/xfs_vnodeops.c. Return address =
>> > 0xffffffffa02ee20a
>> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): I/O Error Detected. Shutting
>> > down filesystem
>> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): Please umount the filesystem and
>> > rectify the problem(s)
>> > Apr 12 02:32:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
>> > Apr 12 02:32:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
>> > Apr 12 02:33:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
>> > Apr 12 02:33:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
>> >
>> > xfs_repair -n
>> >
>> >
>> > Phase 7 - verify link counts...
>> > would have reset inode 20021 nlinks from 0 to 1
>> > would have reset inode 20789 nlinks from 0 to 1
>> > would have reset inode 35125 nlinks from 0 to 1
>> > would have reset inode 35637 nlinks from 0 to 1
>> > would have reset inode 36149 nlinks from 0 to 1
>> > would have reset inode 38197 nlinks from 0 to 1
>> > would have reset inode 39477 nlinks from 0 to 1
>> > would have reset inode 54069 nlinks from 0 to 1
>> > would have reset inode 62261 nlinks from 0 to 1
>> > would have reset inode 63029 nlinks from 0 to 1
>> > would have reset inode 72501 nlinks from 0 to 1
>> > would have reset inode 79925 nlinks from 0 to 1
>> > would have reset inode 81205 nlinks from 0 to 1
>> > would have reset inode 84789 nlinks from 0 to 1
>> > would have reset inode 87861 nlinks from 0 to 1
>> > would have reset inode 90663 nlinks from 0 to 1
>> > would have reset inode 91189 nlinks from 0 to 1
>> > would have reset inode 95541 nlinks from 0 to 1
>> > would have reset inode 98101 nlinks from 0 to 1
>> > would have reset inode 101173 nlinks from 0 to 1
>> > would have reset inode 113205 nlinks from 0 to 1
>> > would have reset inode 114741 nlinks from 0 to 1
>> > would have reset inode 126261 nlinks from 0 to 1
>> > would have reset inode 140597 nlinks from 0 to 1
>> > would have reset inode 144693 nlinks from 0 to 1
>> > would have reset inode 147765 nlinks from 0 to 1
>> > would have reset inode 152885 nlinks from 0 to 1
>> > would have reset inode 161333 nlinks from 0 to 1
>> > would have reset inode 161845 nlinks from 0 to 1
>> > would have reset inode 167477 nlinks from 0 to 1
>> > would have reset inode 172341 nlinks from 0 to 1
>> > would have reset inode 191797 nlinks from 0 to 1
>> > would have reset inode 204853 nlinks from 0 to 1
>> > would have reset inode 205365 nlinks from 0 to 1
>> > would have reset inode 215349 nlinks from 0 to 1
>> > would have reset inode 215861 nlinks from 0 to 1
>> > would have reset inode 216373 nlinks from 0 to 1
>> > would have reset inode 217397 nlinks from 0 to 1
>> > would have reset inode 224309 nlinks from 0 to 1
>> > would have reset inode 225589 nlinks from 0 to 1
>> > would have reset inode 234549 nlinks from 0 to 1
>> > would have reset inode 234805 nlinks from 0 to 1
>> > would have reset inode 249653 nlinks from 0 to 1
>> > would have reset inode 250677 nlinks from 0 to 1
>> > would have reset inode 252469 nlinks from 0 to 1
>> > would have reset inode 261429 nlinks from 0 to 1
>> > would have reset inode 265013 nlinks from 0 to 1
>> > would have reset inode 266805 nlinks from 0 to 1
>> > would have reset inode 267317 nlinks from 0 to 1
>> > would have reset inode 268853 nlinks from 0 to 1
>> > would have reset inode 272437 nlinks from 0 to 1
>> > would have reset inode 273205 nlinks from 0 to 1
>> > would have reset inode 274229 nlinks from 0 to 1
>> > would have reset inode 278325 nlinks from 0 to 1
>> > would have reset inode 278837 nlinks from 0 to 1
>> > would have reset inode 281397 nlinks from 0 to 1
>> > would have reset inode 292661 nlinks from 0 to 1
>> > would have reset inode 300853 nlinks from 0 to 1
>> > would have reset inode 302901 nlinks from 0 to 1
>> > would have reset inode 305205 nlinks from 0 to 1
>> > would have reset inode 314165 nlinks from 0 to 1
>> > would have reset inode 315189 nlinks from 0 to 1
>> > would have reset inode 320309 nlinks from 0 to 1
>> > would have reset inode 324917 nlinks from 0 to 1
>> > would have reset inode 328245 nlinks from 0 to 1
>> > would have reset inode 335925 nlinks from 0 to 1
>> > would have reset inode 339253 nlinks from 0 to 1
>> > would have reset inode 339765 nlinks from 0 to 1
>> > would have reset inode 348213 nlinks from 0 to 1
>> > would have reset inode 360501 nlinks from 0 to 1
>> > would have reset inode 362037 nlinks from 0 to 1
>> > would have reset inode 366389 nlinks from 0 to 1
>> > would have reset inode 385845 nlinks from 0 to 1
>> > would have reset inode 390709 nlinks from 0 to 1
>> > would have reset inode 409141 nlinks from 0 to 1
>> > would have reset inode 413237 nlinks from 0 to 1
>> > would have reset inode 414773 nlinks from 0 to 1
>> > would have reset inode 417845 nlinks from 0 to 1
>> > would have reset inode 436021 nlinks from 0 to 1
>> > would have reset inode 439349 nlinks from 0 to 1
>> > would have reset inode 447029 nlinks from 0 to 1
>> > would have reset inode 491317 nlinks from 0 to 1
>> > would have reset inode 494133 nlinks from 0 to 1
>> > would have reset inode 495413 nlinks from 0 to 1
>> > would have reset inode 501301 nlinks from 0 to 1
>> > would have reset inode 506421 nlinks from 0 to 1
>> > would have reset inode 508469 nlinks from 0 to 1
>> > would have reset inode 508981 nlinks from 0 to 1
>> > would have reset inode 511797 nlinks from 0 to 1
>> > would have reset inode 513077 nlinks from 0 to 1
>> > would have reset inode 517941 nlinks from 0 to 1
>> > would have reset inode 521013 nlinks from 0 to 1
>> > would have reset inode 522805 nlinks from 0 to 1
>> > would have reset inode 523317 nlinks from 0 to 1
>> > would have reset inode 525621 nlinks from 0 to 1
>> > would have reset inode 527925 nlinks from 0 to 1
>> > would have reset inode 535605 nlinks from 0 to 1
>> > would have reset inode 541749 nlinks from 0 to 1
>> > would have reset inode 573493 nlinks from 0 to 1
>> > would have reset inode 578613 nlinks from 0 to 1
>> > would have reset inode 583029 nlinks from 0 to 1
>> > would have reset inode 585525 nlinks from 0 to 1
>> > would have reset inode 586293 nlinks from 0 to 1
>> > would have reset inode 586805 nlinks from 0 to 1
>> > would have reset inode 591413 nlinks from 0 to 1
>> > would have reset inode 594485 nlinks from 0 to 1
>> > would have reset inode 596277 nlinks from 0 to 1
>> > would have reset inode 603189 nlinks from 0 to 1
>> > would have reset inode 613429 nlinks from 0 to 1
>> > would have reset inode 617781 nlinks from 0 to 1
>> > would have reset inode 621877 nlinks from 0 to 1
>> > would have reset inode 623925 nlinks from 0 to 1
>> > would have reset inode 625205 nlinks from 0 to 1
>> > would have reset inode 626741 nlinks from 0 to 1
>> > would have reset inode 639541 nlinks from 0 to 1
>> > would have reset inode 640053 nlinks from 0 to 1
>> > would have reset inode 640565 nlinks from 0 to 1
>> > would have reset inode 645173 nlinks from 0 to 1
>> > would have reset inode 652853 nlinks from 0 to 1
>> > would have reset inode 656181 nlinks from 0 to 1
>> > would have reset inode 659253 nlinks from 0 to 1
>> > would have reset inode 663605 nlinks from 0 to 1
>> > would have reset inode 667445 nlinks from 0 to 1
>> > would have reset inode 680757 nlinks from 0 to 1
>> > would have reset inode 691253 nlinks from 0 to 1
>> > would have reset inode 691765 nlinks from 0 to 1
>> > would have reset inode 697653 nlinks from 0 to 1
>> > would have reset inode 700469 nlinks from 0 to 1
>> > would have reset inode 707893 nlinks from 0 to 1
>> > would have reset inode 716853 nlinks from 0 to 1
>> > would have reset inode 722229 nlinks from 0 to 1
>> > would have reset inode 722741 nlinks from 0 to 1
>> > would have reset inode 723765 nlinks from 0 to 1
>> > would have reset inode 731957 nlinks from 0 to 1
>> > would have reset inode 742965 nlinks from 0 to 1
>> > would have reset inode 743477 nlinks from 0 to 1
>> > would have reset inode 745781 nlinks from 0 to 1
>> > would have reset inode 746293 nlinks from 0 to 1
>> > would have reset inode 774453 nlinks from 0 to 1
>> > would have reset inode 778805 nlinks from 0 to 1
>> > would have reset inode 785013 nlinks from 0 to 1
>> > would have reset inode 785973 nlinks from 0 to 1
>> > would have reset inode 791349 nlinks from 0 to 1
>> > would have reset inode 796981 nlinks from 0 to 1
>> > would have reset inode 803381 nlinks from 0 to 1
>> > would have reset inode 806965 nlinks from 0 to 1
>> > would have reset inode 811798 nlinks from 0 to 1
>> > would have reset inode 812310 nlinks from 0 to 1
>> > would have reset inode 813078 nlinks from 0 to 1
>> > would have reset inode 813607 nlinks from 0 to 1
>> > would have reset inode 814183 nlinks from 0 to 1
>> > would have reset inode 822069 nlinks from 0 to 1
>> > would have reset inode 828469 nlinks from 0 to 1
>> > would have reset inode 830005 nlinks from 0 to 1
>> > would have reset inode 832053 nlinks from 0 to 1
>> > would have reset inode 832565 nlinks from 0 to 1
>> > would have reset inode 836661 nlinks from 0 to 1
>> > would have reset inode 841013 nlinks from 0 to 1
>> > would have reset inode 841525 nlinks from 0 to 1
>> > would have reset inode 845365 nlinks from 0 to 1
>> > would have reset inode 846133 nlinks from 0 to 1
>> > would have reset inode 847157 nlinks from 0 to 1
>> > would have reset inode 852533 nlinks from 0 to 1
>> > would have reset inode 857141 nlinks from 0 to 1
>> > would have reset inode 863271 nlinks from 0 to 1
>> > would have reset inode 866855 nlinks from 0 to 1
>> > would have reset inode 887861 nlinks from 0 to 1
>> > would have reset inode 891701 nlinks from 0 to 1
>> > would have reset inode 894773 nlinks from 0 to 1
>> > would have reset inode 900149 nlinks from 0 to 1
>> > would have reset inode 902197 nlinks from 0 to 1
>> > would have reset inode 906293 nlinks from 0 to 1
>> > would have reset inode 906805 nlinks from 0 to 1
>> > would have reset inode 909877 nlinks from 0 to 1
>> > would have reset inode 925493 nlinks from 0 to 1
>> > would have reset inode 949543 nlinks from 0 to 1
>> > would have reset inode 955175 nlinks from 0 to 1
>> > would have reset inode 963623 nlinks from 0 to 1
>> > would have reset inode 967733 nlinks from 0 to 1
>> > would have reset inode 968231 nlinks from 0 to 1
>> > would have reset inode 982069 nlinks from 0 to 1
>> > would have reset inode 1007413 nlinks from 0 to 1
>> > would have reset inode 1011509 nlinks from 0 to 1
>> > would have reset inode 1014069 nlinks from 0 to 1
>> > would have reset inode 1014581 nlinks from 0 to 1
>> > would have reset inode 1022005 nlinks from 0 to 1
>> > would have reset inode 1022517 nlinks from 0 to 1
>> > would have reset inode 1023029 nlinks from 0 to 1
>> > would have reset inode 1025333 nlinks from 0 to 1
>> > would have reset inode 1043765 nlinks from 0 to 1
>> > would have reset inode 1044789 nlinks from 0 to 1
>> > would have reset inode 1049397 nlinks from 0 to 1
>> > would have reset inode 1050933 nlinks from 0 to 1
>> > would have reset inode 1051445 nlinks from 0 to 1
>> > would have reset inode 1054261 nlinks from 0 to 1
>> > would have reset inode 1060917 nlinks from 0 to 1
>> > would have reset inode 1063477 nlinks from 0 to 1
>> > would have reset inode 1076021 nlinks from 0 to 1
>> > would have reset inode 1081141 nlinks from 0 to 1
>> > would have reset inode 1086261 nlinks from 0 to 1
>> > would have reset inode 1097269 nlinks from 0 to 1
>> > would have reset inode 1099829 nlinks from 0 to 1
>> > would have reset inode 1100853 nlinks from 0 to 1
>> > would have reset inode 1101877 nlinks from 0 to 1
>> > would have reset inode 1126709 nlinks from 0 to 1
>> > would have reset inode 1134389 nlinks from 0 to 1
>> > would have reset inode 1141045 nlinks from 0 to 1
>> > would have reset inode 1141557 nlinks from 0 to 1
>> > would have reset inode 1142581 nlinks from 0 to 1
>> > would have reset inode 1148469 nlinks from 0 to 1
>> > would have reset inode 1153333 nlinks from 0 to 1
>> > would have reset inode 1181749 nlinks from 0 to 1
>> > would have reset inode 1192245 nlinks from 0 to 1
>> > would have reset inode 1198133 nlinks from 0 to 1
>> > would have reset inode 1203765 nlinks from 0 to 1
>> > would have reset inode 1221429 nlinks from 0 to 1
>> > would have reset inode 1223989 nlinks from 0 to 1
>> > would have reset inode 1235509 nlinks from 0 to 1
>> > would have reset inode 1239349 nlinks from 0 to 1
>> > would have reset inode 1240885 nlinks from 0 to 1
>> > would have reset inode 1241397 nlinks from 0 to 1
>> > would have reset inode 1241909 nlinks from 0 to 1
>> > would have reset inode 1242421 nlinks from 0 to 1
>> > would have reset inode 1244981 nlinks from 0 to 1
>> > would have reset inode 1246517 nlinks from 0 to 1
>> > would have reset inode 1253429 nlinks from 0 to 1
>> > would have reset inode 1271861 nlinks from 0 to 1
>> > would have reset inode 1274677 nlinks from 0 to 1
>> > would have reset inode 1277749 nlinks from 0 to 1
>> > would have reset inode 1278773 nlinks from 0 to 1
>> > would have reset inode 1286709 nlinks from 0 to 1
>> > would have reset inode 1288245 nlinks from 0 to 1
>> > would have reset inode 1299765 nlinks from 0 to 1
>> > would have reset inode 1302325 nlinks from 0 to 1
>> > would have reset inode 1304885 nlinks from 0 to 1
>> > would have reset inode 1305397 nlinks from 0 to 1
>> > would have reset inode 1307509 nlinks from 0 to 1
>> > would have reset inode 1309493 nlinks from 0 to 1
>> > would have reset inode 1310517 nlinks from 0 to 1
>> > would have reset inode 1311029 nlinks from 0 to 1
>> > would have reset inode 1312053 nlinks from 0 to 1
>> > would have reset inode 1316917 nlinks from 0 to 1
>> > would have reset inode 1317941 nlinks from 0 to 1
>> > would have reset inode 1320821 nlinks from 0 to 1
>> > would have reset inode 1322805 nlinks from 0 to 1
>> > would have reset inode 1332789 nlinks from 0 to 1
>> > would have reset inode 1336373 nlinks from 0 to 1
>> > would have reset inode 1345653 nlinks from 0 to 1
>> > would have reset inode 1354549 nlinks from 0 to 1
>> > would have reset inode 1361973 nlinks from 0 to 1
>> > would have reset inode 1369909 nlinks from 0 to 1
>> > would have reset inode 1372981 nlinks from 0 to 1
>> > would have reset inode 1388853 nlinks from 0 to 1
>> > would have reset inode 1402933 nlinks from 0 to 1
>> > would have reset inode 1403445 nlinks from 0 to 1
>> > would have reset inode 1420085 nlinks from 0 to 1
>> > would have reset inode 1452853 nlinks from 0 to 1
>> > would have reset inode 1456437 nlinks from 0 to 1
>> > would have reset inode 1457973 nlinks from 0 to 1
>> > would have reset inode 1459253 nlinks from 0 to 1
>> > would have reset inode 1467957 nlinks from 0 to 1
>> > would have reset inode 1471541 nlinks from 0 to 1
>> > would have reset inode 1476661 nlinks from 0 to 1
>> > would have reset inode 1479733 nlinks from 0 to 1
>> > would have reset inode 1483061 nlinks from 0 to 1
>> > would have reset inode 1484085 nlinks from 0 to 1
>> > would have reset inode 1486133 nlinks from 0 to 1
>> > would have reset inode 1489461 nlinks from 0 to 1
>> > would have reset inode 1490037 nlinks from 0 to 1
>> > would have reset inode 1492021 nlinks from 0 to 1
>> > would have reset inode 1493557 nlinks from 0 to 1
>> > would have reset inode 1494069 nlinks from 0 to 1
>> > would have reset inode 1496885 nlinks from 0 to 1
>> > would have reset inode 1498421 nlinks from 0 to 1
>> > would have reset inode 1498933 nlinks from 0 to 1
>> > would have reset inode 1499957 nlinks from 0 to 1
>> > would have reset inode 1506101 nlinks from 0 to 1
>> > would have reset inode 1507637 nlinks from 0 to 1
>> > would have reset inode 1510453 nlinks from 0 to 1
>> > would have reset inode 1514293 nlinks from 0 to 1
>> > would have reset inode 1517365 nlinks from 0 to 1
>> > would have reset inode 1520693 nlinks from 0 to 1
>> > would have reset inode 1521973 nlinks from 0 to 1
>> > would have reset inode 1530421 nlinks from 0 to 1
>> > would have reset inode 1530933 nlinks from 0 to 1
>> > would have reset inode 1537333 nlinks from 0 to 1
>> > would have reset inode 1538357 nlinks from 0 to 1
>> > would have reset inode 1548853 nlinks from 0 to 1
>> > would have reset inode 1553973 nlinks from 0 to 1
>> > would have reset inode 1557301 nlinks from 0 to 1
>> > would have reset inode 1564213 nlinks from 0 to 1
>> > would have reset inode 1564725 nlinks from 0 to 1
>> > would have reset inode 1576501 nlinks from 0 to 1
>> > would have reset inode 1580597 nlinks from 0 to 1
>> > would have reset inode 1584693 nlinks from 0 to 1
>> > would have reset inode 1586485 nlinks from 0 to 1
>> > would have reset inode 1589301 nlinks from 0 to 1
>> > would have reset inode 1589813 nlinks from 0 to 1
>> > would have reset inode 1592629 nlinks from 0 to 1
>> > would have reset inode 1595701 nlinks from 0 to 1
>> > would have reset inode 1601077 nlinks from 0 to 1
>> > would have reset inode 1623861 nlinks from 0 to 1
>> > would have reset inode 1626677 nlinks from 0 to 1
>> > would have reset inode 1627701 nlinks from 0 to 1
>> > would have reset inode 1633333 nlinks from 0 to 1
>> > would have reset inode 1639221 nlinks from 0 to 1
>> > would have reset inode 1649205 nlinks from 0 to 1
>> > would have reset inode 1686325 nlinks from 0 to 1
>> > would have reset inode 1690677 nlinks from 0 to 1
>> > would have reset inode 1693749 nlinks from 0 to 1
>> > would have reset inode 1704757 nlinks from 0 to 1
>> > would have reset inode 1707061 nlinks from 0 to 1
>> > would have reset inode 1709109 nlinks from 0 to 1
>> > would have reset inode 1719349 nlinks from 0 to 1
>> > would have reset inode 1737013 nlinks from 0 to 1
>> > would have reset inode 1741365 nlinks from 0 to 1
>> > would have reset inode 1747509 nlinks from 0 to 1
>> > would have reset inode 1770805 nlinks from 0 to 1
>> > would have reset inode 1780789 nlinks from 0 to 1
>> > would have reset inode 1793589 nlinks from 0 to 1
>> > would have reset inode 1795125 nlinks from 0 to 1
>> > would have reset inode 1800757 nlinks from 0 to 1
>> > would have reset inode 1801269 nlinks from 0 to 1
>> > would have reset inode 1802549 nlinks from 0 to 1
>> > would have reset inode 1804085 nlinks from 0 to 1
>> > would have reset inode 1817141 nlinks from 0 to 1
>> > would have reset inode 1821749 nlinks from 0 to 1
>> > would have reset inode 1832757 nlinks from 0 to 1
>> > would have reset inode 1836341 nlinks from 0 to 1
>> > would have reset inode 1856309 nlinks from 0 to 1
>> > would have reset inode 1900597 nlinks from 0 to 1
>> > would have reset inode 1902901 nlinks from 0 to 1
>> > would have reset inode 1912373 nlinks from 0 to 1
>> > would have reset inode 1943093 nlinks from 0 to 1
>> > would have reset inode 1944373 nlinks from 0 to 1
>> > would have reset inode 1954101 nlinks from 0 to 1
>> > would have reset inode 1955893 nlinks from 0 to 1
>> > would have reset inode 1961781 nlinks from 0 to 1
>> > would have reset inode 1974325 nlinks from 0 to 1
>> > would have reset inode 1978677 nlinks from 0 to 1
>> > would have reset inode 1981237 nlinks from 0 to 1
>> > would have reset inode 1992245 nlinks from 0 to 1
>> > would have reset inode 2000949 nlinks from 0 to 1
>> > would have reset inode 2002229 nlinks from 0 to 1
>> > would have reset inode 2004789 nlinks from 0 to 1
>> > would have reset inode 2005301 nlinks from 0 to 1
>> > would have reset inode 2011189 nlinks from 0 to 1
>> > would have reset inode 2012981 nlinks from 0 to 1
>> > would have reset inode 2015285 nlinks from 0 to 1
>> > would have reset inode 2018869 nlinks from 0 to 1
>> > would have reset inode 2028341 nlinks from 0 to 1
>> > would have reset inode 2028853 nlinks from 0 to 1
>> > would have reset inode 2030901 nlinks from 0 to 1
>> > would have reset inode 2032181 nlinks from 0 to 1
>> > would have reset inode 2032693 nlinks from 0 to 1
>> > would have reset inode 2040117 nlinks from 0 to 1
>> > would have reset inode 2053685 nlinks from 0 to 1
>> > would have reset inode 2083893 nlinks from 0 to 1
>> > would have reset inode 2087221 nlinks from 0 to 1
>> > would have reset inode 2095925 nlinks from 0 to 1
>> > would have reset inode 2098741 nlinks from 0 to 1
>> > would have reset inode 2100533 nlinks from 0 to 1
>> > would have reset inode 2101301 nlinks from 0 to 1
>> > would have reset inode 2123573 nlinks from 0 to 1
>> > would have reset inode 2132789 nlinks from 0 to 1
>> > would have reset inode 2133813 nlinks from 0 to 1
>> >
>> >
>> >
>> >
>> >
>> > 2013/4/10 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>> >
>> > The storage info is as following:
>> > RAID-6
>> > SATA HDD
>> > Controller: PERC H710P Mini (Embedded)
>> > Disk /dev/sdb: 30000.3 GB, 30000346562560 bytes
>> > 255 heads, 63 sectors/track, 3647334 cylinders
>> > Units = cylinders of 16065 * 512 = 8225280 bytes
>> > Sector size (logical/physical): 512 bytes / 512 bytes
>> > I/O size (minimum/optimal): 512 bytes / 512 bytes
>> > Disk identifier: 0x00000000
>> >
>> > sd 0:2:1:0: [sdb] 58594426880 512-byte logical blocks: (30.0 TB/27.2
>> > TiB)
>> > sd 0:2:1:0: [sdb] Write Protect is off
>> > sd 0:2:1:0: [sdb] Mode Sense: 1f 00 00 08
>> > sd 0:2:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't
>> > support DPO or FUA
>> > sd 0:2:1:0: [sdb] Attached SCSI disk
>> >
>> > *-storage
>> > description: RAID bus controller
>> > product: MegaRAID SAS 2208 [Thunderbolt]
>> > vendor: LSI Logic / Symbios Logic
>> > physical id: 0
>> > bus info: pci@0000:02:00.0
>> > logical name: scsi0
>> > version: 01
>> > width: 64 bits
>> > clock: 33MHz
>> > capabilities: storage pm pciexpress vpd msi msix bus_master
>> > cap_list rom
>> > configuration: driver=megaraid_sas latency=0
>> > resources: irq:42 ioport:fc00(size=256)
>> > memory:dd7fc000-dd7fffff memory:dd780000-dd7bffff
>> > memory:dc800000-dc81ffff(prefetchable)
>> > *-disk:0
>> > description: SCSI Disk
>> > product: PERC H710P
>> > vendor: DELL
>> > physical id: 2.0.0
>> > bus info: scsi@0:2.0.0
>> > logical name: /dev/sda
>> > version: 3.13
>> > serial: 0049d6ce1d9f2035180096fde490f648
>> > size: 558GiB (599GB)
>> > capabilities: partitioned partitioned:dos
>> > configuration: ansiversion=5 signature=000aa336
>> > *-disk:1
>> > description: SCSI Disk
>> > product: PERC H710P
>> > vendor: DELL
>> > physical id: 2.1.0
>> > bus info: scsi@0:2.1.0
>> > logical name: /dev/sdb
>> > logical name: /mnt/xfsd
>> > version: 3.13
>> > serial: 003366f71da22035180096fde490f648
>> > size: 27TiB (30TB)
>> > configuration: ansiversion=5 mount.fstype=xfs
>> >
>> mount.options=rw,relatime,attr2,delaylog,logbsize=64k,sunit=128,swidth=1280,noquota
>> > state=mounted
>> >
>> > Thank you.
>> >
>> >
>> > 2013/4/10 Emmanuel Florac <eflorac@intellique.com
>> > <mailto:eflorac@intellique.com>>
>> >
>> > Le Tue, 9 Apr 2013 23:10:03 +0800
>> > 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>> écrivait:
>> >
>> > > > Apr 9 11:01:30 cqdx kernel: XFS (sdb): I/O Error Detected.
>> > > > Shutting down filesystem
>> >
>> > This. I/O error detected. That means that at some point the
>> > underlying
>> > device (disk, RAID array, SAN volume) couldn't be reached. So
>> this
>> > could very well be a case of a flakey drive, array, cable or
>> SCSI
>> > driver.
>> >
>> > What's the storage setup here?
>> >
>> > --
>> >
>> ------------------------------------------------------------------------
>> > Emmanuel Florac | Direction technique
>> > | Intellique
>> > | <eflorac@intellique.com
>> > <mailto:eflorac@intellique.com>>
>> > | +33 1 78 94 84 02
>> >
>> ------------------------------------------------------------------------
>> >
>> >
>> >
>> >
>> > --
>> > 符永涛
>> >
>> >
>> >
>> >
>> > --
>> > 符永涛
>> >
>> >
>> > _______________________________________________
>> > xfs mailing list
>> > xfs@oss.sgi.com
>> > http://oss.sgi.com/mailman/listinfo/xfs
>> >
>>
>>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 58222 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-12 5:16 ` Eric Sandeen
@ 2013-04-12 5:40 ` 符永涛
2013-04-12 6:00 ` 符永涛
2013-04-12 7:44 ` 符永涛
0 siblings, 2 replies; 60+ messages in thread
From: 符永涛 @ 2013-04-12 5:40 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Brian Foster, Ben Myers, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 1687 bytes --]
ls -l
/usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
-r--r--r-- 1 root root 21393024 Apr 12 12:08
/usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
rpm -qa|grep kernel
kernel-headers-2.6.32-279.el6.x86_64
kernel-devel-2.6.32-279.el6.x86_64
kernel-2.6.32-358.el6.x86_64
kernel-debuginfo-common-x86_64-2.6.32-279.el6.x86_64
abrt-addon-kerneloops-2.0.8-6.el6.x86_64
kernel-firmware-2.6.32-358.el6.noarch
kernel-debug-2.6.32-358.el6.x86_64
kernel-debuginfo-2.6.32-279.el6.x86_64
dracut-kernel-004-283.el6.noarch
libreport-plugin-kerneloops-2.0.9-5.el6.x86_64
kernel-devel-2.6.32-358.el6.x86_64
kernel-2.6.32-279.el6.x86_64
rpm -q kernel-debuginfo
kernel-debuginfo-2.6.32-279.el6.x86_64
rpm -q kernel
kernel-2.6.32-279.el6.x86_64
kernel-2.6.32-358.el6.x86_64
do I need to re probe it?
2013/4/12 Eric Sandeen <sandeen@sandeen.net>
> On 4/11/13 11:32 PM, 符永涛 wrote:
> > Hi Brian,
> > Sorry but when I execute the script it says:
> > WARNING: cannot find module xfs debuginfo: No DWARF information found
> > semantic error: no match while resolving probe point
> module("xfs").function("xfs_iunlink")
> >
> > uname -a
> > 2.6.32-279.el6.x86_64
> > kernel debuginfo has been installed.
> >
> > Where can I find the correct xfs debuginfo?
>
> it should be in the kernel-debuginfo rpm (of the same version/release as
> the kernel rpm you're running)
>
> You should have:
>
> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>
> If not, can you show:
>
> # uname -a
> # rpm -q kernel
> # rpm -q kernel-debuginfo
>
> -Eric
>
>
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 2259 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-12 5:40 ` 符永涛
@ 2013-04-12 6:00 ` 符永涛
2013-04-12 12:11 ` Brian Foster
2013-04-12 7:44 ` 符永涛
1 sibling, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-12 6:00 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Brian Foster, Ben Myers, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 2138 bytes --]
stap -e 'probe module("xfs").function("xfs_iunlink"){}'
WARNING: cannot find module xfs debuginfo: No DWARF information found
semantic error: no match while resolving probe point
module("xfs").function("xfs_iunlink")
Pass 2: analysis failed. Try again with another '--vp 01' option.
2013/4/12 符永涛 <yongtaofu@gmail.com>
> ls -l
> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
> -r--r--r-- 1 root root 21393024 Apr 12 12:08
> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>
> rpm -qa|grep kernel
> kernel-headers-2.6.32-279.el6.x86_64
> kernel-devel-2.6.32-279.el6.x86_64
> kernel-2.6.32-358.el6.x86_64
> kernel-debuginfo-common-x86_64-2.6.32-279.el6.x86_64
> abrt-addon-kerneloops-2.0.8-6.el6.x86_64
> kernel-firmware-2.6.32-358.el6.noarch
> kernel-debug-2.6.32-358.el6.x86_64
> kernel-debuginfo-2.6.32-279.el6.x86_64
> dracut-kernel-004-283.el6.noarch
> libreport-plugin-kerneloops-2.0.9-5.el6.x86_64
> kernel-devel-2.6.32-358.el6.x86_64
> kernel-2.6.32-279.el6.x86_64
>
> rpm -q kernel-debuginfo
> kernel-debuginfo-2.6.32-279.el6.x86_64
>
> rpm -q kernel
> kernel-2.6.32-279.el6.x86_64
> kernel-2.6.32-358.el6.x86_64
>
> do I need to re probe it?
>
>
> 2013/4/12 Eric Sandeen <sandeen@sandeen.net>
>
>> On 4/11/13 11:32 PM, 符永涛 wrote:
>> > Hi Brian,
>> > Sorry but when I execute the script it says:
>> > WARNING: cannot find module xfs debuginfo: No DWARF information found
>> > semantic error: no match while resolving probe point
>> module("xfs").function("xfs_iunlink")
>> >
>> > uname -a
>> > 2.6.32-279.el6.x86_64
>> > kernel debuginfo has been installed.
>> >
>> > Where can I find the correct xfs debuginfo?
>>
>> it should be in the kernel-debuginfo rpm (of the same version/release as
>> the kernel rpm you're running)
>>
>> You should have:
>>
>>
>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>
>> If not, can you show:
>>
>> # uname -a
>> # rpm -q kernel
>> # rpm -q kernel-debuginfo
>>
>> -Eric
>>
>>
>>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 3069 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-12 1:07 ` Eric Sandeen
2013-04-12 1:36 ` 符永涛
@ 2013-04-12 6:15 ` 符永涛
1 sibling, 0 replies; 60+ messages in thread
From: 符永涛 @ 2013-04-12 6:15 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Brian Foster, Ben Myers, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 2375 bytes --]
Hi Eric and all,
Thank you for your help and you can get the xfs meta dump file from the
following link:
https://docs.google.com/file/d/0B7n2C4T5tfNCdFBCTnNxNERmbWc/edit?usp=sharing
I dump it after a mount/unmount procedure.
2013/4/12 Eric Sandeen <sandeen@sandeen.net>
> On 4/11/13 6:26 PM, Brian Foster wrote:
> > On 04/11/2013 03:11 PM, 符永涛 wrote:
> >> It happens tonight again on one of our servers, how to debug the root
> >> cause? Thank you.
> >>
> >
> > Hi,
> >
> > I've attached a system tap script (stap -v xfs.stp) that should
> > hopefully print out a bit more data should the issue happen again. Do
> > you have a small enough number of nodes (or predictable enough pattern)
> > that you could run this on the nodes that tend to fail and collect the
> > output?
> >
> > Also, could you collect an xfs_metadump of the filesystem in question
> > and make it available for download and analysis somewhere? I believe the
> > ideal approach is to mount/umount the filesystem first to replay the log
> > before collecting a metadump, but somebody could correct me on that (to
> > be safe, you could collect multiple dumps: pre-mount and post-mount).
>
> Dave suggested yesterday that this would be best: metadump right
> after unmounting post-failure, then mount/umount & generate another
> metadump.
>
> -Eric
>
> > Could you also describe your workload a little bit? Thanks.
> >
> > Brian
> >
> >> Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
> >> xfs_inotobp() returned error 22.
> >> Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
> >> error 22
> >> Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_do_force_shutdown(0x1)
> >> called from line 1184 of file fs/xfs/xfs_vnodeops.c. Return address =
> >> 0xffffffffa02ee20a
> >> Apr 12 02:32:10 cqdx kernel: XFS (sdb): I/O Error Detected. Shutting
> >> down filesystem
> >> Apr 12 02:32:10 cqdx kernel: XFS (sdb): Please umount the filesystem and
> >> rectify the problem(s)
> >> Apr 12 02:32:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> >> Apr 12 02:32:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> >> Apr 12 02:33:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> >> Apr 12 02:33:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> >>
>
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 3216 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-12 5:40 ` 符永涛
2013-04-12 6:00 ` 符永涛
@ 2013-04-12 7:44 ` 符永涛
2013-04-12 8:32 ` 符永涛
1 sibling, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-12 7:44 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Brian Foster, Ben Myers, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 2186 bytes --]
Hi Brian,
What else I'm missing? Thank you.
stap -e 'probe module("xfs").function("xfs_iunlink"){}'
WARNING: cannot find module xfs debuginfo: No DWARF information found
semantic error: no match while resolving probe point
module("xfs").function("xfs_iunlink")
Pass 2: analysis failed. Try again with another '--vp 01' option.
2013/4/12 符永涛 <yongtaofu@gmail.com>
> ls -l
> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
> -r--r--r-- 1 root root 21393024 Apr 12 12:08
> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>
> rpm -qa|grep kernel
> kernel-headers-2.6.32-279.el6.x86_64
> kernel-devel-2.6.32-279.el6.x86_64
> kernel-2.6.32-358.el6.x86_64
> kernel-debuginfo-common-x86_64-2.6.32-279.el6.x86_64
> abrt-addon-kerneloops-2.0.8-6.el6.x86_64
> kernel-firmware-2.6.32-358.el6.noarch
> kernel-debug-2.6.32-358.el6.x86_64
> kernel-debuginfo-2.6.32-279.el6.x86_64
> dracut-kernel-004-283.el6.noarch
> libreport-plugin-kerneloops-2.0.9-5.el6.x86_64
> kernel-devel-2.6.32-358.el6.x86_64
> kernel-2.6.32-279.el6.x86_64
>
> rpm -q kernel-debuginfo
> kernel-debuginfo-2.6.32-279.el6.x86_64
>
> rpm -q kernel
> kernel-2.6.32-279.el6.x86_64
> kernel-2.6.32-358.el6.x86_64
>
> do I need to re probe it?
>
>
> 2013/4/12 Eric Sandeen <sandeen@sandeen.net>
>
>> On 4/11/13 11:32 PM, 符永涛 wrote:
>> > Hi Brian,
>> > Sorry but when I execute the script it says:
>> > WARNING: cannot find module xfs debuginfo: No DWARF information found
>> > semantic error: no match while resolving probe point
>> module("xfs").function("xfs_iunlink")
>> >
>> > uname -a
>> > 2.6.32-279.el6.x86_64
>> > kernel debuginfo has been installed.
>> >
>> > Where can I find the correct xfs debuginfo?
>>
>> it should be in the kernel-debuginfo rpm (of the same version/release as
>> the kernel rpm you're running)
>>
>> You should have:
>>
>>
>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>
>> If not, can you show:
>>
>> # uname -a
>> # rpm -q kernel
>> # rpm -q kernel-debuginfo
>>
>> -Eric
>>
>>
>>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 3651 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-12 7:44 ` 符永涛
@ 2013-04-12 8:32 ` 符永涛
2013-04-12 12:41 ` Brian Foster
0 siblings, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-12 8:32 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Brian Foster, Ben Myers, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 2511 bytes --]
Dear xfs experts,
Can I just call xfs_stack_trace(); in the second line of
xfs_do_force_shutdown() to print stack and rebuild kernel to check what's
the error?
2013/4/12 符永涛 <yongtaofu@gmail.com>
> Hi Brian,
> What else I'm missing? Thank you.
> stap -e 'probe module("xfs").function("xfs_iunlink"){}'
>
> WARNING: cannot find module xfs debuginfo: No DWARF information found
> semantic error: no match while resolving probe point
> module("xfs").function("xfs_iunlink")
> Pass 2: analysis failed. Try again with another '--vp 01' option.
>
>
> 2013/4/12 符永涛 <yongtaofu@gmail.com>
>
>> ls -l
>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>> -r--r--r-- 1 root root 21393024 Apr 12 12:08
>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>
>> rpm -qa|grep kernel
>> kernel-headers-2.6.32-279.el6.x86_64
>> kernel-devel-2.6.32-279.el6.x86_64
>> kernel-2.6.32-358.el6.x86_64
>> kernel-debuginfo-common-x86_64-2.6.32-279.el6.x86_64
>> abrt-addon-kerneloops-2.0.8-6.el6.x86_64
>> kernel-firmware-2.6.32-358.el6.noarch
>> kernel-debug-2.6.32-358.el6.x86_64
>> kernel-debuginfo-2.6.32-279.el6.x86_64
>> dracut-kernel-004-283.el6.noarch
>> libreport-plugin-kerneloops-2.0.9-5.el6.x86_64
>> kernel-devel-2.6.32-358.el6.x86_64
>> kernel-2.6.32-279.el6.x86_64
>>
>> rpm -q kernel-debuginfo
>> kernel-debuginfo-2.6.32-279.el6.x86_64
>>
>> rpm -q kernel
>> kernel-2.6.32-279.el6.x86_64
>> kernel-2.6.32-358.el6.x86_64
>>
>> do I need to re probe it?
>>
>>
>> 2013/4/12 Eric Sandeen <sandeen@sandeen.net>
>>
>>> On 4/11/13 11:32 PM, 符永涛 wrote:
>>> > Hi Brian,
>>> > Sorry but when I execute the script it says:
>>> > WARNING: cannot find module xfs debuginfo: No DWARF information found
>>> > semantic error: no match while resolving probe point
>>> module("xfs").function("xfs_iunlink")
>>> >
>>> > uname -a
>>> > 2.6.32-279.el6.x86_64
>>> > kernel debuginfo has been installed.
>>> >
>>> > Where can I find the correct xfs debuginfo?
>>>
>>> it should be in the kernel-debuginfo rpm (of the same version/release as
>>> the kernel rpm you're running)
>>>
>>> You should have:
>>>
>>>
>>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>>
>>> If not, can you show:
>>>
>>> # uname -a
>>> # rpm -q kernel
>>> # rpm -q kernel-debuginfo
>>>
>>> -Eric
>>>
>>>
>>>
>>
>>
>> --
>> 符永涛
>>
>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 4312 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-12 6:00 ` 符永涛
@ 2013-04-12 12:11 ` Brian Foster
0 siblings, 0 replies; 60+ messages in thread
From: Brian Foster @ 2013-04-12 12:11 UTC (permalink / raw)
To: 符永涛; +Cc: Ben Myers, Eric Sandeen, xfs@oss.sgi.com
On 04/12/2013 02:00 AM, 符永涛 wrote:
> stap -e 'probe module("xfs").function("xfs_iunlink"){}'
> WARNING: cannot find module xfs debuginfo: No DWARF information found
> semantic error: no match while resolving probe point
> module("xfs").function("xfs_iunlink")
> Pass 2: analysis failed. Try again with another '--vp 01' option.
>
This is the error I get if I remove
kernel-debuginfo-2.6.32-279.el6.x86_64.rpm. Otherwise, the example
xfs_iunlink() probe works for me. The xfs.ko.debug module installed to
my system matches the path you've listed below, as well.
I suppose it couldn't hurt to try and remove/reinstall that module. Do
you get any output from 'objdump --dwarf .../xfs.ko.debug?'
I also notice that you have a newer kernel installed on this system. Are
you sure you're still running the -279 build? I'm not an stap expert,
but it would be nice to find out where it's looking for debug info at
runtime...
Brian
P.S., The commands you've listed in another mail:
sudo stap -L 'kernel.trace("*")'|grep xfs_iunlink
sudo stap -L 'kernel.trace("*")'|grep xfs_ifree
... don't print anything on my box either, but as mentioned, the
xfs_iunlink() probe works. I suspect these are not relevant. Perhaps you
are listing tracepoints here? The following command prints the probe
point info on my box:
stap -L 'module("xfs").function("xfs_iunlink")'
>
> 2013/4/12 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>
> ls -l
> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
> -r--r--r-- 1 root root 21393024 Apr 12 12:08
> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>
> rpm -qa|grep kernel
> kernel-headers-2.6.32-279.el6.x86_64
> kernel-devel-2.6.32-279.el6.x86_64
> kernel-2.6.32-358.el6.x86_64
> kernel-debuginfo-common-x86_64-2.6.32-279.el6.x86_64
> abrt-addon-kerneloops-2.0.8-6.el6.x86_64
> kernel-firmware-2.6.32-358.el6.noarch
> kernel-debug-2.6.32-358.el6.x86_64
> kernel-debuginfo-2.6.32-279.el6.x86_64
> dracut-kernel-004-283.el6.noarch
> libreport-plugin-kerneloops-2.0.9-5.el6.x86_64
> kernel-devel-2.6.32-358.el6.x86_64
> kernel-2.6.32-279.el6.x86_64
>
> rpm -q kernel-debuginfo
> kernel-debuginfo-2.6.32-279.el6.x86_64
>
> rpm -q kernel
> kernel-2.6.32-279.el6.x86_64
> kernel-2.6.32-358.el6.x86_64
>
> do I need to re probe it?
>
>
> 2013/4/12 Eric Sandeen <sandeen@sandeen.net
> <mailto:sandeen@sandeen.net>>
>
> On 4/11/13 11:32 PM, 符永涛 wrote:
> > Hi Brian,
> > Sorry but when I execute the script it says:
> > WARNING: cannot find module xfs debuginfo: No DWARF
> information found
> > semantic error: no match while resolving probe point
> module("xfs").function("xfs_iunlink")
> >
> > uname -a
> > 2.6.32-279.el6.x86_64
> > kernel debuginfo has been installed.
> >
> > Where can I find the correct xfs debuginfo?
>
> it should be in the kernel-debuginfo rpm (of the same
> version/release as the kernel rpm you're running)
>
> You should have:
>
> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>
> If not, can you show:
>
> # uname -a
> # rpm -q kernel
> # rpm -q kernel-debuginfo
>
> -Eric
>
>
>
>
>
> --
> 符永涛
>
>
>
>
> --
> 符永涛
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-12 8:32 ` 符永涛
@ 2013-04-12 12:41 ` Brian Foster
2013-04-12 14:48 ` 符永涛
0 siblings, 1 reply; 60+ messages in thread
From: Brian Foster @ 2013-04-12 12:41 UTC (permalink / raw)
To: 符永涛; +Cc: Ben Myers, Eric Sandeen, xfs@oss.sgi.com
On 04/12/2013 04:32 AM, 符永涛 wrote:
> Dear xfs experts,
> Can I just call xfs_stack_trace(); in the second line of
> xfs_do_force_shutdown() to print stack and rebuild kernel to check
> what's the error?
>
I suppose that's a start. If you're willing/able to create and run a
modified kernel for the purpose of collecting more debug info, perhaps
we can get a bit more creative in collecting more data on the problem
(but a stack trace there is a good start).
BTW- you might want to place the call after the XFS_FORCED_SHUTDOWN(mp)
check almost halfway into the function to avoid duplicate messages.
Brian
>
> 2013/4/12 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>
> Hi Brian,
> What else I'm missing? Thank you.
> stap -e 'probe module("xfs").function("xfs_iunlink"){}'
>
> WARNING: cannot find module xfs debuginfo: No DWARF information found
> semantic error: no match while resolving probe point
> module("xfs").function("xfs_iunlink")
> Pass 2: analysis failed. Try again with another '--vp 01' option.
>
>
> 2013/4/12 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>
> ls -l
> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
> -r--r--r-- 1 root root 21393024 Apr 12 12:08
> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>
> rpm -qa|grep kernel
> kernel-headers-2.6.32-279.el6.x86_64
> kernel-devel-2.6.32-279.el6.x86_64
> kernel-2.6.32-358.el6.x86_64
> kernel-debuginfo-common-x86_64-2.6.32-279.el6.x86_64
> abrt-addon-kerneloops-2.0.8-6.el6.x86_64
> kernel-firmware-2.6.32-358.el6.noarch
> kernel-debug-2.6.32-358.el6.x86_64
> kernel-debuginfo-2.6.32-279.el6.x86_64
> dracut-kernel-004-283.el6.noarch
> libreport-plugin-kerneloops-2.0.9-5.el6.x86_64
> kernel-devel-2.6.32-358.el6.x86_64
> kernel-2.6.32-279.el6.x86_64
>
> rpm -q kernel-debuginfo
> kernel-debuginfo-2.6.32-279.el6.x86_64
>
> rpm -q kernel
> kernel-2.6.32-279.el6.x86_64
> kernel-2.6.32-358.el6.x86_64
>
> do I need to re probe it?
>
>
> 2013/4/12 Eric Sandeen <sandeen@sandeen.net
> <mailto:sandeen@sandeen.net>>
>
> On 4/11/13 11:32 PM, 符永涛 wrote:
> > Hi Brian,
> > Sorry but when I execute the script it says:
> > WARNING: cannot find module xfs debuginfo: No DWARF
> information found
> > semantic error: no match while resolving probe point
> module("xfs").function("xfs_iunlink")
> >
> > uname -a
> > 2.6.32-279.el6.x86_64
> > kernel debuginfo has been installed.
> >
> > Where can I find the correct xfs debuginfo?
>
> it should be in the kernel-debuginfo rpm (of the same
> version/release as the kernel rpm you're running)
>
> You should have:
>
> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>
> If not, can you show:
>
> # uname -a
> # rpm -q kernel
> # rpm -q kernel-debuginfo
>
> -Eric
>
>
>
>
>
> --
> 符永涛
>
>
>
>
> --
> 符永涛
>
>
>
>
> --
> 符永涛
>
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-12 0:45 ` 符永涛
@ 2013-04-12 12:50 ` Brian Foster
2013-04-12 13:42 ` 符永涛
0 siblings, 1 reply; 60+ messages in thread
From: Brian Foster @ 2013-04-12 12:50 UTC (permalink / raw)
To: 符永涛; +Cc: Ben Myers, xfs@oss.sgi.com
On 04/11/2013 08:45 PM, 符永涛 wrote:
> the workload is about:
> 24 servers, replica(3) which means the distribute is 8
> load is about 3(TB)-8(TB) per day.
>
This describes your cluster, but not the workload (though cluster info
is good too). What kind of workload is running on your clients (i.e.,
rsync jobs, etc.)? Are you running through native gluster mount points,
NFS mounts or a mix? Do you have any gluster internal operations running
(i.e., rebalance, etc.).
Is there any kind of pattern you can discern from the workload and when
the XFS error happens to occur? You have a good number of servers in
play here, is there any kind of pattern in terms of which servers
experience the error? Is it always the same servers or a random set?
Brian
>
> 2013/4/12 Brian Foster <bfoster@redhat.com <mailto:bfoster@redhat.com>>
>
> On 04/11/2013 03:11 PM, 符永涛 wrote:
> > It happens tonight again on one of our servers, how to debug the root
> > cause? Thank you.
> >
>
> Hi,
>
> I've attached a system tap script (stap -v xfs.stp) that should
> hopefully print out a bit more data should the issue happen again. Do
> you have a small enough number of nodes (or predictable enough pattern)
> that you could run this on the nodes that tend to fail and collect the
> output?
>
> Also, could you collect an xfs_metadump of the filesystem in question
> and make it available for download and analysis somewhere? I believe the
> ideal approach is to mount/umount the filesystem first to replay the log
> before collecting a metadump, but somebody could correct me on that (to
> be safe, you could collect multiple dumps: pre-mount and post-mount).
>
> Could you also describe your workload a little bit? Thanks.
>
> Brian
>
> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
> > xfs_inotobp() returned error 22.
> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree
> returned
> > error 22
> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_do_force_shutdown(0x1)
> > called from line 1184 of file fs/xfs/xfs_vnodeops.c. Return address =
> > 0xffffffffa02ee20a
> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): I/O Error Detected. Shutting
> > down filesystem
> > Apr 12 02:32:10 cqdx kernel: XFS (sdb): Please umount the
> filesystem and
> > rectify the problem(s)
> > Apr 12 02:32:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> returned.
> > Apr 12 02:32:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> returned.
> > Apr 12 02:33:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> returned.
> > Apr 12 02:33:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> returned.
> >
> > xfs_repair -n
> >
> >
> > Phase 7 - verify link counts...
> > would have reset inode 20021 nlinks from 0 to 1
> > would have reset inode 20789 nlinks from 0 to 1
> > would have reset inode 35125 nlinks from 0 to 1
> > would have reset inode 35637 nlinks from 0 to 1
> > would have reset inode 36149 nlinks from 0 to 1
> > would have reset inode 38197 nlinks from 0 to 1
> > would have reset inode 39477 nlinks from 0 to 1
> > would have reset inode 54069 nlinks from 0 to 1
> > would have reset inode 62261 nlinks from 0 to 1
> > would have reset inode 63029 nlinks from 0 to 1
> > would have reset inode 72501 nlinks from 0 to 1
> > would have reset inode 79925 nlinks from 0 to 1
> > would have reset inode 81205 nlinks from 0 to 1
> > would have reset inode 84789 nlinks from 0 to 1
> > would have reset inode 87861 nlinks from 0 to 1
> > would have reset inode 90663 nlinks from 0 to 1
> > would have reset inode 91189 nlinks from 0 to 1
> > would have reset inode 95541 nlinks from 0 to 1
> > would have reset inode 98101 nlinks from 0 to 1
> > would have reset inode 101173 nlinks from 0 to 1
> > would have reset inode 113205 nlinks from 0 to 1
> > would have reset inode 114741 nlinks from 0 to 1
> > would have reset inode 126261 nlinks from 0 to 1
> > would have reset inode 140597 nlinks from 0 to 1
> > would have reset inode 144693 nlinks from 0 to 1
> > would have reset inode 147765 nlinks from 0 to 1
> > would have reset inode 152885 nlinks from 0 to 1
> > would have reset inode 161333 nlinks from 0 to 1
> > would have reset inode 161845 nlinks from 0 to 1
> > would have reset inode 167477 nlinks from 0 to 1
> > would have reset inode 172341 nlinks from 0 to 1
> > would have reset inode 191797 nlinks from 0 to 1
> > would have reset inode 204853 nlinks from 0 to 1
> > would have reset inode 205365 nlinks from 0 to 1
> > would have reset inode 215349 nlinks from 0 to 1
> > would have reset inode 215861 nlinks from 0 to 1
> > would have reset inode 216373 nlinks from 0 to 1
> > would have reset inode 217397 nlinks from 0 to 1
> > would have reset inode 224309 nlinks from 0 to 1
> > would have reset inode 225589 nlinks from 0 to 1
> > would have reset inode 234549 nlinks from 0 to 1
> > would have reset inode 234805 nlinks from 0 to 1
> > would have reset inode 249653 nlinks from 0 to 1
> > would have reset inode 250677 nlinks from 0 to 1
> > would have reset inode 252469 nlinks from 0 to 1
> > would have reset inode 261429 nlinks from 0 to 1
> > would have reset inode 265013 nlinks from 0 to 1
> > would have reset inode 266805 nlinks from 0 to 1
> > would have reset inode 267317 nlinks from 0 to 1
> > would have reset inode 268853 nlinks from 0 to 1
> > would have reset inode 272437 nlinks from 0 to 1
> > would have reset inode 273205 nlinks from 0 to 1
> > would have reset inode 274229 nlinks from 0 to 1
> > would have reset inode 278325 nlinks from 0 to 1
> > would have reset inode 278837 nlinks from 0 to 1
> > would have reset inode 281397 nlinks from 0 to 1
> > would have reset inode 292661 nlinks from 0 to 1
> > would have reset inode 300853 nlinks from 0 to 1
> > would have reset inode 302901 nlinks from 0 to 1
> > would have reset inode 305205 nlinks from 0 to 1
> > would have reset inode 314165 nlinks from 0 to 1
> > would have reset inode 315189 nlinks from 0 to 1
> > would have reset inode 320309 nlinks from 0 to 1
> > would have reset inode 324917 nlinks from 0 to 1
> > would have reset inode 328245 nlinks from 0 to 1
> > would have reset inode 335925 nlinks from 0 to 1
> > would have reset inode 339253 nlinks from 0 to 1
> > would have reset inode 339765 nlinks from 0 to 1
> > would have reset inode 348213 nlinks from 0 to 1
> > would have reset inode 360501 nlinks from 0 to 1
> > would have reset inode 362037 nlinks from 0 to 1
> > would have reset inode 366389 nlinks from 0 to 1
> > would have reset inode 385845 nlinks from 0 to 1
> > would have reset inode 390709 nlinks from 0 to 1
> > would have reset inode 409141 nlinks from 0 to 1
> > would have reset inode 413237 nlinks from 0 to 1
> > would have reset inode 414773 nlinks from 0 to 1
> > would have reset inode 417845 nlinks from 0 to 1
> > would have reset inode 436021 nlinks from 0 to 1
> > would have reset inode 439349 nlinks from 0 to 1
> > would have reset inode 447029 nlinks from 0 to 1
> > would have reset inode 491317 nlinks from 0 to 1
> > would have reset inode 494133 nlinks from 0 to 1
> > would have reset inode 495413 nlinks from 0 to 1
> > would have reset inode 501301 nlinks from 0 to 1
> > would have reset inode 506421 nlinks from 0 to 1
> > would have reset inode 508469 nlinks from 0 to 1
> > would have reset inode 508981 nlinks from 0 to 1
> > would have reset inode 511797 nlinks from 0 to 1
> > would have reset inode 513077 nlinks from 0 to 1
> > would have reset inode 517941 nlinks from 0 to 1
> > would have reset inode 521013 nlinks from 0 to 1
> > would have reset inode 522805 nlinks from 0 to 1
> > would have reset inode 523317 nlinks from 0 to 1
> > would have reset inode 525621 nlinks from 0 to 1
> > would have reset inode 527925 nlinks from 0 to 1
> > would have reset inode 535605 nlinks from 0 to 1
> > would have reset inode 541749 nlinks from 0 to 1
> > would have reset inode 573493 nlinks from 0 to 1
> > would have reset inode 578613 nlinks from 0 to 1
> > would have reset inode 583029 nlinks from 0 to 1
> > would have reset inode 585525 nlinks from 0 to 1
> > would have reset inode 586293 nlinks from 0 to 1
> > would have reset inode 586805 nlinks from 0 to 1
> > would have reset inode 591413 nlinks from 0 to 1
> > would have reset inode 594485 nlinks from 0 to 1
> > would have reset inode 596277 nlinks from 0 to 1
> > would have reset inode 603189 nlinks from 0 to 1
> > would have reset inode 613429 nlinks from 0 to 1
> > would have reset inode 617781 nlinks from 0 to 1
> > would have reset inode 621877 nlinks from 0 to 1
> > would have reset inode 623925 nlinks from 0 to 1
> > would have reset inode 625205 nlinks from 0 to 1
> > would have reset inode 626741 nlinks from 0 to 1
> > would have reset inode 639541 nlinks from 0 to 1
> > would have reset inode 640053 nlinks from 0 to 1
> > would have reset inode 640565 nlinks from 0 to 1
> > would have reset inode 645173 nlinks from 0 to 1
> > would have reset inode 652853 nlinks from 0 to 1
> > would have reset inode 656181 nlinks from 0 to 1
> > would have reset inode 659253 nlinks from 0 to 1
> > would have reset inode 663605 nlinks from 0 to 1
> > would have reset inode 667445 nlinks from 0 to 1
> > would have reset inode 680757 nlinks from 0 to 1
> > would have reset inode 691253 nlinks from 0 to 1
> > would have reset inode 691765 nlinks from 0 to 1
> > would have reset inode 697653 nlinks from 0 to 1
> > would have reset inode 700469 nlinks from 0 to 1
> > would have reset inode 707893 nlinks from 0 to 1
> > would have reset inode 716853 nlinks from 0 to 1
> > would have reset inode 722229 nlinks from 0 to 1
> > would have reset inode 722741 nlinks from 0 to 1
> > would have reset inode 723765 nlinks from 0 to 1
> > would have reset inode 731957 nlinks from 0 to 1
> > would have reset inode 742965 nlinks from 0 to 1
> > would have reset inode 743477 nlinks from 0 to 1
> > would have reset inode 745781 nlinks from 0 to 1
> > would have reset inode 746293 nlinks from 0 to 1
> > would have reset inode 774453 nlinks from 0 to 1
> > would have reset inode 778805 nlinks from 0 to 1
> > would have reset inode 785013 nlinks from 0 to 1
> > would have reset inode 785973 nlinks from 0 to 1
> > would have reset inode 791349 nlinks from 0 to 1
> > would have reset inode 796981 nlinks from 0 to 1
> > would have reset inode 803381 nlinks from 0 to 1
> > would have reset inode 806965 nlinks from 0 to 1
> > would have reset inode 811798 nlinks from 0 to 1
> > would have reset inode 812310 nlinks from 0 to 1
> > would have reset inode 813078 nlinks from 0 to 1
> > would have reset inode 813607 nlinks from 0 to 1
> > would have reset inode 814183 nlinks from 0 to 1
> > would have reset inode 822069 nlinks from 0 to 1
> > would have reset inode 828469 nlinks from 0 to 1
> > would have reset inode 830005 nlinks from 0 to 1
> > would have reset inode 832053 nlinks from 0 to 1
> > would have reset inode 832565 nlinks from 0 to 1
> > would have reset inode 836661 nlinks from 0 to 1
> > would have reset inode 841013 nlinks from 0 to 1
> > would have reset inode 841525 nlinks from 0 to 1
> > would have reset inode 845365 nlinks from 0 to 1
> > would have reset inode 846133 nlinks from 0 to 1
> > would have reset inode 847157 nlinks from 0 to 1
> > would have reset inode 852533 nlinks from 0 to 1
> > would have reset inode 857141 nlinks from 0 to 1
> > would have reset inode 863271 nlinks from 0 to 1
> > would have reset inode 866855 nlinks from 0 to 1
> > would have reset inode 887861 nlinks from 0 to 1
> > would have reset inode 891701 nlinks from 0 to 1
> > would have reset inode 894773 nlinks from 0 to 1
> > would have reset inode 900149 nlinks from 0 to 1
> > would have reset inode 902197 nlinks from 0 to 1
> > would have reset inode 906293 nlinks from 0 to 1
> > would have reset inode 906805 nlinks from 0 to 1
> > would have reset inode 909877 nlinks from 0 to 1
> > would have reset inode 925493 nlinks from 0 to 1
> > would have reset inode 949543 nlinks from 0 to 1
> > would have reset inode 955175 nlinks from 0 to 1
> > would have reset inode 963623 nlinks from 0 to 1
> > would have reset inode 967733 nlinks from 0 to 1
> > would have reset inode 968231 nlinks from 0 to 1
> > would have reset inode 982069 nlinks from 0 to 1
> > would have reset inode 1007413 nlinks from 0 to 1
> > would have reset inode 1011509 nlinks from 0 to 1
> > would have reset inode 1014069 nlinks from 0 to 1
> > would have reset inode 1014581 nlinks from 0 to 1
> > would have reset inode 1022005 nlinks from 0 to 1
> > would have reset inode 1022517 nlinks from 0 to 1
> > would have reset inode 1023029 nlinks from 0 to 1
> > would have reset inode 1025333 nlinks from 0 to 1
> > would have reset inode 1043765 nlinks from 0 to 1
> > would have reset inode 1044789 nlinks from 0 to 1
> > would have reset inode 1049397 nlinks from 0 to 1
> > would have reset inode 1050933 nlinks from 0 to 1
> > would have reset inode 1051445 nlinks from 0 to 1
> > would have reset inode 1054261 nlinks from 0 to 1
> > would have reset inode 1060917 nlinks from 0 to 1
> > would have reset inode 1063477 nlinks from 0 to 1
> > would have reset inode 1076021 nlinks from 0 to 1
> > would have reset inode 1081141 nlinks from 0 to 1
> > would have reset inode 1086261 nlinks from 0 to 1
> > would have reset inode 1097269 nlinks from 0 to 1
> > would have reset inode 1099829 nlinks from 0 to 1
> > would have reset inode 1100853 nlinks from 0 to 1
> > would have reset inode 1101877 nlinks from 0 to 1
> > would have reset inode 1126709 nlinks from 0 to 1
> > would have reset inode 1134389 nlinks from 0 to 1
> > would have reset inode 1141045 nlinks from 0 to 1
> > would have reset inode 1141557 nlinks from 0 to 1
> > would have reset inode 1142581 nlinks from 0 to 1
> > would have reset inode 1148469 nlinks from 0 to 1
> > would have reset inode 1153333 nlinks from 0 to 1
> > would have reset inode 1181749 nlinks from 0 to 1
> > would have reset inode 1192245 nlinks from 0 to 1
> > would have reset inode 1198133 nlinks from 0 to 1
> > would have reset inode 1203765 nlinks from 0 to 1
> > would have reset inode 1221429 nlinks from 0 to 1
> > would have reset inode 1223989 nlinks from 0 to 1
> > would have reset inode 1235509 nlinks from 0 to 1
> > would have reset inode 1239349 nlinks from 0 to 1
> > would have reset inode 1240885 nlinks from 0 to 1
> > would have reset inode 1241397 nlinks from 0 to 1
> > would have reset inode 1241909 nlinks from 0 to 1
> > would have reset inode 1242421 nlinks from 0 to 1
> > would have reset inode 1244981 nlinks from 0 to 1
> > would have reset inode 1246517 nlinks from 0 to 1
> > would have reset inode 1253429 nlinks from 0 to 1
> > would have reset inode 1271861 nlinks from 0 to 1
> > would have reset inode 1274677 nlinks from 0 to 1
> > would have reset inode 1277749 nlinks from 0 to 1
> > would have reset inode 1278773 nlinks from 0 to 1
> > would have reset inode 1286709 nlinks from 0 to 1
> > would have reset inode 1288245 nlinks from 0 to 1
> > would have reset inode 1299765 nlinks from 0 to 1
> > would have reset inode 1302325 nlinks from 0 to 1
> > would have reset inode 1304885 nlinks from 0 to 1
> > would have reset inode 1305397 nlinks from 0 to 1
> > would have reset inode 1307509 nlinks from 0 to 1
> > would have reset inode 1309493 nlinks from 0 to 1
> > would have reset inode 1310517 nlinks from 0 to 1
> > would have reset inode 1311029 nlinks from 0 to 1
> > would have reset inode 1312053 nlinks from 0 to 1
> > would have reset inode 1316917 nlinks from 0 to 1
> > would have reset inode 1317941 nlinks from 0 to 1
> > would have reset inode 1320821 nlinks from 0 to 1
> > would have reset inode 1322805 nlinks from 0 to 1
> > would have reset inode 1332789 nlinks from 0 to 1
> > would have reset inode 1336373 nlinks from 0 to 1
> > would have reset inode 1345653 nlinks from 0 to 1
> > would have reset inode 1354549 nlinks from 0 to 1
> > would have reset inode 1361973 nlinks from 0 to 1
> > would have reset inode 1369909 nlinks from 0 to 1
> > would have reset inode 1372981 nlinks from 0 to 1
> > would have reset inode 1388853 nlinks from 0 to 1
> > would have reset inode 1402933 nlinks from 0 to 1
> > would have reset inode 1403445 nlinks from 0 to 1
> > would have reset inode 1420085 nlinks from 0 to 1
> > would have reset inode 1452853 nlinks from 0 to 1
> > would have reset inode 1456437 nlinks from 0 to 1
> > would have reset inode 1457973 nlinks from 0 to 1
> > would have reset inode 1459253 nlinks from 0 to 1
> > would have reset inode 1467957 nlinks from 0 to 1
> > would have reset inode 1471541 nlinks from 0 to 1
> > would have reset inode 1476661 nlinks from 0 to 1
> > would have reset inode 1479733 nlinks from 0 to 1
> > would have reset inode 1483061 nlinks from 0 to 1
> > would have reset inode 1484085 nlinks from 0 to 1
> > would have reset inode 1486133 nlinks from 0 to 1
> > would have reset inode 1489461 nlinks from 0 to 1
> > would have reset inode 1490037 nlinks from 0 to 1
> > would have reset inode 1492021 nlinks from 0 to 1
> > would have reset inode 1493557 nlinks from 0 to 1
> > would have reset inode 1494069 nlinks from 0 to 1
> > would have reset inode 1496885 nlinks from 0 to 1
> > would have reset inode 1498421 nlinks from 0 to 1
> > would have reset inode 1498933 nlinks from 0 to 1
> > would have reset inode 1499957 nlinks from 0 to 1
> > would have reset inode 1506101 nlinks from 0 to 1
> > would have reset inode 1507637 nlinks from 0 to 1
> > would have reset inode 1510453 nlinks from 0 to 1
> > would have reset inode 1514293 nlinks from 0 to 1
> > would have reset inode 1517365 nlinks from 0 to 1
> > would have reset inode 1520693 nlinks from 0 to 1
> > would have reset inode 1521973 nlinks from 0 to 1
> > would have reset inode 1530421 nlinks from 0 to 1
> > would have reset inode 1530933 nlinks from 0 to 1
> > would have reset inode 1537333 nlinks from 0 to 1
> > would have reset inode 1538357 nlinks from 0 to 1
> > would have reset inode 1548853 nlinks from 0 to 1
> > would have reset inode 1553973 nlinks from 0 to 1
> > would have reset inode 1557301 nlinks from 0 to 1
> > would have reset inode 1564213 nlinks from 0 to 1
> > would have reset inode 1564725 nlinks from 0 to 1
> > would have reset inode 1576501 nlinks from 0 to 1
> > would have reset inode 1580597 nlinks from 0 to 1
> > would have reset inode 1584693 nlinks from 0 to 1
> > would have reset inode 1586485 nlinks from 0 to 1
> > would have reset inode 1589301 nlinks from 0 to 1
> > would have reset inode 1589813 nlinks from 0 to 1
> > would have reset inode 1592629 nlinks from 0 to 1
> > would have reset inode 1595701 nlinks from 0 to 1
> > would have reset inode 1601077 nlinks from 0 to 1
> > would have reset inode 1623861 nlinks from 0 to 1
> > would have reset inode 1626677 nlinks from 0 to 1
> > would have reset inode 1627701 nlinks from 0 to 1
> > would have reset inode 1633333 nlinks from 0 to 1
> > would have reset inode 1639221 nlinks from 0 to 1
> > would have reset inode 1649205 nlinks from 0 to 1
> > would have reset inode 1686325 nlinks from 0 to 1
> > would have reset inode 1690677 nlinks from 0 to 1
> > would have reset inode 1693749 nlinks from 0 to 1
> > would have reset inode 1704757 nlinks from 0 to 1
> > would have reset inode 1707061 nlinks from 0 to 1
> > would have reset inode 1709109 nlinks from 0 to 1
> > would have reset inode 1719349 nlinks from 0 to 1
> > would have reset inode 1737013 nlinks from 0 to 1
> > would have reset inode 1741365 nlinks from 0 to 1
> > would have reset inode 1747509 nlinks from 0 to 1
> > would have reset inode 1770805 nlinks from 0 to 1
> > would have reset inode 1780789 nlinks from 0 to 1
> > would have reset inode 1793589 nlinks from 0 to 1
> > would have reset inode 1795125 nlinks from 0 to 1
> > would have reset inode 1800757 nlinks from 0 to 1
> > would have reset inode 1801269 nlinks from 0 to 1
> > would have reset inode 1802549 nlinks from 0 to 1
> > would have reset inode 1804085 nlinks from 0 to 1
> > would have reset inode 1817141 nlinks from 0 to 1
> > would have reset inode 1821749 nlinks from 0 to 1
> > would have reset inode 1832757 nlinks from 0 to 1
> > would have reset inode 1836341 nlinks from 0 to 1
> > would have reset inode 1856309 nlinks from 0 to 1
> > would have reset inode 1900597 nlinks from 0 to 1
> > would have reset inode 1902901 nlinks from 0 to 1
> > would have reset inode 1912373 nlinks from 0 to 1
> > would have reset inode 1943093 nlinks from 0 to 1
> > would have reset inode 1944373 nlinks from 0 to 1
> > would have reset inode 1954101 nlinks from 0 to 1
> > would have reset inode 1955893 nlinks from 0 to 1
> > would have reset inode 1961781 nlinks from 0 to 1
> > would have reset inode 1974325 nlinks from 0 to 1
> > would have reset inode 1978677 nlinks from 0 to 1
> > would have reset inode 1981237 nlinks from 0 to 1
> > would have reset inode 1992245 nlinks from 0 to 1
> > would have reset inode 2000949 nlinks from 0 to 1
> > would have reset inode 2002229 nlinks from 0 to 1
> > would have reset inode 2004789 nlinks from 0 to 1
> > would have reset inode 2005301 nlinks from 0 to 1
> > would have reset inode 2011189 nlinks from 0 to 1
> > would have reset inode 2012981 nlinks from 0 to 1
> > would have reset inode 2015285 nlinks from 0 to 1
> > would have reset inode 2018869 nlinks from 0 to 1
> > would have reset inode 2028341 nlinks from 0 to 1
> > would have reset inode 2028853 nlinks from 0 to 1
> > would have reset inode 2030901 nlinks from 0 to 1
> > would have reset inode 2032181 nlinks from 0 to 1
> > would have reset inode 2032693 nlinks from 0 to 1
> > would have reset inode 2040117 nlinks from 0 to 1
> > would have reset inode 2053685 nlinks from 0 to 1
> > would have reset inode 2083893 nlinks from 0 to 1
> > would have reset inode 2087221 nlinks from 0 to 1
> > would have reset inode 2095925 nlinks from 0 to 1
> > would have reset inode 2098741 nlinks from 0 to 1
> > would have reset inode 2100533 nlinks from 0 to 1
> > would have reset inode 2101301 nlinks from 0 to 1
> > would have reset inode 2123573 nlinks from 0 to 1
> > would have reset inode 2132789 nlinks from 0 to 1
> > would have reset inode 2133813 nlinks from 0 to 1
> >
> >
> >
> >
> >
> > 2013/4/10 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>
> <mailto:yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>>
> >
> > The storage info is as following:
> > RAID-6
> > SATA HDD
> > Controller: PERC H710P Mini (Embedded)
> > Disk /dev/sdb: 30000.3 GB, 30000346562560 bytes
> > 255 heads, 63 sectors/track, 3647334 cylinders
> > Units = cylinders of 16065 * 512 = 8225280 bytes
> > Sector size (logical/physical): 512 bytes / 512 bytes
> > I/O size (minimum/optimal): 512 bytes / 512 bytes
> > Disk identifier: 0x00000000
> >
> > sd 0:2:1:0: [sdb] 58594426880 512-byte logical blocks: (30.0
> TB/27.2
> > TiB)
> > sd 0:2:1:0: [sdb] Write Protect is off
> > sd 0:2:1:0: [sdb] Mode Sense: 1f 00 00 08
> > sd 0:2:1:0: [sdb] Write cache: enabled, read cache: enabled,
> doesn't
> > support DPO or FUA
> > sd 0:2:1:0: [sdb] Attached SCSI disk
> >
> > *-storage
> > description: RAID bus controller
> > product: MegaRAID SAS 2208 [Thunderbolt]
> > vendor: LSI Logic / Symbios Logic
> > physical id: 0
> > bus info: pci@0000:02:00.0
> > logical name: scsi0
> > version: 01
> > width: 64 bits
> > clock: 33MHz
> > capabilities: storage pm pciexpress vpd msi msix bus_master
> > cap_list rom
> > configuration: driver=megaraid_sas latency=0
> > resources: irq:42 ioport:fc00(size=256)
> > memory:dd7fc000-dd7fffff memory:dd780000-dd7bffff
> > memory:dc800000-dc81ffff(prefetchable)
> > *-disk:0
> > description: SCSI Disk
> > product: PERC H710P
> > vendor: DELL
> > physical id: 2.0.0
> > bus info: scsi@0:2.0.0
> > logical name: /dev/sda
> > version: 3.13
> > serial: 0049d6ce1d9f2035180096fde490f648
> > size: 558GiB (599GB)
> > capabilities: partitioned partitioned:dos
> > configuration: ansiversion=5 signature=000aa336
> > *-disk:1
> > description: SCSI Disk
> > product: PERC H710P
> > vendor: DELL
> > physical id: 2.1.0
> > bus info: scsi@0:2.1.0
> > logical name: /dev/sdb
> > logical name: /mnt/xfsd
> > version: 3.13
> > serial: 003366f71da22035180096fde490f648
> > size: 27TiB (30TB)
> > configuration: ansiversion=5 mount.fstype=xfs
> >
> mount.options=rw,relatime,attr2,delaylog,logbsize=64k,sunit=128,swidth=1280,noquota
> > state=mounted
> >
> > Thank you.
> >
> >
> > 2013/4/10 Emmanuel Florac <eflorac@intellique.com
> <mailto:eflorac@intellique.com>
> > <mailto:eflorac@intellique.com <mailto:eflorac@intellique.com>>>
> >
> > Le Tue, 9 Apr 2013 23:10:03 +0800
> > 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>
> <mailto:yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>> écrivait:
> >
> > > > Apr 9 11:01:30 cqdx kernel: XFS (sdb): I/O Error
> Detected.
> > > > Shutting down filesystem
> >
> > This. I/O error detected. That means that at some point the
> > underlying
> > device (disk, RAID array, SAN volume) couldn't be reached.
> So this
> > could very well be a case of a flakey drive, array, cable
> or SCSI
> > driver.
> >
> > What's the storage setup here?
> >
> > --
> >
> ------------------------------------------------------------------------
> > Emmanuel Florac | Direction technique
> > | Intellique
> > | <eflorac@intellique.com
> <mailto:eflorac@intellique.com>
> > <mailto:eflorac@intellique.com
> <mailto:eflorac@intellique.com>>>
> > | +33 1 78 94 84 02
> <tel:%2B33%201%2078%2094%2084%2002>
> >
> ------------------------------------------------------------------------
> >
> >
> >
> >
> > --
> > 符永涛
> >
> >
> >
> >
> > --
> > 符永涛
> >
> >
> > _______________________________________________
> > xfs mailing list
> > xfs@oss.sgi.com <mailto:xfs@oss.sgi.com>
> > http://oss.sgi.com/mailman/listinfo/xfs
> >
>
>
>
>
> --
> 符永涛
>
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-12 12:50 ` Brian Foster
@ 2013-04-12 13:42 ` 符永涛
2013-04-12 13:48 ` 符永涛
0 siblings, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-12 13:42 UTC (permalink / raw)
To: Brian Foster; +Cc: Ben Myers, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 31410 bytes --]
The xfs shutdown error is always the same. (It happens about 20 times on
about 50 servers during last half year)
Recently the shutdown happens on the cluster with 24 servers(dis 8 *
replica 3) during rebalance. The average work load of this cluster is more
than 3TB growth per day.
The work load including normal fops, rsync jobs, video encoding/decoding,
logging etc. through glusterfs native client of hundreds of machines.
The shutdown tend to happen when we run rebalance for the glusterfs cluster
which I guess will trigger a lot of unlink operations?
Thank you very much. May be I can try to collect more logs with a modified
kernel package.
2013/4/12 Brian Foster <bfoster@redhat.com>
> On 04/11/2013 08:45 PM, 符永涛 wrote:
> > the workload is about:
> > 24 servers, replica(3) which means the distribute is 8
> > load is about 3(TB)-8(TB) per day.
> >
>
> This describes your cluster, but not the workload (though cluster info
> is good too). What kind of workload is running on your clients (i.e.,
> rsync jobs, etc.)? Are you running through native gluster mount points,
> NFS mounts or a mix? Do you have any gluster internal operations running
> (i.e., rebalance, etc.).
>
> Is there any kind of pattern you can discern from the workload and when
> the XFS error happens to occur? You have a good number of servers in
> play here, is there any kind of pattern in terms of which servers
> experience the error? Is it always the same servers or a random set?
>
> Brian
>
> >
> > 2013/4/12 Brian Foster <bfoster@redhat.com <mailto:bfoster@redhat.com>>
> >
> > On 04/11/2013 03:11 PM, 符永涛 wrote:
> > > It happens tonight again on one of our servers, how to debug the
> root
> > > cause? Thank you.
> > >
> >
> > Hi,
> >
> > I've attached a system tap script (stap -v xfs.stp) that should
> > hopefully print out a bit more data should the issue happen again. Do
> > you have a small enough number of nodes (or predictable enough
> pattern)
> > that you could run this on the nodes that tend to fail and collect
> the
> > output?
> >
> > Also, could you collect an xfs_metadump of the filesystem in question
> > and make it available for download and analysis somewhere? I believe
> the
> > ideal approach is to mount/umount the filesystem first to replay the
> log
> > before collecting a metadump, but somebody could correct me on that
> (to
> > be safe, you could collect multiple dumps: pre-mount and post-mount).
> >
> > Could you also describe your workload a little bit? Thanks.
> >
> > Brian
> >
> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
> > > xfs_inotobp() returned error 22.
> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree
> > returned
> > > error 22
> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_do_force_shutdown(0x1)
> > > called from line 1184 of file fs/xfs/xfs_vnodeops.c. Return
> address =
> > > 0xffffffffa02ee20a
> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb): I/O Error Detected.
> Shutting
> > > down filesystem
> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb): Please umount the
> > filesystem and
> > > rectify the problem(s)
> > > Apr 12 02:32:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> > returned.
> > > Apr 12 02:32:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> > returned.
> > > Apr 12 02:33:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> > returned.
> > > Apr 12 02:33:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> > returned.
> > >
> > > xfs_repair -n
> > >
> > >
> > > Phase 7 - verify link counts...
> > > would have reset inode 20021 nlinks from 0 to 1
> > > would have reset inode 20789 nlinks from 0 to 1
> > > would have reset inode 35125 nlinks from 0 to 1
> > > would have reset inode 35637 nlinks from 0 to 1
> > > would have reset inode 36149 nlinks from 0 to 1
> > > would have reset inode 38197 nlinks from 0 to 1
> > > would have reset inode 39477 nlinks from 0 to 1
> > > would have reset inode 54069 nlinks from 0 to 1
> > > would have reset inode 62261 nlinks from 0 to 1
> > > would have reset inode 63029 nlinks from 0 to 1
> > > would have reset inode 72501 nlinks from 0 to 1
> > > would have reset inode 79925 nlinks from 0 to 1
> > > would have reset inode 81205 nlinks from 0 to 1
> > > would have reset inode 84789 nlinks from 0 to 1
> > > would have reset inode 87861 nlinks from 0 to 1
> > > would have reset inode 90663 nlinks from 0 to 1
> > > would have reset inode 91189 nlinks from 0 to 1
> > > would have reset inode 95541 nlinks from 0 to 1
> > > would have reset inode 98101 nlinks from 0 to 1
> > > would have reset inode 101173 nlinks from 0 to 1
> > > would have reset inode 113205 nlinks from 0 to 1
> > > would have reset inode 114741 nlinks from 0 to 1
> > > would have reset inode 126261 nlinks from 0 to 1
> > > would have reset inode 140597 nlinks from 0 to 1
> > > would have reset inode 144693 nlinks from 0 to 1
> > > would have reset inode 147765 nlinks from 0 to 1
> > > would have reset inode 152885 nlinks from 0 to 1
> > > would have reset inode 161333 nlinks from 0 to 1
> > > would have reset inode 161845 nlinks from 0 to 1
> > > would have reset inode 167477 nlinks from 0 to 1
> > > would have reset inode 172341 nlinks from 0 to 1
> > > would have reset inode 191797 nlinks from 0 to 1
> > > would have reset inode 204853 nlinks from 0 to 1
> > > would have reset inode 205365 nlinks from 0 to 1
> > > would have reset inode 215349 nlinks from 0 to 1
> > > would have reset inode 215861 nlinks from 0 to 1
> > > would have reset inode 216373 nlinks from 0 to 1
> > > would have reset inode 217397 nlinks from 0 to 1
> > > would have reset inode 224309 nlinks from 0 to 1
> > > would have reset inode 225589 nlinks from 0 to 1
> > > would have reset inode 234549 nlinks from 0 to 1
> > > would have reset inode 234805 nlinks from 0 to 1
> > > would have reset inode 249653 nlinks from 0 to 1
> > > would have reset inode 250677 nlinks from 0 to 1
> > > would have reset inode 252469 nlinks from 0 to 1
> > > would have reset inode 261429 nlinks from 0 to 1
> > > would have reset inode 265013 nlinks from 0 to 1
> > > would have reset inode 266805 nlinks from 0 to 1
> > > would have reset inode 267317 nlinks from 0 to 1
> > > would have reset inode 268853 nlinks from 0 to 1
> > > would have reset inode 272437 nlinks from 0 to 1
> > > would have reset inode 273205 nlinks from 0 to 1
> > > would have reset inode 274229 nlinks from 0 to 1
> > > would have reset inode 278325 nlinks from 0 to 1
> > > would have reset inode 278837 nlinks from 0 to 1
> > > would have reset inode 281397 nlinks from 0 to 1
> > > would have reset inode 292661 nlinks from 0 to 1
> > > would have reset inode 300853 nlinks from 0 to 1
> > > would have reset inode 302901 nlinks from 0 to 1
> > > would have reset inode 305205 nlinks from 0 to 1
> > > would have reset inode 314165 nlinks from 0 to 1
> > > would have reset inode 315189 nlinks from 0 to 1
> > > would have reset inode 320309 nlinks from 0 to 1
> > > would have reset inode 324917 nlinks from 0 to 1
> > > would have reset inode 328245 nlinks from 0 to 1
> > > would have reset inode 335925 nlinks from 0 to 1
> > > would have reset inode 339253 nlinks from 0 to 1
> > > would have reset inode 339765 nlinks from 0 to 1
> > > would have reset inode 348213 nlinks from 0 to 1
> > > would have reset inode 360501 nlinks from 0 to 1
> > > would have reset inode 362037 nlinks from 0 to 1
> > > would have reset inode 366389 nlinks from 0 to 1
> > > would have reset inode 385845 nlinks from 0 to 1
> > > would have reset inode 390709 nlinks from 0 to 1
> > > would have reset inode 409141 nlinks from 0 to 1
> > > would have reset inode 413237 nlinks from 0 to 1
> > > would have reset inode 414773 nlinks from 0 to 1
> > > would have reset inode 417845 nlinks from 0 to 1
> > > would have reset inode 436021 nlinks from 0 to 1
> > > would have reset inode 439349 nlinks from 0 to 1
> > > would have reset inode 447029 nlinks from 0 to 1
> > > would have reset inode 491317 nlinks from 0 to 1
> > > would have reset inode 494133 nlinks from 0 to 1
> > > would have reset inode 495413 nlinks from 0 to 1
> > > would have reset inode 501301 nlinks from 0 to 1
> > > would have reset inode 506421 nlinks from 0 to 1
> > > would have reset inode 508469 nlinks from 0 to 1
> > > would have reset inode 508981 nlinks from 0 to 1
> > > would have reset inode 511797 nlinks from 0 to 1
> > > would have reset inode 513077 nlinks from 0 to 1
> > > would have reset inode 517941 nlinks from 0 to 1
> > > would have reset inode 521013 nlinks from 0 to 1
> > > would have reset inode 522805 nlinks from 0 to 1
> > > would have reset inode 523317 nlinks from 0 to 1
> > > would have reset inode 525621 nlinks from 0 to 1
> > > would have reset inode 527925 nlinks from 0 to 1
> > > would have reset inode 535605 nlinks from 0 to 1
> > > would have reset inode 541749 nlinks from 0 to 1
> > > would have reset inode 573493 nlinks from 0 to 1
> > > would have reset inode 578613 nlinks from 0 to 1
> > > would have reset inode 583029 nlinks from 0 to 1
> > > would have reset inode 585525 nlinks from 0 to 1
> > > would have reset inode 586293 nlinks from 0 to 1
> > > would have reset inode 586805 nlinks from 0 to 1
> > > would have reset inode 591413 nlinks from 0 to 1
> > > would have reset inode 594485 nlinks from 0 to 1
> > > would have reset inode 596277 nlinks from 0 to 1
> > > would have reset inode 603189 nlinks from 0 to 1
> > > would have reset inode 613429 nlinks from 0 to 1
> > > would have reset inode 617781 nlinks from 0 to 1
> > > would have reset inode 621877 nlinks from 0 to 1
> > > would have reset inode 623925 nlinks from 0 to 1
> > > would have reset inode 625205 nlinks from 0 to 1
> > > would have reset inode 626741 nlinks from 0 to 1
> > > would have reset inode 639541 nlinks from 0 to 1
> > > would have reset inode 640053 nlinks from 0 to 1
> > > would have reset inode 640565 nlinks from 0 to 1
> > > would have reset inode 645173 nlinks from 0 to 1
> > > would have reset inode 652853 nlinks from 0 to 1
> > > would have reset inode 656181 nlinks from 0 to 1
> > > would have reset inode 659253 nlinks from 0 to 1
> > > would have reset inode 663605 nlinks from 0 to 1
> > > would have reset inode 667445 nlinks from 0 to 1
> > > would have reset inode 680757 nlinks from 0 to 1
> > > would have reset inode 691253 nlinks from 0 to 1
> > > would have reset inode 691765 nlinks from 0 to 1
> > > would have reset inode 697653 nlinks from 0 to 1
> > > would have reset inode 700469 nlinks from 0 to 1
> > > would have reset inode 707893 nlinks from 0 to 1
> > > would have reset inode 716853 nlinks from 0 to 1
> > > would have reset inode 722229 nlinks from 0 to 1
> > > would have reset inode 722741 nlinks from 0 to 1
> > > would have reset inode 723765 nlinks from 0 to 1
> > > would have reset inode 731957 nlinks from 0 to 1
> > > would have reset inode 742965 nlinks from 0 to 1
> > > would have reset inode 743477 nlinks from 0 to 1
> > > would have reset inode 745781 nlinks from 0 to 1
> > > would have reset inode 746293 nlinks from 0 to 1
> > > would have reset inode 774453 nlinks from 0 to 1
> > > would have reset inode 778805 nlinks from 0 to 1
> > > would have reset inode 785013 nlinks from 0 to 1
> > > would have reset inode 785973 nlinks from 0 to 1
> > > would have reset inode 791349 nlinks from 0 to 1
> > > would have reset inode 796981 nlinks from 0 to 1
> > > would have reset inode 803381 nlinks from 0 to 1
> > > would have reset inode 806965 nlinks from 0 to 1
> > > would have reset inode 811798 nlinks from 0 to 1
> > > would have reset inode 812310 nlinks from 0 to 1
> > > would have reset inode 813078 nlinks from 0 to 1
> > > would have reset inode 813607 nlinks from 0 to 1
> > > would have reset inode 814183 nlinks from 0 to 1
> > > would have reset inode 822069 nlinks from 0 to 1
> > > would have reset inode 828469 nlinks from 0 to 1
> > > would have reset inode 830005 nlinks from 0 to 1
> > > would have reset inode 832053 nlinks from 0 to 1
> > > would have reset inode 832565 nlinks from 0 to 1
> > > would have reset inode 836661 nlinks from 0 to 1
> > > would have reset inode 841013 nlinks from 0 to 1
> > > would have reset inode 841525 nlinks from 0 to 1
> > > would have reset inode 845365 nlinks from 0 to 1
> > > would have reset inode 846133 nlinks from 0 to 1
> > > would have reset inode 847157 nlinks from 0 to 1
> > > would have reset inode 852533 nlinks from 0 to 1
> > > would have reset inode 857141 nlinks from 0 to 1
> > > would have reset inode 863271 nlinks from 0 to 1
> > > would have reset inode 866855 nlinks from 0 to 1
> > > would have reset inode 887861 nlinks from 0 to 1
> > > would have reset inode 891701 nlinks from 0 to 1
> > > would have reset inode 894773 nlinks from 0 to 1
> > > would have reset inode 900149 nlinks from 0 to 1
> > > would have reset inode 902197 nlinks from 0 to 1
> > > would have reset inode 906293 nlinks from 0 to 1
> > > would have reset inode 906805 nlinks from 0 to 1
> > > would have reset inode 909877 nlinks from 0 to 1
> > > would have reset inode 925493 nlinks from 0 to 1
> > > would have reset inode 949543 nlinks from 0 to 1
> > > would have reset inode 955175 nlinks from 0 to 1
> > > would have reset inode 963623 nlinks from 0 to 1
> > > would have reset inode 967733 nlinks from 0 to 1
> > > would have reset inode 968231 nlinks from 0 to 1
> > > would have reset inode 982069 nlinks from 0 to 1
> > > would have reset inode 1007413 nlinks from 0 to 1
> > > would have reset inode 1011509 nlinks from 0 to 1
> > > would have reset inode 1014069 nlinks from 0 to 1
> > > would have reset inode 1014581 nlinks from 0 to 1
> > > would have reset inode 1022005 nlinks from 0 to 1
> > > would have reset inode 1022517 nlinks from 0 to 1
> > > would have reset inode 1023029 nlinks from 0 to 1
> > > would have reset inode 1025333 nlinks from 0 to 1
> > > would have reset inode 1043765 nlinks from 0 to 1
> > > would have reset inode 1044789 nlinks from 0 to 1
> > > would have reset inode 1049397 nlinks from 0 to 1
> > > would have reset inode 1050933 nlinks from 0 to 1
> > > would have reset inode 1051445 nlinks from 0 to 1
> > > would have reset inode 1054261 nlinks from 0 to 1
> > > would have reset inode 1060917 nlinks from 0 to 1
> > > would have reset inode 1063477 nlinks from 0 to 1
> > > would have reset inode 1076021 nlinks from 0 to 1
> > > would have reset inode 1081141 nlinks from 0 to 1
> > > would have reset inode 1086261 nlinks from 0 to 1
> > > would have reset inode 1097269 nlinks from 0 to 1
> > > would have reset inode 1099829 nlinks from 0 to 1
> > > would have reset inode 1100853 nlinks from 0 to 1
> > > would have reset inode 1101877 nlinks from 0 to 1
> > > would have reset inode 1126709 nlinks from 0 to 1
> > > would have reset inode 1134389 nlinks from 0 to 1
> > > would have reset inode 1141045 nlinks from 0 to 1
> > > would have reset inode 1141557 nlinks from 0 to 1
> > > would have reset inode 1142581 nlinks from 0 to 1
> > > would have reset inode 1148469 nlinks from 0 to 1
> > > would have reset inode 1153333 nlinks from 0 to 1
> > > would have reset inode 1181749 nlinks from 0 to 1
> > > would have reset inode 1192245 nlinks from 0 to 1
> > > would have reset inode 1198133 nlinks from 0 to 1
> > > would have reset inode 1203765 nlinks from 0 to 1
> > > would have reset inode 1221429 nlinks from 0 to 1
> > > would have reset inode 1223989 nlinks from 0 to 1
> > > would have reset inode 1235509 nlinks from 0 to 1
> > > would have reset inode 1239349 nlinks from 0 to 1
> > > would have reset inode 1240885 nlinks from 0 to 1
> > > would have reset inode 1241397 nlinks from 0 to 1
> > > would have reset inode 1241909 nlinks from 0 to 1
> > > would have reset inode 1242421 nlinks from 0 to 1
> > > would have reset inode 1244981 nlinks from 0 to 1
> > > would have reset inode 1246517 nlinks from 0 to 1
> > > would have reset inode 1253429 nlinks from 0 to 1
> > > would have reset inode 1271861 nlinks from 0 to 1
> > > would have reset inode 1274677 nlinks from 0 to 1
> > > would have reset inode 1277749 nlinks from 0 to 1
> > > would have reset inode 1278773 nlinks from 0 to 1
> > > would have reset inode 1286709 nlinks from 0 to 1
> > > would have reset inode 1288245 nlinks from 0 to 1
> > > would have reset inode 1299765 nlinks from 0 to 1
> > > would have reset inode 1302325 nlinks from 0 to 1
> > > would have reset inode 1304885 nlinks from 0 to 1
> > > would have reset inode 1305397 nlinks from 0 to 1
> > > would have reset inode 1307509 nlinks from 0 to 1
> > > would have reset inode 1309493 nlinks from 0 to 1
> > > would have reset inode 1310517 nlinks from 0 to 1
> > > would have reset inode 1311029 nlinks from 0 to 1
> > > would have reset inode 1312053 nlinks from 0 to 1
> > > would have reset inode 1316917 nlinks from 0 to 1
> > > would have reset inode 1317941 nlinks from 0 to 1
> > > would have reset inode 1320821 nlinks from 0 to 1
> > > would have reset inode 1322805 nlinks from 0 to 1
> > > would have reset inode 1332789 nlinks from 0 to 1
> > > would have reset inode 1336373 nlinks from 0 to 1
> > > would have reset inode 1345653 nlinks from 0 to 1
> > > would have reset inode 1354549 nlinks from 0 to 1
> > > would have reset inode 1361973 nlinks from 0 to 1
> > > would have reset inode 1369909 nlinks from 0 to 1
> > > would have reset inode 1372981 nlinks from 0 to 1
> > > would have reset inode 1388853 nlinks from 0 to 1
> > > would have reset inode 1402933 nlinks from 0 to 1
> > > would have reset inode 1403445 nlinks from 0 to 1
> > > would have reset inode 1420085 nlinks from 0 to 1
> > > would have reset inode 1452853 nlinks from 0 to 1
> > > would have reset inode 1456437 nlinks from 0 to 1
> > > would have reset inode 1457973 nlinks from 0 to 1
> > > would have reset inode 1459253 nlinks from 0 to 1
> > > would have reset inode 1467957 nlinks from 0 to 1
> > > would have reset inode 1471541 nlinks from 0 to 1
> > > would have reset inode 1476661 nlinks from 0 to 1
> > > would have reset inode 1479733 nlinks from 0 to 1
> > > would have reset inode 1483061 nlinks from 0 to 1
> > > would have reset inode 1484085 nlinks from 0 to 1
> > > would have reset inode 1486133 nlinks from 0 to 1
> > > would have reset inode 1489461 nlinks from 0 to 1
> > > would have reset inode 1490037 nlinks from 0 to 1
> > > would have reset inode 1492021 nlinks from 0 to 1
> > > would have reset inode 1493557 nlinks from 0 to 1
> > > would have reset inode 1494069 nlinks from 0 to 1
> > > would have reset inode 1496885 nlinks from 0 to 1
> > > would have reset inode 1498421 nlinks from 0 to 1
> > > would have reset inode 1498933 nlinks from 0 to 1
> > > would have reset inode 1499957 nlinks from 0 to 1
> > > would have reset inode 1506101 nlinks from 0 to 1
> > > would have reset inode 1507637 nlinks from 0 to 1
> > > would have reset inode 1510453 nlinks from 0 to 1
> > > would have reset inode 1514293 nlinks from 0 to 1
> > > would have reset inode 1517365 nlinks from 0 to 1
> > > would have reset inode 1520693 nlinks from 0 to 1
> > > would have reset inode 1521973 nlinks from 0 to 1
> > > would have reset inode 1530421 nlinks from 0 to 1
> > > would have reset inode 1530933 nlinks from 0 to 1
> > > would have reset inode 1537333 nlinks from 0 to 1
> > > would have reset inode 1538357 nlinks from 0 to 1
> > > would have reset inode 1548853 nlinks from 0 to 1
> > > would have reset inode 1553973 nlinks from 0 to 1
> > > would have reset inode 1557301 nlinks from 0 to 1
> > > would have reset inode 1564213 nlinks from 0 to 1
> > > would have reset inode 1564725 nlinks from 0 to 1
> > > would have reset inode 1576501 nlinks from 0 to 1
> > > would have reset inode 1580597 nlinks from 0 to 1
> > > would have reset inode 1584693 nlinks from 0 to 1
> > > would have reset inode 1586485 nlinks from 0 to 1
> > > would have reset inode 1589301 nlinks from 0 to 1
> > > would have reset inode 1589813 nlinks from 0 to 1
> > > would have reset inode 1592629 nlinks from 0 to 1
> > > would have reset inode 1595701 nlinks from 0 to 1
> > > would have reset inode 1601077 nlinks from 0 to 1
> > > would have reset inode 1623861 nlinks from 0 to 1
> > > would have reset inode 1626677 nlinks from 0 to 1
> > > would have reset inode 1627701 nlinks from 0 to 1
> > > would have reset inode 1633333 nlinks from 0 to 1
> > > would have reset inode 1639221 nlinks from 0 to 1
> > > would have reset inode 1649205 nlinks from 0 to 1
> > > would have reset inode 1686325 nlinks from 0 to 1
> > > would have reset inode 1690677 nlinks from 0 to 1
> > > would have reset inode 1693749 nlinks from 0 to 1
> > > would have reset inode 1704757 nlinks from 0 to 1
> > > would have reset inode 1707061 nlinks from 0 to 1
> > > would have reset inode 1709109 nlinks from 0 to 1
> > > would have reset inode 1719349 nlinks from 0 to 1
> > > would have reset inode 1737013 nlinks from 0 to 1
> > > would have reset inode 1741365 nlinks from 0 to 1
> > > would have reset inode 1747509 nlinks from 0 to 1
> > > would have reset inode 1770805 nlinks from 0 to 1
> > > would have reset inode 1780789 nlinks from 0 to 1
> > > would have reset inode 1793589 nlinks from 0 to 1
> > > would have reset inode 1795125 nlinks from 0 to 1
> > > would have reset inode 1800757 nlinks from 0 to 1
> > > would have reset inode 1801269 nlinks from 0 to 1
> > > would have reset inode 1802549 nlinks from 0 to 1
> > > would have reset inode 1804085 nlinks from 0 to 1
> > > would have reset inode 1817141 nlinks from 0 to 1
> > > would have reset inode 1821749 nlinks from 0 to 1
> > > would have reset inode 1832757 nlinks from 0 to 1
> > > would have reset inode 1836341 nlinks from 0 to 1
> > > would have reset inode 1856309 nlinks from 0 to 1
> > > would have reset inode 1900597 nlinks from 0 to 1
> > > would have reset inode 1902901 nlinks from 0 to 1
> > > would have reset inode 1912373 nlinks from 0 to 1
> > > would have reset inode 1943093 nlinks from 0 to 1
> > > would have reset inode 1944373 nlinks from 0 to 1
> > > would have reset inode 1954101 nlinks from 0 to 1
> > > would have reset inode 1955893 nlinks from 0 to 1
> > > would have reset inode 1961781 nlinks from 0 to 1
> > > would have reset inode 1974325 nlinks from 0 to 1
> > > would have reset inode 1978677 nlinks from 0 to 1
> > > would have reset inode 1981237 nlinks from 0 to 1
> > > would have reset inode 1992245 nlinks from 0 to 1
> > > would have reset inode 2000949 nlinks from 0 to 1
> > > would have reset inode 2002229 nlinks from 0 to 1
> > > would have reset inode 2004789 nlinks from 0 to 1
> > > would have reset inode 2005301 nlinks from 0 to 1
> > > would have reset inode 2011189 nlinks from 0 to 1
> > > would have reset inode 2012981 nlinks from 0 to 1
> > > would have reset inode 2015285 nlinks from 0 to 1
> > > would have reset inode 2018869 nlinks from 0 to 1
> > > would have reset inode 2028341 nlinks from 0 to 1
> > > would have reset inode 2028853 nlinks from 0 to 1
> > > would have reset inode 2030901 nlinks from 0 to 1
> > > would have reset inode 2032181 nlinks from 0 to 1
> > > would have reset inode 2032693 nlinks from 0 to 1
> > > would have reset inode 2040117 nlinks from 0 to 1
> > > would have reset inode 2053685 nlinks from 0 to 1
> > > would have reset inode 2083893 nlinks from 0 to 1
> > > would have reset inode 2087221 nlinks from 0 to 1
> > > would have reset inode 2095925 nlinks from 0 to 1
> > > would have reset inode 2098741 nlinks from 0 to 1
> > > would have reset inode 2100533 nlinks from 0 to 1
> > > would have reset inode 2101301 nlinks from 0 to 1
> > > would have reset inode 2123573 nlinks from 0 to 1
> > > would have reset inode 2132789 nlinks from 0 to 1
> > > would have reset inode 2133813 nlinks from 0 to 1
> > >
> > >
> > >
> > >
> > >
> > > 2013/4/10 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>
> > <mailto:yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>>
> > >
> > > The storage info is as following:
> > > RAID-6
> > > SATA HDD
> > > Controller: PERC H710P Mini (Embedded)
> > > Disk /dev/sdb: 30000.3 GB, 30000346562560 bytes
> > > 255 heads, 63 sectors/track, 3647334 cylinders
> > > Units = cylinders of 16065 * 512 = 8225280 bytes
> > > Sector size (logical/physical): 512 bytes / 512 bytes
> > > I/O size (minimum/optimal): 512 bytes / 512 bytes
> > > Disk identifier: 0x00000000
> > >
> > > sd 0:2:1:0: [sdb] 58594426880 512-byte logical blocks: (30.0
> > TB/27.2
> > > TiB)
> > > sd 0:2:1:0: [sdb] Write Protect is off
> > > sd 0:2:1:0: [sdb] Mode Sense: 1f 00 00 08
> > > sd 0:2:1:0: [sdb] Write cache: enabled, read cache: enabled,
> > doesn't
> > > support DPO or FUA
> > > sd 0:2:1:0: [sdb] Attached SCSI disk
> > >
> > > *-storage
> > > description: RAID bus controller
> > > product: MegaRAID SAS 2208 [Thunderbolt]
> > > vendor: LSI Logic / Symbios Logic
> > > physical id: 0
> > > bus info: pci@0000:02:00.0
> > > logical name: scsi0
> > > version: 01
> > > width: 64 bits
> > > clock: 33MHz
> > > capabilities: storage pm pciexpress vpd msi msix
> bus_master
> > > cap_list rom
> > > configuration: driver=megaraid_sas latency=0
> > > resources: irq:42 ioport:fc00(size=256)
> > > memory:dd7fc000-dd7fffff memory:dd780000-dd7bffff
> > > memory:dc800000-dc81ffff(prefetchable)
> > > *-disk:0
> > > description: SCSI Disk
> > > product: PERC H710P
> > > vendor: DELL
> > > physical id: 2.0.0
> > > bus info: scsi@0:2.0.0
> > > logical name: /dev/sda
> > > version: 3.13
> > > serial: 0049d6ce1d9f2035180096fde490f648
> > > size: 558GiB (599GB)
> > > capabilities: partitioned partitioned:dos
> > > configuration: ansiversion=5 signature=000aa336
> > > *-disk:1
> > > description: SCSI Disk
> > > product: PERC H710P
> > > vendor: DELL
> > > physical id: 2.1.0
> > > bus info: scsi@0:2.1.0
> > > logical name: /dev/sdb
> > > logical name: /mnt/xfsd
> > > version: 3.13
> > > serial: 003366f71da22035180096fde490f648
> > > size: 27TiB (30TB)
> > > configuration: ansiversion=5 mount.fstype=xfs
> > >
> >
> mount.options=rw,relatime,attr2,delaylog,logbsize=64k,sunit=128,swidth=1280,noquota
> > > state=mounted
> > >
> > > Thank you.
> > >
> > >
> > > 2013/4/10 Emmanuel Florac <eflorac@intellique.com
> > <mailto:eflorac@intellique.com>
> > > <mailto:eflorac@intellique.com <mailto:eflorac@intellique.com
> >>>
> > >
> > > Le Tue, 9 Apr 2013 23:10:03 +0800
> > > 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>
> > <mailto:yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>> écrivait:
> > >
> > > > > Apr 9 11:01:30 cqdx kernel: XFS (sdb): I/O Error
> > Detected.
> > > > > Shutting down filesystem
> > >
> > > This. I/O error detected. That means that at some point the
> > > underlying
> > > device (disk, RAID array, SAN volume) couldn't be reached.
> > So this
> > > could very well be a case of a flakey drive, array, cable
> > or SCSI
> > > driver.
> > >
> > > What's the storage setup here?
> > >
> > > --
> > >
> >
> ------------------------------------------------------------------------
> > > Emmanuel Florac | Direction technique
> > > | Intellique
> > > | <eflorac@intellique.com
> > <mailto:eflorac@intellique.com>
> > > <mailto:eflorac@intellique.com
> > <mailto:eflorac@intellique.com>>>
> > > | +33 1 78 94 84 02
> > <tel:%2B33%201%2078%2094%2084%2002>
> > >
> >
> ------------------------------------------------------------------------
> > >
> > >
> > >
> > >
> > > --
> > > 符永涛
> > >
> > >
> > >
> > >
> > > --
> > > 符永涛
> > >
> > >
> > > _______________________________________________
> > > xfs mailing list
> > > xfs@oss.sgi.com <mailto:xfs@oss.sgi.com>
> > > http://oss.sgi.com/mailman/listinfo/xfs
> > >
> >
> >
> >
> >
> > --
> > 符永涛
> >
> >
> > _______________________________________________
> > xfs mailing list
> > xfs@oss.sgi.com
> > http://oss.sgi.com/mailman/listinfo/xfs
> >
>
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 44838 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-12 13:42 ` 符永涛
@ 2013-04-12 13:48 ` 符永涛
2013-04-12 13:51 ` 符永涛
0 siblings, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-12 13:48 UTC (permalink / raw)
To: Brian Foster; +Cc: Ben Myers, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 32216 bytes --]
And I'm not sure what kind of information is important to debug this issue?
Thank you.
2013/4/12 符永涛 <yongtaofu@gmail.com>
> The xfs shutdown error is always the same. (It happens about 20 times on
> about 50 servers during last half year)
> Recently the shutdown happens on the cluster with 24 servers(dis 8 *
> replica 3) during rebalance. The average work load of this cluster is more
> than 3TB growth per day.
> The work load including normal fops, rsync jobs, video encoding/decoding,
> logging etc. through glusterfs native client of hundreds of machines.
> The shutdown tend to happen when we run rebalance for the glusterfs
> cluster which I guess will trigger a lot of unlink operations?
>
> Thank you very much. May be I can try to collect more logs with a modified
> kernel package.
>
>
>
> 2013/4/12 Brian Foster <bfoster@redhat.com>
>
>> On 04/11/2013 08:45 PM, 符永涛 wrote:
>> > the workload is about:
>> > 24 servers, replica(3) which means the distribute is 8
>> > load is about 3(TB)-8(TB) per day.
>> >
>>
>> This describes your cluster, but not the workload (though cluster info
>> is good too). What kind of workload is running on your clients (i.e.,
>> rsync jobs, etc.)? Are you running through native gluster mount points,
>> NFS mounts or a mix? Do you have any gluster internal operations running
>> (i.e., rebalance, etc.).
>>
>> Is there any kind of pattern you can discern from the workload and when
>> the XFS error happens to occur? You have a good number of servers in
>> play here, is there any kind of pattern in terms of which servers
>> experience the error? Is it always the same servers or a random set?
>>
>> Brian
>>
>> >
>> > 2013/4/12 Brian Foster <bfoster@redhat.com <mailto:bfoster@redhat.com>>
>> >
>> > On 04/11/2013 03:11 PM, 符永涛 wrote:
>> > > It happens tonight again on one of our servers, how to debug the
>> root
>> > > cause? Thank you.
>> > >
>> >
>> > Hi,
>> >
>> > I've attached a system tap script (stap -v xfs.stp) that should
>> > hopefully print out a bit more data should the issue happen again.
>> Do
>> > you have a small enough number of nodes (or predictable enough
>> pattern)
>> > that you could run this on the nodes that tend to fail and collect
>> the
>> > output?
>> >
>> > Also, could you collect an xfs_metadump of the filesystem in
>> question
>> > and make it available for download and analysis somewhere? I
>> believe the
>> > ideal approach is to mount/umount the filesystem first to replay
>> the log
>> > before collecting a metadump, but somebody could correct me on that
>> (to
>> > be safe, you could collect multiple dumps: pre-mount and
>> post-mount).
>> >
>> > Could you also describe your workload a little bit? Thanks.
>> >
>> > Brian
>> >
>> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
>> > > xfs_inotobp() returned error 22.
>> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree
>> > returned
>> > > error 22
>> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_do_force_shutdown(0x1)
>> > > called from line 1184 of file fs/xfs/xfs_vnodeops.c. Return
>> address =
>> > > 0xffffffffa02ee20a
>> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb): I/O Error Detected.
>> Shutting
>> > > down filesystem
>> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb): Please umount the
>> > filesystem and
>> > > rectify the problem(s)
>> > > Apr 12 02:32:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> > returned.
>> > > Apr 12 02:32:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> > returned.
>> > > Apr 12 02:33:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> > returned.
>> > > Apr 12 02:33:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> > returned.
>> > >
>> > > xfs_repair -n
>> > >
>> > >
>> > > Phase 7 - verify link counts...
>> > > would have reset inode 20021 nlinks from 0 to 1
>> > > would have reset inode 20789 nlinks from 0 to 1
>> > > would have reset inode 35125 nlinks from 0 to 1
>> > > would have reset inode 35637 nlinks from 0 to 1
>> > > would have reset inode 36149 nlinks from 0 to 1
>> > > would have reset inode 38197 nlinks from 0 to 1
>> > > would have reset inode 39477 nlinks from 0 to 1
>> > > would have reset inode 54069 nlinks from 0 to 1
>> > > would have reset inode 62261 nlinks from 0 to 1
>> > > would have reset inode 63029 nlinks from 0 to 1
>> > > would have reset inode 72501 nlinks from 0 to 1
>> > > would have reset inode 79925 nlinks from 0 to 1
>> > > would have reset inode 81205 nlinks from 0 to 1
>> > > would have reset inode 84789 nlinks from 0 to 1
>> > > would have reset inode 87861 nlinks from 0 to 1
>> > > would have reset inode 90663 nlinks from 0 to 1
>> > > would have reset inode 91189 nlinks from 0 to 1
>> > > would have reset inode 95541 nlinks from 0 to 1
>> > > would have reset inode 98101 nlinks from 0 to 1
>> > > would have reset inode 101173 nlinks from 0 to 1
>> > > would have reset inode 113205 nlinks from 0 to 1
>> > > would have reset inode 114741 nlinks from 0 to 1
>> > > would have reset inode 126261 nlinks from 0 to 1
>> > > would have reset inode 140597 nlinks from 0 to 1
>> > > would have reset inode 144693 nlinks from 0 to 1
>> > > would have reset inode 147765 nlinks from 0 to 1
>> > > would have reset inode 152885 nlinks from 0 to 1
>> > > would have reset inode 161333 nlinks from 0 to 1
>> > > would have reset inode 161845 nlinks from 0 to 1
>> > > would have reset inode 167477 nlinks from 0 to 1
>> > > would have reset inode 172341 nlinks from 0 to 1
>> > > would have reset inode 191797 nlinks from 0 to 1
>> > > would have reset inode 204853 nlinks from 0 to 1
>> > > would have reset inode 205365 nlinks from 0 to 1
>> > > would have reset inode 215349 nlinks from 0 to 1
>> > > would have reset inode 215861 nlinks from 0 to 1
>> > > would have reset inode 216373 nlinks from 0 to 1
>> > > would have reset inode 217397 nlinks from 0 to 1
>> > > would have reset inode 224309 nlinks from 0 to 1
>> > > would have reset inode 225589 nlinks from 0 to 1
>> > > would have reset inode 234549 nlinks from 0 to 1
>> > > would have reset inode 234805 nlinks from 0 to 1
>> > > would have reset inode 249653 nlinks from 0 to 1
>> > > would have reset inode 250677 nlinks from 0 to 1
>> > > would have reset inode 252469 nlinks from 0 to 1
>> > > would have reset inode 261429 nlinks from 0 to 1
>> > > would have reset inode 265013 nlinks from 0 to 1
>> > > would have reset inode 266805 nlinks from 0 to 1
>> > > would have reset inode 267317 nlinks from 0 to 1
>> > > would have reset inode 268853 nlinks from 0 to 1
>> > > would have reset inode 272437 nlinks from 0 to 1
>> > > would have reset inode 273205 nlinks from 0 to 1
>> > > would have reset inode 274229 nlinks from 0 to 1
>> > > would have reset inode 278325 nlinks from 0 to 1
>> > > would have reset inode 278837 nlinks from 0 to 1
>> > > would have reset inode 281397 nlinks from 0 to 1
>> > > would have reset inode 292661 nlinks from 0 to 1
>> > > would have reset inode 300853 nlinks from 0 to 1
>> > > would have reset inode 302901 nlinks from 0 to 1
>> > > would have reset inode 305205 nlinks from 0 to 1
>> > > would have reset inode 314165 nlinks from 0 to 1
>> > > would have reset inode 315189 nlinks from 0 to 1
>> > > would have reset inode 320309 nlinks from 0 to 1
>> > > would have reset inode 324917 nlinks from 0 to 1
>> > > would have reset inode 328245 nlinks from 0 to 1
>> > > would have reset inode 335925 nlinks from 0 to 1
>> > > would have reset inode 339253 nlinks from 0 to 1
>> > > would have reset inode 339765 nlinks from 0 to 1
>> > > would have reset inode 348213 nlinks from 0 to 1
>> > > would have reset inode 360501 nlinks from 0 to 1
>> > > would have reset inode 362037 nlinks from 0 to 1
>> > > would have reset inode 366389 nlinks from 0 to 1
>> > > would have reset inode 385845 nlinks from 0 to 1
>> > > would have reset inode 390709 nlinks from 0 to 1
>> > > would have reset inode 409141 nlinks from 0 to 1
>> > > would have reset inode 413237 nlinks from 0 to 1
>> > > would have reset inode 414773 nlinks from 0 to 1
>> > > would have reset inode 417845 nlinks from 0 to 1
>> > > would have reset inode 436021 nlinks from 0 to 1
>> > > would have reset inode 439349 nlinks from 0 to 1
>> > > would have reset inode 447029 nlinks from 0 to 1
>> > > would have reset inode 491317 nlinks from 0 to 1
>> > > would have reset inode 494133 nlinks from 0 to 1
>> > > would have reset inode 495413 nlinks from 0 to 1
>> > > would have reset inode 501301 nlinks from 0 to 1
>> > > would have reset inode 506421 nlinks from 0 to 1
>> > > would have reset inode 508469 nlinks from 0 to 1
>> > > would have reset inode 508981 nlinks from 0 to 1
>> > > would have reset inode 511797 nlinks from 0 to 1
>> > > would have reset inode 513077 nlinks from 0 to 1
>> > > would have reset inode 517941 nlinks from 0 to 1
>> > > would have reset inode 521013 nlinks from 0 to 1
>> > > would have reset inode 522805 nlinks from 0 to 1
>> > > would have reset inode 523317 nlinks from 0 to 1
>> > > would have reset inode 525621 nlinks from 0 to 1
>> > > would have reset inode 527925 nlinks from 0 to 1
>> > > would have reset inode 535605 nlinks from 0 to 1
>> > > would have reset inode 541749 nlinks from 0 to 1
>> > > would have reset inode 573493 nlinks from 0 to 1
>> > > would have reset inode 578613 nlinks from 0 to 1
>> > > would have reset inode 583029 nlinks from 0 to 1
>> > > would have reset inode 585525 nlinks from 0 to 1
>> > > would have reset inode 586293 nlinks from 0 to 1
>> > > would have reset inode 586805 nlinks from 0 to 1
>> > > would have reset inode 591413 nlinks from 0 to 1
>> > > would have reset inode 594485 nlinks from 0 to 1
>> > > would have reset inode 596277 nlinks from 0 to 1
>> > > would have reset inode 603189 nlinks from 0 to 1
>> > > would have reset inode 613429 nlinks from 0 to 1
>> > > would have reset inode 617781 nlinks from 0 to 1
>> > > would have reset inode 621877 nlinks from 0 to 1
>> > > would have reset inode 623925 nlinks from 0 to 1
>> > > would have reset inode 625205 nlinks from 0 to 1
>> > > would have reset inode 626741 nlinks from 0 to 1
>> > > would have reset inode 639541 nlinks from 0 to 1
>> > > would have reset inode 640053 nlinks from 0 to 1
>> > > would have reset inode 640565 nlinks from 0 to 1
>> > > would have reset inode 645173 nlinks from 0 to 1
>> > > would have reset inode 652853 nlinks from 0 to 1
>> > > would have reset inode 656181 nlinks from 0 to 1
>> > > would have reset inode 659253 nlinks from 0 to 1
>> > > would have reset inode 663605 nlinks from 0 to 1
>> > > would have reset inode 667445 nlinks from 0 to 1
>> > > would have reset inode 680757 nlinks from 0 to 1
>> > > would have reset inode 691253 nlinks from 0 to 1
>> > > would have reset inode 691765 nlinks from 0 to 1
>> > > would have reset inode 697653 nlinks from 0 to 1
>> > > would have reset inode 700469 nlinks from 0 to 1
>> > > would have reset inode 707893 nlinks from 0 to 1
>> > > would have reset inode 716853 nlinks from 0 to 1
>> > > would have reset inode 722229 nlinks from 0 to 1
>> > > would have reset inode 722741 nlinks from 0 to 1
>> > > would have reset inode 723765 nlinks from 0 to 1
>> > > would have reset inode 731957 nlinks from 0 to 1
>> > > would have reset inode 742965 nlinks from 0 to 1
>> > > would have reset inode 743477 nlinks from 0 to 1
>> > > would have reset inode 745781 nlinks from 0 to 1
>> > > would have reset inode 746293 nlinks from 0 to 1
>> > > would have reset inode 774453 nlinks from 0 to 1
>> > > would have reset inode 778805 nlinks from 0 to 1
>> > > would have reset inode 785013 nlinks from 0 to 1
>> > > would have reset inode 785973 nlinks from 0 to 1
>> > > would have reset inode 791349 nlinks from 0 to 1
>> > > would have reset inode 796981 nlinks from 0 to 1
>> > > would have reset inode 803381 nlinks from 0 to 1
>> > > would have reset inode 806965 nlinks from 0 to 1
>> > > would have reset inode 811798 nlinks from 0 to 1
>> > > would have reset inode 812310 nlinks from 0 to 1
>> > > would have reset inode 813078 nlinks from 0 to 1
>> > > would have reset inode 813607 nlinks from 0 to 1
>> > > would have reset inode 814183 nlinks from 0 to 1
>> > > would have reset inode 822069 nlinks from 0 to 1
>> > > would have reset inode 828469 nlinks from 0 to 1
>> > > would have reset inode 830005 nlinks from 0 to 1
>> > > would have reset inode 832053 nlinks from 0 to 1
>> > > would have reset inode 832565 nlinks from 0 to 1
>> > > would have reset inode 836661 nlinks from 0 to 1
>> > > would have reset inode 841013 nlinks from 0 to 1
>> > > would have reset inode 841525 nlinks from 0 to 1
>> > > would have reset inode 845365 nlinks from 0 to 1
>> > > would have reset inode 846133 nlinks from 0 to 1
>> > > would have reset inode 847157 nlinks from 0 to 1
>> > > would have reset inode 852533 nlinks from 0 to 1
>> > > would have reset inode 857141 nlinks from 0 to 1
>> > > would have reset inode 863271 nlinks from 0 to 1
>> > > would have reset inode 866855 nlinks from 0 to 1
>> > > would have reset inode 887861 nlinks from 0 to 1
>> > > would have reset inode 891701 nlinks from 0 to 1
>> > > would have reset inode 894773 nlinks from 0 to 1
>> > > would have reset inode 900149 nlinks from 0 to 1
>> > > would have reset inode 902197 nlinks from 0 to 1
>> > > would have reset inode 906293 nlinks from 0 to 1
>> > > would have reset inode 906805 nlinks from 0 to 1
>> > > would have reset inode 909877 nlinks from 0 to 1
>> > > would have reset inode 925493 nlinks from 0 to 1
>> > > would have reset inode 949543 nlinks from 0 to 1
>> > > would have reset inode 955175 nlinks from 0 to 1
>> > > would have reset inode 963623 nlinks from 0 to 1
>> > > would have reset inode 967733 nlinks from 0 to 1
>> > > would have reset inode 968231 nlinks from 0 to 1
>> > > would have reset inode 982069 nlinks from 0 to 1
>> > > would have reset inode 1007413 nlinks from 0 to 1
>> > > would have reset inode 1011509 nlinks from 0 to 1
>> > > would have reset inode 1014069 nlinks from 0 to 1
>> > > would have reset inode 1014581 nlinks from 0 to 1
>> > > would have reset inode 1022005 nlinks from 0 to 1
>> > > would have reset inode 1022517 nlinks from 0 to 1
>> > > would have reset inode 1023029 nlinks from 0 to 1
>> > > would have reset inode 1025333 nlinks from 0 to 1
>> > > would have reset inode 1043765 nlinks from 0 to 1
>> > > would have reset inode 1044789 nlinks from 0 to 1
>> > > would have reset inode 1049397 nlinks from 0 to 1
>> > > would have reset inode 1050933 nlinks from 0 to 1
>> > > would have reset inode 1051445 nlinks from 0 to 1
>> > > would have reset inode 1054261 nlinks from 0 to 1
>> > > would have reset inode 1060917 nlinks from 0 to 1
>> > > would have reset inode 1063477 nlinks from 0 to 1
>> > > would have reset inode 1076021 nlinks from 0 to 1
>> > > would have reset inode 1081141 nlinks from 0 to 1
>> > > would have reset inode 1086261 nlinks from 0 to 1
>> > > would have reset inode 1097269 nlinks from 0 to 1
>> > > would have reset inode 1099829 nlinks from 0 to 1
>> > > would have reset inode 1100853 nlinks from 0 to 1
>> > > would have reset inode 1101877 nlinks from 0 to 1
>> > > would have reset inode 1126709 nlinks from 0 to 1
>> > > would have reset inode 1134389 nlinks from 0 to 1
>> > > would have reset inode 1141045 nlinks from 0 to 1
>> > > would have reset inode 1141557 nlinks from 0 to 1
>> > > would have reset inode 1142581 nlinks from 0 to 1
>> > > would have reset inode 1148469 nlinks from 0 to 1
>> > > would have reset inode 1153333 nlinks from 0 to 1
>> > > would have reset inode 1181749 nlinks from 0 to 1
>> > > would have reset inode 1192245 nlinks from 0 to 1
>> > > would have reset inode 1198133 nlinks from 0 to 1
>> > > would have reset inode 1203765 nlinks from 0 to 1
>> > > would have reset inode 1221429 nlinks from 0 to 1
>> > > would have reset inode 1223989 nlinks from 0 to 1
>> > > would have reset inode 1235509 nlinks from 0 to 1
>> > > would have reset inode 1239349 nlinks from 0 to 1
>> > > would have reset inode 1240885 nlinks from 0 to 1
>> > > would have reset inode 1241397 nlinks from 0 to 1
>> > > would have reset inode 1241909 nlinks from 0 to 1
>> > > would have reset inode 1242421 nlinks from 0 to 1
>> > > would have reset inode 1244981 nlinks from 0 to 1
>> > > would have reset inode 1246517 nlinks from 0 to 1
>> > > would have reset inode 1253429 nlinks from 0 to 1
>> > > would have reset inode 1271861 nlinks from 0 to 1
>> > > would have reset inode 1274677 nlinks from 0 to 1
>> > > would have reset inode 1277749 nlinks from 0 to 1
>> > > would have reset inode 1278773 nlinks from 0 to 1
>> > > would have reset inode 1286709 nlinks from 0 to 1
>> > > would have reset inode 1288245 nlinks from 0 to 1
>> > > would have reset inode 1299765 nlinks from 0 to 1
>> > > would have reset inode 1302325 nlinks from 0 to 1
>> > > would have reset inode 1304885 nlinks from 0 to 1
>> > > would have reset inode 1305397 nlinks from 0 to 1
>> > > would have reset inode 1307509 nlinks from 0 to 1
>> > > would have reset inode 1309493 nlinks from 0 to 1
>> > > would have reset inode 1310517 nlinks from 0 to 1
>> > > would have reset inode 1311029 nlinks from 0 to 1
>> > > would have reset inode 1312053 nlinks from 0 to 1
>> > > would have reset inode 1316917 nlinks from 0 to 1
>> > > would have reset inode 1317941 nlinks from 0 to 1
>> > > would have reset inode 1320821 nlinks from 0 to 1
>> > > would have reset inode 1322805 nlinks from 0 to 1
>> > > would have reset inode 1332789 nlinks from 0 to 1
>> > > would have reset inode 1336373 nlinks from 0 to 1
>> > > would have reset inode 1345653 nlinks from 0 to 1
>> > > would have reset inode 1354549 nlinks from 0 to 1
>> > > would have reset inode 1361973 nlinks from 0 to 1
>> > > would have reset inode 1369909 nlinks from 0 to 1
>> > > would have reset inode 1372981 nlinks from 0 to 1
>> > > would have reset inode 1388853 nlinks from 0 to 1
>> > > would have reset inode 1402933 nlinks from 0 to 1
>> > > would have reset inode 1403445 nlinks from 0 to 1
>> > > would have reset inode 1420085 nlinks from 0 to 1
>> > > would have reset inode 1452853 nlinks from 0 to 1
>> > > would have reset inode 1456437 nlinks from 0 to 1
>> > > would have reset inode 1457973 nlinks from 0 to 1
>> > > would have reset inode 1459253 nlinks from 0 to 1
>> > > would have reset inode 1467957 nlinks from 0 to 1
>> > > would have reset inode 1471541 nlinks from 0 to 1
>> > > would have reset inode 1476661 nlinks from 0 to 1
>> > > would have reset inode 1479733 nlinks from 0 to 1
>> > > would have reset inode 1483061 nlinks from 0 to 1
>> > > would have reset inode 1484085 nlinks from 0 to 1
>> > > would have reset inode 1486133 nlinks from 0 to 1
>> > > would have reset inode 1489461 nlinks from 0 to 1
>> > > would have reset inode 1490037 nlinks from 0 to 1
>> > > would have reset inode 1492021 nlinks from 0 to 1
>> > > would have reset inode 1493557 nlinks from 0 to 1
>> > > would have reset inode 1494069 nlinks from 0 to 1
>> > > would have reset inode 1496885 nlinks from 0 to 1
>> > > would have reset inode 1498421 nlinks from 0 to 1
>> > > would have reset inode 1498933 nlinks from 0 to 1
>> > > would have reset inode 1499957 nlinks from 0 to 1
>> > > would have reset inode 1506101 nlinks from 0 to 1
>> > > would have reset inode 1507637 nlinks from 0 to 1
>> > > would have reset inode 1510453 nlinks from 0 to 1
>> > > would have reset inode 1514293 nlinks from 0 to 1
>> > > would have reset inode 1517365 nlinks from 0 to 1
>> > > would have reset inode 1520693 nlinks from 0 to 1
>> > > would have reset inode 1521973 nlinks from 0 to 1
>> > > would have reset inode 1530421 nlinks from 0 to 1
>> > > would have reset inode 1530933 nlinks from 0 to 1
>> > > would have reset inode 1537333 nlinks from 0 to 1
>> > > would have reset inode 1538357 nlinks from 0 to 1
>> > > would have reset inode 1548853 nlinks from 0 to 1
>> > > would have reset inode 1553973 nlinks from 0 to 1
>> > > would have reset inode 1557301 nlinks from 0 to 1
>> > > would have reset inode 1564213 nlinks from 0 to 1
>> > > would have reset inode 1564725 nlinks from 0 to 1
>> > > would have reset inode 1576501 nlinks from 0 to 1
>> > > would have reset inode 1580597 nlinks from 0 to 1
>> > > would have reset inode 1584693 nlinks from 0 to 1
>> > > would have reset inode 1586485 nlinks from 0 to 1
>> > > would have reset inode 1589301 nlinks from 0 to 1
>> > > would have reset inode 1589813 nlinks from 0 to 1
>> > > would have reset inode 1592629 nlinks from 0 to 1
>> > > would have reset inode 1595701 nlinks from 0 to 1
>> > > would have reset inode 1601077 nlinks from 0 to 1
>> > > would have reset inode 1623861 nlinks from 0 to 1
>> > > would have reset inode 1626677 nlinks from 0 to 1
>> > > would have reset inode 1627701 nlinks from 0 to 1
>> > > would have reset inode 1633333 nlinks from 0 to 1
>> > > would have reset inode 1639221 nlinks from 0 to 1
>> > > would have reset inode 1649205 nlinks from 0 to 1
>> > > would have reset inode 1686325 nlinks from 0 to 1
>> > > would have reset inode 1690677 nlinks from 0 to 1
>> > > would have reset inode 1693749 nlinks from 0 to 1
>> > > would have reset inode 1704757 nlinks from 0 to 1
>> > > would have reset inode 1707061 nlinks from 0 to 1
>> > > would have reset inode 1709109 nlinks from 0 to 1
>> > > would have reset inode 1719349 nlinks from 0 to 1
>> > > would have reset inode 1737013 nlinks from 0 to 1
>> > > would have reset inode 1741365 nlinks from 0 to 1
>> > > would have reset inode 1747509 nlinks from 0 to 1
>> > > would have reset inode 1770805 nlinks from 0 to 1
>> > > would have reset inode 1780789 nlinks from 0 to 1
>> > > would have reset inode 1793589 nlinks from 0 to 1
>> > > would have reset inode 1795125 nlinks from 0 to 1
>> > > would have reset inode 1800757 nlinks from 0 to 1
>> > > would have reset inode 1801269 nlinks from 0 to 1
>> > > would have reset inode 1802549 nlinks from 0 to 1
>> > > would have reset inode 1804085 nlinks from 0 to 1
>> > > would have reset inode 1817141 nlinks from 0 to 1
>> > > would have reset inode 1821749 nlinks from 0 to 1
>> > > would have reset inode 1832757 nlinks from 0 to 1
>> > > would have reset inode 1836341 nlinks from 0 to 1
>> > > would have reset inode 1856309 nlinks from 0 to 1
>> > > would have reset inode 1900597 nlinks from 0 to 1
>> > > would have reset inode 1902901 nlinks from 0 to 1
>> > > would have reset inode 1912373 nlinks from 0 to 1
>> > > would have reset inode 1943093 nlinks from 0 to 1
>> > > would have reset inode 1944373 nlinks from 0 to 1
>> > > would have reset inode 1954101 nlinks from 0 to 1
>> > > would have reset inode 1955893 nlinks from 0 to 1
>> > > would have reset inode 1961781 nlinks from 0 to 1
>> > > would have reset inode 1974325 nlinks from 0 to 1
>> > > would have reset inode 1978677 nlinks from 0 to 1
>> > > would have reset inode 1981237 nlinks from 0 to 1
>> > > would have reset inode 1992245 nlinks from 0 to 1
>> > > would have reset inode 2000949 nlinks from 0 to 1
>> > > would have reset inode 2002229 nlinks from 0 to 1
>> > > would have reset inode 2004789 nlinks from 0 to 1
>> > > would have reset inode 2005301 nlinks from 0 to 1
>> > > would have reset inode 2011189 nlinks from 0 to 1
>> > > would have reset inode 2012981 nlinks from 0 to 1
>> > > would have reset inode 2015285 nlinks from 0 to 1
>> > > would have reset inode 2018869 nlinks from 0 to 1
>> > > would have reset inode 2028341 nlinks from 0 to 1
>> > > would have reset inode 2028853 nlinks from 0 to 1
>> > > would have reset inode 2030901 nlinks from 0 to 1
>> > > would have reset inode 2032181 nlinks from 0 to 1
>> > > would have reset inode 2032693 nlinks from 0 to 1
>> > > would have reset inode 2040117 nlinks from 0 to 1
>> > > would have reset inode 2053685 nlinks from 0 to 1
>> > > would have reset inode 2083893 nlinks from 0 to 1
>> > > would have reset inode 2087221 nlinks from 0 to 1
>> > > would have reset inode 2095925 nlinks from 0 to 1
>> > > would have reset inode 2098741 nlinks from 0 to 1
>> > > would have reset inode 2100533 nlinks from 0 to 1
>> > > would have reset inode 2101301 nlinks from 0 to 1
>> > > would have reset inode 2123573 nlinks from 0 to 1
>> > > would have reset inode 2132789 nlinks from 0 to 1
>> > > would have reset inode 2133813 nlinks from 0 to 1
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > 2013/4/10 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>
>> > <mailto:yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>>
>> > >
>> > > The storage info is as following:
>> > > RAID-6
>> > > SATA HDD
>> > > Controller: PERC H710P Mini (Embedded)
>> > > Disk /dev/sdb: 30000.3 GB, 30000346562560 bytes
>> > > 255 heads, 63 sectors/track, 3647334 cylinders
>> > > Units = cylinders of 16065 * 512 = 8225280 bytes
>> > > Sector size (logical/physical): 512 bytes / 512 bytes
>> > > I/O size (minimum/optimal): 512 bytes / 512 bytes
>> > > Disk identifier: 0x00000000
>> > >
>> > > sd 0:2:1:0: [sdb] 58594426880 512-byte logical blocks: (30.0
>> > TB/27.2
>> > > TiB)
>> > > sd 0:2:1:0: [sdb] Write Protect is off
>> > > sd 0:2:1:0: [sdb] Mode Sense: 1f 00 00 08
>> > > sd 0:2:1:0: [sdb] Write cache: enabled, read cache: enabled,
>> > doesn't
>> > > support DPO or FUA
>> > > sd 0:2:1:0: [sdb] Attached SCSI disk
>> > >
>> > > *-storage
>> > > description: RAID bus controller
>> > > product: MegaRAID SAS 2208 [Thunderbolt]
>> > > vendor: LSI Logic / Symbios Logic
>> > > physical id: 0
>> > > bus info: pci@0000:02:00.0
>> > > logical name: scsi0
>> > > version: 01
>> > > width: 64 bits
>> > > clock: 33MHz
>> > > capabilities: storage pm pciexpress vpd msi msix
>> bus_master
>> > > cap_list rom
>> > > configuration: driver=megaraid_sas latency=0
>> > > resources: irq:42 ioport:fc00(size=256)
>> > > memory:dd7fc000-dd7fffff memory:dd780000-dd7bffff
>> > > memory:dc800000-dc81ffff(prefetchable)
>> > > *-disk:0
>> > > description: SCSI Disk
>> > > product: PERC H710P
>> > > vendor: DELL
>> > > physical id: 2.0.0
>> > > bus info: scsi@0:2.0.0
>> > > logical name: /dev/sda
>> > > version: 3.13
>> > > serial: 0049d6ce1d9f2035180096fde490f648
>> > > size: 558GiB (599GB)
>> > > capabilities: partitioned partitioned:dos
>> > > configuration: ansiversion=5 signature=000aa336
>> > > *-disk:1
>> > > description: SCSI Disk
>> > > product: PERC H710P
>> > > vendor: DELL
>> > > physical id: 2.1.0
>> > > bus info: scsi@0:2.1.0
>> > > logical name: /dev/sdb
>> > > logical name: /mnt/xfsd
>> > > version: 3.13
>> > > serial: 003366f71da22035180096fde490f648
>> > > size: 27TiB (30TB)
>> > > configuration: ansiversion=5 mount.fstype=xfs
>> > >
>> >
>> mount.options=rw,relatime,attr2,delaylog,logbsize=64k,sunit=128,swidth=1280,noquota
>> > > state=mounted
>> > >
>> > > Thank you.
>> > >
>> > >
>> > > 2013/4/10 Emmanuel Florac <eflorac@intellique.com
>> > <mailto:eflorac@intellique.com>
>> > > <mailto:eflorac@intellique.com <mailto:eflorac@intellique.com
>> >>>
>> > >
>> > > Le Tue, 9 Apr 2013 23:10:03 +0800
>> > > 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>
>> > <mailto:yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>>
>> écrivait:
>> > >
>> > > > > Apr 9 11:01:30 cqdx kernel: XFS (sdb): I/O Error
>> > Detected.
>> > > > > Shutting down filesystem
>> > >
>> > > This. I/O error detected. That means that at some point
>> the
>> > > underlying
>> > > device (disk, RAID array, SAN volume) couldn't be reached.
>> > So this
>> > > could very well be a case of a flakey drive, array, cable
>> > or SCSI
>> > > driver.
>> > >
>> > > What's the storage setup here?
>> > >
>> > > --
>> > >
>> >
>> ------------------------------------------------------------------------
>> > > Emmanuel Florac | Direction technique
>> > > | Intellique
>> > > | <eflorac@intellique.com
>> > <mailto:eflorac@intellique.com>
>> > > <mailto:eflorac@intellique.com
>> > <mailto:eflorac@intellique.com>>>
>> > > | +33 1 78 94 84 02
>> > <tel:%2B33%201%2078%2094%2084%2002>
>> > >
>> >
>> ------------------------------------------------------------------------
>> > >
>> > >
>> > >
>> > >
>> > > --
>> > > 符永涛
>> > >
>> > >
>> > >
>> > >
>> > > --
>> > > 符永涛
>> > >
>> > >
>> > > _______________________________________________
>> > > xfs mailing list
>> > > xfs@oss.sgi.com <mailto:xfs@oss.sgi.com>
>> > > http://oss.sgi.com/mailman/listinfo/xfs
>> > >
>> >
>> >
>> >
>> >
>> > --
>> > 符永涛
>> >
>> >
>> > _______________________________________________
>> > xfs mailing list
>> > xfs@oss.sgi.com
>> > http://oss.sgi.com/mailman/listinfo/xfs
>> >
>>
>>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 45654 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-12 13:48 ` 符永涛
@ 2013-04-12 13:51 ` 符永涛
2013-04-12 13:59 ` 符永涛
0 siblings, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-12 13:51 UTC (permalink / raw)
To: Brian Foster; +Cc: Ben Myers, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 33148 bytes --]
I have attached the xfs_metadump file to google drive and is there any
useful clue there? I don't know how to check the meta file.
https://docs.google.com/file/d/0B7n2C4T5tfNCdFBCTnNxNERmbWc/edit?usp=sharing
2013/4/12 符永涛 <yongtaofu@gmail.com>
> And I'm not sure what kind of information is important to debug this issue?
> Thank you.
>
>
> 2013/4/12 符永涛 <yongtaofu@gmail.com>
>
>> The xfs shutdown error is always the same. (It happens about 20 times on
>> about 50 servers during last half year)
>> Recently the shutdown happens on the cluster with 24 servers(dis 8 *
>> replica 3) during rebalance. The average work load of this cluster is more
>> than 3TB growth per day.
>> The work load including normal fops, rsync jobs, video encoding/decoding,
>> logging etc. through glusterfs native client of hundreds of machines.
>> The shutdown tend to happen when we run rebalance for the glusterfs
>> cluster which I guess will trigger a lot of unlink operations?
>>
>> Thank you very much. May be I can try to collect more logs with a
>> modified kernel package.
>>
>>
>>
>> 2013/4/12 Brian Foster <bfoster@redhat.com>
>>
>>> On 04/11/2013 08:45 PM, 符永涛 wrote:
>>> > the workload is about:
>>> > 24 servers, replica(3) which means the distribute is 8
>>> > load is about 3(TB)-8(TB) per day.
>>> >
>>>
>>> This describes your cluster, but not the workload (though cluster info
>>> is good too). What kind of workload is running on your clients (i.e.,
>>> rsync jobs, etc.)? Are you running through native gluster mount points,
>>> NFS mounts or a mix? Do you have any gluster internal operations running
>>> (i.e., rebalance, etc.).
>>>
>>> Is there any kind of pattern you can discern from the workload and when
>>> the XFS error happens to occur? You have a good number of servers in
>>> play here, is there any kind of pattern in terms of which servers
>>> experience the error? Is it always the same servers or a random set?
>>>
>>> Brian
>>>
>>> >
>>> > 2013/4/12 Brian Foster <bfoster@redhat.com <mailto:bfoster@redhat.com
>>> >>
>>> >
>>> > On 04/11/2013 03:11 PM, 符永涛 wrote:
>>> > > It happens tonight again on one of our servers, how to debug the
>>> root
>>> > > cause? Thank you.
>>> > >
>>> >
>>> > Hi,
>>> >
>>> > I've attached a system tap script (stap -v xfs.stp) that should
>>> > hopefully print out a bit more data should the issue happen again.
>>> Do
>>> > you have a small enough number of nodes (or predictable enough
>>> pattern)
>>> > that you could run this on the nodes that tend to fail and collect
>>> the
>>> > output?
>>> >
>>> > Also, could you collect an xfs_metadump of the filesystem in
>>> question
>>> > and make it available for download and analysis somewhere? I
>>> believe the
>>> > ideal approach is to mount/umount the filesystem first to replay
>>> the log
>>> > before collecting a metadump, but somebody could correct me on
>>> that (to
>>> > be safe, you could collect multiple dumps: pre-mount and
>>> post-mount).
>>> >
>>> > Could you also describe your workload a little bit? Thanks.
>>> >
>>> > Brian
>>> >
>>> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
>>> > > xfs_inotobp() returned error 22.
>>> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree
>>> > returned
>>> > > error 22
>>> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb):
>>> xfs_do_force_shutdown(0x1)
>>> > > called from line 1184 of file fs/xfs/xfs_vnodeops.c. Return
>>> address =
>>> > > 0xffffffffa02ee20a
>>> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb): I/O Error Detected.
>>> Shutting
>>> > > down filesystem
>>> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb): Please umount the
>>> > filesystem and
>>> > > rectify the problem(s)
>>> > > Apr 12 02:32:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>>> > returned.
>>> > > Apr 12 02:32:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>>> > returned.
>>> > > Apr 12 02:33:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>>> > returned.
>>> > > Apr 12 02:33:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>>> > returned.
>>> > >
>>> > > xfs_repair -n
>>> > >
>>> > >
>>> > > Phase 7 - verify link counts...
>>> > > would have reset inode 20021 nlinks from 0 to 1
>>> > > would have reset inode 20789 nlinks from 0 to 1
>>> > > would have reset inode 35125 nlinks from 0 to 1
>>> > > would have reset inode 35637 nlinks from 0 to 1
>>> > > would have reset inode 36149 nlinks from 0 to 1
>>> > > would have reset inode 38197 nlinks from 0 to 1
>>> > > would have reset inode 39477 nlinks from 0 to 1
>>> > > would have reset inode 54069 nlinks from 0 to 1
>>> > > would have reset inode 62261 nlinks from 0 to 1
>>> > > would have reset inode 63029 nlinks from 0 to 1
>>> > > would have reset inode 72501 nlinks from 0 to 1
>>> > > would have reset inode 79925 nlinks from 0 to 1
>>> > > would have reset inode 81205 nlinks from 0 to 1
>>> > > would have reset inode 84789 nlinks from 0 to 1
>>> > > would have reset inode 87861 nlinks from 0 to 1
>>> > > would have reset inode 90663 nlinks from 0 to 1
>>> > > would have reset inode 91189 nlinks from 0 to 1
>>> > > would have reset inode 95541 nlinks from 0 to 1
>>> > > would have reset inode 98101 nlinks from 0 to 1
>>> > > would have reset inode 101173 nlinks from 0 to 1
>>> > > would have reset inode 113205 nlinks from 0 to 1
>>> > > would have reset inode 114741 nlinks from 0 to 1
>>> > > would have reset inode 126261 nlinks from 0 to 1
>>> > > would have reset inode 140597 nlinks from 0 to 1
>>> > > would have reset inode 144693 nlinks from 0 to 1
>>> > > would have reset inode 147765 nlinks from 0 to 1
>>> > > would have reset inode 152885 nlinks from 0 to 1
>>> > > would have reset inode 161333 nlinks from 0 to 1
>>> > > would have reset inode 161845 nlinks from 0 to 1
>>> > > would have reset inode 167477 nlinks from 0 to 1
>>> > > would have reset inode 172341 nlinks from 0 to 1
>>> > > would have reset inode 191797 nlinks from 0 to 1
>>> > > would have reset inode 204853 nlinks from 0 to 1
>>> > > would have reset inode 205365 nlinks from 0 to 1
>>> > > would have reset inode 215349 nlinks from 0 to 1
>>> > > would have reset inode 215861 nlinks from 0 to 1
>>> > > would have reset inode 216373 nlinks from 0 to 1
>>> > > would have reset inode 217397 nlinks from 0 to 1
>>> > > would have reset inode 224309 nlinks from 0 to 1
>>> > > would have reset inode 225589 nlinks from 0 to 1
>>> > > would have reset inode 234549 nlinks from 0 to 1
>>> > > would have reset inode 234805 nlinks from 0 to 1
>>> > > would have reset inode 249653 nlinks from 0 to 1
>>> > > would have reset inode 250677 nlinks from 0 to 1
>>> > > would have reset inode 252469 nlinks from 0 to 1
>>> > > would have reset inode 261429 nlinks from 0 to 1
>>> > > would have reset inode 265013 nlinks from 0 to 1
>>> > > would have reset inode 266805 nlinks from 0 to 1
>>> > > would have reset inode 267317 nlinks from 0 to 1
>>> > > would have reset inode 268853 nlinks from 0 to 1
>>> > > would have reset inode 272437 nlinks from 0 to 1
>>> > > would have reset inode 273205 nlinks from 0 to 1
>>> > > would have reset inode 274229 nlinks from 0 to 1
>>> > > would have reset inode 278325 nlinks from 0 to 1
>>> > > would have reset inode 278837 nlinks from 0 to 1
>>> > > would have reset inode 281397 nlinks from 0 to 1
>>> > > would have reset inode 292661 nlinks from 0 to 1
>>> > > would have reset inode 300853 nlinks from 0 to 1
>>> > > would have reset inode 302901 nlinks from 0 to 1
>>> > > would have reset inode 305205 nlinks from 0 to 1
>>> > > would have reset inode 314165 nlinks from 0 to 1
>>> > > would have reset inode 315189 nlinks from 0 to 1
>>> > > would have reset inode 320309 nlinks from 0 to 1
>>> > > would have reset inode 324917 nlinks from 0 to 1
>>> > > would have reset inode 328245 nlinks from 0 to 1
>>> > > would have reset inode 335925 nlinks from 0 to 1
>>> > > would have reset inode 339253 nlinks from 0 to 1
>>> > > would have reset inode 339765 nlinks from 0 to 1
>>> > > would have reset inode 348213 nlinks from 0 to 1
>>> > > would have reset inode 360501 nlinks from 0 to 1
>>> > > would have reset inode 362037 nlinks from 0 to 1
>>> > > would have reset inode 366389 nlinks from 0 to 1
>>> > > would have reset inode 385845 nlinks from 0 to 1
>>> > > would have reset inode 390709 nlinks from 0 to 1
>>> > > would have reset inode 409141 nlinks from 0 to 1
>>> > > would have reset inode 413237 nlinks from 0 to 1
>>> > > would have reset inode 414773 nlinks from 0 to 1
>>> > > would have reset inode 417845 nlinks from 0 to 1
>>> > > would have reset inode 436021 nlinks from 0 to 1
>>> > > would have reset inode 439349 nlinks from 0 to 1
>>> > > would have reset inode 447029 nlinks from 0 to 1
>>> > > would have reset inode 491317 nlinks from 0 to 1
>>> > > would have reset inode 494133 nlinks from 0 to 1
>>> > > would have reset inode 495413 nlinks from 0 to 1
>>> > > would have reset inode 501301 nlinks from 0 to 1
>>> > > would have reset inode 506421 nlinks from 0 to 1
>>> > > would have reset inode 508469 nlinks from 0 to 1
>>> > > would have reset inode 508981 nlinks from 0 to 1
>>> > > would have reset inode 511797 nlinks from 0 to 1
>>> > > would have reset inode 513077 nlinks from 0 to 1
>>> > > would have reset inode 517941 nlinks from 0 to 1
>>> > > would have reset inode 521013 nlinks from 0 to 1
>>> > > would have reset inode 522805 nlinks from 0 to 1
>>> > > would have reset inode 523317 nlinks from 0 to 1
>>> > > would have reset inode 525621 nlinks from 0 to 1
>>> > > would have reset inode 527925 nlinks from 0 to 1
>>> > > would have reset inode 535605 nlinks from 0 to 1
>>> > > would have reset inode 541749 nlinks from 0 to 1
>>> > > would have reset inode 573493 nlinks from 0 to 1
>>> > > would have reset inode 578613 nlinks from 0 to 1
>>> > > would have reset inode 583029 nlinks from 0 to 1
>>> > > would have reset inode 585525 nlinks from 0 to 1
>>> > > would have reset inode 586293 nlinks from 0 to 1
>>> > > would have reset inode 586805 nlinks from 0 to 1
>>> > > would have reset inode 591413 nlinks from 0 to 1
>>> > > would have reset inode 594485 nlinks from 0 to 1
>>> > > would have reset inode 596277 nlinks from 0 to 1
>>> > > would have reset inode 603189 nlinks from 0 to 1
>>> > > would have reset inode 613429 nlinks from 0 to 1
>>> > > would have reset inode 617781 nlinks from 0 to 1
>>> > > would have reset inode 621877 nlinks from 0 to 1
>>> > > would have reset inode 623925 nlinks from 0 to 1
>>> > > would have reset inode 625205 nlinks from 0 to 1
>>> > > would have reset inode 626741 nlinks from 0 to 1
>>> > > would have reset inode 639541 nlinks from 0 to 1
>>> > > would have reset inode 640053 nlinks from 0 to 1
>>> > > would have reset inode 640565 nlinks from 0 to 1
>>> > > would have reset inode 645173 nlinks from 0 to 1
>>> > > would have reset inode 652853 nlinks from 0 to 1
>>> > > would have reset inode 656181 nlinks from 0 to 1
>>> > > would have reset inode 659253 nlinks from 0 to 1
>>> > > would have reset inode 663605 nlinks from 0 to 1
>>> > > would have reset inode 667445 nlinks from 0 to 1
>>> > > would have reset inode 680757 nlinks from 0 to 1
>>> > > would have reset inode 691253 nlinks from 0 to 1
>>> > > would have reset inode 691765 nlinks from 0 to 1
>>> > > would have reset inode 697653 nlinks from 0 to 1
>>> > > would have reset inode 700469 nlinks from 0 to 1
>>> > > would have reset inode 707893 nlinks from 0 to 1
>>> > > would have reset inode 716853 nlinks from 0 to 1
>>> > > would have reset inode 722229 nlinks from 0 to 1
>>> > > would have reset inode 722741 nlinks from 0 to 1
>>> > > would have reset inode 723765 nlinks from 0 to 1
>>> > > would have reset inode 731957 nlinks from 0 to 1
>>> > > would have reset inode 742965 nlinks from 0 to 1
>>> > > would have reset inode 743477 nlinks from 0 to 1
>>> > > would have reset inode 745781 nlinks from 0 to 1
>>> > > would have reset inode 746293 nlinks from 0 to 1
>>> > > would have reset inode 774453 nlinks from 0 to 1
>>> > > would have reset inode 778805 nlinks from 0 to 1
>>> > > would have reset inode 785013 nlinks from 0 to 1
>>> > > would have reset inode 785973 nlinks from 0 to 1
>>> > > would have reset inode 791349 nlinks from 0 to 1
>>> > > would have reset inode 796981 nlinks from 0 to 1
>>> > > would have reset inode 803381 nlinks from 0 to 1
>>> > > would have reset inode 806965 nlinks from 0 to 1
>>> > > would have reset inode 811798 nlinks from 0 to 1
>>> > > would have reset inode 812310 nlinks from 0 to 1
>>> > > would have reset inode 813078 nlinks from 0 to 1
>>> > > would have reset inode 813607 nlinks from 0 to 1
>>> > > would have reset inode 814183 nlinks from 0 to 1
>>> > > would have reset inode 822069 nlinks from 0 to 1
>>> > > would have reset inode 828469 nlinks from 0 to 1
>>> > > would have reset inode 830005 nlinks from 0 to 1
>>> > > would have reset inode 832053 nlinks from 0 to 1
>>> > > would have reset inode 832565 nlinks from 0 to 1
>>> > > would have reset inode 836661 nlinks from 0 to 1
>>> > > would have reset inode 841013 nlinks from 0 to 1
>>> > > would have reset inode 841525 nlinks from 0 to 1
>>> > > would have reset inode 845365 nlinks from 0 to 1
>>> > > would have reset inode 846133 nlinks from 0 to 1
>>> > > would have reset inode 847157 nlinks from 0 to 1
>>> > > would have reset inode 852533 nlinks from 0 to 1
>>> > > would have reset inode 857141 nlinks from 0 to 1
>>> > > would have reset inode 863271 nlinks from 0 to 1
>>> > > would have reset inode 866855 nlinks from 0 to 1
>>> > > would have reset inode 887861 nlinks from 0 to 1
>>> > > would have reset inode 891701 nlinks from 0 to 1
>>> > > would have reset inode 894773 nlinks from 0 to 1
>>> > > would have reset inode 900149 nlinks from 0 to 1
>>> > > would have reset inode 902197 nlinks from 0 to 1
>>> > > would have reset inode 906293 nlinks from 0 to 1
>>> > > would have reset inode 906805 nlinks from 0 to 1
>>> > > would have reset inode 909877 nlinks from 0 to 1
>>> > > would have reset inode 925493 nlinks from 0 to 1
>>> > > would have reset inode 949543 nlinks from 0 to 1
>>> > > would have reset inode 955175 nlinks from 0 to 1
>>> > > would have reset inode 963623 nlinks from 0 to 1
>>> > > would have reset inode 967733 nlinks from 0 to 1
>>> > > would have reset inode 968231 nlinks from 0 to 1
>>> > > would have reset inode 982069 nlinks from 0 to 1
>>> > > would have reset inode 1007413 nlinks from 0 to 1
>>> > > would have reset inode 1011509 nlinks from 0 to 1
>>> > > would have reset inode 1014069 nlinks from 0 to 1
>>> > > would have reset inode 1014581 nlinks from 0 to 1
>>> > > would have reset inode 1022005 nlinks from 0 to 1
>>> > > would have reset inode 1022517 nlinks from 0 to 1
>>> > > would have reset inode 1023029 nlinks from 0 to 1
>>> > > would have reset inode 1025333 nlinks from 0 to 1
>>> > > would have reset inode 1043765 nlinks from 0 to 1
>>> > > would have reset inode 1044789 nlinks from 0 to 1
>>> > > would have reset inode 1049397 nlinks from 0 to 1
>>> > > would have reset inode 1050933 nlinks from 0 to 1
>>> > > would have reset inode 1051445 nlinks from 0 to 1
>>> > > would have reset inode 1054261 nlinks from 0 to 1
>>> > > would have reset inode 1060917 nlinks from 0 to 1
>>> > > would have reset inode 1063477 nlinks from 0 to 1
>>> > > would have reset inode 1076021 nlinks from 0 to 1
>>> > > would have reset inode 1081141 nlinks from 0 to 1
>>> > > would have reset inode 1086261 nlinks from 0 to 1
>>> > > would have reset inode 1097269 nlinks from 0 to 1
>>> > > would have reset inode 1099829 nlinks from 0 to 1
>>> > > would have reset inode 1100853 nlinks from 0 to 1
>>> > > would have reset inode 1101877 nlinks from 0 to 1
>>> > > would have reset inode 1126709 nlinks from 0 to 1
>>> > > would have reset inode 1134389 nlinks from 0 to 1
>>> > > would have reset inode 1141045 nlinks from 0 to 1
>>> > > would have reset inode 1141557 nlinks from 0 to 1
>>> > > would have reset inode 1142581 nlinks from 0 to 1
>>> > > would have reset inode 1148469 nlinks from 0 to 1
>>> > > would have reset inode 1153333 nlinks from 0 to 1
>>> > > would have reset inode 1181749 nlinks from 0 to 1
>>> > > would have reset inode 1192245 nlinks from 0 to 1
>>> > > would have reset inode 1198133 nlinks from 0 to 1
>>> > > would have reset inode 1203765 nlinks from 0 to 1
>>> > > would have reset inode 1221429 nlinks from 0 to 1
>>> > > would have reset inode 1223989 nlinks from 0 to 1
>>> > > would have reset inode 1235509 nlinks from 0 to 1
>>> > > would have reset inode 1239349 nlinks from 0 to 1
>>> > > would have reset inode 1240885 nlinks from 0 to 1
>>> > > would have reset inode 1241397 nlinks from 0 to 1
>>> > > would have reset inode 1241909 nlinks from 0 to 1
>>> > > would have reset inode 1242421 nlinks from 0 to 1
>>> > > would have reset inode 1244981 nlinks from 0 to 1
>>> > > would have reset inode 1246517 nlinks from 0 to 1
>>> > > would have reset inode 1253429 nlinks from 0 to 1
>>> > > would have reset inode 1271861 nlinks from 0 to 1
>>> > > would have reset inode 1274677 nlinks from 0 to 1
>>> > > would have reset inode 1277749 nlinks from 0 to 1
>>> > > would have reset inode 1278773 nlinks from 0 to 1
>>> > > would have reset inode 1286709 nlinks from 0 to 1
>>> > > would have reset inode 1288245 nlinks from 0 to 1
>>> > > would have reset inode 1299765 nlinks from 0 to 1
>>> > > would have reset inode 1302325 nlinks from 0 to 1
>>> > > would have reset inode 1304885 nlinks from 0 to 1
>>> > > would have reset inode 1305397 nlinks from 0 to 1
>>> > > would have reset inode 1307509 nlinks from 0 to 1
>>> > > would have reset inode 1309493 nlinks from 0 to 1
>>> > > would have reset inode 1310517 nlinks from 0 to 1
>>> > > would have reset inode 1311029 nlinks from 0 to 1
>>> > > would have reset inode 1312053 nlinks from 0 to 1
>>> > > would have reset inode 1316917 nlinks from 0 to 1
>>> > > would have reset inode 1317941 nlinks from 0 to 1
>>> > > would have reset inode 1320821 nlinks from 0 to 1
>>> > > would have reset inode 1322805 nlinks from 0 to 1
>>> > > would have reset inode 1332789 nlinks from 0 to 1
>>> > > would have reset inode 1336373 nlinks from 0 to 1
>>> > > would have reset inode 1345653 nlinks from 0 to 1
>>> > > would have reset inode 1354549 nlinks from 0 to 1
>>> > > would have reset inode 1361973 nlinks from 0 to 1
>>> > > would have reset inode 1369909 nlinks from 0 to 1
>>> > > would have reset inode 1372981 nlinks from 0 to 1
>>> > > would have reset inode 1388853 nlinks from 0 to 1
>>> > > would have reset inode 1402933 nlinks from 0 to 1
>>> > > would have reset inode 1403445 nlinks from 0 to 1
>>> > > would have reset inode 1420085 nlinks from 0 to 1
>>> > > would have reset inode 1452853 nlinks from 0 to 1
>>> > > would have reset inode 1456437 nlinks from 0 to 1
>>> > > would have reset inode 1457973 nlinks from 0 to 1
>>> > > would have reset inode 1459253 nlinks from 0 to 1
>>> > > would have reset inode 1467957 nlinks from 0 to 1
>>> > > would have reset inode 1471541 nlinks from 0 to 1
>>> > > would have reset inode 1476661 nlinks from 0 to 1
>>> > > would have reset inode 1479733 nlinks from 0 to 1
>>> > > would have reset inode 1483061 nlinks from 0 to 1
>>> > > would have reset inode 1484085 nlinks from 0 to 1
>>> > > would have reset inode 1486133 nlinks from 0 to 1
>>> > > would have reset inode 1489461 nlinks from 0 to 1
>>> > > would have reset inode 1490037 nlinks from 0 to 1
>>> > > would have reset inode 1492021 nlinks from 0 to 1
>>> > > would have reset inode 1493557 nlinks from 0 to 1
>>> > > would have reset inode 1494069 nlinks from 0 to 1
>>> > > would have reset inode 1496885 nlinks from 0 to 1
>>> > > would have reset inode 1498421 nlinks from 0 to 1
>>> > > would have reset inode 1498933 nlinks from 0 to 1
>>> > > would have reset inode 1499957 nlinks from 0 to 1
>>> > > would have reset inode 1506101 nlinks from 0 to 1
>>> > > would have reset inode 1507637 nlinks from 0 to 1
>>> > > would have reset inode 1510453 nlinks from 0 to 1
>>> > > would have reset inode 1514293 nlinks from 0 to 1
>>> > > would have reset inode 1517365 nlinks from 0 to 1
>>> > > would have reset inode 1520693 nlinks from 0 to 1
>>> > > would have reset inode 1521973 nlinks from 0 to 1
>>> > > would have reset inode 1530421 nlinks from 0 to 1
>>> > > would have reset inode 1530933 nlinks from 0 to 1
>>> > > would have reset inode 1537333 nlinks from 0 to 1
>>> > > would have reset inode 1538357 nlinks from 0 to 1
>>> > > would have reset inode 1548853 nlinks from 0 to 1
>>> > > would have reset inode 1553973 nlinks from 0 to 1
>>> > > would have reset inode 1557301 nlinks from 0 to 1
>>> > > would have reset inode 1564213 nlinks from 0 to 1
>>> > > would have reset inode 1564725 nlinks from 0 to 1
>>> > > would have reset inode 1576501 nlinks from 0 to 1
>>> > > would have reset inode 1580597 nlinks from 0 to 1
>>> > > would have reset inode 1584693 nlinks from 0 to 1
>>> > > would have reset inode 1586485 nlinks from 0 to 1
>>> > > would have reset inode 1589301 nlinks from 0 to 1
>>> > > would have reset inode 1589813 nlinks from 0 to 1
>>> > > would have reset inode 1592629 nlinks from 0 to 1
>>> > > would have reset inode 1595701 nlinks from 0 to 1
>>> > > would have reset inode 1601077 nlinks from 0 to 1
>>> > > would have reset inode 1623861 nlinks from 0 to 1
>>> > > would have reset inode 1626677 nlinks from 0 to 1
>>> > > would have reset inode 1627701 nlinks from 0 to 1
>>> > > would have reset inode 1633333 nlinks from 0 to 1
>>> > > would have reset inode 1639221 nlinks from 0 to 1
>>> > > would have reset inode 1649205 nlinks from 0 to 1
>>> > > would have reset inode 1686325 nlinks from 0 to 1
>>> > > would have reset inode 1690677 nlinks from 0 to 1
>>> > > would have reset inode 1693749 nlinks from 0 to 1
>>> > > would have reset inode 1704757 nlinks from 0 to 1
>>> > > would have reset inode 1707061 nlinks from 0 to 1
>>> > > would have reset inode 1709109 nlinks from 0 to 1
>>> > > would have reset inode 1719349 nlinks from 0 to 1
>>> > > would have reset inode 1737013 nlinks from 0 to 1
>>> > > would have reset inode 1741365 nlinks from 0 to 1
>>> > > would have reset inode 1747509 nlinks from 0 to 1
>>> > > would have reset inode 1770805 nlinks from 0 to 1
>>> > > would have reset inode 1780789 nlinks from 0 to 1
>>> > > would have reset inode 1793589 nlinks from 0 to 1
>>> > > would have reset inode 1795125 nlinks from 0 to 1
>>> > > would have reset inode 1800757 nlinks from 0 to 1
>>> > > would have reset inode 1801269 nlinks from 0 to 1
>>> > > would have reset inode 1802549 nlinks from 0 to 1
>>> > > would have reset inode 1804085 nlinks from 0 to 1
>>> > > would have reset inode 1817141 nlinks from 0 to 1
>>> > > would have reset inode 1821749 nlinks from 0 to 1
>>> > > would have reset inode 1832757 nlinks from 0 to 1
>>> > > would have reset inode 1836341 nlinks from 0 to 1
>>> > > would have reset inode 1856309 nlinks from 0 to 1
>>> > > would have reset inode 1900597 nlinks from 0 to 1
>>> > > would have reset inode 1902901 nlinks from 0 to 1
>>> > > would have reset inode 1912373 nlinks from 0 to 1
>>> > > would have reset inode 1943093 nlinks from 0 to 1
>>> > > would have reset inode 1944373 nlinks from 0 to 1
>>> > > would have reset inode 1954101 nlinks from 0 to 1
>>> > > would have reset inode 1955893 nlinks from 0 to 1
>>> > > would have reset inode 1961781 nlinks from 0 to 1
>>> > > would have reset inode 1974325 nlinks from 0 to 1
>>> > > would have reset inode 1978677 nlinks from 0 to 1
>>> > > would have reset inode 1981237 nlinks from 0 to 1
>>> > > would have reset inode 1992245 nlinks from 0 to 1
>>> > > would have reset inode 2000949 nlinks from 0 to 1
>>> > > would have reset inode 2002229 nlinks from 0 to 1
>>> > > would have reset inode 2004789 nlinks from 0 to 1
>>> > > would have reset inode 2005301 nlinks from 0 to 1
>>> > > would have reset inode 2011189 nlinks from 0 to 1
>>> > > would have reset inode 2012981 nlinks from 0 to 1
>>> > > would have reset inode 2015285 nlinks from 0 to 1
>>> > > would have reset inode 2018869 nlinks from 0 to 1
>>> > > would have reset inode 2028341 nlinks from 0 to 1
>>> > > would have reset inode 2028853 nlinks from 0 to 1
>>> > > would have reset inode 2030901 nlinks from 0 to 1
>>> > > would have reset inode 2032181 nlinks from 0 to 1
>>> > > would have reset inode 2032693 nlinks from 0 to 1
>>> > > would have reset inode 2040117 nlinks from 0 to 1
>>> > > would have reset inode 2053685 nlinks from 0 to 1
>>> > > would have reset inode 2083893 nlinks from 0 to 1
>>> > > would have reset inode 2087221 nlinks from 0 to 1
>>> > > would have reset inode 2095925 nlinks from 0 to 1
>>> > > would have reset inode 2098741 nlinks from 0 to 1
>>> > > would have reset inode 2100533 nlinks from 0 to 1
>>> > > would have reset inode 2101301 nlinks from 0 to 1
>>> > > would have reset inode 2123573 nlinks from 0 to 1
>>> > > would have reset inode 2132789 nlinks from 0 to 1
>>> > > would have reset inode 2133813 nlinks from 0 to 1
>>> > >
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > 2013/4/10 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>
>>> > <mailto:yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>>
>>> > >
>>> > > The storage info is as following:
>>> > > RAID-6
>>> > > SATA HDD
>>> > > Controller: PERC H710P Mini (Embedded)
>>> > > Disk /dev/sdb: 30000.3 GB, 30000346562560 bytes
>>> > > 255 heads, 63 sectors/track, 3647334 cylinders
>>> > > Units = cylinders of 16065 * 512 = 8225280 bytes
>>> > > Sector size (logical/physical): 512 bytes / 512 bytes
>>> > > I/O size (minimum/optimal): 512 bytes / 512 bytes
>>> > > Disk identifier: 0x00000000
>>> > >
>>> > > sd 0:2:1:0: [sdb] 58594426880 512-byte logical blocks: (30.0
>>> > TB/27.2
>>> > > TiB)
>>> > > sd 0:2:1:0: [sdb] Write Protect is off
>>> > > sd 0:2:1:0: [sdb] Mode Sense: 1f 00 00 08
>>> > > sd 0:2:1:0: [sdb] Write cache: enabled, read cache: enabled,
>>> > doesn't
>>> > > support DPO or FUA
>>> > > sd 0:2:1:0: [sdb] Attached SCSI disk
>>> > >
>>> > > *-storage
>>> > > description: RAID bus controller
>>> > > product: MegaRAID SAS 2208 [Thunderbolt]
>>> > > vendor: LSI Logic / Symbios Logic
>>> > > physical id: 0
>>> > > bus info: pci@0000:02:00.0
>>> > > logical name: scsi0
>>> > > version: 01
>>> > > width: 64 bits
>>> > > clock: 33MHz
>>> > > capabilities: storage pm pciexpress vpd msi msix
>>> bus_master
>>> > > cap_list rom
>>> > > configuration: driver=megaraid_sas latency=0
>>> > > resources: irq:42 ioport:fc00(size=256)
>>> > > memory:dd7fc000-dd7fffff memory:dd780000-dd7bffff
>>> > > memory:dc800000-dc81ffff(prefetchable)
>>> > > *-disk:0
>>> > > description: SCSI Disk
>>> > > product: PERC H710P
>>> > > vendor: DELL
>>> > > physical id: 2.0.0
>>> > > bus info: scsi@0:2.0.0
>>> > > logical name: /dev/sda
>>> > > version: 3.13
>>> > > serial: 0049d6ce1d9f2035180096fde490f648
>>> > > size: 558GiB (599GB)
>>> > > capabilities: partitioned partitioned:dos
>>> > > configuration: ansiversion=5 signature=000aa336
>>> > > *-disk:1
>>> > > description: SCSI Disk
>>> > > product: PERC H710P
>>> > > vendor: DELL
>>> > > physical id: 2.1.0
>>> > > bus info: scsi@0:2.1.0
>>> > > logical name: /dev/sdb
>>> > > logical name: /mnt/xfsd
>>> > > version: 3.13
>>> > > serial: 003366f71da22035180096fde490f648
>>> > > size: 27TiB (30TB)
>>> > > configuration: ansiversion=5 mount.fstype=xfs
>>> > >
>>> >
>>> mount.options=rw,relatime,attr2,delaylog,logbsize=64k,sunit=128,swidth=1280,noquota
>>> > > state=mounted
>>> > >
>>> > > Thank you.
>>> > >
>>> > >
>>> > > 2013/4/10 Emmanuel Florac <eflorac@intellique.com
>>> > <mailto:eflorac@intellique.com>
>>> > > <mailto:eflorac@intellique.com <mailto:
>>> eflorac@intellique.com>>>
>>> > >
>>> > > Le Tue, 9 Apr 2013 23:10:03 +0800
>>> > > 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>
>>> > <mailto:yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>>
>>> écrivait:
>>> > >
>>> > > > > Apr 9 11:01:30 cqdx kernel: XFS (sdb): I/O Error
>>> > Detected.
>>> > > > > Shutting down filesystem
>>> > >
>>> > > This. I/O error detected. That means that at some point
>>> the
>>> > > underlying
>>> > > device (disk, RAID array, SAN volume) couldn't be
>>> reached.
>>> > So this
>>> > > could very well be a case of a flakey drive, array, cable
>>> > or SCSI
>>> > > driver.
>>> > >
>>> > > What's the storage setup here?
>>> > >
>>> > > --
>>> > >
>>> >
>>> ------------------------------------------------------------------------
>>> > > Emmanuel Florac | Direction technique
>>> > > | Intellique
>>> > > | <eflorac@intellique.com
>>> > <mailto:eflorac@intellique.com>
>>> > > <mailto:eflorac@intellique.com
>>> > <mailto:eflorac@intellique.com>>>
>>> > > | +33 1 78 94 84 02
>>> > <tel:%2B33%201%2078%2094%2084%2002>
>>> > >
>>> >
>>> ------------------------------------------------------------------------
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > --
>>> > > 符永涛
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > --
>>> > > 符永涛
>>> > >
>>> > >
>>> > > _______________________________________________
>>> > > xfs mailing list
>>> > > xfs@oss.sgi.com <mailto:xfs@oss.sgi.com>
>>> > > http://oss.sgi.com/mailman/listinfo/xfs
>>> > >
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > 符永涛
>>> >
>>> >
>>> > _______________________________________________
>>> > xfs mailing list
>>> > xfs@oss.sgi.com
>>> > http://oss.sgi.com/mailman/listinfo/xfs
>>> >
>>>
>>>
>>
>>
>> --
>> 符永涛
>>
>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 46432 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-12 13:51 ` 符永涛
@ 2013-04-12 13:59 ` 符永涛
0 siblings, 0 replies; 60+ messages in thread
From: 符永涛 @ 2013-04-12 13:59 UTC (permalink / raw)
To: Brian Foster; +Cc: Ben Myers, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 34138 bytes --]
The error happens randomly among the servers and we still can't find any
clear pattern. And since there are few logs and we can't find any clue If
it's hardware issue or software issue. And when xfs shutdown happens no
hardware alarms found too.
2013/4/12 符永涛 <yongtaofu@gmail.com>
> I have attached the xfs_metadump file to google drive and is there any
> useful clue there? I don't know how to check the meta file.
>
> https://docs.google.com/file/d/0B7n2C4T5tfNCdFBCTnNxNERmbWc/edit?usp=sharing
>
>
> 2013/4/12 符永涛 <yongtaofu@gmail.com>
>
>> And I'm not sure what kind of information is important to debug this
>> issue?
>> Thank you.
>>
>>
>> 2013/4/12 符永涛 <yongtaofu@gmail.com>
>>
>>> The xfs shutdown error is always the same. (It happens about 20 times on
>>> about 50 servers during last half year)
>>> Recently the shutdown happens on the cluster with 24 servers(dis 8 *
>>> replica 3) during rebalance. The average work load of this cluster is more
>>> than 3TB growth per day.
>>> The work load including normal fops, rsync jobs, video
>>> encoding/decoding, logging etc. through glusterfs native client of hundreds
>>> of machines.
>>> The shutdown tend to happen when we run rebalance for the glusterfs
>>> cluster which I guess will trigger a lot of unlink operations?
>>>
>>> Thank you very much. May be I can try to collect more logs with a
>>> modified kernel package.
>>>
>>>
>>>
>>> 2013/4/12 Brian Foster <bfoster@redhat.com>
>>>
>>>> On 04/11/2013 08:45 PM, 符永涛 wrote:
>>>> > the workload is about:
>>>> > 24 servers, replica(3) which means the distribute is 8
>>>> > load is about 3(TB)-8(TB) per day.
>>>> >
>>>>
>>>> This describes your cluster, but not the workload (though cluster info
>>>> is good too). What kind of workload is running on your clients (i.e.,
>>>> rsync jobs, etc.)? Are you running through native gluster mount points,
>>>> NFS mounts or a mix? Do you have any gluster internal operations running
>>>> (i.e., rebalance, etc.).
>>>>
>>>> Is there any kind of pattern you can discern from the workload and when
>>>> the XFS error happens to occur? You have a good number of servers in
>>>> play here, is there any kind of pattern in terms of which servers
>>>> experience the error? Is it always the same servers or a random set?
>>>>
>>>> Brian
>>>>
>>>> >
>>>> > 2013/4/12 Brian Foster <bfoster@redhat.com <mailto:bfoster@redhat.com
>>>> >>
>>>> >
>>>> > On 04/11/2013 03:11 PM, 符永涛 wrote:
>>>> > > It happens tonight again on one of our servers, how to debug
>>>> the root
>>>> > > cause? Thank you.
>>>> > >
>>>> >
>>>> > Hi,
>>>> >
>>>> > I've attached a system tap script (stap -v xfs.stp) that should
>>>> > hopefully print out a bit more data should the issue happen
>>>> again. Do
>>>> > you have a small enough number of nodes (or predictable enough
>>>> pattern)
>>>> > that you could run this on the nodes that tend to fail and
>>>> collect the
>>>> > output?
>>>> >
>>>> > Also, could you collect an xfs_metadump of the filesystem in
>>>> question
>>>> > and make it available for download and analysis somewhere? I
>>>> believe the
>>>> > ideal approach is to mount/umount the filesystem first to replay
>>>> the log
>>>> > before collecting a metadump, but somebody could correct me on
>>>> that (to
>>>> > be safe, you could collect multiple dumps: pre-mount and
>>>> post-mount).
>>>> >
>>>> > Could you also describe your workload a little bit? Thanks.
>>>> >
>>>> > Brian
>>>> >
>>>> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
>>>> > > xfs_inotobp() returned error 22.
>>>> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree
>>>> > returned
>>>> > > error 22
>>>> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb):
>>>> xfs_do_force_shutdown(0x1)
>>>> > > called from line 1184 of file fs/xfs/xfs_vnodeops.c. Return
>>>> address =
>>>> > > 0xffffffffa02ee20a
>>>> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb): I/O Error Detected.
>>>> Shutting
>>>> > > down filesystem
>>>> > > Apr 12 02:32:10 cqdx kernel: XFS (sdb): Please umount the
>>>> > filesystem and
>>>> > > rectify the problem(s)
>>>> > > Apr 12 02:32:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>>>> > returned.
>>>> > > Apr 12 02:32:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>>>> > returned.
>>>> > > Apr 12 02:33:19 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>>>> > returned.
>>>> > > Apr 12 02:33:49 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>>>> > returned.
>>>> > >
>>>> > > xfs_repair -n
>>>> > >
>>>> > >
>>>> > > Phase 7 - verify link counts...
>>>> > > would have reset inode 20021 nlinks from 0 to 1
>>>> > > would have reset inode 20789 nlinks from 0 to 1
>>>> > > would have reset inode 35125 nlinks from 0 to 1
>>>> > > would have reset inode 35637 nlinks from 0 to 1
>>>> > > would have reset inode 36149 nlinks from 0 to 1
>>>> > > would have reset inode 38197 nlinks from 0 to 1
>>>> > > would have reset inode 39477 nlinks from 0 to 1
>>>> > > would have reset inode 54069 nlinks from 0 to 1
>>>> > > would have reset inode 62261 nlinks from 0 to 1
>>>> > > would have reset inode 63029 nlinks from 0 to 1
>>>> > > would have reset inode 72501 nlinks from 0 to 1
>>>> > > would have reset inode 79925 nlinks from 0 to 1
>>>> > > would have reset inode 81205 nlinks from 0 to 1
>>>> > > would have reset inode 84789 nlinks from 0 to 1
>>>> > > would have reset inode 87861 nlinks from 0 to 1
>>>> > > would have reset inode 90663 nlinks from 0 to 1
>>>> > > would have reset inode 91189 nlinks from 0 to 1
>>>> > > would have reset inode 95541 nlinks from 0 to 1
>>>> > > would have reset inode 98101 nlinks from 0 to 1
>>>> > > would have reset inode 101173 nlinks from 0 to 1
>>>> > > would have reset inode 113205 nlinks from 0 to 1
>>>> > > would have reset inode 114741 nlinks from 0 to 1
>>>> > > would have reset inode 126261 nlinks from 0 to 1
>>>> > > would have reset inode 140597 nlinks from 0 to 1
>>>> > > would have reset inode 144693 nlinks from 0 to 1
>>>> > > would have reset inode 147765 nlinks from 0 to 1
>>>> > > would have reset inode 152885 nlinks from 0 to 1
>>>> > > would have reset inode 161333 nlinks from 0 to 1
>>>> > > would have reset inode 161845 nlinks from 0 to 1
>>>> > > would have reset inode 167477 nlinks from 0 to 1
>>>> > > would have reset inode 172341 nlinks from 0 to 1
>>>> > > would have reset inode 191797 nlinks from 0 to 1
>>>> > > would have reset inode 204853 nlinks from 0 to 1
>>>> > > would have reset inode 205365 nlinks from 0 to 1
>>>> > > would have reset inode 215349 nlinks from 0 to 1
>>>> > > would have reset inode 215861 nlinks from 0 to 1
>>>> > > would have reset inode 216373 nlinks from 0 to 1
>>>> > > would have reset inode 217397 nlinks from 0 to 1
>>>> > > would have reset inode 224309 nlinks from 0 to 1
>>>> > > would have reset inode 225589 nlinks from 0 to 1
>>>> > > would have reset inode 234549 nlinks from 0 to 1
>>>> > > would have reset inode 234805 nlinks from 0 to 1
>>>> > > would have reset inode 249653 nlinks from 0 to 1
>>>> > > would have reset inode 250677 nlinks from 0 to 1
>>>> > > would have reset inode 252469 nlinks from 0 to 1
>>>> > > would have reset inode 261429 nlinks from 0 to 1
>>>> > > would have reset inode 265013 nlinks from 0 to 1
>>>> > > would have reset inode 266805 nlinks from 0 to 1
>>>> > > would have reset inode 267317 nlinks from 0 to 1
>>>> > > would have reset inode 268853 nlinks from 0 to 1
>>>> > > would have reset inode 272437 nlinks from 0 to 1
>>>> > > would have reset inode 273205 nlinks from 0 to 1
>>>> > > would have reset inode 274229 nlinks from 0 to 1
>>>> > > would have reset inode 278325 nlinks from 0 to 1
>>>> > > would have reset inode 278837 nlinks from 0 to 1
>>>> > > would have reset inode 281397 nlinks from 0 to 1
>>>> > > would have reset inode 292661 nlinks from 0 to 1
>>>> > > would have reset inode 300853 nlinks from 0 to 1
>>>> > > would have reset inode 302901 nlinks from 0 to 1
>>>> > > would have reset inode 305205 nlinks from 0 to 1
>>>> > > would have reset inode 314165 nlinks from 0 to 1
>>>> > > would have reset inode 315189 nlinks from 0 to 1
>>>> > > would have reset inode 320309 nlinks from 0 to 1
>>>> > > would have reset inode 324917 nlinks from 0 to 1
>>>> > > would have reset inode 328245 nlinks from 0 to 1
>>>> > > would have reset inode 335925 nlinks from 0 to 1
>>>> > > would have reset inode 339253 nlinks from 0 to 1
>>>> > > would have reset inode 339765 nlinks from 0 to 1
>>>> > > would have reset inode 348213 nlinks from 0 to 1
>>>> > > would have reset inode 360501 nlinks from 0 to 1
>>>> > > would have reset inode 362037 nlinks from 0 to 1
>>>> > > would have reset inode 366389 nlinks from 0 to 1
>>>> > > would have reset inode 385845 nlinks from 0 to 1
>>>> > > would have reset inode 390709 nlinks from 0 to 1
>>>> > > would have reset inode 409141 nlinks from 0 to 1
>>>> > > would have reset inode 413237 nlinks from 0 to 1
>>>> > > would have reset inode 414773 nlinks from 0 to 1
>>>> > > would have reset inode 417845 nlinks from 0 to 1
>>>> > > would have reset inode 436021 nlinks from 0 to 1
>>>> > > would have reset inode 439349 nlinks from 0 to 1
>>>> > > would have reset inode 447029 nlinks from 0 to 1
>>>> > > would have reset inode 491317 nlinks from 0 to 1
>>>> > > would have reset inode 494133 nlinks from 0 to 1
>>>> > > would have reset inode 495413 nlinks from 0 to 1
>>>> > > would have reset inode 501301 nlinks from 0 to 1
>>>> > > would have reset inode 506421 nlinks from 0 to 1
>>>> > > would have reset inode 508469 nlinks from 0 to 1
>>>> > > would have reset inode 508981 nlinks from 0 to 1
>>>> > > would have reset inode 511797 nlinks from 0 to 1
>>>> > > would have reset inode 513077 nlinks from 0 to 1
>>>> > > would have reset inode 517941 nlinks from 0 to 1
>>>> > > would have reset inode 521013 nlinks from 0 to 1
>>>> > > would have reset inode 522805 nlinks from 0 to 1
>>>> > > would have reset inode 523317 nlinks from 0 to 1
>>>> > > would have reset inode 525621 nlinks from 0 to 1
>>>> > > would have reset inode 527925 nlinks from 0 to 1
>>>> > > would have reset inode 535605 nlinks from 0 to 1
>>>> > > would have reset inode 541749 nlinks from 0 to 1
>>>> > > would have reset inode 573493 nlinks from 0 to 1
>>>> > > would have reset inode 578613 nlinks from 0 to 1
>>>> > > would have reset inode 583029 nlinks from 0 to 1
>>>> > > would have reset inode 585525 nlinks from 0 to 1
>>>> > > would have reset inode 586293 nlinks from 0 to 1
>>>> > > would have reset inode 586805 nlinks from 0 to 1
>>>> > > would have reset inode 591413 nlinks from 0 to 1
>>>> > > would have reset inode 594485 nlinks from 0 to 1
>>>> > > would have reset inode 596277 nlinks from 0 to 1
>>>> > > would have reset inode 603189 nlinks from 0 to 1
>>>> > > would have reset inode 613429 nlinks from 0 to 1
>>>> > > would have reset inode 617781 nlinks from 0 to 1
>>>> > > would have reset inode 621877 nlinks from 0 to 1
>>>> > > would have reset inode 623925 nlinks from 0 to 1
>>>> > > would have reset inode 625205 nlinks from 0 to 1
>>>> > > would have reset inode 626741 nlinks from 0 to 1
>>>> > > would have reset inode 639541 nlinks from 0 to 1
>>>> > > would have reset inode 640053 nlinks from 0 to 1
>>>> > > would have reset inode 640565 nlinks from 0 to 1
>>>> > > would have reset inode 645173 nlinks from 0 to 1
>>>> > > would have reset inode 652853 nlinks from 0 to 1
>>>> > > would have reset inode 656181 nlinks from 0 to 1
>>>> > > would have reset inode 659253 nlinks from 0 to 1
>>>> > > would have reset inode 663605 nlinks from 0 to 1
>>>> > > would have reset inode 667445 nlinks from 0 to 1
>>>> > > would have reset inode 680757 nlinks from 0 to 1
>>>> > > would have reset inode 691253 nlinks from 0 to 1
>>>> > > would have reset inode 691765 nlinks from 0 to 1
>>>> > > would have reset inode 697653 nlinks from 0 to 1
>>>> > > would have reset inode 700469 nlinks from 0 to 1
>>>> > > would have reset inode 707893 nlinks from 0 to 1
>>>> > > would have reset inode 716853 nlinks from 0 to 1
>>>> > > would have reset inode 722229 nlinks from 0 to 1
>>>> > > would have reset inode 722741 nlinks from 0 to 1
>>>> > > would have reset inode 723765 nlinks from 0 to 1
>>>> > > would have reset inode 731957 nlinks from 0 to 1
>>>> > > would have reset inode 742965 nlinks from 0 to 1
>>>> > > would have reset inode 743477 nlinks from 0 to 1
>>>> > > would have reset inode 745781 nlinks from 0 to 1
>>>> > > would have reset inode 746293 nlinks from 0 to 1
>>>> > > would have reset inode 774453 nlinks from 0 to 1
>>>> > > would have reset inode 778805 nlinks from 0 to 1
>>>> > > would have reset inode 785013 nlinks from 0 to 1
>>>> > > would have reset inode 785973 nlinks from 0 to 1
>>>> > > would have reset inode 791349 nlinks from 0 to 1
>>>> > > would have reset inode 796981 nlinks from 0 to 1
>>>> > > would have reset inode 803381 nlinks from 0 to 1
>>>> > > would have reset inode 806965 nlinks from 0 to 1
>>>> > > would have reset inode 811798 nlinks from 0 to 1
>>>> > > would have reset inode 812310 nlinks from 0 to 1
>>>> > > would have reset inode 813078 nlinks from 0 to 1
>>>> > > would have reset inode 813607 nlinks from 0 to 1
>>>> > > would have reset inode 814183 nlinks from 0 to 1
>>>> > > would have reset inode 822069 nlinks from 0 to 1
>>>> > > would have reset inode 828469 nlinks from 0 to 1
>>>> > > would have reset inode 830005 nlinks from 0 to 1
>>>> > > would have reset inode 832053 nlinks from 0 to 1
>>>> > > would have reset inode 832565 nlinks from 0 to 1
>>>> > > would have reset inode 836661 nlinks from 0 to 1
>>>> > > would have reset inode 841013 nlinks from 0 to 1
>>>> > > would have reset inode 841525 nlinks from 0 to 1
>>>> > > would have reset inode 845365 nlinks from 0 to 1
>>>> > > would have reset inode 846133 nlinks from 0 to 1
>>>> > > would have reset inode 847157 nlinks from 0 to 1
>>>> > > would have reset inode 852533 nlinks from 0 to 1
>>>> > > would have reset inode 857141 nlinks from 0 to 1
>>>> > > would have reset inode 863271 nlinks from 0 to 1
>>>> > > would have reset inode 866855 nlinks from 0 to 1
>>>> > > would have reset inode 887861 nlinks from 0 to 1
>>>> > > would have reset inode 891701 nlinks from 0 to 1
>>>> > > would have reset inode 894773 nlinks from 0 to 1
>>>> > > would have reset inode 900149 nlinks from 0 to 1
>>>> > > would have reset inode 902197 nlinks from 0 to 1
>>>> > > would have reset inode 906293 nlinks from 0 to 1
>>>> > > would have reset inode 906805 nlinks from 0 to 1
>>>> > > would have reset inode 909877 nlinks from 0 to 1
>>>> > > would have reset inode 925493 nlinks from 0 to 1
>>>> > > would have reset inode 949543 nlinks from 0 to 1
>>>> > > would have reset inode 955175 nlinks from 0 to 1
>>>> > > would have reset inode 963623 nlinks from 0 to 1
>>>> > > would have reset inode 967733 nlinks from 0 to 1
>>>> > > would have reset inode 968231 nlinks from 0 to 1
>>>> > > would have reset inode 982069 nlinks from 0 to 1
>>>> > > would have reset inode 1007413 nlinks from 0 to 1
>>>> > > would have reset inode 1011509 nlinks from 0 to 1
>>>> > > would have reset inode 1014069 nlinks from 0 to 1
>>>> > > would have reset inode 1014581 nlinks from 0 to 1
>>>> > > would have reset inode 1022005 nlinks from 0 to 1
>>>> > > would have reset inode 1022517 nlinks from 0 to 1
>>>> > > would have reset inode 1023029 nlinks from 0 to 1
>>>> > > would have reset inode 1025333 nlinks from 0 to 1
>>>> > > would have reset inode 1043765 nlinks from 0 to 1
>>>> > > would have reset inode 1044789 nlinks from 0 to 1
>>>> > > would have reset inode 1049397 nlinks from 0 to 1
>>>> > > would have reset inode 1050933 nlinks from 0 to 1
>>>> > > would have reset inode 1051445 nlinks from 0 to 1
>>>> > > would have reset inode 1054261 nlinks from 0 to 1
>>>> > > would have reset inode 1060917 nlinks from 0 to 1
>>>> > > would have reset inode 1063477 nlinks from 0 to 1
>>>> > > would have reset inode 1076021 nlinks from 0 to 1
>>>> > > would have reset inode 1081141 nlinks from 0 to 1
>>>> > > would have reset inode 1086261 nlinks from 0 to 1
>>>> > > would have reset inode 1097269 nlinks from 0 to 1
>>>> > > would have reset inode 1099829 nlinks from 0 to 1
>>>> > > would have reset inode 1100853 nlinks from 0 to 1
>>>> > > would have reset inode 1101877 nlinks from 0 to 1
>>>> > > would have reset inode 1126709 nlinks from 0 to 1
>>>> > > would have reset inode 1134389 nlinks from 0 to 1
>>>> > > would have reset inode 1141045 nlinks from 0 to 1
>>>> > > would have reset inode 1141557 nlinks from 0 to 1
>>>> > > would have reset inode 1142581 nlinks from 0 to 1
>>>> > > would have reset inode 1148469 nlinks from 0 to 1
>>>> > > would have reset inode 1153333 nlinks from 0 to 1
>>>> > > would have reset inode 1181749 nlinks from 0 to 1
>>>> > > would have reset inode 1192245 nlinks from 0 to 1
>>>> > > would have reset inode 1198133 nlinks from 0 to 1
>>>> > > would have reset inode 1203765 nlinks from 0 to 1
>>>> > > would have reset inode 1221429 nlinks from 0 to 1
>>>> > > would have reset inode 1223989 nlinks from 0 to 1
>>>> > > would have reset inode 1235509 nlinks from 0 to 1
>>>> > > would have reset inode 1239349 nlinks from 0 to 1
>>>> > > would have reset inode 1240885 nlinks from 0 to 1
>>>> > > would have reset inode 1241397 nlinks from 0 to 1
>>>> > > would have reset inode 1241909 nlinks from 0 to 1
>>>> > > would have reset inode 1242421 nlinks from 0 to 1
>>>> > > would have reset inode 1244981 nlinks from 0 to 1
>>>> > > would have reset inode 1246517 nlinks from 0 to 1
>>>> > > would have reset inode 1253429 nlinks from 0 to 1
>>>> > > would have reset inode 1271861 nlinks from 0 to 1
>>>> > > would have reset inode 1274677 nlinks from 0 to 1
>>>> > > would have reset inode 1277749 nlinks from 0 to 1
>>>> > > would have reset inode 1278773 nlinks from 0 to 1
>>>> > > would have reset inode 1286709 nlinks from 0 to 1
>>>> > > would have reset inode 1288245 nlinks from 0 to 1
>>>> > > would have reset inode 1299765 nlinks from 0 to 1
>>>> > > would have reset inode 1302325 nlinks from 0 to 1
>>>> > > would have reset inode 1304885 nlinks from 0 to 1
>>>> > > would have reset inode 1305397 nlinks from 0 to 1
>>>> > > would have reset inode 1307509 nlinks from 0 to 1
>>>> > > would have reset inode 1309493 nlinks from 0 to 1
>>>> > > would have reset inode 1310517 nlinks from 0 to 1
>>>> > > would have reset inode 1311029 nlinks from 0 to 1
>>>> > > would have reset inode 1312053 nlinks from 0 to 1
>>>> > > would have reset inode 1316917 nlinks from 0 to 1
>>>> > > would have reset inode 1317941 nlinks from 0 to 1
>>>> > > would have reset inode 1320821 nlinks from 0 to 1
>>>> > > would have reset inode 1322805 nlinks from 0 to 1
>>>> > > would have reset inode 1332789 nlinks from 0 to 1
>>>> > > would have reset inode 1336373 nlinks from 0 to 1
>>>> > > would have reset inode 1345653 nlinks from 0 to 1
>>>> > > would have reset inode 1354549 nlinks from 0 to 1
>>>> > > would have reset inode 1361973 nlinks from 0 to 1
>>>> > > would have reset inode 1369909 nlinks from 0 to 1
>>>> > > would have reset inode 1372981 nlinks from 0 to 1
>>>> > > would have reset inode 1388853 nlinks from 0 to 1
>>>> > > would have reset inode 1402933 nlinks from 0 to 1
>>>> > > would have reset inode 1403445 nlinks from 0 to 1
>>>> > > would have reset inode 1420085 nlinks from 0 to 1
>>>> > > would have reset inode 1452853 nlinks from 0 to 1
>>>> > > would have reset inode 1456437 nlinks from 0 to 1
>>>> > > would have reset inode 1457973 nlinks from 0 to 1
>>>> > > would have reset inode 1459253 nlinks from 0 to 1
>>>> > > would have reset inode 1467957 nlinks from 0 to 1
>>>> > > would have reset inode 1471541 nlinks from 0 to 1
>>>> > > would have reset inode 1476661 nlinks from 0 to 1
>>>> > > would have reset inode 1479733 nlinks from 0 to 1
>>>> > > would have reset inode 1483061 nlinks from 0 to 1
>>>> > > would have reset inode 1484085 nlinks from 0 to 1
>>>> > > would have reset inode 1486133 nlinks from 0 to 1
>>>> > > would have reset inode 1489461 nlinks from 0 to 1
>>>> > > would have reset inode 1490037 nlinks from 0 to 1
>>>> > > would have reset inode 1492021 nlinks from 0 to 1
>>>> > > would have reset inode 1493557 nlinks from 0 to 1
>>>> > > would have reset inode 1494069 nlinks from 0 to 1
>>>> > > would have reset inode 1496885 nlinks from 0 to 1
>>>> > > would have reset inode 1498421 nlinks from 0 to 1
>>>> > > would have reset inode 1498933 nlinks from 0 to 1
>>>> > > would have reset inode 1499957 nlinks from 0 to 1
>>>> > > would have reset inode 1506101 nlinks from 0 to 1
>>>> > > would have reset inode 1507637 nlinks from 0 to 1
>>>> > > would have reset inode 1510453 nlinks from 0 to 1
>>>> > > would have reset inode 1514293 nlinks from 0 to 1
>>>> > > would have reset inode 1517365 nlinks from 0 to 1
>>>> > > would have reset inode 1520693 nlinks from 0 to 1
>>>> > > would have reset inode 1521973 nlinks from 0 to 1
>>>> > > would have reset inode 1530421 nlinks from 0 to 1
>>>> > > would have reset inode 1530933 nlinks from 0 to 1
>>>> > > would have reset inode 1537333 nlinks from 0 to 1
>>>> > > would have reset inode 1538357 nlinks from 0 to 1
>>>> > > would have reset inode 1548853 nlinks from 0 to 1
>>>> > > would have reset inode 1553973 nlinks from 0 to 1
>>>> > > would have reset inode 1557301 nlinks from 0 to 1
>>>> > > would have reset inode 1564213 nlinks from 0 to 1
>>>> > > would have reset inode 1564725 nlinks from 0 to 1
>>>> > > would have reset inode 1576501 nlinks from 0 to 1
>>>> > > would have reset inode 1580597 nlinks from 0 to 1
>>>> > > would have reset inode 1584693 nlinks from 0 to 1
>>>> > > would have reset inode 1586485 nlinks from 0 to 1
>>>> > > would have reset inode 1589301 nlinks from 0 to 1
>>>> > > would have reset inode 1589813 nlinks from 0 to 1
>>>> > > would have reset inode 1592629 nlinks from 0 to 1
>>>> > > would have reset inode 1595701 nlinks from 0 to 1
>>>> > > would have reset inode 1601077 nlinks from 0 to 1
>>>> > > would have reset inode 1623861 nlinks from 0 to 1
>>>> > > would have reset inode 1626677 nlinks from 0 to 1
>>>> > > would have reset inode 1627701 nlinks from 0 to 1
>>>> > > would have reset inode 1633333 nlinks from 0 to 1
>>>> > > would have reset inode 1639221 nlinks from 0 to 1
>>>> > > would have reset inode 1649205 nlinks from 0 to 1
>>>> > > would have reset inode 1686325 nlinks from 0 to 1
>>>> > > would have reset inode 1690677 nlinks from 0 to 1
>>>> > > would have reset inode 1693749 nlinks from 0 to 1
>>>> > > would have reset inode 1704757 nlinks from 0 to 1
>>>> > > would have reset inode 1707061 nlinks from 0 to 1
>>>> > > would have reset inode 1709109 nlinks from 0 to 1
>>>> > > would have reset inode 1719349 nlinks from 0 to 1
>>>> > > would have reset inode 1737013 nlinks from 0 to 1
>>>> > > would have reset inode 1741365 nlinks from 0 to 1
>>>> > > would have reset inode 1747509 nlinks from 0 to 1
>>>> > > would have reset inode 1770805 nlinks from 0 to 1
>>>> > > would have reset inode 1780789 nlinks from 0 to 1
>>>> > > would have reset inode 1793589 nlinks from 0 to 1
>>>> > > would have reset inode 1795125 nlinks from 0 to 1
>>>> > > would have reset inode 1800757 nlinks from 0 to 1
>>>> > > would have reset inode 1801269 nlinks from 0 to 1
>>>> > > would have reset inode 1802549 nlinks from 0 to 1
>>>> > > would have reset inode 1804085 nlinks from 0 to 1
>>>> > > would have reset inode 1817141 nlinks from 0 to 1
>>>> > > would have reset inode 1821749 nlinks from 0 to 1
>>>> > > would have reset inode 1832757 nlinks from 0 to 1
>>>> > > would have reset inode 1836341 nlinks from 0 to 1
>>>> > > would have reset inode 1856309 nlinks from 0 to 1
>>>> > > would have reset inode 1900597 nlinks from 0 to 1
>>>> > > would have reset inode 1902901 nlinks from 0 to 1
>>>> > > would have reset inode 1912373 nlinks from 0 to 1
>>>> > > would have reset inode 1943093 nlinks from 0 to 1
>>>> > > would have reset inode 1944373 nlinks from 0 to 1
>>>> > > would have reset inode 1954101 nlinks from 0 to 1
>>>> > > would have reset inode 1955893 nlinks from 0 to 1
>>>> > > would have reset inode 1961781 nlinks from 0 to 1
>>>> > > would have reset inode 1974325 nlinks from 0 to 1
>>>> > > would have reset inode 1978677 nlinks from 0 to 1
>>>> > > would have reset inode 1981237 nlinks from 0 to 1
>>>> > > would have reset inode 1992245 nlinks from 0 to 1
>>>> > > would have reset inode 2000949 nlinks from 0 to 1
>>>> > > would have reset inode 2002229 nlinks from 0 to 1
>>>> > > would have reset inode 2004789 nlinks from 0 to 1
>>>> > > would have reset inode 2005301 nlinks from 0 to 1
>>>> > > would have reset inode 2011189 nlinks from 0 to 1
>>>> > > would have reset inode 2012981 nlinks from 0 to 1
>>>> > > would have reset inode 2015285 nlinks from 0 to 1
>>>> > > would have reset inode 2018869 nlinks from 0 to 1
>>>> > > would have reset inode 2028341 nlinks from 0 to 1
>>>> > > would have reset inode 2028853 nlinks from 0 to 1
>>>> > > would have reset inode 2030901 nlinks from 0 to 1
>>>> > > would have reset inode 2032181 nlinks from 0 to 1
>>>> > > would have reset inode 2032693 nlinks from 0 to 1
>>>> > > would have reset inode 2040117 nlinks from 0 to 1
>>>> > > would have reset inode 2053685 nlinks from 0 to 1
>>>> > > would have reset inode 2083893 nlinks from 0 to 1
>>>> > > would have reset inode 2087221 nlinks from 0 to 1
>>>> > > would have reset inode 2095925 nlinks from 0 to 1
>>>> > > would have reset inode 2098741 nlinks from 0 to 1
>>>> > > would have reset inode 2100533 nlinks from 0 to 1
>>>> > > would have reset inode 2101301 nlinks from 0 to 1
>>>> > > would have reset inode 2123573 nlinks from 0 to 1
>>>> > > would have reset inode 2132789 nlinks from 0 to 1
>>>> > > would have reset inode 2133813 nlinks from 0 to 1
>>>> > >
>>>> > >
>>>> > >
>>>> > >
>>>> > >
>>>> > > 2013/4/10 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>
>>>> > <mailto:yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>>
>>>> > >
>>>> > > The storage info is as following:
>>>> > > RAID-6
>>>> > > SATA HDD
>>>> > > Controller: PERC H710P Mini (Embedded)
>>>> > > Disk /dev/sdb: 30000.3 GB, 30000346562560 bytes
>>>> > > 255 heads, 63 sectors/track, 3647334 cylinders
>>>> > > Units = cylinders of 16065 * 512 = 8225280 bytes
>>>> > > Sector size (logical/physical): 512 bytes / 512 bytes
>>>> > > I/O size (minimum/optimal): 512 bytes / 512 bytes
>>>> > > Disk identifier: 0x00000000
>>>> > >
>>>> > > sd 0:2:1:0: [sdb] 58594426880 512-byte logical blocks: (30.0
>>>> > TB/27.2
>>>> > > TiB)
>>>> > > sd 0:2:1:0: [sdb] Write Protect is off
>>>> > > sd 0:2:1:0: [sdb] Mode Sense: 1f 00 00 08
>>>> > > sd 0:2:1:0: [sdb] Write cache: enabled, read cache: enabled,
>>>> > doesn't
>>>> > > support DPO or FUA
>>>> > > sd 0:2:1:0: [sdb] Attached SCSI disk
>>>> > >
>>>> > > *-storage
>>>> > > description: RAID bus controller
>>>> > > product: MegaRAID SAS 2208 [Thunderbolt]
>>>> > > vendor: LSI Logic / Symbios Logic
>>>> > > physical id: 0
>>>> > > bus info: pci@0000:02:00.0
>>>> > > logical name: scsi0
>>>> > > version: 01
>>>> > > width: 64 bits
>>>> > > clock: 33MHz
>>>> > > capabilities: storage pm pciexpress vpd msi msix
>>>> bus_master
>>>> > > cap_list rom
>>>> > > configuration: driver=megaraid_sas latency=0
>>>> > > resources: irq:42 ioport:fc00(size=256)
>>>> > > memory:dd7fc000-dd7fffff memory:dd780000-dd7bffff
>>>> > > memory:dc800000-dc81ffff(prefetchable)
>>>> > > *-disk:0
>>>> > > description: SCSI Disk
>>>> > > product: PERC H710P
>>>> > > vendor: DELL
>>>> > > physical id: 2.0.0
>>>> > > bus info: scsi@0:2.0.0
>>>> > > logical name: /dev/sda
>>>> > > version: 3.13
>>>> > > serial: 0049d6ce1d9f2035180096fde490f648
>>>> > > size: 558GiB (599GB)
>>>> > > capabilities: partitioned partitioned:dos
>>>> > > configuration: ansiversion=5 signature=000aa336
>>>> > > *-disk:1
>>>> > > description: SCSI Disk
>>>> > > product: PERC H710P
>>>> > > vendor: DELL
>>>> > > physical id: 2.1.0
>>>> > > bus info: scsi@0:2.1.0
>>>> > > logical name: /dev/sdb
>>>> > > logical name: /mnt/xfsd
>>>> > > version: 3.13
>>>> > > serial: 003366f71da22035180096fde490f648
>>>> > > size: 27TiB (30TB)
>>>> > > configuration: ansiversion=5 mount.fstype=xfs
>>>> > >
>>>> >
>>>> mount.options=rw,relatime,attr2,delaylog,logbsize=64k,sunit=128,swidth=1280,noquota
>>>> > > state=mounted
>>>> > >
>>>> > > Thank you.
>>>> > >
>>>> > >
>>>> > > 2013/4/10 Emmanuel Florac <eflorac@intellique.com
>>>> > <mailto:eflorac@intellique.com>
>>>> > > <mailto:eflorac@intellique.com <mailto:
>>>> eflorac@intellique.com>>>
>>>> > >
>>>> > > Le Tue, 9 Apr 2013 23:10:03 +0800
>>>> > > 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>
>>>> > <mailto:yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>>
>>>> écrivait:
>>>> > >
>>>> > > > > Apr 9 11:01:30 cqdx kernel: XFS (sdb): I/O Error
>>>> > Detected.
>>>> > > > > Shutting down filesystem
>>>> > >
>>>> > > This. I/O error detected. That means that at some point
>>>> the
>>>> > > underlying
>>>> > > device (disk, RAID array, SAN volume) couldn't be
>>>> reached.
>>>> > So this
>>>> > > could very well be a case of a flakey drive, array,
>>>> cable
>>>> > or SCSI
>>>> > > driver.
>>>> > >
>>>> > > What's the storage setup here?
>>>> > >
>>>> > > --
>>>> > >
>>>> >
>>>> ------------------------------------------------------------------------
>>>> > > Emmanuel Florac | Direction technique
>>>> > > | Intellique
>>>> > > | <eflorac@intellique.com
>>>> > <mailto:eflorac@intellique.com>
>>>> > > <mailto:eflorac@intellique.com
>>>> > <mailto:eflorac@intellique.com>>>
>>>> > > | +33 1 78 94 84 02
>>>> > <tel:%2B33%201%2078%2094%2084%2002>
>>>> > >
>>>> >
>>>> ------------------------------------------------------------------------
>>>> > >
>>>> > >
>>>> > >
>>>> > >
>>>> > > --
>>>> > > 符永涛
>>>> > >
>>>> > >
>>>> > >
>>>> > >
>>>> > > --
>>>> > > 符永涛
>>>> > >
>>>> > >
>>>> > > _______________________________________________
>>>> > > xfs mailing list
>>>> > > xfs@oss.sgi.com <mailto:xfs@oss.sgi.com>
>>>> > > http://oss.sgi.com/mailman/listinfo/xfs
>>>> > >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > 符永涛
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > xfs mailing list
>>>> > xfs@oss.sgi.com
>>>> > http://oss.sgi.com/mailman/listinfo/xfs
>>>> >
>>>>
>>>>
>>>
>>>
>>> --
>>> 符永涛
>>>
>>
>>
>>
>> --
>> 符永涛
>>
>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 47150 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-12 12:41 ` Brian Foster
@ 2013-04-12 14:48 ` 符永涛
2013-04-15 2:08 ` 符永涛
0 siblings, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-12 14:48 UTC (permalink / raw)
To: Brian Foster; +Cc: Ben Myers, Eric Sandeen, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 4051 bytes --]
Hi Brian,
Your scripts works for me now after I installed all the rpm built out from
kernel srpm. I'll try it. Thank you.
2013/4/12 Brian Foster <bfoster@redhat.com>
> On 04/12/2013 04:32 AM, 符永涛 wrote:
> > Dear xfs experts,
> > Can I just call xfs_stack_trace(); in the second line of
> > xfs_do_force_shutdown() to print stack and rebuild kernel to check
> > what's the error?
> >
>
> I suppose that's a start. If you're willing/able to create and run a
> modified kernel for the purpose of collecting more debug info, perhaps
> we can get a bit more creative in collecting more data on the problem
> (but a stack trace there is a good start).
>
> BTW- you might want to place the call after the XFS_FORCED_SHUTDOWN(mp)
> check almost halfway into the function to avoid duplicate messages.
>
> Brian
>
> >
> > 2013/4/12 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
> >
> > Hi Brian,
> > What else I'm missing? Thank you.
> > stap -e 'probe module("xfs").function("xfs_iunlink"){}'
> >
> > WARNING: cannot find module xfs debuginfo: No DWARF information found
> > semantic error: no match while resolving probe point
> > module("xfs").function("xfs_iunlink")
> > Pass 2: analysis failed. Try again with another '--vp 01' option.
> >
> >
> > 2013/4/12 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
> >
> > ls -l
> >
> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
> > -r--r--r-- 1 root root 21393024 Apr 12 12:08
> >
> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
> >
> > rpm -qa|grep kernel
> > kernel-headers-2.6.32-279.el6.x86_64
> > kernel-devel-2.6.32-279.el6.x86_64
> > kernel-2.6.32-358.el6.x86_64
> > kernel-debuginfo-common-x86_64-2.6.32-279.el6.x86_64
> > abrt-addon-kerneloops-2.0.8-6.el6.x86_64
> > kernel-firmware-2.6.32-358.el6.noarch
> > kernel-debug-2.6.32-358.el6.x86_64
> > kernel-debuginfo-2.6.32-279.el6.x86_64
> > dracut-kernel-004-283.el6.noarch
> > libreport-plugin-kerneloops-2.0.9-5.el6.x86_64
> > kernel-devel-2.6.32-358.el6.x86_64
> > kernel-2.6.32-279.el6.x86_64
> >
> > rpm -q kernel-debuginfo
> > kernel-debuginfo-2.6.32-279.el6.x86_64
> >
> > rpm -q kernel
> > kernel-2.6.32-279.el6.x86_64
> > kernel-2.6.32-358.el6.x86_64
> >
> > do I need to re probe it?
> >
> >
> > 2013/4/12 Eric Sandeen <sandeen@sandeen.net
> > <mailto:sandeen@sandeen.net>>
> >
> > On 4/11/13 11:32 PM, 符永涛 wrote:
> > > Hi Brian,
> > > Sorry but when I execute the script it says:
> > > WARNING: cannot find module xfs debuginfo: No DWARF
> > information found
> > > semantic error: no match while resolving probe point
> > module("xfs").function("xfs_iunlink")
> > >
> > > uname -a
> > > 2.6.32-279.el6.x86_64
> > > kernel debuginfo has been installed.
> > >
> > > Where can I find the correct xfs debuginfo?
> >
> > it should be in the kernel-debuginfo rpm (of the same
> > version/release as the kernel rpm you're running)
> >
> > You should have:
> >
> >
> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
> >
> > If not, can you show:
> >
> > # uname -a
> > # rpm -q kernel
> > # rpm -q kernel-debuginfo
> >
> > -Eric
> >
> >
> >
> >
> >
> > --
> > 符永涛
> >
> >
> >
> >
> > --
> > 符永涛
> >
> >
> >
> >
> > --
> > 符永涛
> >
> >
> > _______________________________________________
> > xfs mailing list
> > xfs@oss.sgi.com
> > http://oss.sgi.com/mailman/listinfo/xfs
> >
>
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 6950 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-12 14:48 ` 符永涛
@ 2013-04-15 2:08 ` 符永涛
2013-04-15 5:04 ` 符永涛
0 siblings, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-15 2:08 UTC (permalink / raw)
To: Brian Foster; +Cc: Ben Myers, Eric Sandeen, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 8753 bytes --]
Dear xfs experts,
Now I'm deploying Brian's system script in out cluster. But from last night
till now 5 servers in our 24 servers xfs shutdown with the same error. I
run xfs_repair command and found all the lost inodes are glusterfs dht link
files. This explains why the xfs shutdown tend to happen during glusterfs
rebalance. During glusterfs rebalance procedure a lot of dhk link files may
be unlinked. For example the following inodes are found in lost+found in
one of the servers:
[root@* lost+found]# pwd
/mnt/xfsd/lost+found
[root@* lost+found]# ls -l
total 740
---------T 1 root root 0 Apr 8 21:06 100119
---------T 1 root root 0 Apr 8 21:11 101123
---------T 1 root root 0 Apr 8 21:19 102659
---------T 1 root root 0 Apr 12 14:46 1040919
---------T 1 root root 0 Apr 12 14:58 1041943
---------T 1 root root 0 Apr 8 21:32 105219
---------T 1 root root 0 Apr 8 21:37 105731
---------T 1 root root 0 Apr 12 17:48 1068055
---------T 1 root root 0 Apr 12 18:38 1073943
---------T 1 root root 0 Apr 8 21:54 108035
---------T 1 root root 0 Apr 12 21:49 1091095
---------T 1 root root 0 Apr 13 00:17 1111063
---------T 1 root root 0 Apr 13 03:51 1121815
---------T 1 root root 0 Apr 8 22:25 112387
---------T 1 root root 0 Apr 13 06:39 1136151
...
[root@* lost+found]# getfattr -m . -d -e hex *
# file: 96007
trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
trusted.gfid=0xa0370d8a9f104dafbebbd0e6dd7ce1f7
trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x0000000049dff000
# file: 97027
trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
trusted.gfid=0xc1c1fe2ec7034442a623385f43b04c25
trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006ac78000
# file: 97559
trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
trusted.gfid=0xcf7c17013c914511bda4d1c743fae118
trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x00000000519fb000
# file: 98055
trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
trusted.gfid=0xe86abc6e2c4b44c28d415fbbe34f2102
trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000004c098000
# file: 98567
trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
trusted.gfid=0x12543a2efbdf4b9fa61c6d89ca396f80
trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006bc98000
# file: 98583
trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
trusted.gfid=0x760d16d3b7974cfb9c0a665a0982c470
trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006cde9000
# file: 99607
trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
trusted.gfid=0x0849a732ea204bc3b8bae830b46881da
trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x00000000513f1000
...
What do you think about it? Thank you very much.
2013/4/12 符永涛 <yongtaofu@gmail.com>
> Hi Brian,
>
> Your scripts works for me now after I installed all the rpm built out from
> kernel srpm. I'll try it. Thank you.
>
>
> 2013/4/12 Brian Foster <bfoster@redhat.com>
>
>> On 04/12/2013 04:32 AM, 符永涛 wrote:
>> > Dear xfs experts,
>> > Can I just call xfs_stack_trace(); in the second line of
>> > xfs_do_force_shutdown() to print stack and rebuild kernel to check
>> > what's the error?
>> >
>>
>> I suppose that's a start. If you're willing/able to create and run a
>> modified kernel for the purpose of collecting more debug info, perhaps
>> we can get a bit more creative in collecting more data on the problem
>> (but a stack trace there is a good start).
>>
>> BTW- you might want to place the call after the XFS_FORCED_SHUTDOWN(mp)
>> check almost halfway into the function to avoid duplicate messages.
>>
>> Brian
>>
>> >
>> > 2013/4/12 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>> >
>> > Hi Brian,
>> > What else I'm missing? Thank you.
>> > stap -e 'probe module("xfs").function("xfs_iunlink"){}'
>> >
>> > WARNING: cannot find module xfs debuginfo: No DWARF information
>> found
>> > semantic error: no match while resolving probe point
>> > module("xfs").function("xfs_iunlink")
>> > Pass 2: analysis failed. Try again with another '--vp 01' option.
>> >
>> >
>> > 2013/4/12 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>> >
>> > ls -l
>> >
>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>> > -r--r--r-- 1 root root 21393024 Apr 12 12:08
>> >
>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>> >
>> > rpm -qa|grep kernel
>> > kernel-headers-2.6.32-279.el6.x86_64
>> > kernel-devel-2.6.32-279.el6.x86_64
>> > kernel-2.6.32-358.el6.x86_64
>> > kernel-debuginfo-common-x86_64-2.6.32-279.el6.x86_64
>> > abrt-addon-kerneloops-2.0.8-6.el6.x86_64
>> > kernel-firmware-2.6.32-358.el6.noarch
>> > kernel-debug-2.6.32-358.el6.x86_64
>> > kernel-debuginfo-2.6.32-279.el6.x86_64
>> > dracut-kernel-004-283.el6.noarch
>> > libreport-plugin-kerneloops-2.0.9-5.el6.x86_64
>> > kernel-devel-2.6.32-358.el6.x86_64
>> > kernel-2.6.32-279.el6.x86_64
>> >
>> > rpm -q kernel-debuginfo
>> > kernel-debuginfo-2.6.32-279.el6.x86_64
>> >
>> > rpm -q kernel
>> > kernel-2.6.32-279.el6.x86_64
>> > kernel-2.6.32-358.el6.x86_64
>> >
>> > do I need to re probe it?
>> >
>> >
>> > 2013/4/12 Eric Sandeen <sandeen@sandeen.net
>> > <mailto:sandeen@sandeen.net>>
>> >
>> > On 4/11/13 11:32 PM, 符永涛 wrote:
>> > > Hi Brian,
>> > > Sorry but when I execute the script it says:
>> > > WARNING: cannot find module xfs debuginfo: No DWARF
>> > information found
>> > > semantic error: no match while resolving probe point
>> > module("xfs").function("xfs_iunlink")
>> > >
>> > > uname -a
>> > > 2.6.32-279.el6.x86_64
>> > > kernel debuginfo has been installed.
>> > >
>> > > Where can I find the correct xfs debuginfo?
>> >
>> > it should be in the kernel-debuginfo rpm (of the same
>> > version/release as the kernel rpm you're running)
>> >
>> > You should have:
>> >
>> >
>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>> >
>> > If not, can you show:
>> >
>> > # uname -a
>> > # rpm -q kernel
>> > # rpm -q kernel-debuginfo
>> >
>> > -Eric
>> >
>> >
>> >
>> >
>> >
>> > --
>> > 符永涛
>> >
>> >
>> >
>> >
>> > --
>> > 符永涛
>> >
>> >
>> >
>> >
>> > --
>> > 符永涛
>> >
>> >
>> > _______________________________________________
>> > xfs mailing list
>> > xfs@oss.sgi.com
>> > http://oss.sgi.com/mailman/listinfo/xfs
>> >
>>
>>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 12197 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-15 2:08 ` 符永涛
@ 2013-04-15 5:04 ` 符永涛
2013-04-15 12:54 ` 符永涛
0 siblings, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-15 5:04 UTC (permalink / raw)
To: Brian Foster; +Cc: Ben Myers, Eric Sandeen, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 10077 bytes --]
Also glusterfs use a lot of hardlink for self-heal:
--------T 2 root root 0 Apr 15 11:58 /mnt/xfsd/testbug/998416323
---------T 2 root root 0 Apr 15 11:58 /mnt/xfsd/testbug/999296624
---------T 2 root root 0 Apr 15 12:24 /mnt/xfsd/testbug/999568484
---------T 2 root root 0 Apr 15 11:58 /mnt/xfsd/testbug/999956875
---------T 2 root root 0 Apr 15 11:58
/mnt/xfsd/testbug/.glusterfs/05/2f/052f4e3e-c379-4a3c-b995-a10fdaca33d0
---------T 2 root root 0 Apr 15 11:58
/mnt/xfsd/testbug/.glusterfs/05/95/0595272e-ce2b-45d5-8693-d02c00b94d9d
---------T 2 root root 0 Apr 15 11:58
/mnt/xfsd/testbug/.glusterfs/05/ca/05ca00a0-92a7-44cf-b6e3-380496aafaa4
---------T 2 root root 0 Apr 15 12:24
/mnt/xfsd/testbug/.glusterfs/0a/23/0a238ca7-3cef-4540-9c98-6bf631551b21
---------T 2 root root 0 Apr 15 11:58
/mnt/xfsd/testbug/.glusterfs/0a/4b/0a4b640b-f675-4708-bb59-e2369ffbbb9d
Does it related?
2013/4/15 符永涛 <yongtaofu@gmail.com>
> Dear xfs experts,
> Now I'm deploying Brian's system script in out cluster. But from last
> night till now 5 servers in our 24 servers xfs shutdown with the same
> error. I run xfs_repair command and found all the lost inodes are glusterfs
> dht link files. This explains why the xfs shutdown tend to happen during
> glusterfs rebalance. During glusterfs rebalance procedure a lot of dhk link
> files may be unlinked. For example the following inodes are found in
> lost+found in one of the servers:
> [root@* lost+found]# pwd
> /mnt/xfsd/lost+found
> [root@* lost+found]# ls -l
> total 740
> ---------T 1 root root 0 Apr 8 21:06 100119
> ---------T 1 root root 0 Apr 8 21:11 101123
> ---------T 1 root root 0 Apr 8 21:19 102659
> ---------T 1 root root 0 Apr 12 14:46 1040919
> ---------T 1 root root 0 Apr 12 14:58 1041943
> ---------T 1 root root 0 Apr 8 21:32 105219
> ---------T 1 root root 0 Apr 8 21:37 105731
> ---------T 1 root root 0 Apr 12 17:48 1068055
> ---------T 1 root root 0 Apr 12 18:38 1073943
> ---------T 1 root root 0 Apr 8 21:54 108035
> ---------T 1 root root 0 Apr 12 21:49 1091095
> ---------T 1 root root 0 Apr 13 00:17 1111063
> ---------T 1 root root 0 Apr 13 03:51 1121815
> ---------T 1 root root 0 Apr 8 22:25 112387
> ---------T 1 root root 0 Apr 13 06:39 1136151
> ...
> [root@* lost+found]# getfattr -m . -d -e hex *
>
> # file: 96007
> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
> trusted.gfid=0xa0370d8a9f104dafbebbd0e6dd7ce1f7
>
> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
>
> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x0000000049dff000
>
> # file: 97027
> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
> trusted.gfid=0xc1c1fe2ec7034442a623385f43b04c25
>
> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
>
> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006ac78000
>
> # file: 97559
> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
> trusted.gfid=0xcf7c17013c914511bda4d1c743fae118
>
> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>
> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x00000000519fb000
>
> # file: 98055
> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
> trusted.gfid=0xe86abc6e2c4b44c28d415fbbe34f2102
>
> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
>
> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000004c098000
>
> # file: 98567
> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
> trusted.gfid=0x12543a2efbdf4b9fa61c6d89ca396f80
>
> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>
> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006bc98000
>
> # file: 98583
> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
> trusted.gfid=0x760d16d3b7974cfb9c0a665a0982c470
>
> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>
> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006cde9000
>
> # file: 99607
> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
> trusted.gfid=0x0849a732ea204bc3b8bae830b46881da
>
> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>
> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x00000000513f1000
> ...
>
> What do you think about it? Thank you very much.
>
>
> 2013/4/12 符永涛 <yongtaofu@gmail.com>
>
>> Hi Brian,
>>
>> Your scripts works for me now after I installed all the rpm built out
>> from kernel srpm. I'll try it. Thank you.
>>
>>
>> 2013/4/12 Brian Foster <bfoster@redhat.com>
>>
>>> On 04/12/2013 04:32 AM, 符永涛 wrote:
>>> > Dear xfs experts,
>>> > Can I just call xfs_stack_trace(); in the second line of
>>> > xfs_do_force_shutdown() to print stack and rebuild kernel to check
>>> > what's the error?
>>> >
>>>
>>> I suppose that's a start. If you're willing/able to create and run a
>>> modified kernel for the purpose of collecting more debug info, perhaps
>>> we can get a bit more creative in collecting more data on the problem
>>> (but a stack trace there is a good start).
>>>
>>> BTW- you might want to place the call after the XFS_FORCED_SHUTDOWN(mp)
>>> check almost halfway into the function to avoid duplicate messages.
>>>
>>> Brian
>>>
>>> >
>>> > 2013/4/12 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>>> >
>>> > Hi Brian,
>>> > What else I'm missing? Thank you.
>>> > stap -e 'probe module("xfs").function("xfs_iunlink"){}'
>>> >
>>> > WARNING: cannot find module xfs debuginfo: No DWARF information
>>> found
>>> > semantic error: no match while resolving probe point
>>> > module("xfs").function("xfs_iunlink")
>>> > Pass 2: analysis failed. Try again with another '--vp 01' option.
>>> >
>>> >
>>> > 2013/4/12 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>>> >
>>> > ls -l
>>> >
>>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>> > -r--r--r-- 1 root root 21393024 Apr 12 12:08
>>> >
>>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>> >
>>> > rpm -qa|grep kernel
>>> > kernel-headers-2.6.32-279.el6.x86_64
>>> > kernel-devel-2.6.32-279.el6.x86_64
>>> > kernel-2.6.32-358.el6.x86_64
>>> > kernel-debuginfo-common-x86_64-2.6.32-279.el6.x86_64
>>> > abrt-addon-kerneloops-2.0.8-6.el6.x86_64
>>> > kernel-firmware-2.6.32-358.el6.noarch
>>> > kernel-debug-2.6.32-358.el6.x86_64
>>> > kernel-debuginfo-2.6.32-279.el6.x86_64
>>> > dracut-kernel-004-283.el6.noarch
>>> > libreport-plugin-kerneloops-2.0.9-5.el6.x86_64
>>> > kernel-devel-2.6.32-358.el6.x86_64
>>> > kernel-2.6.32-279.el6.x86_64
>>> >
>>> > rpm -q kernel-debuginfo
>>> > kernel-debuginfo-2.6.32-279.el6.x86_64
>>> >
>>> > rpm -q kernel
>>> > kernel-2.6.32-279.el6.x86_64
>>> > kernel-2.6.32-358.el6.x86_64
>>> >
>>> > do I need to re probe it?
>>> >
>>> >
>>> > 2013/4/12 Eric Sandeen <sandeen@sandeen.net
>>> > <mailto:sandeen@sandeen.net>>
>>> >
>>> > On 4/11/13 11:32 PM, 符永涛 wrote:
>>> > > Hi Brian,
>>> > > Sorry but when I execute the script it says:
>>> > > WARNING: cannot find module xfs debuginfo: No DWARF
>>> > information found
>>> > > semantic error: no match while resolving probe point
>>> > module("xfs").function("xfs_iunlink")
>>> > >
>>> > > uname -a
>>> > > 2.6.32-279.el6.x86_64
>>> > > kernel debuginfo has been installed.
>>> > >
>>> > > Where can I find the correct xfs debuginfo?
>>> >
>>> > it should be in the kernel-debuginfo rpm (of the same
>>> > version/release as the kernel rpm you're running)
>>> >
>>> > You should have:
>>> >
>>> >
>>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>> >
>>> > If not, can you show:
>>> >
>>> > # uname -a
>>> > # rpm -q kernel
>>> > # rpm -q kernel-debuginfo
>>> >
>>> > -Eric
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > 符永涛
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > 符永涛
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > 符永涛
>>> >
>>> >
>>> > _______________________________________________
>>> > xfs mailing list
>>> > xfs@oss.sgi.com
>>> > http://oss.sgi.com/mailman/listinfo/xfs
>>> >
>>>
>>>
>>
>>
>> --
>> 符永涛
>>
>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 13612 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-15 5:04 ` 符永涛
@ 2013-04-15 12:54 ` 符永涛
2013-04-15 13:33 ` 符永涛
2013-04-15 14:13 ` Brian Foster
0 siblings, 2 replies; 60+ messages in thread
From: 符永涛 @ 2013-04-15 12:54 UTC (permalink / raw)
To: Brian Foster; +Cc: Ben Myers, Eric Sandeen, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 13495 bytes --]
Dear Brian and xfs experts,
Brain your scripts works and I am able to reproduce it with glusterfs
rebalance on our test cluster. 2 of our server xfs shutdown during
glusterfs rebalance, the shutdown userspace stacktrace both related to
pthread. See logs bellow, What's your opinion? Thank you very much!
logs:
[root@10.23.72.93 ~]# cat xfs.log
--- xfs_imap --
module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
-- return=0x16
vars: mp=0xffff882017a50800 tp=0xffff881c81797c70 ino=0xffffffff
imap=0xffff88100e2f7c08 flags=0x0 agbno=? agino=? agno=? blks_per_cluster=?
chunk_agbno=? cluster_agbno=? error=? offset=? offset_agbno=? __func__=[...]
mp: m_agno_log = 0x5, m_agino_log = 0x20
mp->m_sb: sb_agcount = 0x1c, sb_agblocks = 0xffffff0, sb_inopblog = 0x4,
sb_agblklog = 0x1c, sb_dblocks = 0x1b4900000
imap: im_blkno = 0x0, im_len = 0xa078, im_boffset = 0x86ea
kernel backtrace:
Returning from: 0xffffffffa02b3ab0 : xfs_imap+0x0/0x280 [xfs]
Returning to : 0xffffffffa02b9599 : xfs_inotobp+0x49/0xc0 [xfs]
0xffffffffa02b96f1 : xfs_iunlink_remove+0xe1/0x320 [xfs]
0xffffffff81501a69
0x0 (inexact)
user backtrace:
0x3bd1a0e5ad [/lib64/libpthread-2.12.so+0xe5ad/0x219000]
--- xfs_iunlink_remove --
module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1680").return
-- return=0x16
vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00 next_ino=? mp=? agi=?
dip=? agibp=0xffff880109b47e20 ibp=? agno=? agino=? next_agino=? last_ibp=?
last_dip=0xffff882000000000 bucket_index=? offset=?
last_offset=0xffffffffffff8810 error=? __func__=[...]
ip: i_ino = 0x113, i_flags = 0x0
ip->i_d: di_nlink = 0x0, di_gen = 0x0
[root@10.23.72.93 ~]#
[root@10.23.72.94 ~]# cat xfs.log
--- xfs_imap --
module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
-- return=0x16
vars: mp=0xffff881017c6c800 tp=0xffff8801037acea0 ino=0xffffffff
imap=0xffff882017101c08 flags=0x0 agbno=? agino=? agno=? blks_per_cluster=?
chunk_agbno=? cluster_agbno=? error=? offset=? offset_agbno=? __func__=[...]
mp: m_agno_log = 0x5, m_agino_log = 0x20
mp->m_sb: sb_agcount = 0x1c, sb_agblocks = 0xffffff0, sb_inopblog = 0x4,
sb_agblklog = 0x1c, sb_dblocks = 0x1b4900000
imap: im_blkno = 0x0, im_len = 0xd98, im_boffset = 0x547
kernel backtrace:
Returning from: 0xffffffffa02b3ab0 : xfs_imap+0x0/0x280 [xfs]
Returning to : 0xffffffffa02b9599 : xfs_inotobp+0x49/0xc0 [xfs]
0xffffffffa02b96f1 : xfs_iunlink_remove+0xe1/0x320 [xfs]
0xffffffff81501a69
0x0 (inexact)
user backtrace:
0x30cd40e5ad [/lib64/libpthread-2.12.so+0xe5ad/0x219000]
--- xfs_iunlink_remove --
module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1680").return
-- return=0x16
vars: tp=0xffff8801037acea0 ip=0xffff880e697c8800 next_ino=? mp=? agi=?
dip=? agibp=0xffff880d846c2d60 ibp=? agno=? agino=? next_agino=? last_ibp=?
last_dip=0xffff881017c6c800 bucket_index=? offset=?
last_offset=0xffffffffffff880e error=? __func__=[...]
ip: i_ino = 0x142, i_flags = 0x0
ip->i_d: di_nlink = 0x0, di_gen = 0x3565732e
2013/4/15 符永涛 <yongtaofu@gmail.com>
> Also glusterfs use a lot of hardlink for self-heal:
> --------T 2 root root 0 Apr 15 11:58 /mnt/xfsd/testbug/998416323
> ---------T 2 root root 0 Apr 15 11:58 /mnt/xfsd/testbug/999296624
> ---------T 2 root root 0 Apr 15 12:24 /mnt/xfsd/testbug/999568484
> ---------T 2 root root 0 Apr 15 11:58 /mnt/xfsd/testbug/999956875
> ---------T 2 root root 0 Apr 15 11:58
> /mnt/xfsd/testbug/.glusterfs/05/2f/052f4e3e-c379-4a3c-b995-a10fdaca33d0
> ---------T 2 root root 0 Apr 15 11:58
> /mnt/xfsd/testbug/.glusterfs/05/95/0595272e-ce2b-45d5-8693-d02c00b94d9d
> ---------T 2 root root 0 Apr 15 11:58
> /mnt/xfsd/testbug/.glusterfs/05/ca/05ca00a0-92a7-44cf-b6e3-380496aafaa4
> ---------T 2 root root 0 Apr 15 12:24
> /mnt/xfsd/testbug/.glusterfs/0a/23/0a238ca7-3cef-4540-9c98-6bf631551b21
> ---------T 2 root root 0 Apr 15 11:58
> /mnt/xfsd/testbug/.glusterfs/0a/4b/0a4b640b-f675-4708-bb59-e2369ffbbb9d
> Does it related?
>
>
> 2013/4/15 符永涛 <yongtaofu@gmail.com>
>
>> Dear xfs experts,
>> Now I'm deploying Brian's system script in out cluster. But from last
>> night till now 5 servers in our 24 servers xfs shutdown with the same
>> error. I run xfs_repair command and found all the lost inodes are glusterfs
>> dht link files. This explains why the xfs shutdown tend to happen during
>> glusterfs rebalance. During glusterfs rebalance procedure a lot of dhk link
>> files may be unlinked. For example the following inodes are found in
>> lost+found in one of the servers:
>> [root@* lost+found]# pwd
>> /mnt/xfsd/lost+found
>> [root@* lost+found]# ls -l
>> total 740
>> ---------T 1 root root 0 Apr 8 21:06 100119
>> ---------T 1 root root 0 Apr 8 21:11 101123
>> ---------T 1 root root 0 Apr 8 21:19 102659
>> ---------T 1 root root 0 Apr 12 14:46 1040919
>> ---------T 1 root root 0 Apr 12 14:58 1041943
>> ---------T 1 root root 0 Apr 8 21:32 105219
>> ---------T 1 root root 0 Apr 8 21:37 105731
>> ---------T 1 root root 0 Apr 12 17:48 1068055
>> ---------T 1 root root 0 Apr 12 18:38 1073943
>> ---------T 1 root root 0 Apr 8 21:54 108035
>> ---------T 1 root root 0 Apr 12 21:49 1091095
>> ---------T 1 root root 0 Apr 13 00:17 1111063
>> ---------T 1 root root 0 Apr 13 03:51 1121815
>> ---------T 1 root root 0 Apr 8 22:25 112387
>> ---------T 1 root root 0 Apr 13 06:39 1136151
>> ...
>> [root@* lost+found]# getfattr -m . -d -e hex *
>>
>> # file: 96007
>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>> trusted.gfid=0xa0370d8a9f104dafbebbd0e6dd7ce1f7
>>
>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
>>
>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x0000000049dff000
>>
>> # file: 97027
>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>> trusted.gfid=0xc1c1fe2ec7034442a623385f43b04c25
>>
>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
>>
>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006ac78000
>>
>> # file: 97559
>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>> trusted.gfid=0xcf7c17013c914511bda4d1c743fae118
>>
>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>
>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x00000000519fb000
>>
>> # file: 98055
>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>> trusted.gfid=0xe86abc6e2c4b44c28d415fbbe34f2102
>>
>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
>>
>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000004c098000
>>
>> # file: 98567
>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>> trusted.gfid=0x12543a2efbdf4b9fa61c6d89ca396f80
>>
>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>
>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006bc98000
>>
>> # file: 98583
>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>> trusted.gfid=0x760d16d3b7974cfb9c0a665a0982c470
>>
>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>
>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006cde9000
>>
>> # file: 99607
>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>> trusted.gfid=0x0849a732ea204bc3b8bae830b46881da
>>
>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>
>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x00000000513f1000
>> ...
>>
>> What do you think about it? Thank you very much.
>>
>>
>> 2013/4/12 符永涛 <yongtaofu@gmail.com>
>>
>>> Hi Brian,
>>>
>>> Your scripts works for me now after I installed all the rpm built out
>>> from kernel srpm. I'll try it. Thank you.
>>>
>>>
>>> 2013/4/12 Brian Foster <bfoster@redhat.com>
>>>
>>>> On 04/12/2013 04:32 AM, 符永涛 wrote:
>>>> > Dear xfs experts,
>>>> > Can I just call xfs_stack_trace(); in the second line of
>>>> > xfs_do_force_shutdown() to print stack and rebuild kernel to check
>>>> > what's the error?
>>>> >
>>>>
>>>> I suppose that's a start. If you're willing/able to create and run a
>>>> modified kernel for the purpose of collecting more debug info, perhaps
>>>> we can get a bit more creative in collecting more data on the problem
>>>> (but a stack trace there is a good start).
>>>>
>>>> BTW- you might want to place the call after the XFS_FORCED_SHUTDOWN(mp)
>>>> check almost halfway into the function to avoid duplicate messages.
>>>>
>>>> Brian
>>>>
>>>> >
>>>> > 2013/4/12 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>>>> >
>>>> > Hi Brian,
>>>> > What else I'm missing? Thank you.
>>>> > stap -e 'probe module("xfs").function("xfs_iunlink"){}'
>>>> >
>>>> > WARNING: cannot find module xfs debuginfo: No DWARF information
>>>> found
>>>> > semantic error: no match while resolving probe point
>>>> > module("xfs").function("xfs_iunlink")
>>>> > Pass 2: analysis failed. Try again with another '--vp 01' option.
>>>> >
>>>> >
>>>> > 2013/4/12 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>>>> >
>>>> > ls -l
>>>> >
>>>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>>> > -r--r--r-- 1 root root 21393024 Apr 12 12:08
>>>> >
>>>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>>> >
>>>> > rpm -qa|grep kernel
>>>> > kernel-headers-2.6.32-279.el6.x86_64
>>>> > kernel-devel-2.6.32-279.el6.x86_64
>>>> > kernel-2.6.32-358.el6.x86_64
>>>> > kernel-debuginfo-common-x86_64-2.6.32-279.el6.x86_64
>>>> > abrt-addon-kerneloops-2.0.8-6.el6.x86_64
>>>> > kernel-firmware-2.6.32-358.el6.noarch
>>>> > kernel-debug-2.6.32-358.el6.x86_64
>>>> > kernel-debuginfo-2.6.32-279.el6.x86_64
>>>> > dracut-kernel-004-283.el6.noarch
>>>> > libreport-plugin-kerneloops-2.0.9-5.el6.x86_64
>>>> > kernel-devel-2.6.32-358.el6.x86_64
>>>> > kernel-2.6.32-279.el6.x86_64
>>>> >
>>>> > rpm -q kernel-debuginfo
>>>> > kernel-debuginfo-2.6.32-279.el6.x86_64
>>>> >
>>>> > rpm -q kernel
>>>> > kernel-2.6.32-279.el6.x86_64
>>>> > kernel-2.6.32-358.el6.x86_64
>>>> >
>>>> > do I need to re probe it?
>>>> >
>>>> >
>>>> > 2013/4/12 Eric Sandeen <sandeen@sandeen.net
>>>> > <mailto:sandeen@sandeen.net>>
>>>> >
>>>> > On 4/11/13 11:32 PM, 符永涛 wrote:
>>>> > > Hi Brian,
>>>> > > Sorry but when I execute the script it says:
>>>> > > WARNING: cannot find module xfs debuginfo: No DWARF
>>>> > information found
>>>> > > semantic error: no match while resolving probe point
>>>> > module("xfs").function("xfs_iunlink")
>>>> > >
>>>> > > uname -a
>>>> > > 2.6.32-279.el6.x86_64
>>>> > > kernel debuginfo has been installed.
>>>> > >
>>>> > > Where can I find the correct xfs debuginfo?
>>>> >
>>>> > it should be in the kernel-debuginfo rpm (of the same
>>>> > version/release as the kernel rpm you're running)
>>>> >
>>>> > You should have:
>>>> >
>>>> >
>>>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>>> >
>>>> > If not, can you show:
>>>> >
>>>> > # uname -a
>>>> > # rpm -q kernel
>>>> > # rpm -q kernel-debuginfo
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > 符永涛
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > 符永涛
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > 符永涛
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > xfs mailing list
>>>> > xfs@oss.sgi.com
>>>> > http://oss.sgi.com/mailman/listinfo/xfs
>>>> >
>>>>
>>>>
>>>
>>>
>>> --
>>> 符永涛
>>>
>>
>>
>>
>> --
>> 符永涛
>>
>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 17637 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-15 12:54 ` 符永涛
@ 2013-04-15 13:33 ` 符永涛
2013-04-15 13:36 ` 符永涛
2013-04-15 14:13 ` Brian Foster
1 sibling, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-15 13:33 UTC (permalink / raw)
To: Brian Foster; +Cc: Ben Myers, Eric Sandeen, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 15756 bytes --]
and xfs kernel trace is:
Apr 15 20:43:03 10 kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
returned error 22.
Apr 15 20:43:03 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
error 22
Apr 15 20:43:03 10 kernel: Pid: 3093, comm: glusterfsd Not tainted
2.6.32-279.el6.x86_64 #1
Apr 15 20:43:03 10 kernel: Call Trace:
Apr 15 20:43:03 10 kernel: [<ffffffffa02d4212>] ? xfs_inactive+0x442/0x460
[xfs]
Apr 15 20:43:03 10 kernel: [<ffffffffa02e1790>] ?
xfs_fs_clear_inode+0xa0/0xd0 [xfs]
Apr 15 20:43:03 10 kernel: [<ffffffff81195adc>] ? clear_inode+0xac/0x140
Apr 15 20:43:03 10 kernel: [<ffffffff81196296>] ?
generic_delete_inode+0x196/0x1d0
Apr 15 20:43:03 10 kernel: [<ffffffff81196335>] ?
generic_drop_inode+0x65/0x80
Apr 15 20:43:03 10 kernel: [<ffffffff81195182>] ? iput+0x62/0x70
Apr 15 20:43:03 10 kernel: [<ffffffff81191ce0>] ? dentry_iput+0x90/0x100
Apr 15 20:43:03 10 kernel: [<ffffffff81191e41>] ? d_kill+0x31/0x60
Apr 15 20:43:03 10 kernel: [<ffffffff8119386c>] ? dput+0x7c/0x150
Apr 15 20:43:03 10 kernel: [<ffffffff8117c9c9>] ? __fput+0x189/0x210
Apr 15 20:43:03 10 kernel: [<ffffffff8117ca75>] ? fput+0x25/0x30
Apr 15 20:43:03 10 kernel: [<ffffffff8117849d>] ? filp_close+0x5d/0x90
Apr 15 20:43:03 10 kernel: [<ffffffff81178575>] ? sys_close+0xa5/0x100
Apr 15 20:43:03 10 kernel: [<ffffffff8100b308>] ? tracesys+0xd9/0xde
Apr 15 20:43:03 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
from line 1186 of file fs/xfs/xfs_vnodeops.c. Return address =
0xffffffffa02d422b
Apr 15 20:43:03 10 kernel: XFS (sdb): I/O Error Detected. Shutting down
filesystem
Apr 15 20:43:03 10 kernel: XFS (sdb): Please umount the filesystem and
rectify the problem(s)
Apr 15 20:43:13 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
2013/4/15 符永涛 <yongtaofu@gmail.com>
> Dear Brian and xfs experts,
> Brain your scripts works and I am able to reproduce it with glusterfs
> rebalance on our test cluster. 2 of our server xfs shutdown during
> glusterfs rebalance, the shutdown userspace stacktrace both related to
> pthread. See logs bellow, What's your opinion? Thank you very much!
> logs:
> [root@10.23.72.93 ~]# cat xfs.log
>
> --- xfs_imap -- module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
> -- return=0x16
> vars: mp=0xffff882017a50800 tp=0xffff881c81797c70 ino=0xffffffff
> imap=0xffff88100e2f7c08 flags=0x0 agbno=? agino=? agno=? blks_per_cluster=?
> chunk_agbno=? cluster_agbno=? error=? offset=? offset_agbno=? __func__=[...]
> mp: m_agno_log = 0x5, m_agino_log = 0x20
> mp->m_sb: sb_agcount = 0x1c, sb_agblocks = 0xffffff0, sb_inopblog = 0x4,
> sb_agblklog = 0x1c, sb_dblocks = 0x1b4900000
> imap: im_blkno = 0x0, im_len = 0xa078, im_boffset = 0x86ea
> kernel backtrace:
> Returning from: 0xffffffffa02b3ab0 : xfs_imap+0x0/0x280 [xfs]
> Returning to : 0xffffffffa02b9599 : xfs_inotobp+0x49/0xc0 [xfs]
> 0xffffffffa02b96f1 : xfs_iunlink_remove+0xe1/0x320 [xfs]
> 0xffffffff81501a69
> 0x0 (inexact)
> user backtrace:
> 0x3bd1a0e5ad [/lib64/libpthread-2.12.so+0xe5ad/0x219000]
>
> --- xfs_iunlink_remove -- module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1680").return
> -- return=0x16
> vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00 next_ino=? mp=? agi=?
> dip=? agibp=0xffff880109b47e20 ibp=? agno=? agino=? next_agino=? last_ibp=?
> last_dip=0xffff882000000000 bucket_index=? offset=?
> last_offset=0xffffffffffff8810 error=? __func__=[...]
> ip: i_ino = 0x113, i_flags = 0x0
> ip->i_d: di_nlink = 0x0, di_gen = 0x0
> [root@10.23.72.93 ~]#
> [root@10.23.72.94 ~]# cat xfs.log
>
> --- xfs_imap -- module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
> -- return=0x16
> vars: mp=0xffff881017c6c800 tp=0xffff8801037acea0 ino=0xffffffff
> imap=0xffff882017101c08 flags=0x0 agbno=? agino=? agno=? blks_per_cluster=?
> chunk_agbno=? cluster_agbno=? error=? offset=? offset_agbno=? __func__=[...]
> mp: m_agno_log = 0x5, m_agino_log = 0x20
> mp->m_sb: sb_agcount = 0x1c, sb_agblocks = 0xffffff0, sb_inopblog = 0x4,
> sb_agblklog = 0x1c, sb_dblocks = 0x1b4900000
> imap: im_blkno = 0x0, im_len = 0xd98, im_boffset = 0x547
> kernel backtrace:
> Returning from: 0xffffffffa02b3ab0 : xfs_imap+0x0/0x280 [xfs]
> Returning to : 0xffffffffa02b9599 : xfs_inotobp+0x49/0xc0 [xfs]
> 0xffffffffa02b96f1 : xfs_iunlink_remove+0xe1/0x320 [xfs]
> 0xffffffff81501a69
> 0x0 (inexact)
> user backtrace:
> 0x30cd40e5ad [/lib64/libpthread-2.12.so+0xe5ad/0x219000]
>
> --- xfs_iunlink_remove -- module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1680").return
> -- return=0x16
> vars: tp=0xffff8801037acea0 ip=0xffff880e697c8800 next_ino=? mp=? agi=?
> dip=? agibp=0xffff880d846c2d60 ibp=? agno=? agino=? next_agino=? last_ibp=?
> last_dip=0xffff881017c6c800 bucket_index=? offset=?
> last_offset=0xffffffffffff880e error=? __func__=[...]
> ip: i_ino = 0x142, i_flags = 0x0
> ip->i_d: di_nlink = 0x0, di_gen = 0x3565732e
>
>
>
> 2013/4/15 符永涛 <yongtaofu@gmail.com>
>
>> Also glusterfs use a lot of hardlink for self-heal:
>> --------T 2 root root 0 Apr 15 11:58 /mnt/xfsd/testbug/998416323
>> ---------T 2 root root 0 Apr 15 11:58 /mnt/xfsd/testbug/999296624
>> ---------T 2 root root 0 Apr 15 12:24 /mnt/xfsd/testbug/999568484
>> ---------T 2 root root 0 Apr 15 11:58 /mnt/xfsd/testbug/999956875
>> ---------T 2 root root 0 Apr 15 11:58
>> /mnt/xfsd/testbug/.glusterfs/05/2f/052f4e3e-c379-4a3c-b995-a10fdaca33d0
>> ---------T 2 root root 0 Apr 15 11:58
>> /mnt/xfsd/testbug/.glusterfs/05/95/0595272e-ce2b-45d5-8693-d02c00b94d9d
>> ---------T 2 root root 0 Apr 15 11:58
>> /mnt/xfsd/testbug/.glusterfs/05/ca/05ca00a0-92a7-44cf-b6e3-380496aafaa4
>> ---------T 2 root root 0 Apr 15 12:24
>> /mnt/xfsd/testbug/.glusterfs/0a/23/0a238ca7-3cef-4540-9c98-6bf631551b21
>> ---------T 2 root root 0 Apr 15 11:58
>> /mnt/xfsd/testbug/.glusterfs/0a/4b/0a4b640b-f675-4708-bb59-e2369ffbbb9d
>> Does it related?
>>
>>
>> 2013/4/15 符永涛 <yongtaofu@gmail.com>
>>
>>> Dear xfs experts,
>>> Now I'm deploying Brian's system script in out cluster. But from last
>>> night till now 5 servers in our 24 servers xfs shutdown with the same
>>> error. I run xfs_repair command and found all the lost inodes are glusterfs
>>> dht link files. This explains why the xfs shutdown tend to happen during
>>> glusterfs rebalance. During glusterfs rebalance procedure a lot of dhk link
>>> files may be unlinked. For example the following inodes are found in
>>> lost+found in one of the servers:
>>> [root@* lost+found]# pwd
>>> /mnt/xfsd/lost+found
>>> [root@* lost+found]# ls -l
>>> total 740
>>> ---------T 1 root root 0 Apr 8 21:06 100119
>>> ---------T 1 root root 0 Apr 8 21:11 101123
>>> ---------T 1 root root 0 Apr 8 21:19 102659
>>> ---------T 1 root root 0 Apr 12 14:46 1040919
>>> ---------T 1 root root 0 Apr 12 14:58 1041943
>>> ---------T 1 root root 0 Apr 8 21:32 105219
>>> ---------T 1 root root 0 Apr 8 21:37 105731
>>> ---------T 1 root root 0 Apr 12 17:48 1068055
>>> ---------T 1 root root 0 Apr 12 18:38 1073943
>>> ---------T 1 root root 0 Apr 8 21:54 108035
>>> ---------T 1 root root 0 Apr 12 21:49 1091095
>>> ---------T 1 root root 0 Apr 13 00:17 1111063
>>> ---------T 1 root root 0 Apr 13 03:51 1121815
>>> ---------T 1 root root 0 Apr 8 22:25 112387
>>> ---------T 1 root root 0 Apr 13 06:39 1136151
>>> ...
>>> [root@* lost+found]# getfattr -m . -d -e hex *
>>>
>>> # file: 96007
>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>> trusted.gfid=0xa0370d8a9f104dafbebbd0e6dd7ce1f7
>>>
>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
>>>
>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x0000000049dff000
>>>
>>> # file: 97027
>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>> trusted.gfid=0xc1c1fe2ec7034442a623385f43b04c25
>>>
>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
>>>
>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006ac78000
>>>
>>> # file: 97559
>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>> trusted.gfid=0xcf7c17013c914511bda4d1c743fae118
>>>
>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>>
>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x00000000519fb000
>>>
>>> # file: 98055
>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>> trusted.gfid=0xe86abc6e2c4b44c28d415fbbe34f2102
>>>
>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
>>>
>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000004c098000
>>>
>>> # file: 98567
>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>> trusted.gfid=0x12543a2efbdf4b9fa61c6d89ca396f80
>>>
>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>>
>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006bc98000
>>>
>>> # file: 98583
>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>> trusted.gfid=0x760d16d3b7974cfb9c0a665a0982c470
>>>
>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>>
>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006cde9000
>>>
>>> # file: 99607
>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>> trusted.gfid=0x0849a732ea204bc3b8bae830b46881da
>>>
>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>>
>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x00000000513f1000
>>> ...
>>>
>>> What do you think about it? Thank you very much.
>>>
>>>
>>> 2013/4/12 符永涛 <yongtaofu@gmail.com>
>>>
>>>> Hi Brian,
>>>>
>>>> Your scripts works for me now after I installed all the rpm built out
>>>> from kernel srpm. I'll try it. Thank you.
>>>>
>>>>
>>>> 2013/4/12 Brian Foster <bfoster@redhat.com>
>>>>
>>>>> On 04/12/2013 04:32 AM, 符永涛 wrote:
>>>>> > Dear xfs experts,
>>>>> > Can I just call xfs_stack_trace(); in the second line of
>>>>> > xfs_do_force_shutdown() to print stack and rebuild kernel to check
>>>>> > what's the error?
>>>>> >
>>>>>
>>>>> I suppose that's a start. If you're willing/able to create and run a
>>>>> modified kernel for the purpose of collecting more debug info, perhaps
>>>>> we can get a bit more creative in collecting more data on the problem
>>>>> (but a stack trace there is a good start).
>>>>>
>>>>> BTW- you might want to place the call after the XFS_FORCED_SHUTDOWN(mp)
>>>>> check almost halfway into the function to avoid duplicate messages.
>>>>>
>>>>> Brian
>>>>>
>>>>> >
>>>>> > 2013/4/12 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>>>>> >
>>>>> > Hi Brian,
>>>>> > What else I'm missing? Thank you.
>>>>> > stap -e 'probe module("xfs").function("xfs_iunlink"){}'
>>>>> >
>>>>> > WARNING: cannot find module xfs debuginfo: No DWARF information
>>>>> found
>>>>> > semantic error: no match while resolving probe point
>>>>> > module("xfs").function("xfs_iunlink")
>>>>> > Pass 2: analysis failed. Try again with another '--vp 01'
>>>>> option.
>>>>> >
>>>>> >
>>>>> > 2013/4/12 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>>>>> >
>>>>> > ls -l
>>>>> >
>>>>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>>>> > -r--r--r-- 1 root root 21393024 Apr 12 12:08
>>>>> >
>>>>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>>>> >
>>>>> > rpm -qa|grep kernel
>>>>> > kernel-headers-2.6.32-279.el6.x86_64
>>>>> > kernel-devel-2.6.32-279.el6.x86_64
>>>>> > kernel-2.6.32-358.el6.x86_64
>>>>> > kernel-debuginfo-common-x86_64-2.6.32-279.el6.x86_64
>>>>> > abrt-addon-kerneloops-2.0.8-6.el6.x86_64
>>>>> > kernel-firmware-2.6.32-358.el6.noarch
>>>>> > kernel-debug-2.6.32-358.el6.x86_64
>>>>> > kernel-debuginfo-2.6.32-279.el6.x86_64
>>>>> > dracut-kernel-004-283.el6.noarch
>>>>> > libreport-plugin-kerneloops-2.0.9-5.el6.x86_64
>>>>> > kernel-devel-2.6.32-358.el6.x86_64
>>>>> > kernel-2.6.32-279.el6.x86_64
>>>>> >
>>>>> > rpm -q kernel-debuginfo
>>>>> > kernel-debuginfo-2.6.32-279.el6.x86_64
>>>>> >
>>>>> > rpm -q kernel
>>>>> > kernel-2.6.32-279.el6.x86_64
>>>>> > kernel-2.6.32-358.el6.x86_64
>>>>> >
>>>>> > do I need to re probe it?
>>>>> >
>>>>> >
>>>>> > 2013/4/12 Eric Sandeen <sandeen@sandeen.net
>>>>> > <mailto:sandeen@sandeen.net>>
>>>>> >
>>>>> > On 4/11/13 11:32 PM, 符永涛 wrote:
>>>>> > > Hi Brian,
>>>>> > > Sorry but when I execute the script it says:
>>>>> > > WARNING: cannot find module xfs debuginfo: No DWARF
>>>>> > information found
>>>>> > > semantic error: no match while resolving probe point
>>>>> > module("xfs").function("xfs_iunlink")
>>>>> > >
>>>>> > > uname -a
>>>>> > > 2.6.32-279.el6.x86_64
>>>>> > > kernel debuginfo has been installed.
>>>>> > >
>>>>> > > Where can I find the correct xfs debuginfo?
>>>>> >
>>>>> > it should be in the kernel-debuginfo rpm (of the same
>>>>> > version/release as the kernel rpm you're running)
>>>>> >
>>>>> > You should have:
>>>>> >
>>>>> >
>>>>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>>>> >
>>>>> > If not, can you show:
>>>>> >
>>>>> > # uname -a
>>>>> > # rpm -q kernel
>>>>> > # rpm -q kernel-debuginfo
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > 符永涛
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > 符永涛
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > 符永涛
>>>>> >
>>>>> >
>>>>> > _______________________________________________
>>>>> > xfs mailing list
>>>>> > xfs@oss.sgi.com
>>>>> > http://oss.sgi.com/mailman/listinfo/xfs
>>>>> >
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> 符永涛
>>>>
>>>
>>>
>>>
>>> --
>>> 符永涛
>>>
>>
>>
>>
>> --
>> 符永涛
>>
>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 20155 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-15 13:33 ` 符永涛
@ 2013-04-15 13:36 ` 符永涛
2013-04-15 13:45 ` 符永涛
0 siblings, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-15 13:36 UTC (permalink / raw)
To: Brian Foster; +Cc: Ben Myers, Eric Sandeen, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 16350 bytes --]
More info about it is: It happened exactly when glusterfs rebalance
complete.
2013/4/15 符永涛 <yongtaofu@gmail.com>
> and xfs kernel trace is:
> Apr 15 20:43:03 10 kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
> returned error 22.
> Apr 15 20:43:03 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
> error 22
> Apr 15 20:43:03 10 kernel: Pid: 3093, comm: glusterfsd Not tainted
> 2.6.32-279.el6.x86_64 #1
> Apr 15 20:43:03 10 kernel: Call Trace:
> Apr 15 20:43:03 10 kernel: [<ffffffffa02d4212>] ? xfs_inactive+0x442/0x460
> [xfs]
> Apr 15 20:43:03 10 kernel: [<ffffffffa02e1790>] ?
> xfs_fs_clear_inode+0xa0/0xd0 [xfs]
> Apr 15 20:43:03 10 kernel: [<ffffffff81195adc>] ? clear_inode+0xac/0x140
> Apr 15 20:43:03 10 kernel: [<ffffffff81196296>] ?
> generic_delete_inode+0x196/0x1d0
> Apr 15 20:43:03 10 kernel: [<ffffffff81196335>] ?
> generic_drop_inode+0x65/0x80
> Apr 15 20:43:03 10 kernel: [<ffffffff81195182>] ? iput+0x62/0x70
> Apr 15 20:43:03 10 kernel: [<ffffffff81191ce0>] ? dentry_iput+0x90/0x100
> Apr 15 20:43:03 10 kernel: [<ffffffff81191e41>] ? d_kill+0x31/0x60
> Apr 15 20:43:03 10 kernel: [<ffffffff8119386c>] ? dput+0x7c/0x150
> Apr 15 20:43:03 10 kernel: [<ffffffff8117c9c9>] ? __fput+0x189/0x210
> Apr 15 20:43:03 10 kernel: [<ffffffff8117ca75>] ? fput+0x25/0x30
> Apr 15 20:43:03 10 kernel: [<ffffffff8117849d>] ? filp_close+0x5d/0x90
> Apr 15 20:43:03 10 kernel: [<ffffffff81178575>] ? sys_close+0xa5/0x100
> Apr 15 20:43:03 10 kernel: [<ffffffff8100b308>] ? tracesys+0xd9/0xde
> Apr 15 20:43:03 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
> from line 1186 of file fs/xfs/xfs_vnodeops.c. Return address =
> 0xffffffffa02d422b
> Apr 15 20:43:03 10 kernel: XFS (sdb): I/O Error Detected. Shutting down
> filesystem
> Apr 15 20:43:03 10 kernel: XFS (sdb): Please umount the filesystem and
> rectify the problem(s)
> Apr 15 20:43:13 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
>
>
> 2013/4/15 符永涛 <yongtaofu@gmail.com>
>
>> Dear Brian and xfs experts,
>> Brain your scripts works and I am able to reproduce it with glusterfs
>> rebalance on our test cluster. 2 of our server xfs shutdown during
>> glusterfs rebalance, the shutdown userspace stacktrace both related to
>> pthread. See logs bellow, What's your opinion? Thank you very much!
>> logs:
>> [root@10.23.72.93 ~]# cat xfs.log
>>
>> --- xfs_imap -- module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
>> -- return=0x16
>> vars: mp=0xffff882017a50800 tp=0xffff881c81797c70 ino=0xffffffff
>> imap=0xffff88100e2f7c08 flags=0x0 agbno=? agino=? agno=? blks_per_cluster=?
>> chunk_agbno=? cluster_agbno=? error=? offset=? offset_agbno=? __func__=[...]
>> mp: m_agno_log = 0x5, m_agino_log = 0x20
>> mp->m_sb: sb_agcount = 0x1c, sb_agblocks = 0xffffff0, sb_inopblog = 0x4,
>> sb_agblklog = 0x1c, sb_dblocks = 0x1b4900000
>> imap: im_blkno = 0x0, im_len = 0xa078, im_boffset = 0x86ea
>> kernel backtrace:
>> Returning from: 0xffffffffa02b3ab0 : xfs_imap+0x0/0x280 [xfs]
>> Returning to : 0xffffffffa02b9599 : xfs_inotobp+0x49/0xc0 [xfs]
>> 0xffffffffa02b96f1 : xfs_iunlink_remove+0xe1/0x320 [xfs]
>> 0xffffffff81501a69
>> 0x0 (inexact)
>> user backtrace:
>> 0x3bd1a0e5ad [/lib64/libpthread-2.12.so+0xe5ad/0x219000]
>>
>> --- xfs_iunlink_remove -- module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1680").return
>> -- return=0x16
>> vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00 next_ino=? mp=? agi=?
>> dip=? agibp=0xffff880109b47e20 ibp=? agno=? agino=? next_agino=? last_ibp=?
>> last_dip=0xffff882000000000 bucket_index=? offset=?
>> last_offset=0xffffffffffff8810 error=? __func__=[...]
>> ip: i_ino = 0x113, i_flags = 0x0
>> ip->i_d: di_nlink = 0x0, di_gen = 0x0
>> [root@10.23.72.93 ~]#
>> [root@10.23.72.94 ~]# cat xfs.log
>>
>> --- xfs_imap -- module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
>> -- return=0x16
>> vars: mp=0xffff881017c6c800 tp=0xffff8801037acea0 ino=0xffffffff
>> imap=0xffff882017101c08 flags=0x0 agbno=? agino=? agno=? blks_per_cluster=?
>> chunk_agbno=? cluster_agbno=? error=? offset=? offset_agbno=? __func__=[...]
>> mp: m_agno_log = 0x5, m_agino_log = 0x20
>> mp->m_sb: sb_agcount = 0x1c, sb_agblocks = 0xffffff0, sb_inopblog = 0x4,
>> sb_agblklog = 0x1c, sb_dblocks = 0x1b4900000
>> imap: im_blkno = 0x0, im_len = 0xd98, im_boffset = 0x547
>> kernel backtrace:
>> Returning from: 0xffffffffa02b3ab0 : xfs_imap+0x0/0x280 [xfs]
>> Returning to : 0xffffffffa02b9599 : xfs_inotobp+0x49/0xc0 [xfs]
>> 0xffffffffa02b96f1 : xfs_iunlink_remove+0xe1/0x320 [xfs]
>> 0xffffffff81501a69
>> 0x0 (inexact)
>> user backtrace:
>> 0x30cd40e5ad [/lib64/libpthread-2.12.so+0xe5ad/0x219000]
>>
>> --- xfs_iunlink_remove -- module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1680").return
>> -- return=0x16
>> vars: tp=0xffff8801037acea0 ip=0xffff880e697c8800 next_ino=? mp=? agi=?
>> dip=? agibp=0xffff880d846c2d60 ibp=? agno=? agino=? next_agino=? last_ibp=?
>> last_dip=0xffff881017c6c800 bucket_index=? offset=?
>> last_offset=0xffffffffffff880e error=? __func__=[...]
>> ip: i_ino = 0x142, i_flags = 0x0
>> ip->i_d: di_nlink = 0x0, di_gen = 0x3565732e
>>
>>
>>
>> 2013/4/15 符永涛 <yongtaofu@gmail.com>
>>
>>> Also glusterfs use a lot of hardlink for self-heal:
>>> --------T 2 root root 0 Apr 15 11:58 /mnt/xfsd/testbug/998416323
>>> ---------T 2 root root 0 Apr 15 11:58 /mnt/xfsd/testbug/999296624
>>> ---------T 2 root root 0 Apr 15 12:24 /mnt/xfsd/testbug/999568484
>>> ---------T 2 root root 0 Apr 15 11:58 /mnt/xfsd/testbug/999956875
>>> ---------T 2 root root 0 Apr 15 11:58
>>> /mnt/xfsd/testbug/.glusterfs/05/2f/052f4e3e-c379-4a3c-b995-a10fdaca33d0
>>> ---------T 2 root root 0 Apr 15 11:58
>>> /mnt/xfsd/testbug/.glusterfs/05/95/0595272e-ce2b-45d5-8693-d02c00b94d9d
>>> ---------T 2 root root 0 Apr 15 11:58
>>> /mnt/xfsd/testbug/.glusterfs/05/ca/05ca00a0-92a7-44cf-b6e3-380496aafaa4
>>> ---------T 2 root root 0 Apr 15 12:24
>>> /mnt/xfsd/testbug/.glusterfs/0a/23/0a238ca7-3cef-4540-9c98-6bf631551b21
>>> ---------T 2 root root 0 Apr 15 11:58
>>> /mnt/xfsd/testbug/.glusterfs/0a/4b/0a4b640b-f675-4708-bb59-e2369ffbbb9d
>>> Does it related?
>>>
>>>
>>> 2013/4/15 符永涛 <yongtaofu@gmail.com>
>>>
>>>> Dear xfs experts,
>>>> Now I'm deploying Brian's system script in out cluster. But from last
>>>> night till now 5 servers in our 24 servers xfs shutdown with the same
>>>> error. I run xfs_repair command and found all the lost inodes are glusterfs
>>>> dht link files. This explains why the xfs shutdown tend to happen during
>>>> glusterfs rebalance. During glusterfs rebalance procedure a lot of dhk link
>>>> files may be unlinked. For example the following inodes are found in
>>>> lost+found in one of the servers:
>>>> [root@* lost+found]# pwd
>>>> /mnt/xfsd/lost+found
>>>> [root@* lost+found]# ls -l
>>>> total 740
>>>> ---------T 1 root root 0 Apr 8 21:06 100119
>>>> ---------T 1 root root 0 Apr 8 21:11 101123
>>>> ---------T 1 root root 0 Apr 8 21:19 102659
>>>> ---------T 1 root root 0 Apr 12 14:46 1040919
>>>> ---------T 1 root root 0 Apr 12 14:58 1041943
>>>> ---------T 1 root root 0 Apr 8 21:32 105219
>>>> ---------T 1 root root 0 Apr 8 21:37 105731
>>>> ---------T 1 root root 0 Apr 12 17:48 1068055
>>>> ---------T 1 root root 0 Apr 12 18:38 1073943
>>>> ---------T 1 root root 0 Apr 8 21:54 108035
>>>> ---------T 1 root root 0 Apr 12 21:49 1091095
>>>> ---------T 1 root root 0 Apr 13 00:17 1111063
>>>> ---------T 1 root root 0 Apr 13 03:51 1121815
>>>> ---------T 1 root root 0 Apr 8 22:25 112387
>>>> ---------T 1 root root 0 Apr 13 06:39 1136151
>>>> ...
>>>> [root@* lost+found]# getfattr -m . -d -e hex *
>>>>
>>>> # file: 96007
>>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>>> trusted.gfid=0xa0370d8a9f104dafbebbd0e6dd7ce1f7
>>>>
>>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
>>>>
>>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x0000000049dff000
>>>>
>>>> # file: 97027
>>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>>> trusted.gfid=0xc1c1fe2ec7034442a623385f43b04c25
>>>>
>>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
>>>>
>>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006ac78000
>>>>
>>>> # file: 97559
>>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>>> trusted.gfid=0xcf7c17013c914511bda4d1c743fae118
>>>>
>>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>>>
>>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x00000000519fb000
>>>>
>>>> # file: 98055
>>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>>> trusted.gfid=0xe86abc6e2c4b44c28d415fbbe34f2102
>>>>
>>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
>>>>
>>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000004c098000
>>>>
>>>> # file: 98567
>>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>>> trusted.gfid=0x12543a2efbdf4b9fa61c6d89ca396f80
>>>>
>>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>>>
>>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006bc98000
>>>>
>>>> # file: 98583
>>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>>> trusted.gfid=0x760d16d3b7974cfb9c0a665a0982c470
>>>>
>>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>>>
>>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006cde9000
>>>>
>>>> # file: 99607
>>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>>> trusted.gfid=0x0849a732ea204bc3b8bae830b46881da
>>>>
>>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>>>
>>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x00000000513f1000
>>>> ...
>>>>
>>>> What do you think about it? Thank you very much.
>>>>
>>>>
>>>> 2013/4/12 符永涛 <yongtaofu@gmail.com>
>>>>
>>>>> Hi Brian,
>>>>>
>>>>> Your scripts works for me now after I installed all the rpm built out
>>>>> from kernel srpm. I'll try it. Thank you.
>>>>>
>>>>>
>>>>> 2013/4/12 Brian Foster <bfoster@redhat.com>
>>>>>
>>>>>> On 04/12/2013 04:32 AM, 符永涛 wrote:
>>>>>> > Dear xfs experts,
>>>>>> > Can I just call xfs_stack_trace(); in the second line of
>>>>>> > xfs_do_force_shutdown() to print stack and rebuild kernel to check
>>>>>> > what's the error?
>>>>>> >
>>>>>>
>>>>>> I suppose that's a start. If you're willing/able to create and run a
>>>>>> modified kernel for the purpose of collecting more debug info, perhaps
>>>>>> we can get a bit more creative in collecting more data on the problem
>>>>>> (but a stack trace there is a good start).
>>>>>>
>>>>>> BTW- you might want to place the call after the
>>>>>> XFS_FORCED_SHUTDOWN(mp)
>>>>>> check almost halfway into the function to avoid duplicate messages.
>>>>>>
>>>>>> Brian
>>>>>>
>>>>>> >
>>>>>> > 2013/4/12 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>>>>>> >
>>>>>> > Hi Brian,
>>>>>> > What else I'm missing? Thank you.
>>>>>> > stap -e 'probe module("xfs").function("xfs_iunlink"){}'
>>>>>> >
>>>>>> > WARNING: cannot find module xfs debuginfo: No DWARF information
>>>>>> found
>>>>>> > semantic error: no match while resolving probe point
>>>>>> > module("xfs").function("xfs_iunlink")
>>>>>> > Pass 2: analysis failed. Try again with another '--vp 01'
>>>>>> option.
>>>>>> >
>>>>>> >
>>>>>> > 2013/4/12 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com
>>>>>> >>
>>>>>> >
>>>>>> > ls -l
>>>>>> >
>>>>>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>>>>> > -r--r--r-- 1 root root 21393024 Apr 12 12:08
>>>>>> >
>>>>>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>>>>> >
>>>>>> > rpm -qa|grep kernel
>>>>>> > kernel-headers-2.6.32-279.el6.x86_64
>>>>>> > kernel-devel-2.6.32-279.el6.x86_64
>>>>>> > kernel-2.6.32-358.el6.x86_64
>>>>>> > kernel-debuginfo-common-x86_64-2.6.32-279.el6.x86_64
>>>>>> > abrt-addon-kerneloops-2.0.8-6.el6.x86_64
>>>>>> > kernel-firmware-2.6.32-358.el6.noarch
>>>>>> > kernel-debug-2.6.32-358.el6.x86_64
>>>>>> > kernel-debuginfo-2.6.32-279.el6.x86_64
>>>>>> > dracut-kernel-004-283.el6.noarch
>>>>>> > libreport-plugin-kerneloops-2.0.9-5.el6.x86_64
>>>>>> > kernel-devel-2.6.32-358.el6.x86_64
>>>>>> > kernel-2.6.32-279.el6.x86_64
>>>>>> >
>>>>>> > rpm -q kernel-debuginfo
>>>>>> > kernel-debuginfo-2.6.32-279.el6.x86_64
>>>>>> >
>>>>>> > rpm -q kernel
>>>>>> > kernel-2.6.32-279.el6.x86_64
>>>>>> > kernel-2.6.32-358.el6.x86_64
>>>>>> >
>>>>>> > do I need to re probe it?
>>>>>> >
>>>>>> >
>>>>>> > 2013/4/12 Eric Sandeen <sandeen@sandeen.net
>>>>>> > <mailto:sandeen@sandeen.net>>
>>>>>> >
>>>>>> > On 4/11/13 11:32 PM, 符永涛 wrote:
>>>>>> > > Hi Brian,
>>>>>> > > Sorry but when I execute the script it says:
>>>>>> > > WARNING: cannot find module xfs debuginfo: No DWARF
>>>>>> > information found
>>>>>> > > semantic error: no match while resolving probe point
>>>>>> > module("xfs").function("xfs_iunlink")
>>>>>> > >
>>>>>> > > uname -a
>>>>>> > > 2.6.32-279.el6.x86_64
>>>>>> > > kernel debuginfo has been installed.
>>>>>> > >
>>>>>> > > Where can I find the correct xfs debuginfo?
>>>>>> >
>>>>>> > it should be in the kernel-debuginfo rpm (of the same
>>>>>> > version/release as the kernel rpm you're running)
>>>>>> >
>>>>>> > You should have:
>>>>>> >
>>>>>> >
>>>>>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>>>>> >
>>>>>> > If not, can you show:
>>>>>> >
>>>>>> > # uname -a
>>>>>> > # rpm -q kernel
>>>>>> > # rpm -q kernel-debuginfo
>>>>>> >
>>>>>> > -Eric
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > 符永涛
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > 符永涛
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > 符永涛
>>>>>> >
>>>>>> >
>>>>>> > _______________________________________________
>>>>>> > xfs mailing list
>>>>>> > xfs@oss.sgi.com
>>>>>> > http://oss.sgi.com/mailman/listinfo/xfs
>>>>>> >
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> 符永涛
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> 符永涛
>>>>
>>>
>>>
>>>
>>> --
>>> 符永涛
>>>
>>
>>
>>
>> --
>> 符永涛
>>
>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 20771 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-15 13:36 ` 符永涛
@ 2013-04-15 13:45 ` 符永涛
2013-04-15 13:57 ` Eric Sandeen
0 siblings, 1 reply; 60+ messages in thread
From: 符永涛 @ 2013-04-15 13:45 UTC (permalink / raw)
To: Brian Foster; +Cc: Ben Myers, Eric Sandeen, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 17583 bytes --]
And at the same time we got the following error log of glusterfs:
[2013-04-15 20:43:03.851163] I [dht-rebalance.c:1611:gf_defrag_status_get]
0-glusterfs: Rebalance is completed
[2013-04-15 20:43:03.851248] I [dht-rebalance.c:1614:gf_defrag_status_get]
0-glusterfs: Files migrated: 1629, size: 1582329065954, lookups: 11036,
failures: 561
[2013-04-15 20:43:03.887634] W [glusterfsd.c:831:cleanup_and_exit]
(-->/lib64/libc.so.6(clone+0x6d) [0x3bd16e767d]
(-->/lib64/libpthread.so.0() [0x3bd1a07851]
(-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xdd) [0x405c9d]))) 0-:
received signum (15), shutting down
[2013-04-15 20:43:03.887878] E
[rpcsvc.c:1155:rpcsvc_program_unregister_portmap] 0-rpc-service: Could not
unregister with portmap
2013/4/15 符永涛 <yongtaofu@gmail.com>
> More info about it is: It happened exactly when glusterfs rebalance
> complete.
>
>
> 2013/4/15 符永涛 <yongtaofu@gmail.com>
>
>> and xfs kernel trace is:
>> Apr 15 20:43:03 10 kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
>> returned error 22.
>> Apr 15 20:43:03 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
>> error 22
>> Apr 15 20:43:03 10 kernel: Pid: 3093, comm: glusterfsd Not tainted
>> 2.6.32-279.el6.x86_64 #1
>> Apr 15 20:43:03 10 kernel: Call Trace:
>> Apr 15 20:43:03 10 kernel: [<ffffffffa02d4212>] ?
>> xfs_inactive+0x442/0x460 [xfs]
>> Apr 15 20:43:03 10 kernel: [<ffffffffa02e1790>] ?
>> xfs_fs_clear_inode+0xa0/0xd0 [xfs]
>> Apr 15 20:43:03 10 kernel: [<ffffffff81195adc>] ? clear_inode+0xac/0x140
>> Apr 15 20:43:03 10 kernel: [<ffffffff81196296>] ?
>> generic_delete_inode+0x196/0x1d0
>> Apr 15 20:43:03 10 kernel: [<ffffffff81196335>] ?
>> generic_drop_inode+0x65/0x80
>> Apr 15 20:43:03 10 kernel: [<ffffffff81195182>] ? iput+0x62/0x70
>> Apr 15 20:43:03 10 kernel: [<ffffffff81191ce0>] ? dentry_iput+0x90/0x100
>> Apr 15 20:43:03 10 kernel: [<ffffffff81191e41>] ? d_kill+0x31/0x60
>> Apr 15 20:43:03 10 kernel: [<ffffffff8119386c>] ? dput+0x7c/0x150
>> Apr 15 20:43:03 10 kernel: [<ffffffff8117c9c9>] ? __fput+0x189/0x210
>> Apr 15 20:43:03 10 kernel: [<ffffffff8117ca75>] ? fput+0x25/0x30
>> Apr 15 20:43:03 10 kernel: [<ffffffff8117849d>] ? filp_close+0x5d/0x90
>> Apr 15 20:43:03 10 kernel: [<ffffffff81178575>] ? sys_close+0xa5/0x100
>> Apr 15 20:43:03 10 kernel: [<ffffffff8100b308>] ? tracesys+0xd9/0xde
>> Apr 15 20:43:03 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
>> from line 1186 of file fs/xfs/xfs_vnodeops.c. Return address =
>> 0xffffffffa02d422b
>> Apr 15 20:43:03 10 kernel: XFS (sdb): I/O Error Detected. Shutting down
>> filesystem
>> Apr 15 20:43:03 10 kernel: XFS (sdb): Please umount the filesystem and
>> rectify the problem(s)
>> Apr 15 20:43:13 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.
>>
>>
>> 2013/4/15 符永涛 <yongtaofu@gmail.com>
>>
>>> Dear Brian and xfs experts,
>>> Brain your scripts works and I am able to reproduce it with glusterfs
>>> rebalance on our test cluster. 2 of our server xfs shutdown during
>>> glusterfs rebalance, the shutdown userspace stacktrace both related to
>>> pthread. See logs bellow, What's your opinion? Thank you very much!
>>> logs:
>>> [root@10.23.72.93 ~]# cat xfs.log
>>>
>>> --- xfs_imap -- module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
>>> -- return=0x16
>>> vars: mp=0xffff882017a50800 tp=0xffff881c81797c70 ino=0xffffffff
>>> imap=0xffff88100e2f7c08 flags=0x0 agbno=? agino=? agno=? blks_per_cluster=?
>>> chunk_agbno=? cluster_agbno=? error=? offset=? offset_agbno=? __func__=[...]
>>> mp: m_agno_log = 0x5, m_agino_log = 0x20
>>> mp->m_sb: sb_agcount = 0x1c, sb_agblocks = 0xffffff0, sb_inopblog = 0x4,
>>> sb_agblklog = 0x1c, sb_dblocks = 0x1b4900000
>>> imap: im_blkno = 0x0, im_len = 0xa078, im_boffset = 0x86ea
>>> kernel backtrace:
>>> Returning from: 0xffffffffa02b3ab0 : xfs_imap+0x0/0x280 [xfs]
>>> Returning to : 0xffffffffa02b9599 : xfs_inotobp+0x49/0xc0 [xfs]
>>> 0xffffffffa02b96f1 : xfs_iunlink_remove+0xe1/0x320 [xfs]
>>> 0xffffffff81501a69
>>> 0x0 (inexact)
>>> user backtrace:
>>> 0x3bd1a0e5ad [/lib64/libpthread-2.12.so+0xe5ad/0x219000]
>>>
>>> --- xfs_iunlink_remove -- module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1680").return
>>> -- return=0x16
>>> vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00 next_ino=? mp=? agi=?
>>> dip=? agibp=0xffff880109b47e20 ibp=? agno=? agino=? next_agino=? last_ibp=?
>>> last_dip=0xffff882000000000 bucket_index=? offset=?
>>> last_offset=0xffffffffffff8810 error=? __func__=[...]
>>> ip: i_ino = 0x113, i_flags = 0x0
>>> ip->i_d: di_nlink = 0x0, di_gen = 0x0
>>> [root@10.23.72.93 ~]#
>>> [root@10.23.72.94 ~]# cat xfs.log
>>>
>>> --- xfs_imap -- module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
>>> -- return=0x16
>>> vars: mp=0xffff881017c6c800 tp=0xffff8801037acea0 ino=0xffffffff
>>> imap=0xffff882017101c08 flags=0x0 agbno=? agino=? agno=? blks_per_cluster=?
>>> chunk_agbno=? cluster_agbno=? error=? offset=? offset_agbno=? __func__=[...]
>>> mp: m_agno_log = 0x5, m_agino_log = 0x20
>>> mp->m_sb: sb_agcount = 0x1c, sb_agblocks = 0xffffff0, sb_inopblog = 0x4,
>>> sb_agblklog = 0x1c, sb_dblocks = 0x1b4900000
>>> imap: im_blkno = 0x0, im_len = 0xd98, im_boffset = 0x547
>>> kernel backtrace:
>>> Returning from: 0xffffffffa02b3ab0 : xfs_imap+0x0/0x280 [xfs]
>>> Returning to : 0xffffffffa02b9599 : xfs_inotobp+0x49/0xc0 [xfs]
>>> 0xffffffffa02b96f1 : xfs_iunlink_remove+0xe1/0x320 [xfs]
>>> 0xffffffff81501a69
>>> 0x0 (inexact)
>>> user backtrace:
>>> 0x30cd40e5ad [/lib64/libpthread-2.12.so+0xe5ad/0x219000]
>>>
>>> --- xfs_iunlink_remove -- module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1680").return
>>> -- return=0x16
>>> vars: tp=0xffff8801037acea0 ip=0xffff880e697c8800 next_ino=? mp=? agi=?
>>> dip=? agibp=0xffff880d846c2d60 ibp=? agno=? agino=? next_agino=? last_ibp=?
>>> last_dip=0xffff881017c6c800 bucket_index=? offset=?
>>> last_offset=0xffffffffffff880e error=? __func__=[...]
>>> ip: i_ino = 0x142, i_flags = 0x0
>>> ip->i_d: di_nlink = 0x0, di_gen = 0x3565732e
>>>
>>>
>>>
>>> 2013/4/15 符永涛 <yongtaofu@gmail.com>
>>>
>>>> Also glusterfs use a lot of hardlink for self-heal:
>>>> --------T 2 root root 0 Apr 15 11:58 /mnt/xfsd/testbug/998416323
>>>> ---------T 2 root root 0 Apr 15 11:58 /mnt/xfsd/testbug/999296624
>>>> ---------T 2 root root 0 Apr 15 12:24 /mnt/xfsd/testbug/999568484
>>>> ---------T 2 root root 0 Apr 15 11:58 /mnt/xfsd/testbug/999956875
>>>> ---------T 2 root root 0 Apr 15 11:58
>>>> /mnt/xfsd/testbug/.glusterfs/05/2f/052f4e3e-c379-4a3c-b995-a10fdaca33d0
>>>> ---------T 2 root root 0 Apr 15 11:58
>>>> /mnt/xfsd/testbug/.glusterfs/05/95/0595272e-ce2b-45d5-8693-d02c00b94d9d
>>>> ---------T 2 root root 0 Apr 15 11:58
>>>> /mnt/xfsd/testbug/.glusterfs/05/ca/05ca00a0-92a7-44cf-b6e3-380496aafaa4
>>>> ---------T 2 root root 0 Apr 15 12:24
>>>> /mnt/xfsd/testbug/.glusterfs/0a/23/0a238ca7-3cef-4540-9c98-6bf631551b21
>>>> ---------T 2 root root 0 Apr 15 11:58
>>>> /mnt/xfsd/testbug/.glusterfs/0a/4b/0a4b640b-f675-4708-bb59-e2369ffbbb9d
>>>> Does it related?
>>>>
>>>>
>>>> 2013/4/15 符永涛 <yongtaofu@gmail.com>
>>>>
>>>>> Dear xfs experts,
>>>>> Now I'm deploying Brian's system script in out cluster. But from last
>>>>> night till now 5 servers in our 24 servers xfs shutdown with the same
>>>>> error. I run xfs_repair command and found all the lost inodes are glusterfs
>>>>> dht link files. This explains why the xfs shutdown tend to happen during
>>>>> glusterfs rebalance. During glusterfs rebalance procedure a lot of dhk link
>>>>> files may be unlinked. For example the following inodes are found in
>>>>> lost+found in one of the servers:
>>>>> [root@* lost+found]# pwd
>>>>> /mnt/xfsd/lost+found
>>>>> [root@* lost+found]# ls -l
>>>>> total 740
>>>>> ---------T 1 root root 0 Apr 8 21:06 100119
>>>>> ---------T 1 root root 0 Apr 8 21:11 101123
>>>>> ---------T 1 root root 0 Apr 8 21:19 102659
>>>>> ---------T 1 root root 0 Apr 12 14:46 1040919
>>>>> ---------T 1 root root 0 Apr 12 14:58 1041943
>>>>> ---------T 1 root root 0 Apr 8 21:32 105219
>>>>> ---------T 1 root root 0 Apr 8 21:37 105731
>>>>> ---------T 1 root root 0 Apr 12 17:48 1068055
>>>>> ---------T 1 root root 0 Apr 12 18:38 1073943
>>>>> ---------T 1 root root 0 Apr 8 21:54 108035
>>>>> ---------T 1 root root 0 Apr 12 21:49 1091095
>>>>> ---------T 1 root root 0 Apr 13 00:17 1111063
>>>>> ---------T 1 root root 0 Apr 13 03:51 1121815
>>>>> ---------T 1 root root 0 Apr 8 22:25 112387
>>>>> ---------T 1 root root 0 Apr 13 06:39 1136151
>>>>> ...
>>>>> [root@* lost+found]# getfattr -m . -d -e hex *
>>>>>
>>>>> # file: 96007
>>>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>>>> trusted.gfid=0xa0370d8a9f104dafbebbd0e6dd7ce1f7
>>>>>
>>>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
>>>>>
>>>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x0000000049dff000
>>>>>
>>>>> # file: 97027
>>>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>>>> trusted.gfid=0xc1c1fe2ec7034442a623385f43b04c25
>>>>>
>>>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
>>>>>
>>>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006ac78000
>>>>>
>>>>> # file: 97559
>>>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>>>> trusted.gfid=0xcf7c17013c914511bda4d1c743fae118
>>>>>
>>>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>>>>
>>>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x00000000519fb000
>>>>>
>>>>> # file: 98055
>>>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>>>> trusted.gfid=0xe86abc6e2c4b44c28d415fbbe34f2102
>>>>>
>>>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
>>>>>
>>>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000004c098000
>>>>>
>>>>> # file: 98567
>>>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>>>> trusted.gfid=0x12543a2efbdf4b9fa61c6d89ca396f80
>>>>>
>>>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>>>>
>>>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006bc98000
>>>>>
>>>>> # file: 98583
>>>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>>>> trusted.gfid=0x760d16d3b7974cfb9c0a665a0982c470
>>>>>
>>>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>>>>
>>>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006cde9000
>>>>>
>>>>> # file: 99607
>>>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>>>> trusted.gfid=0x0849a732ea204bc3b8bae830b46881da
>>>>>
>>>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>>>>
>>>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x00000000513f1000
>>>>> ...
>>>>>
>>>>> What do you think about it? Thank you very much.
>>>>>
>>>>>
>>>>> 2013/4/12 符永涛 <yongtaofu@gmail.com>
>>>>>
>>>>>> Hi Brian,
>>>>>>
>>>>>> Your scripts works for me now after I installed all the rpm built out
>>>>>> from kernel srpm. I'll try it. Thank you.
>>>>>>
>>>>>>
>>>>>> 2013/4/12 Brian Foster <bfoster@redhat.com>
>>>>>>
>>>>>>> On 04/12/2013 04:32 AM, 符永涛 wrote:
>>>>>>> > Dear xfs experts,
>>>>>>> > Can I just call xfs_stack_trace(); in the second line of
>>>>>>> > xfs_do_force_shutdown() to print stack and rebuild kernel to check
>>>>>>> > what's the error?
>>>>>>> >
>>>>>>>
>>>>>>> I suppose that's a start. If you're willing/able to create and run a
>>>>>>> modified kernel for the purpose of collecting more debug info,
>>>>>>> perhaps
>>>>>>> we can get a bit more creative in collecting more data on the problem
>>>>>>> (but a stack trace there is a good start).
>>>>>>>
>>>>>>> BTW- you might want to place the call after the
>>>>>>> XFS_FORCED_SHUTDOWN(mp)
>>>>>>> check almost halfway into the function to avoid duplicate messages.
>>>>>>>
>>>>>>> Brian
>>>>>>>
>>>>>>> >
>>>>>>> > 2013/4/12 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>>>>>>> >
>>>>>>> > Hi Brian,
>>>>>>> > What else I'm missing? Thank you.
>>>>>>> > stap -e 'probe module("xfs").function("xfs_iunlink"){}'
>>>>>>> >
>>>>>>> > WARNING: cannot find module xfs debuginfo: No DWARF
>>>>>>> information found
>>>>>>> > semantic error: no match while resolving probe point
>>>>>>> > module("xfs").function("xfs_iunlink")
>>>>>>> > Pass 2: analysis failed. Try again with another '--vp 01'
>>>>>>> option.
>>>>>>> >
>>>>>>> >
>>>>>>> > 2013/4/12 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com
>>>>>>> >>
>>>>>>> >
>>>>>>> > ls -l
>>>>>>> >
>>>>>>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>>>>>> > -r--r--r-- 1 root root 21393024 Apr 12 12:08
>>>>>>> >
>>>>>>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>>>>>> >
>>>>>>> > rpm -qa|grep kernel
>>>>>>> > kernel-headers-2.6.32-279.el6.x86_64
>>>>>>> > kernel-devel-2.6.32-279.el6.x86_64
>>>>>>> > kernel-2.6.32-358.el6.x86_64
>>>>>>> > kernel-debuginfo-common-x86_64-2.6.32-279.el6.x86_64
>>>>>>> > abrt-addon-kerneloops-2.0.8-6.el6.x86_64
>>>>>>> > kernel-firmware-2.6.32-358.el6.noarch
>>>>>>> > kernel-debug-2.6.32-358.el6.x86_64
>>>>>>> > kernel-debuginfo-2.6.32-279.el6.x86_64
>>>>>>> > dracut-kernel-004-283.el6.noarch
>>>>>>> > libreport-plugin-kerneloops-2.0.9-5.el6.x86_64
>>>>>>> > kernel-devel-2.6.32-358.el6.x86_64
>>>>>>> > kernel-2.6.32-279.el6.x86_64
>>>>>>> >
>>>>>>> > rpm -q kernel-debuginfo
>>>>>>> > kernel-debuginfo-2.6.32-279.el6.x86_64
>>>>>>> >
>>>>>>> > rpm -q kernel
>>>>>>> > kernel-2.6.32-279.el6.x86_64
>>>>>>> > kernel-2.6.32-358.el6.x86_64
>>>>>>> >
>>>>>>> > do I need to re probe it?
>>>>>>> >
>>>>>>> >
>>>>>>> > 2013/4/12 Eric Sandeen <sandeen@sandeen.net
>>>>>>> > <mailto:sandeen@sandeen.net>>
>>>>>>> >
>>>>>>> > On 4/11/13 11:32 PM, 符永涛 wrote:
>>>>>>> > > Hi Brian,
>>>>>>> > > Sorry but when I execute the script it says:
>>>>>>> > > WARNING: cannot find module xfs debuginfo: No DWARF
>>>>>>> > information found
>>>>>>> > > semantic error: no match while resolving probe point
>>>>>>> > module("xfs").function("xfs_iunlink")
>>>>>>> > >
>>>>>>> > > uname -a
>>>>>>> > > 2.6.32-279.el6.x86_64
>>>>>>> > > kernel debuginfo has been installed.
>>>>>>> > >
>>>>>>> > > Where can I find the correct xfs debuginfo?
>>>>>>> >
>>>>>>> > it should be in the kernel-debuginfo rpm (of the same
>>>>>>> > version/release as the kernel rpm you're running)
>>>>>>> >
>>>>>>> > You should have:
>>>>>>> >
>>>>>>> >
>>>>>>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>>>>>> >
>>>>>>> > If not, can you show:
>>>>>>> >
>>>>>>> > # uname -a
>>>>>>> > # rpm -q kernel
>>>>>>> > # rpm -q kernel-debuginfo
>>>>>>> >
>>>>>>> > -Eric
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > --
>>>>>>> > 符永涛
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > --
>>>>>>> > 符永涛
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > --
>>>>>>> > 符永涛
>>>>>>> >
>>>>>>> >
>>>>>>> > _______________________________________________
>>>>>>> > xfs mailing list
>>>>>>> > xfs@oss.sgi.com
>>>>>>> > http://oss.sgi.com/mailman/listinfo/xfs
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> 符永涛
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> 符永涛
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> 符永涛
>>>>
>>>
>>>
>>>
>>> --
>>> 符永涛
>>>
>>
>>
>>
>> --
>> 符永涛
>>
>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 22065 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-15 13:45 ` 符永涛
@ 2013-04-15 13:57 ` Eric Sandeen
2013-04-15 14:21 ` 符永涛
0 siblings, 1 reply; 60+ messages in thread
From: Eric Sandeen @ 2013-04-15 13:57 UTC (permalink / raw)
To: 符永涛; +Cc: Brian Foster, Ben Myers, xfs@oss.sgi.com
On 4/15/13 8:45 AM, 符永涛 wrote:
> And at the same time we got the following error log of glusterfs:
> [2013-04-15 20:43:03.851163] I [dht-rebalance.c:1611:gf_defrag_status_get] 0-glusterfs: Rebalance is completed
> [2013-04-15 20:43:03.851248] I [dht-rebalance.c:1614:gf_defrag_status_get] 0-glusterfs: Files migrated: 1629, size: 1582329065954, lookups: 11036, failures: 561
> [2013-04-15 20:43:03.887634] W [glusterfsd.c:831:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x3bd16e767d] (-->/lib64/libpthread.so.0() [0x3bd1a07851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xdd) [0x405c9d]))) 0-: received signum (15), shutting down
> [2013-04-15 20:43:03.887878] E [rpcsvc.c:1155:rpcsvc_program_unregister_portmap] 0-rpc-service: Could not unregister with portmap
>
We'll take a look, thanks.
Going forward, could I ask that you take a few minutes to batch up the information, rather than sending several emails in a row? It makes it much harder to collect the information when it's spread across so many emails.
Thanks,
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-15 12:54 ` 符永涛
2013-04-15 13:33 ` 符永涛
@ 2013-04-15 14:13 ` Brian Foster
1 sibling, 0 replies; 60+ messages in thread
From: Brian Foster @ 2013-04-15 14:13 UTC (permalink / raw)
To: 符永涛; +Cc: Ben Myers, Eric Sandeen, xfs@oss.sgi.com
On 04/15/2013 08:54 AM, 符永涛 wrote:
> Dear Brian and xfs experts,
> Brain your scripts works and I am able to reproduce it with glusterfs
> rebalance on our test cluster. 2 of our server xfs shutdown during
> glusterfs rebalance, the shutdown userspace stacktrace both related to
> pthread. See logs bellow, What's your opinion? Thank you very much!
> logs:
Thanks for the data. Can you also create a metadump for the
filesystem(s) associated with this output?
Brian
> [root@10.23.72.93 ~]# cat xfs.log
>
> --- xfs_imap --
> module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
> -- return=0x16
> vars: mp=0xffff882017a50800 tp=0xffff881c81797c70 ino=0xffffffff
> imap=0xffff88100e2f7c08 flags=0x0 agbno=? agino=? agno=? blks_per_cluster=?
> chunk_agbno=? cluster_agbno=? error=? offset=? offset_agbno=? __func__=[...]
> mp: m_agno_log = 0x5, m_agino_log = 0x20
> mp->m_sb: sb_agcount = 0x1c, sb_agblocks = 0xffffff0, sb_inopblog = 0x4,
> sb_agblklog = 0x1c, sb_dblocks = 0x1b4900000
> imap: im_blkno = 0x0, im_len = 0xa078, im_boffset = 0x86ea
> kernel backtrace:
> Returning from: 0xffffffffa02b3ab0 : xfs_imap+0x0/0x280 [xfs]
> Returning to : 0xffffffffa02b9599 : xfs_inotobp+0x49/0xc0 [xfs]
> 0xffffffffa02b96f1 : xfs_iunlink_remove+0xe1/0x320 [xfs]
> 0xffffffff81501a69
> 0x0 (inexact)
> user backtrace:
> 0x3bd1a0e5ad [/lib64/libpthread-2.12.so+0xe5ad/0x219000]
>
> --- xfs_iunlink_remove --
> module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1680").return
> -- return=0x16
> vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00 next_ino=? mp=? agi=?
> dip=? agibp=0xffff880109b47e20 ibp=? agno=? agino=? next_agino=? last_ibp=?
> last_dip=0xffff882000000000 bucket_index=? offset=?
> last_offset=0xffffffffffff8810 error=? __func__=[...]
> ip: i_ino = 0x113, i_flags = 0x0
> ip->i_d: di_nlink = 0x0, di_gen = 0x0
> [root@10.23.72.93 ~]#
> [root@10.23.72.94 ~]# cat xfs.log
>
> --- xfs_imap --
> module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return
> -- return=0x16
> vars: mp=0xffff881017c6c800 tp=0xffff8801037acea0 ino=0xffffffff
> imap=0xffff882017101c08 flags=0x0 agbno=? agino=? agno=? blks_per_cluster=?
> chunk_agbno=? cluster_agbno=? error=? offset=? offset_agbno=? __func__=[...]
> mp: m_agno_log = 0x5, m_agino_log = 0x20
> mp->m_sb: sb_agcount = 0x1c, sb_agblocks = 0xffffff0, sb_inopblog = 0x4,
> sb_agblklog = 0x1c, sb_dblocks = 0x1b4900000
> imap: im_blkno = 0x0, im_len = 0xd98, im_boffset = 0x547
> kernel backtrace:
> Returning from: 0xffffffffa02b3ab0 : xfs_imap+0x0/0x280 [xfs]
> Returning to : 0xffffffffa02b9599 : xfs_inotobp+0x49/0xc0 [xfs]
> 0xffffffffa02b96f1 : xfs_iunlink_remove+0xe1/0x320 [xfs]
> 0xffffffff81501a69
> 0x0 (inexact)
> user backtrace:
> 0x30cd40e5ad [/lib64/libpthread-2.12.so+0xe5ad/0x219000]
>
> --- xfs_iunlink_remove --
> module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1680").return
> -- return=0x16
> vars: tp=0xffff8801037acea0 ip=0xffff880e697c8800 next_ino=? mp=? agi=?
> dip=? agibp=0xffff880d846c2d60 ibp=? agno=? agino=? next_agino=? last_ibp=?
> last_dip=0xffff881017c6c800 bucket_index=? offset=?
> last_offset=0xffffffffffff880e error=? __func__=[...]
> ip: i_ino = 0x142, i_flags = 0x0
> ip->i_d: di_nlink = 0x0, di_gen = 0x3565732e
>
>
>
> 2013/4/15 符永涛 <yongtaofu@gmail.com>
>
>> Also glusterfs use a lot of hardlink for self-heal:
>> --------T 2 root root 0 Apr 15 11:58 /mnt/xfsd/testbug/998416323
>> ---------T 2 root root 0 Apr 15 11:58 /mnt/xfsd/testbug/999296624
>> ---------T 2 root root 0 Apr 15 12:24 /mnt/xfsd/testbug/999568484
>> ---------T 2 root root 0 Apr 15 11:58 /mnt/xfsd/testbug/999956875
>> ---------T 2 root root 0 Apr 15 11:58
>> /mnt/xfsd/testbug/.glusterfs/05/2f/052f4e3e-c379-4a3c-b995-a10fdaca33d0
>> ---------T 2 root root 0 Apr 15 11:58
>> /mnt/xfsd/testbug/.glusterfs/05/95/0595272e-ce2b-45d5-8693-d02c00b94d9d
>> ---------T 2 root root 0 Apr 15 11:58
>> /mnt/xfsd/testbug/.glusterfs/05/ca/05ca00a0-92a7-44cf-b6e3-380496aafaa4
>> ---------T 2 root root 0 Apr 15 12:24
>> /mnt/xfsd/testbug/.glusterfs/0a/23/0a238ca7-3cef-4540-9c98-6bf631551b21
>> ---------T 2 root root 0 Apr 15 11:58
>> /mnt/xfsd/testbug/.glusterfs/0a/4b/0a4b640b-f675-4708-bb59-e2369ffbbb9d
>> Does it related?
>>
>>
>> 2013/4/15 符永涛 <yongtaofu@gmail.com>
>>
>>> Dear xfs experts,
>>> Now I'm deploying Brian's system script in out cluster. But from last
>>> night till now 5 servers in our 24 servers xfs shutdown with the same
>>> error. I run xfs_repair command and found all the lost inodes are glusterfs
>>> dht link files. This explains why the xfs shutdown tend to happen during
>>> glusterfs rebalance. During glusterfs rebalance procedure a lot of dhk link
>>> files may be unlinked. For example the following inodes are found in
>>> lost+found in one of the servers:
>>> [root@* lost+found]# pwd
>>> /mnt/xfsd/lost+found
>>> [root@* lost+found]# ls -l
>>> total 740
>>> ---------T 1 root root 0 Apr 8 21:06 100119
>>> ---------T 1 root root 0 Apr 8 21:11 101123
>>> ---------T 1 root root 0 Apr 8 21:19 102659
>>> ---------T 1 root root 0 Apr 12 14:46 1040919
>>> ---------T 1 root root 0 Apr 12 14:58 1041943
>>> ---------T 1 root root 0 Apr 8 21:32 105219
>>> ---------T 1 root root 0 Apr 8 21:37 105731
>>> ---------T 1 root root 0 Apr 12 17:48 1068055
>>> ---------T 1 root root 0 Apr 12 18:38 1073943
>>> ---------T 1 root root 0 Apr 8 21:54 108035
>>> ---------T 1 root root 0 Apr 12 21:49 1091095
>>> ---------T 1 root root 0 Apr 13 00:17 1111063
>>> ---------T 1 root root 0 Apr 13 03:51 1121815
>>> ---------T 1 root root 0 Apr 8 22:25 112387
>>> ---------T 1 root root 0 Apr 13 06:39 1136151
>>> ...
>>> [root@* lost+found]# getfattr -m . -d -e hex *
>>>
>>> # file: 96007
>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>> trusted.gfid=0xa0370d8a9f104dafbebbd0e6dd7ce1f7
>>>
>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
>>>
>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x0000000049dff000
>>>
>>> # file: 97027
>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>> trusted.gfid=0xc1c1fe2ec7034442a623385f43b04c25
>>>
>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
>>>
>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006ac78000
>>>
>>> # file: 97559
>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>> trusted.gfid=0xcf7c17013c914511bda4d1c743fae118
>>>
>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>>
>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x00000000519fb000
>>>
>>> # file: 98055
>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>> trusted.gfid=0xe86abc6e2c4b44c28d415fbbe34f2102
>>>
>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3600
>>>
>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000004c098000
>>>
>>> # file: 98567
>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>> trusted.gfid=0x12543a2efbdf4b9fa61c6d89ca396f80
>>>
>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>>
>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006bc98000
>>>
>>> # file: 98583
>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>> trusted.gfid=0x760d16d3b7974cfb9c0a665a0982c470
>>>
>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>>
>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x000000006cde9000
>>>
>>> # file: 99607
>>> trusted.afr.mams-cq-mt-video-client-3=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-4=0x000000000000000000000000
>>> trusted.afr.mams-cq-mt-video-client-5=0x000000000000000000000000
>>> trusted.gfid=0x0849a732ea204bc3b8bae830b46881da
>>>
>>> trusted.glusterfs.dht.linkto=0x6d616d732d63712d6d742d766964656f2d7265706c69636174652d3500
>>>
>>> trusted.glusterfs.quota.ca34e1ce-f046-4ed4-bbd1-261b21bfe0b8.contri=0x00000000513f1000
>>> ...
>>>
>>> What do you think about it? Thank you very much.
>>>
>>>
>>> 2013/4/12 符永涛 <yongtaofu@gmail.com>
>>>
>>>> Hi Brian,
>>>>
>>>> Your scripts works for me now after I installed all the rpm built out
>>>> from kernel srpm. I'll try it. Thank you.
>>>>
>>>>
>>>> 2013/4/12 Brian Foster <bfoster@redhat.com>
>>>>
>>>>> On 04/12/2013 04:32 AM, 符永涛 wrote:
>>>>>> Dear xfs experts,
>>>>>> Can I just call xfs_stack_trace(); in the second line of
>>>>>> xfs_do_force_shutdown() to print stack and rebuild kernel to check
>>>>>> what's the error?
>>>>>>
>>>>>
>>>>> I suppose that's a start. If you're willing/able to create and run a
>>>>> modified kernel for the purpose of collecting more debug info, perhaps
>>>>> we can get a bit more creative in collecting more data on the problem
>>>>> (but a stack trace there is a good start).
>>>>>
>>>>> BTW- you might want to place the call after the XFS_FORCED_SHUTDOWN(mp)
>>>>> check almost halfway into the function to avoid duplicate messages.
>>>>>
>>>>> Brian
>>>>>
>>>>>>
>>>>>> 2013/4/12 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>>>>>>
>>>>>> Hi Brian,
>>>>>> What else I'm missing? Thank you.
>>>>>> stap -e 'probe module("xfs").function("xfs_iunlink"){}'
>>>>>>
>>>>>> WARNING: cannot find module xfs debuginfo: No DWARF information
>>>>> found
>>>>>> semantic error: no match while resolving probe point
>>>>>> module("xfs").function("xfs_iunlink")
>>>>>> Pass 2: analysis failed. Try again with another '--vp 01' option.
>>>>>>
>>>>>>
>>>>>> 2013/4/12 符永涛 <yongtaofu@gmail.com <mailto:yongtaofu@gmail.com>>
>>>>>>
>>>>>> ls -l
>>>>>>
>>>>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>>>>> -r--r--r-- 1 root root 21393024 Apr 12 12:08
>>>>>>
>>>>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>>>>>
>>>>>> rpm -qa|grep kernel
>>>>>> kernel-headers-2.6.32-279.el6.x86_64
>>>>>> kernel-devel-2.6.32-279.el6.x86_64
>>>>>> kernel-2.6.32-358.el6.x86_64
>>>>>> kernel-debuginfo-common-x86_64-2.6.32-279.el6.x86_64
>>>>>> abrt-addon-kerneloops-2.0.8-6.el6.x86_64
>>>>>> kernel-firmware-2.6.32-358.el6.noarch
>>>>>> kernel-debug-2.6.32-358.el6.x86_64
>>>>>> kernel-debuginfo-2.6.32-279.el6.x86_64
>>>>>> dracut-kernel-004-283.el6.noarch
>>>>>> libreport-plugin-kerneloops-2.0.9-5.el6.x86_64
>>>>>> kernel-devel-2.6.32-358.el6.x86_64
>>>>>> kernel-2.6.32-279.el6.x86_64
>>>>>>
>>>>>> rpm -q kernel-debuginfo
>>>>>> kernel-debuginfo-2.6.32-279.el6.x86_64
>>>>>>
>>>>>> rpm -q kernel
>>>>>> kernel-2.6.32-279.el6.x86_64
>>>>>> kernel-2.6.32-358.el6.x86_64
>>>>>>
>>>>>> do I need to re probe it?
>>>>>>
>>>>>>
>>>>>> 2013/4/12 Eric Sandeen <sandeen@sandeen.net
>>>>>> <mailto:sandeen@sandeen.net>>
>>>>>>
>>>>>> On 4/11/13 11:32 PM, 符永涛 wrote:
>>>>>> > Hi Brian,
>>>>>> > Sorry but when I execute the script it says:
>>>>>> > WARNING: cannot find module xfs debuginfo: No DWARF
>>>>>> information found
>>>>>> > semantic error: no match while resolving probe point
>>>>>> module("xfs").function("xfs_iunlink")
>>>>>> >
>>>>>> > uname -a
>>>>>> > 2.6.32-279.el6.x86_64
>>>>>> > kernel debuginfo has been installed.
>>>>>> >
>>>>>> > Where can I find the correct xfs debuginfo?
>>>>>>
>>>>>> it should be in the kernel-debuginfo rpm (of the same
>>>>>> version/release as the kernel rpm you're running)
>>>>>>
>>>>>> You should have:
>>>>>>
>>>>>>
>>>>> /usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/kernel/fs/xfs/xfs.ko.debug
>>>>>>
>>>>>> If not, can you show:
>>>>>>
>>>>>> # uname -a
>>>>>> # rpm -q kernel
>>>>>> # rpm -q kernel-debuginfo
>>>>>>
>>>>>> -Eric
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> 符永涛
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> 符永涛
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> 符永涛
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> xfs mailing list
>>>>>> xfs@oss.sgi.com
>>>>>> http://oss.sgi.com/mailman/listinfo/xfs
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> 符永涛
>>>>
>>>
>>>
>>>
>>> --
>>> 符永涛
>>>
>>
>>
>>
>> --
>> 符永涛
>>
>
>
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-15 13:57 ` Eric Sandeen
@ 2013-04-15 14:21 ` 符永涛
2013-04-15 15:24 ` 符永涛
2013-04-15 19:34 ` Eric Sandeen
0 siblings, 2 replies; 60+ messages in thread
From: 符永涛 @ 2013-04-15 14:21 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Brian Foster, Ben Myers, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 3241 bytes --]
Hi Eric,
I'm sorry for spaming.
And I got some more info and hope you're interested.
In glusterfs3.3
glusterfsd/src/glusterfsd.c line 1332 there's an unlink operation.
if (ctx->cmd_args.pid_file) {
unlink (ctx->cmd_args.pid_file);
ctx->cmd_args.pid_file = NULL;
}
Glusterfs try to unlink the rebalance pid file after complete and may be
this is where the issue happens.
See logs bellow:
1.
/var/log/secure indicates I start rebalance on Apr 15 11:58:11
Apr 15 11:58:11 10 sudo: root : TTY=pts/2 ; PWD=/root ; USER=root ;
COMMAND=/usr/sbin/gluster volume rebalance testbug start
2.
After xfs shutdown I got the following log:
--- xfs_iunlink_remove --
module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1680").return
-- return=0x16
vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00 next_ino=? mp=? agi=?
dip=? agibp=0xffff880109b47e20 ibp=? agno=? agino=? next_agino=? last_ibp=?
last_dip=0xffff882000000000 bucket_index=? offset=?
last_offset=0xffffffffffff8810 error=? __func__=[...]
ip: i_ino = 0x113, i_flags = 0x0
the inode is lead to xfs shutdown is
0x113
3.
I repair xfs and in lost+foud I find the inode:
[root@10.23.72.93 lost+found]# pwd
/mnt/xfsd/lost+found
[root@10.23.72.93 lost+found]# ls -l 275
---------T 1 root root 0 Apr 15 11:58 275
[root@10.23.72.93 lost+found]# stat 275
File: `275'
Size: 0 Blocks: 0 IO Block: 4096 regular empty
file
Device: 810h/2064d Inode: 275 Links: 1
Access: (1000/---------T) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2013-04-15 11:58:25.833443445 +0800
Modify: 2013-04-15 11:58:25.912461256 +0800
Change: 2013-04-15 11:58:25.915442091 +0800
This file is created aroud 2013-04-15 11:58.
And the other files in lost+foud has extended attribute but this file
doesn't. Which means it is not part of glusterfs backend files. It should
be the rebalance pid file.
So may be unlink the rebalance pid file leads to xfs shutdown.
Thank you.
2013/4/15 Eric Sandeen <sandeen@sandeen.net>
> On 4/15/13 8:45 AM, 符永涛 wrote:
> > And at the same time we got the following error log of glusterfs:
> > [2013-04-15 20:43:03.851163] I
> [dht-rebalance.c:1611:gf_defrag_status_get] 0-glusterfs: Rebalance is
> completed
> > [2013-04-15 20:43:03.851248] I
> [dht-rebalance.c:1614:gf_defrag_status_get] 0-glusterfs: Files migrated:
> 1629, size: 1582329065954, lookups: 11036, failures: 561
> > [2013-04-15 20:43:03.887634] W [glusterfsd.c:831:cleanup_and_exit]
> (-->/lib64/libc.so.6(clone+0x6d) [0x3bd16e767d]
> (-->/lib64/libpthread.so.0() [0x3bd1a07851]
> (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xdd) [0x405c9d]))) 0-:
> received signum (15), shutting down
> > [2013-04-15 20:43:03.887878] E
> [rpcsvc.c:1155:rpcsvc_program_unregister_portmap] 0-rpc-service: Could not
> unregister with portmap
> >
>
> We'll take a look, thanks.
>
> Going forward, could I ask that you take a few minutes to batch up the
> information, rather than sending several emails in a row? It makes it much
> harder to collect the information when it's spread across so many emails.
>
> Thanks,
> -Eric
>
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 5366 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-15 14:21 ` 符永涛
@ 2013-04-15 15:24 ` 符永涛
2013-04-15 19:34 ` Eric Sandeen
1 sibling, 0 replies; 60+ messages in thread
From: 符永涛 @ 2013-04-15 15:24 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Brian Foster, Ben Myers, xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 3577 bytes --]
Hi Brain,
Here's the meta_dump file:
https://docs.google.com/file/d/0B7n2C4T5tfNCRGpoUWIzaTlvM0E/edit?usp=sharing
Thank you.
2013/4/15 符永涛 <yongtaofu@gmail.com>
> Hi Eric,
> I'm sorry for spaming.
> And I got some more info and hope you're interested.
> In glusterfs3.3
> glusterfsd/src/glusterfsd.c line 1332 there's an unlink operation.
> if (ctx->cmd_args.pid_file) {
> unlink (ctx->cmd_args.pid_file);
> ctx->cmd_args.pid_file = NULL;
> }
> Glusterfs try to unlink the rebalance pid file after complete and may be
> this is where the issue happens.
> See logs bellow:
> 1.
> /var/log/secure indicates I start rebalance on Apr 15 11:58:11
> Apr 15 11:58:11 10 sudo: root : TTY=pts/2 ; PWD=/root ; USER=root ;
> COMMAND=/usr/sbin/gluster volume rebalance testbug start
> 2.
> After xfs shutdown I got the following log:
>
> --- xfs_iunlink_remove -- module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1680").return
> -- return=0x16
> vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00 next_ino=? mp=? agi=?
> dip=? agibp=0xffff880109b47e20 ibp=? agno=? agino=? next_agino=? last_ibp=?
> last_dip=0xffff882000000000 bucket_index=? offset=?
> last_offset=0xffffffffffff8810 error=? __func__=[...]
> ip: i_ino = 0x113, i_flags = 0x0
> the inode is lead to xfs shutdown is
> 0x113
> 3.
> I repair xfs and in lost+foud I find the inode:
> [root@10.23.72.93 lost+found]# pwd
> /mnt/xfsd/lost+found
> [root@10.23.72.93 lost+found]# ls -l 275
> ---------T 1 root root 0 Apr 15 11:58 275
> [root@10.23.72.93 lost+found]# stat 275
> File: `275'
> Size: 0 Blocks: 0 IO Block: 4096 regular empty
> file
> Device: 810h/2064d Inode: 275 Links: 1
> Access: (1000/---------T) Uid: ( 0/ root) Gid: ( 0/ root)
> Access: 2013-04-15 11:58:25.833443445 +0800
> Modify: 2013-04-15 11:58:25.912461256 +0800
> Change: 2013-04-15 11:58:25.915442091 +0800
> This file is created aroud 2013-04-15 11:58.
> And the other files in lost+foud has extended attribute but this file
> doesn't. Which means it is not part of glusterfs backend files. It should
> be the rebalance pid file.
>
> So may be unlink the rebalance pid file leads to xfs shutdown.
>
> Thank you.
>
>
>
> 2013/4/15 Eric Sandeen <sandeen@sandeen.net>
>
>> On 4/15/13 8:45 AM, 符永涛 wrote:
>> > And at the same time we got the following error log of glusterfs:
>> > [2013-04-15 20:43:03.851163] I
>> [dht-rebalance.c:1611:gf_defrag_status_get] 0-glusterfs: Rebalance is
>> completed
>> > [2013-04-15 20:43:03.851248] I
>> [dht-rebalance.c:1614:gf_defrag_status_get] 0-glusterfs: Files migrated:
>> 1629, size: 1582329065954, lookups: 11036, failures: 561
>> > [2013-04-15 20:43:03.887634] W [glusterfsd.c:831:cleanup_and_exit]
>> (-->/lib64/libc.so.6(clone+0x6d) [0x3bd16e767d]
>> (-->/lib64/libpthread.so.0() [0x3bd1a07851]
>> (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xdd) [0x405c9d]))) 0-:
>> received signum (15), shutting down
>> > [2013-04-15 20:43:03.887878] E
>> [rpcsvc.c:1155:rpcsvc_program_unregister_portmap] 0-rpc-service: Could not
>> unregister with portmap
>> >
>>
>> We'll take a look, thanks.
>>
>> Going forward, could I ask that you take a few minutes to batch up the
>> information, rather than sending several emails in a row? It makes it much
>> harder to collect the information when it's spread across so many emails.
>>
>> Thanks,
>> -Eric
>>
>>
>
>
> --
> 符永涛
>
--
符永涛
[-- Attachment #1.2: Type: text/html, Size: 6023 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
2013-04-15 14:21 ` 符永涛
2013-04-15 15:24 ` 符永涛
@ 2013-04-15 19:34 ` Eric Sandeen
1 sibling, 0 replies; 60+ messages in thread
From: Eric Sandeen @ 2013-04-15 19:34 UTC (permalink / raw)
To: 符永涛; +Cc: Brian Foster, Ben Myers, xfs@oss.sgi.com
On 4/15/13 9:21 AM, 符永涛 wrote:
> Hi Eric,
> I'm sorry for spaming.
> And I got some more info and hope you're interested.
We are interested; TBH, Brian and I are spending more time on this one because
we have a mutual interest in fixing it for someone who helps pay our salaries.
We really appreciate your willingness to test & debug, since we've been
unable to reproduce this locally so far, so as long as you're willing to
try new things we're willing to keep suggesting them. :)
I'm going to take some time to try to digest the new information, and Brian
or I will let you know if we have more things to try.
Thanks,
-Eric
> In glusterfs3.3
> glusterfsd/src/glusterfsd.c line 1332 there's an unlink operation.
> if (ctx->cmd_args.pid_file) {
> unlink (ctx->cmd_args.pid_file);
> ctx->cmd_args.pid_file = NULL;
> }
> Glusterfs try to unlink the rebalance pid file after complete and may be this is where the issue happens.
> See logs bellow:
> 1.
> /var/log/secure indicates I start rebalance on Apr 15 11:58:11
> Apr 15 11:58:11 10 sudo: root : TTY=pts/2 ; PWD=/root ; USER=root ; COMMAND=/usr/sbin/gluster volume rebalance testbug start
> 2.
> After xfs shutdown I got the following log:
> --- xfs_iunlink_remove -- module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1680").return -- return=0x16
> vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00 next_ino=? mp=? agi=? dip=? agibp=0xffff880109b47e20 ibp=? agno=? agino=? next_agino=? last_ibp=? last_dip=0xffff882000000000 bucket_index=? offset=? last_offset=0xffffffffffff8810 error=? __func__=[...]
> ip: i_ino = 0x113, i_flags = 0x0
> the inode is lead to xfs shutdown is
> 0x113
> 3.
> I repair xfs and in lost+foud I find the inode:
> [root@10.23.72.93 <mailto:root@10.23.72.93> lost+found]# pwd
> /mnt/xfsd/lost+found
> [root@10.23.72.93 <mailto:root@10.23.72.93> lost+found]# ls -l 275
> ---------T 1 root root 0 Apr 15 11:58 275
> [root@10.23.72.93 <mailto:root@10.23.72.93> lost+found]# stat 275
> File: `275'
> Size: 0 Blocks: 0 IO Block: 4096 regular empty file
> Device: 810h/2064d Inode: 275 Links: 1
> Access: (1000/---------T) Uid: ( 0/ root) Gid: ( 0/ root)
> Access: 2013-04-15 11:58:25.833443445 +0800
> Modify: 2013-04-15 11:58:25.912461256 +0800
> Change: 2013-04-15 11:58:25.915442091 +0800
> This file is created aroud 2013-04-15 11:58.
> And the other files in lost+foud has extended attribute but this file doesn't. Which means it is not part of glusterfs backend files. It should be the rebalance pid file.
>
> So may be unlink the rebalance pid file leads to xfs shutdown.
>
> Thank you.
>
>
>
> 2013/4/15 Eric Sandeen <sandeen@sandeen.net <mailto:sandeen@sandeen.net>>
>
> On 4/15/13 8:45 AM, 符永涛 wrote:
> > And at the same time we got the following error log of glusterfs:
> > [2013-04-15 20:43:03.851163] I [dht-rebalance.c:1611:gf_defrag_status_get] 0-glusterfs: Rebalance is completed
> > [2013-04-15 20:43:03.851248] I [dht-rebalance.c:1614:gf_defrag_status_get] 0-glusterfs: Files migrated: 1629, size: 1582329065954, lookups: 11036, failures: 561
> > [2013-04-15 20:43:03.887634] W [glusterfsd.c:831:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x3bd16e767d] (-->/lib64/libpthread.so.0() [0x3bd1a07851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xdd) [0x405c9d]))) 0-: received signum (15), shutting down
> > [2013-04-15 20:43:03.887878] E [rpcsvc.c:1155:rpcsvc_program_unregister_portmap] 0-rpc-service: Could not unregister with portmap
> >
>
> We'll take a look, thanks.
>
> Going forward, could I ask that you take a few minutes to batch up the information, rather than sending several emails in a row? It makes it much harder to collect the information when it's spread across so many emails.
>
> Thanks,
> -Eric
>
>
>
>
> --
> 符永涛
>
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 60+ messages in thread
end of thread, other threads:[~2013-04-15 19:34 UTC | newest]
Thread overview: 60+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-09 12:53 need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22 符永涛
2013-04-09 13:03 ` 符永涛
2013-04-09 13:05 ` 符永涛
2013-04-09 14:52 ` Ben Myers
2013-04-09 15:00 ` 符永涛
2013-04-09 15:07 ` 符永涛
2013-04-09 15:10 ` 符永涛
2013-04-10 10:10 ` Emmanuel Florac
2013-04-10 12:52 ` Dave Chinner
2013-04-10 13:52 ` 符永涛
2013-04-11 19:11 ` 符永涛
2013-04-11 19:55 ` 符永涛
2013-04-11 23:26 ` Brian Foster
2013-04-12 0:45 ` 符永涛
2013-04-12 12:50 ` Brian Foster
2013-04-12 13:42 ` 符永涛
2013-04-12 13:48 ` 符永涛
2013-04-12 13:51 ` 符永涛
2013-04-12 13:59 ` 符永涛
2013-04-12 1:07 ` Eric Sandeen
2013-04-12 1:36 ` 符永涛
2013-04-12 1:38 ` 符永涛
2013-04-12 6:15 ` 符永涛
2013-04-12 4:32 ` 符永涛
2013-04-12 5:16 ` Eric Sandeen
2013-04-12 5:40 ` 符永涛
2013-04-12 6:00 ` 符永涛
2013-04-12 12:11 ` Brian Foster
2013-04-12 7:44 ` 符永涛
2013-04-12 8:32 ` 符永涛
2013-04-12 12:41 ` Brian Foster
2013-04-12 14:48 ` 符永涛
2013-04-15 2:08 ` 符永涛
2013-04-15 5:04 ` 符永涛
2013-04-15 12:54 ` 符永涛
2013-04-15 13:33 ` 符永涛
2013-04-15 13:36 ` 符永涛
2013-04-15 13:45 ` 符永涛
2013-04-15 13:57 ` Eric Sandeen
2013-04-15 14:21 ` 符永涛
2013-04-15 15:24 ` 符永涛
2013-04-15 19:34 ` Eric Sandeen
2013-04-15 14:13 ` Brian Foster
2013-04-12 5:23 ` 符永涛
2013-04-09 22:16 ` Michael L. Semon
2013-04-09 22:18 ` Eric Sandeen
2013-04-09 22:48 ` Ben Myers
2013-04-09 23:30 ` Dave Chinner
2013-04-09 15:06 ` Eric Sandeen
2013-04-09 15:18 ` 符永涛
2013-04-09 15:23 ` Eric Sandeen
2013-04-09 15:25 ` 符永涛
2013-04-09 15:23 ` 符永涛
2013-04-09 15:44 ` Eric Sandeen
2013-04-09 15:48 ` 符永涛
2013-04-09 15:49 ` 符永涛
2013-04-09 15:58 ` Brian Foster
2013-04-09 17:10 ` Eric Sandeen
2013-04-10 5:34 ` 符永涛
2013-04-10 5:36 ` 符永涛
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox