* xfs_admin -c 1 + xfs_repair problem
@ 2008-04-28 18:30 Daniel Bast
2008-04-29 0:48 ` Barry Naujok
0 siblings, 1 reply; 4+ messages in thread
From: Daniel Bast @ 2008-04-28 18:30 UTC (permalink / raw)
To: xfs
Hi,
i tried to enable lazy counts with "xfs_admin -c 1 device" with
xfs_admin from xfsprogs 2.9.8. Unfortunately that process got stuck
without any message. After several hours without any IO or CPU workload
i killed the process and started xfs_repair, but that also got stuck (in
"Phase 6") without any IO or CPU workload or any extra message. The
xfs_repair being stuck in "Phase 6" is reproduceable with a
metadump-image of the filesystem.
I was able to mount the device but don't want to use it because i'm not
sure if everything is ok.
How can i resolve that problem? What information do you need? I can
provide the metadump image (bzip compressed: 28MB) if necessary.
Here are some informations that are maybe useful:
xfs_repair -v /dev/sda7
Phase 1 - find and verify superblock...
- block cache size set to 11472 entries
Phase 2 - using internal log
- zero log...
zero_log: head block 2 tail block 2
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
Phase 5 - rebuild AG headers and trees...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- reset superblock...
Phase 6 - check inode connectivity...
- resetting contents of realtime bitmap and summary inodes
- traversing filesystem ...
- agno = 0
after the killed xfs_admin -c 1 and xfs_repair processes:
xfs_info /dev/sda7
meta-data=/dev/sda7 isize=256 agcount=4, agsize=24719013
blks
= sectsz=512 attr=2
data = bsize=4096 blocks=98876050, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096
log =internal bsize=4096 blocks=32768, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=65536 blocks=0, rtextents=0
a new 'xfs_repair -v /dev/sda7' straced:
strace -ff -p 6364
Process 6409 attached with 6 threads - interrupt to quit
[pid 6364] futex(0x851e2cc, FUTEX_WAIT, 2, NULL <unfinished ...>
[pid 6405] futex(0xb146e3d8, FUTEX_WAIT, 0, NULL <unfinished ...>
[pid 6406] futex(0xb146e358, FUTEX_WAIT, 1, NULL <unfinished ...>
[pid 6407] futex(0xb146e358, FUTEX_WAIT, 2, NULL <unfinished ...>
[pid 6408] futex(0xb146e358, FUTEX_WAIT, 3, NULL <unfinished ...>
[pid 6409] futex(0xb146e358, FUTEX_WAIT, 4, NULL <unfinished ...>
[pid 6406] <... futex resumed> ) = -1 EAGAIN (Resource
temporarily unavailable)
[pid 6407] <... futex resumed> ) = -1 EAGAIN (Resource
temporarily unavailable)
[pid 6408] <... futex resumed> ) = -1 EAGAIN (Resource
temporarily unavailable)
[pid 6406] futex(0xb146e358, FUTEX_WAIT, 4, NULL <unfinished ...>
[pid 6407] futex(0xb146e358, FUTEX_WAIT, 4, NULL <unfinished ...>
[pid 6408] futex(0xb146e358, FUTEX_WAIT, 4, NULL
Thanks
Daniel
P.S. Please CC me, because i'm not subscribed to the list.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: xfs_admin -c 1 + xfs_repair problem
2008-04-28 18:30 xfs_admin -c 1 + xfs_repair problem Daniel Bast
@ 2008-04-29 0:48 ` Barry Naujok
2008-04-29 6:34 ` Daniel Bast
0 siblings, 1 reply; 4+ messages in thread
From: Barry Naujok @ 2008-04-29 0:48 UTC (permalink / raw)
To: Daniel Bast, xfs
On Tue, 29 Apr 2008 04:30:56 +1000, Daniel Bast <daniel.bast@gmx.net>
wrote:
> Hi,
>
> i tried to enable lazy counts with "xfs_admin -c 1 device" with
> xfs_admin from xfsprogs 2.9.8. Unfortunately that process got stuck
> without any message. After several hours without any IO or CPU workload
> i killed the process and started xfs_repair, but that also got stuck (in
> "Phase 6") without any IO or CPU workload or any extra message. The
> xfs_repair being stuck in "Phase 6" is reproduceable with a
> metadump-image of the filesystem.
>
> I was able to mount the device but don't want to use it because i'm not
> sure if everything is ok.
"xfs_admin -c 1" internally runs xfs_repair and hence why it got stuck
too. Your filesystems is fine, the only changes that occured for enabling
lazy-counters was in Phase 5, but may not have been written to disk.
> How can i resolve that problem? What information do you need? I can
> provide the metadump image (bzip compressed: 28MB) if necessary.
Run xfs_repair -P <device> to disable prefetch.
The metadump would be very useful in finding out why xfs_repair got stuck.
Regards,
Barry.
> Here are some informations that are maybe useful:
>
> xfs_repair -v /dev/sda7
> Phase 1 - find and verify superblock...
> - block cache size set to 11472 entries
> Phase 2 - using internal log
> - zero log...
> zero_log: head block 2 tail block 2
> - scan filesystem freespace and inode maps...
> - found root inode chunk
> Phase 3 - for each AG...
> - scan and clear agi unlinked lists...
> - process known inodes and perform inode discovery...
> - agno = 0
> - agno = 1
> - agno = 2
> - agno = 3
> - process newly discovered inodes...
> Phase 4 - check for duplicate blocks...
> - setting up duplicate extent list...
> - check for inodes claiming duplicate blocks...
> - agno = 0
> - agno = 1
> - agno = 2
> - agno = 3
> Phase 5 - rebuild AG headers and trees...
> - agno = 0
> - agno = 1
> - agno = 2
> - agno = 3
> - reset superblock...
> Phase 6 - check inode connectivity...
> - resetting contents of realtime bitmap and summary inodes
> - traversing filesystem ...
> - agno = 0
>
>
> after the killed xfs_admin -c 1 and xfs_repair processes:
> xfs_info /dev/sda7
> meta-data=/dev/sda7 isize=256 agcount=4, agsize=24719013
> blks
> = sectsz=512 attr=2
> data = bsize=4096 blocks=98876050, imaxpct=25
> = sunit=0 swidth=0 blks
> naming =version 2 bsize=4096
> log =internal bsize=4096 blocks=32768, version=2
> = sectsz=512 sunit=0 blks, lazy-count=1
> realtime =none extsz=65536 blocks=0, rtextents=0
>
>
> a new 'xfs_repair -v /dev/sda7' straced:
> strace -ff -p 6364
> Process 6409 attached with 6 threads - interrupt to quit
> [pid 6364] futex(0x851e2cc, FUTEX_WAIT, 2, NULL <unfinished ...>
> [pid 6405] futex(0xb146e3d8, FUTEX_WAIT, 0, NULL <unfinished ...>
> [pid 6406] futex(0xb146e358, FUTEX_WAIT, 1, NULL <unfinished ...>
> [pid 6407] futex(0xb146e358, FUTEX_WAIT, 2, NULL <unfinished ...>
> [pid 6408] futex(0xb146e358, FUTEX_WAIT, 3, NULL <unfinished ...>
> [pid 6409] futex(0xb146e358, FUTEX_WAIT, 4, NULL <unfinished ...>
> [pid 6406] <... futex resumed> ) = -1 EAGAIN (Resource
> temporarily unavailable)
> [pid 6407] <... futex resumed> ) = -1 EAGAIN (Resource
> temporarily unavailable)
> [pid 6408] <... futex resumed> ) = -1 EAGAIN (Resource
> temporarily unavailable)
> [pid 6406] futex(0xb146e358, FUTEX_WAIT, 4, NULL <unfinished ...>
> [pid 6407] futex(0xb146e358, FUTEX_WAIT, 4, NULL <unfinished ...>
> [pid 6408] futex(0xb146e358, FUTEX_WAIT, 4, NULL
>
>
> Thanks
> Daniel
>
> P.S. Please CC me, because i'm not subscribed to the list.
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: xfs_admin -c 1 + xfs_repair problem
2008-04-29 0:48 ` Barry Naujok
@ 2008-04-29 6:34 ` Daniel Bast
2008-04-29 6:48 ` Barry Naujok
0 siblings, 1 reply; 4+ messages in thread
From: Daniel Bast @ 2008-04-29 6:34 UTC (permalink / raw)
To: xfs
Hi Barry,
'xfs_repair -P device' ran through and finished without any problem. So
everything should be fine?
Or should I also run something like 'xfs_repair -P -c lazy-counts=1
device' to make sure that one lazy-count-enable command got through?
After one '-P' run another one without '-P' doesn't finish so I'll send
you the metadump later after finding out how to send a 28MB eMail
attachment.
Thanks
Daniel
Barry Naujok schrieb:
> On Tue, 29 Apr 2008 04:30:56 +1000, Daniel Bast <daniel.bast@gmx.net>
> wrote:
>
>> Hi,
>>
>> i tried to enable lazy counts with "xfs_admin -c 1 device" with
>> xfs_admin from xfsprogs 2.9.8. Unfortunately that process got stuck
>> without any message. After several hours without any IO or CPU
>> workload i killed the process and started xfs_repair, but that also
>> got stuck (in "Phase 6") without any IO or CPU workload or any extra
>> message. The xfs_repair being stuck in "Phase 6" is reproduceable with
>> a metadump-image of the filesystem.
>>
>> I was able to mount the device but don't want to use it because i'm
>> not sure if everything is ok.
>
> "xfs_admin -c 1" internally runs xfs_repair and hence why it got stuck
> too. Your filesystems is fine, the only changes that occured for enabling
> lazy-counters was in Phase 5, but may not have been written to disk.
>
>> How can i resolve that problem? What information do you need? I can
>> provide the metadump image (bzip compressed: 28MB) if necessary.
>
> Run xfs_repair -P <device> to disable prefetch.
>
> The metadump would be very useful in finding out why xfs_repair got stuck.
>
> Regards,
> Barry.
>
>> Here are some informations that are maybe useful:
>>
>> xfs_repair -v /dev/sda7
>> Phase 1 - find and verify superblock...
>> - block cache size set to 11472 entries
>> Phase 2 - using internal log
>> - zero log...
>> zero_log: head block 2 tail block 2
>> - scan filesystem freespace and inode maps...
>> - found root inode chunk
>> Phase 3 - for each AG...
>> - scan and clear agi unlinked lists...
>> - process known inodes and perform inode discovery...
>> - agno = 0
>> - agno = 1
>> - agno = 2
>> - agno = 3
>> - process newly discovered inodes...
>> Phase 4 - check for duplicate blocks...
>> - setting up duplicate extent list...
>> - check for inodes claiming duplicate blocks...
>> - agno = 0
>> - agno = 1
>> - agno = 2
>> - agno = 3
>> Phase 5 - rebuild AG headers and trees...
>> - agno = 0
>> - agno = 1
>> - agno = 2
>> - agno = 3
>> - reset superblock...
>> Phase 6 - check inode connectivity...
>> - resetting contents of realtime bitmap and summary inodes
>> - traversing filesystem ...
>> - agno = 0
>>
>>
>> after the killed xfs_admin -c 1 and xfs_repair processes:
>> xfs_info /dev/sda7
>> meta-data=/dev/sda7 isize=256 agcount=4,
>> agsize=24719013 blks
>> = sectsz=512 attr=2
>> data = bsize=4096 blocks=98876050, imaxpct=25
>> = sunit=0 swidth=0 blks
>> naming =version 2 bsize=4096
>> log =internal bsize=4096 blocks=32768, version=2
>> = sectsz=512 sunit=0 blks, lazy-count=1
>> realtime =none extsz=65536 blocks=0, rtextents=0
>>
>>
>> a new 'xfs_repair -v /dev/sda7' straced:
>> strace -ff -p 6364
>> Process 6409 attached with 6 threads - interrupt to quit
>> [pid 6364] futex(0x851e2cc, FUTEX_WAIT, 2, NULL <unfinished ...>
>> [pid 6405] futex(0xb146e3d8, FUTEX_WAIT, 0, NULL <unfinished ...>
>> [pid 6406] futex(0xb146e358, FUTEX_WAIT, 1, NULL <unfinished ...>
>> [pid 6407] futex(0xb146e358, FUTEX_WAIT, 2, NULL <unfinished ...>
>> [pid 6408] futex(0xb146e358, FUTEX_WAIT, 3, NULL <unfinished ...>
>> [pid 6409] futex(0xb146e358, FUTEX_WAIT, 4, NULL <unfinished ...>
>> [pid 6406] <... futex resumed> ) = -1 EAGAIN (Resource
>> temporarily unavailable)
>> [pid 6407] <... futex resumed> ) = -1 EAGAIN (Resource
>> temporarily unavailable)
>> [pid 6408] <... futex resumed> ) = -1 EAGAIN (Resource
>> temporarily unavailable)
>> [pid 6406] futex(0xb146e358, FUTEX_WAIT, 4, NULL <unfinished ...>
>> [pid 6407] futex(0xb146e358, FUTEX_WAIT, 4, NULL <unfinished ...>
>> [pid 6408] futex(0xb146e358, FUTEX_WAIT, 4, NULL
>>
>>
>> Thanks
>> Daniel
>>
>> P.S. Please CC me, because i'm not subscribed to the list.
>>
>>
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: xfs_admin -c 1 + xfs_repair problem
2008-04-29 6:34 ` Daniel Bast
@ 2008-04-29 6:48 ` Barry Naujok
0 siblings, 0 replies; 4+ messages in thread
From: Barry Naujok @ 2008-04-29 6:48 UTC (permalink / raw)
To: Daniel Bast; +Cc: xfs@oss.sgi.com
On Tue, 29 Apr 2008 16:34:29 +1000, Daniel Bast <daniel.bast@gmx.net>
wrote:
> Hi Barry,
>
> 'xfs_repair -P device' ran through and finished without any problem. So
> everything should be fine?
> Or should I also run something like 'xfs_repair -P -c lazy-counts=1
> device' to make sure that one lazy-count-enable command got through?
Once mounted (yes, everything is fine), xfs_info will tell you if
lazy-counters was enabled.
If it didn't, xfs_repair -P -c lazycount=1 <device> will do it.
> After one '-P' run another one without '-P' doesn't finish so I'll send
> you the metadump later after finding out how to send a 28MB eMail
> attachment.
Email is a bit big. Some form of place where you can stash a binary
file would be good (ftp server, or some web site like yousendit).
You can email me privately with the details.
Regards,
Barry.
> Thanks
> Daniel
>
>
>
>
> Barry Naujok schrieb:
>> On Tue, 29 Apr 2008 04:30:56 +1000, Daniel Bast <daniel.bast@gmx.net>
>> wrote:
>>
>>> Hi,
>>>
>>> i tried to enable lazy counts with "xfs_admin -c 1 device" with
>>> xfs_admin from xfsprogs 2.9.8. Unfortunately that process got stuck
>>> without any message. After several hours without any IO or CPU
>>> workload i killed the process and started xfs_repair, but that also
>>> got stuck (in "Phase 6") without any IO or CPU workload or any extra
>>> message. The xfs_repair being stuck in "Phase 6" is reproduceable with
>>> a metadump-image of the filesystem.
>>>
>>> I was able to mount the device but don't want to use it because i'm
>>> not sure if everything is ok.
>> "xfs_admin -c 1" internally runs xfs_repair and hence why it got stuck
>> too. Your filesystems is fine, the only changes that occured for
>> enabling
>> lazy-counters was in Phase 5, but may not have been written to disk.
>>
>>> How can i resolve that problem? What information do you need? I can
>>> provide the metadump image (bzip compressed: 28MB) if necessary.
>> Run xfs_repair -P <device> to disable prefetch.
>> The metadump would be very useful in finding out why xfs_repair got
>> stuck.
>> Regards,
>> Barry.
>>
>>> Here are some informations that are maybe useful:
>>>
>>> xfs_repair -v /dev/sda7
>>> Phase 1 - find and verify superblock...
>>> - block cache size set to 11472 entries
>>> Phase 2 - using internal log
>>> - zero log...
>>> zero_log: head block 2 tail block 2
>>> - scan filesystem freespace and inode maps...
>>> - found root inode chunk
>>> Phase 3 - for each AG...
>>> - scan and clear agi unlinked lists...
>>> - process known inodes and perform inode discovery...
>>> - agno = 0
>>> - agno = 1
>>> - agno = 2
>>> - agno = 3
>>> - process newly discovered inodes...
>>> Phase 4 - check for duplicate blocks...
>>> - setting up duplicate extent list...
>>> - check for inodes claiming duplicate blocks...
>>> - agno = 0
>>> - agno = 1
>>> - agno = 2
>>> - agno = 3
>>> Phase 5 - rebuild AG headers and trees...
>>> - agno = 0
>>> - agno = 1
>>> - agno = 2
>>> - agno = 3
>>> - reset superblock...
>>> Phase 6 - check inode connectivity...
>>> - resetting contents of realtime bitmap and summary inodes
>>> - traversing filesystem ...
>>> - agno = 0
>>>
>>>
>>> after the killed xfs_admin -c 1 and xfs_repair processes:
>>> xfs_info /dev/sda7
>>> meta-data=/dev/sda7 isize=256 agcount=4,
>>> agsize=24719013 blks
>>> = sectsz=512 attr=2
>>> data = bsize=4096 blocks=98876050,
>>> imaxpct=25
>>> = sunit=0 swidth=0 blks
>>> naming =version 2 bsize=4096
>>> log =internal bsize=4096 blocks=32768, version=2
>>> = sectsz=512 sunit=0 blks,
>>> lazy-count=1
>>> realtime =none extsz=65536 blocks=0, rtextents=0
>>>
>>>
>>> a new 'xfs_repair -v /dev/sda7' straced:
>>> strace -ff -p 6364
>>> Process 6409 attached with 6 threads - interrupt to quit
>>> [pid 6364] futex(0x851e2cc, FUTEX_WAIT, 2, NULL <unfinished ...>
>>> [pid 6405] futex(0xb146e3d8, FUTEX_WAIT, 0, NULL <unfinished ...>
>>> [pid 6406] futex(0xb146e358, FUTEX_WAIT, 1, NULL <unfinished ...>
>>> [pid 6407] futex(0xb146e358, FUTEX_WAIT, 2, NULL <unfinished ...>
>>> [pid 6408] futex(0xb146e358, FUTEX_WAIT, 3, NULL <unfinished ...>
>>> [pid 6409] futex(0xb146e358, FUTEX_WAIT, 4, NULL <unfinished ...>
>>> [pid 6406] <... futex resumed> ) = -1 EAGAIN (Resource
>>> temporarily unavailable)
>>> [pid 6407] <... futex resumed> ) = -1 EAGAIN (Resource
>>> temporarily unavailable)
>>> [pid 6408] <... futex resumed> ) = -1 EAGAIN (Resource
>>> temporarily unavailable)
>>> [pid 6406] futex(0xb146e358, FUTEX_WAIT, 4, NULL <unfinished ...>
>>> [pid 6407] futex(0xb146e358, FUTEX_WAIT, 4, NULL <unfinished ...>
>>> [pid 6408] futex(0xb146e358, FUTEX_WAIT, 4, NULL
>>>
>>>
>>> Thanks
>>> Daniel
>>>
>>> P.S. Please CC me, because i'm not subscribed to the list.
>>>
>>>
>>
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-04-29 6:46 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-28 18:30 xfs_admin -c 1 + xfs_repair problem Daniel Bast
2008-04-29 0:48 ` Barry Naujok
2008-04-29 6:34 ` Daniel Bast
2008-04-29 6:48 ` Barry Naujok
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox