public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* segmentation fault during mount
@ 2011-02-07 11:01 Ryan Roh
  2011-02-08  4:44 ` Eric Sandeen
  0 siblings, 1 reply; 6+ messages in thread
From: Ryan Roh @ 2011-02-07 11:01 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: text/plain, Size: 13261 bytes --]

Dear Members,

 

I'm using XFS based on STMicro SH4 based chip (STi7105).

and I have some issue on xfs log mounting.

 

1. chip : sh4 STi7105

2. HDD : 320GB USB HDD USB 2.0 port.

3. OS : Linux 2.6.23.17 + patch for fixing cache aliasing issue.

4. XFSProgs version : 3.1.1

 

 

mounting and repairing log in the below. This segmentation fault is caused
by

the assert in xfs_alloc_increment function of xfs_alloc_btree.c file. The
btree

level is equal to root level in the below code.

 

    /*

     * If we went off the root then we are seriously confused.

     */

    ASSERT(lev < cur->bc_nlevels);

 

I want to know why and how this error can be happen. Would you please
explain or give any hint for this?

 

Thank you in advance.

 

 

============================================================================
===

XFS mounting filesystem sdb3

Starting XFS recovery on filesystem: sdb3 (logdev: internal)

Unable to handle kernel NULL pointer dereference at virtual address 00000058

pc = 80830d8c

*pde = 00000000

Oops: 0000 [#1]

Modules linked in: stsectoolfuse_ioctl(P) adf(P) asf(P) ath_hif_usb

embxloopback(P) embxmailbox(P) sec_nvram ath_htc_hst ath_dfs(P) embxshell(P)

sttkdma_ioctl(P) ath_rate_atheros(P) mme_host(P) embxshm(P) sttkdma_core(P)

stpti4_ioctl(P) stsectoolfuse_core(P) MHFSWRAPGPL_KERN ath_dev(P) xsr(P)

ath_hal(P) xsr_stl(P) stuart_core(P) MHFSWRAP_KERN(P) stsmart_core(P)
wifitt(P)

umac stapi_core(P) CDIDVB_ST_REF(P)

 

Pid : 885, Comm:                mount

PC is at xfs_alloc_increment+0x1ec/0x220

PC  : 80830d8c SP  : 8aeb5ad0 SR  : 40008000 TEA : c15f3524    Tainted: P


R0  : 00000003 R1  : 00000040 R2  : 00000008 R3  : 000001ff

R4  : 8354c84c R5  : 00000001 R6  : 00000002 R7  : 00000001

R8  : 8354c894 R9  : 00000002 R10 : 8354c84c R11 : 8354c8ae

R12 : 00000000 R13 : 8aeb5a98 R14 : 8354c8b4

MACH: 00000000 MACL: 8aa92322 GBR : 004d2290 PR  : 80830bd0

 

Call trace: 

[<8082ea66>] xfs_free_ag_extent+0xe6/0x5c0

[<80a45f16>] __down_read+0x136/0x180

[<80830560>] xfs_free_extent+0x0/0xc0

[<808305e2>] xfs_free_extent+0x82/0xc0

[<80830560>] xfs_free_extent+0x0/0xc0

[<8087be00>] xfs_trans_log_efd_extent+0x0/0x60

[<8086eb1a>] xlog_recover_process_efi+0x25a/0x320

[<808706c8>] xlog_recover_process_efis+0x68/0xe0

[<8086e8c0>] xlog_recover_process_efi+0x0/0x320

[<8087ac20>] xfs_trans_next_ail+0x0/0x60

[<80870766>] xlog_recover_finish+0x26/0xe0

[<8087625c>] xfs_mountfs+0xc1c/0xe40

[<808876d2>] kmem_alloc+0x52/0x100

[<808877ec>] kmem_zalloc+0xc/0x40

[<808584c0>] xfs_fstrm_free_func+0x0/0xc0

[<8087f30a>] xfs_mount+0x3ca/0x420

[<80893318>] xfs_fs_fill_super+0x58/0x240

[<80774d20>] path_lookup+0x0/0x20

[<808ba576>] snprintf+0x16/0x40

[<80774d20>] path_lookup+0x0/0x20

[<807aab2e>] disk_name+0x8e/0xc0

[<80794f54>] sb_set_blocksize+0x14/0x40

[<8076ae32>] get_sb_bdev+0x152/0x1e0

[<807644a8>] __kmalloc+0x48/0x100

[<80892250>] xfs_fs_get_sb+0x10/0x20

[<808932c0>] xfs_fs_fill_super+0x0/0x240

[<8076a596>] vfs_kern_mount+0x36/0xc0

[<8076a66e>] do_kern_mount+0x2e/0xe0

[<80783390>] do_mount+0x130/0x6c0

[<80770900>] path_release+0x0/0x40

[<80712b3a>] enqueue_entity+0x9a/0x2a0

[<80712ee0>] sub_preempt_count+0x0/0xa0

[<80a45f16>] __down_read+0x136/0x180

[<80712a58>] dequeue_entity+0x58/0xa0

[<80712d66>] task_tick_fair+0x26/0x80

[<8070faa0>] do_page_fault+0x60/0x380

[<808b852e>] __up_read+0x4e/0xe0

[<8070e9ce>] fixup_exception+0xe/0x40

[<8070fb30>] do_page_fault+0xf0/0x380

[<80722b3a>] run_timer_softirq+0x1a/0x200

[<80763870>] kmem_cache_free+0x50/0xe0

[<8072c760>] __rcu_process_callbacks+0x60/0x2c0

[<8072c9ce>] rcu_process_callbacks+0xe/0x40

[<8071ea7a>] tasklet_action+0x7a/0xe0

[<8071e8e4>] __do_softirq+0x64/0x100

[<808bca60>] debug_smp_processor_id+0x0/0xc0

[<8071e8f6>] __do_softirq+0x76/0x100

[<8071e9e4>] do_softirq+0x64/0x80

[<80749020>] get_page_from_freelist+0x160/0x3e0

[<80748fc2>] get_page_from_freelist+0x102/0x3e0

[<8074980e>] __alloc_pages+0x6e/0x2e0

[<80749aa0>] __get_free_pages+0x20/0x60

[<80783986>] sys_mount+0x66/0xe0

[<80749740>] free_pages+0x0/0x60

[<807078f8>] syscall_call+0xa/0xe

[<80783920>] sys_mount+0x0/0xe0

 

Process: mount (pid: 885, stack limit = 8aeb4001)

Stack: (0x8aeb5ad0 to 0x8aeb6000)

5ac0:                                     8aeb5b14 00000000 8082ea66
83853400 

5ae0: 8aeb5ae4 006bfa43 8adff680 8354c84c 8a5f8c78 00000000 00000002
00000000 

5b00: 00020250 00000001 000000bc 004f8880 00000001 8a5ecc80 00000000
80a45f16 

5b20: 80830560 808305e2 80830560 8087be00 000002f0 000001cc 8aeb5b4c
8a5f8c78 

5b40: 00000000 000002f0 00000000 8a5f8c78 83853400 8adff680 83c37f78
00000000 

5b60: 00000002 006bfa43 00000000 00000000 00000000 00000000 00000000
00000000 

5b80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
8086eb1a 

5ba0: 83c43e90 00000000 8a5f8c78 83c36edc 83c36ed4 0000001d 0000001c
00000007 

5bc0: 00000006 ffffffe7 fffffff9 83c36ed0 02bc2417 00000000 00000002
00000000 

5be0: 83c36e90 83853400 012814ea 01ffffff 00000003 00000002 04a053a7
00000000 

5c00: 808706c8 8086e8c0 00001236 8087ac20 8aeb4000 8382631c 83853400
83c36e90 

5c20: 00000001 80870766 8aeb5c90 83853440 8386ce20 838262e0 8382631c
00000000 

5c40: 838262a0 8087625c 83853400 00000000 00000000 838536bc 00000002
00000000 

5c60: 00000000 00000000 00000002 00000000 00000040 83fea8c0 25029d38
00000000 

5c80: 00000000 00000000 00000001 00000000 808876d2 00000000 000002d0
00000058 

5ca0: 00000001 808877ec 816f01e0 808584c0 1fef00fc dfdd5f0f 83853754
00000058 

5cc0: 00000000 00000000 8a6acde0 8adffb00 8087f30a 816f01e0 83fea8c0
836ea000 

5ce0: 8aeb5cc0 83853400 00000000 00000000 00000000 00000040 00000000
00000000 

5d00: 80893318 80774d20 8adfd000 836ea000 83fea8c0 8ae55200 8adfd000
00000001 

5d20: 00000002 8aeb5d93 0000000a ffffffff ffffffff 00000002 00000020
8aeb5d90 

5d40: 8116028c ffffffff 80ab6bdc 808ba576 80774d20 8adfd000 816f0220
8aeb5d90 

5d60: 8116028c 00000003 81160280 8aeb5d74 807aab2e 80794f54 8076ae32
816f0220 

5d80: 8ae55200 816f01f8 00000001 8ae55200 33626473 00000000 00000001
00000000 

5da0: 00000080 8381115c 807644a8 00000000 80892250 8a733000 8adfd000
00008000 

5dc0: 8a4f7000 816f58a0 80b2b2e4 808932c0 816f58a0 8076a596 816f58a0
8076a66e 

5de0: ffffffed 80b2b2e4 8adfd000 8a4f7000 00008000 80783390 00008000
00000000 

5e00: 8a733000 80770900 00000000 00000000 80712b3a 8aeb5e30 83810e60
80712ee0 

5e20: 8132d758 8132d3a8 80a45f16 83d718b4 83810e60 80712a58 8132d758
80712d66 

5e40: 8070faa0 808b852e 8070e9ce 8aeb5f44 8070fb30 83810e60 000541a4
8aeb4000 

5e60: 00000000 00000002 00000050 80722b3a 80763870 00000000 00000000
8072c760 

5e80: 000000f0 8133335c 00000002 00000000 8072c9ce 813320e4 8071ea7a
813320e4 

5ea0: 00000000 ffffff0f 8071e8e4 813320c0 808bca60 0000000a 8071e8f6
00000000 

5ec0: 8133212c 8071e9e4 00008000 80749020 80748fc2 00000000 00000000
80b267bc 

5ee0: 00000044 00000000 000200d0 80b26944 00000000 00000000 8074980e
80b26940 

5f00: 80d330b8 816f57a0 83810e60 8ade8000 8aeb5f84 00000001 00000001
00000000 

5f20: 00000080 8381115c 00000010 00000eb1 814abfa0 8a4f7000 000200d0
00008000 

5f40: 80749aa0 00008000 00402c9c 00008000 7bc48eb1 8ade8000 80783986
00008000 

5f60: 00402c9c 00008000 80749740 8ade8000 8aeb5f48 00000000 8adfd000
8a4f7000 

5f80: 8a733000 8adfd000 807078f8 00000514 00000000 00000021 00000054
80783920 

5fa0: 004d22c0 0044b4b4 ffffffff 00000015 7bc48eb1 7bc48ebb 004d3518
00008000 

5fc0: 00008000 7bc48aa4 004d2298 004d22c0 004736cc 00402c9c 00008000
7bc489e0 

5fe0: 0044b4bc 00423268 00008000 004d2290 00000000 0000004e 00000054
00000160 

Segmentation fault

============================================================================
===

 

 

============================================================================
===

# xfs_repair -n /dev/sdb3

Phase 1 - find and verify superblock...

Phase 2 - using internal log

        - scan filesystem freespace and inode maps...

out-of-order bno btree record 416 (3934652 1) block 2/3

block (2,3934652-3934652) multiply claimed by bno space tree, state - 1

out-of-order bno btree record 417 (3938602 187) block 2/3

block (2,3938602-3938602) multiply claimed by bno space tree, state - 1

bno freespace btree block claimed (state 2), agno 2, bno 37902, suspect 0

cnt freespace btree block claimed (state 2), agno 2, bno 193058, suspect 0

block (2,924189-924189) multiply claimed by cnt space tree, state - 2

block (2,930675-930675) multiply claimed by cnt space tree, state - 2

block (2,930957-930957) multiply claimed by cnt space tree, state - 2

block (2,934623-934623) multiply claimed by cnt space tree, state - 2

block (2,934905-934905) multiply claimed by cnt space tree, state - 2

block (2,936315-936315) multiply claimed by cnt space tree, state - 2

block (2,936597-936597) multiply claimed by cnt space tree, state - 2

block (2,939135-939135) multiply claimed by cnt space tree, state - 2

block (2,939417-939417) multiply claimed by cnt space tree, state - 2

block (2,947595-947595) multiply claimed by cnt space tree, state - 2

block (2,948065-948065) multiply claimed by cnt space tree, state - 2

block (2,1028435-1028435) multiply claimed by cnt space tree, state - 2

block (2,1028906-1028906) multiply claimed by cnt space tree, state - 2

block (2,1031443-1031443) multiply claimed by cnt space tree, state - 2

out-of-order cnt btree record 238 (5971248 187) block 2/193054

out-of-order cnt btree record 239 (6003209 187) block 2/193054

out-of-order cnt btree record 240 (6100543 187) block 2/193054

out-of-order cnt btree record 241 (6192053 187) block 2/193054

out-of-order cnt btree record 242 (6237549 187) block 2/193054

out-of-order cnt btree record 243 (6292821 187) block 2/193054

out-of-order cnt btree record 244 (6388878 187) block 2/193054

cnt freespace btree block claimed (state 2), agno 2, bno 37955, suspect 0

agf_freeblks 12415209, counted 12444225 in ag 2

agi unlinked bucket 18 is 6331218 in ag 2 (inode=1080073042)

sb_ifree 162, counted 166

sb_fdblocks 69539078, counted 69568099

        - found root inode chunk

Phase 3 - for each AG...

        - scan (but don't clear) agi unlinked lists...

        - process known inodes and perform inode discovery...

        - agno = 0

        - agno = 1

        - agno = 2

29560000: Badness in key lookup (length)

bp=(bno 313629760, len 16384 bytes) key=(bno 313629760, len 8192 bytes)

bmap rec out of order, inode 1080073043 entry 105 [o s c] [21592 71839253
188],

104 [19592 101671866 22184]

bad data fork in inode 1080073043

would have cleared inode 1080073043

        - agno = 3

        - process newly discovered inodes...

Phase 4 - check for duplicate blocks...

        - setting up duplicate extent list...

        - check for inodes claiming duplicate blocks...

        - agno = 0

        - agno = 1

        - agno = 2

entry "00000123" at block 0 offset 1104 in directory inode 1074348320

references free inode 1080073043

        would clear inode number in entry at offset 1104...

bmap rec out of order, inode 1080073043 entry 105 [o s c] [21592 71839253
188],

104 [19592 101671866 22184]

bad data fork in inode 1080073043

would have cleared inode 1080073043

        - agno = 3

No modify flag set, skipping phase 5

Phase 6 - check inode connectivity...

        - traversing filesystem ...

entry "00000123" in directory inode 1074348320 points to free inode
1080073043,

would junk entry

bad hash table for directory inode 1074348320 (no data entry): would rebuild

        - traversal finished ...

        - moving disconnected inodes to lost+found ...

disconnected inode 1080073042, would move to lost+found

Phase 7 - verify link counts...

would have reset inode 1080073042 nlinks from 0 to 1

No modify flag set, skipping filesystem flush and exiting.

============================================================================
===

 

 

============================================================================
===

# xfs_logprint -t /dev/sdb3

xfs_logprint:

    data device: 0x813

    log device: 0x813 daddr: 310464192 length: 303184

 

    log tail: 303165 head: 303169 state: <DIRTY>

 

 

LOG REC AT LSN cycle 4 block 303165 (0x4, 0x4a03d)

============================================================================

TRANS: tid:0x83fa10ac  type:  #items:1  trans:0x0  q:0x434c08

EFD: cnt:1 total:1 a:0x42c6f8 len:52 

        EFD:  #regs: 1    num_extents: 3  id: 0xffffffff81eaae90

 

LOG REC AT LSN cycle 4 block 303167 (0x4, 0x4a03f)

============================================================================

TRANS: tid:0x83fa1158  type:INACTIVE  #items:3  trans:0x0  q:0x434c08

INO: cnt:2 total:2 a:0x42c6f8 len:52 a:0x42c760 len:96 

        INODE: #regs:2   ino:0x40609b52  flags:0x1   dsize:0

        CORE inode:

BUF: cnt:2 total:2 a:0x42c7c8 len:28 a:0x42ca48 len:128 

        BUF:  #regs:2   start blkno:0x157aba00   len:8   bmap size:2  

flags:0x0

        BUF DATA

EFI: cnt:1 total:1 a:0x42cad0 len:40 

        EFI:  #regs:1    num_extents:2  id:0xffffffff8239e058

        (s: 0x46bfa43, l: 752) (s: 0x46bfdef, l: 1128) 

 

============================================================================
===

 

 

Regards,

Ryan. 

 

 


[-- Attachment #1.2: Type: text/html, Size: 31561 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: segmentation fault during mount
  2011-02-07 11:01 segmentation fault during mount Ryan Roh
@ 2011-02-08  4:44 ` Eric Sandeen
  2011-02-08  5:12   ` Ryan Roh
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Sandeen @ 2011-02-08  4:44 UTC (permalink / raw)
  To: Ryan Roh; +Cc: xfs

On 2/7/11 5:01 AM, Ryan Roh wrote:
> Dear Members,
> 
> I'm using XFS based on STMicro SH4 based chip (STi7105).
> 
> and I have some issue on xfs log mounting.
> 

Were the errors after any sort of harsh testing of the filesystem, such 
as usb disconnects or power off?

Or was this after a clean unmount?

-Eric

> 
> 1. chip : sh4 STi7105
> 
> 2. HDD : 320GB USB HDD USB 2.0 port.
> 
> 3. OS : Linux 2.6.23.17 + patch for fixing cache aliasing issue.
> 
> 4. XFSProgs version : 3.1.1
> 
>  
> 
> mounting and repairing log in the below. This segmentation fault is caused by
> 
> the assert in xfs_alloc_increment function of xfs_alloc_btree.c file. The btree
> 
> level is equal to root level in the below code.
> 
>  
> 
>     /*
> 
>      * If we went off the root then we are seriously confused.
> 
>      */
> 
>     ASSERT(lev < cur->bc_nlevels);
> 
>  

...

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: segmentation fault during mount
  2011-02-08  4:44 ` Eric Sandeen
@ 2011-02-08  5:12   ` Ryan Roh
  2011-02-08  5:40     ` Eric Sandeen
  0 siblings, 1 reply; 6+ messages in thread
From: Ryan Roh @ 2011-02-08  5:12 UTC (permalink / raw)
  To: 'Eric Sandeen'; +Cc: xfs

Dear Eric,

I don't know how I can make correct form to answer for this thread because
I'm newbie here. Sorry.

Anyway, this issue was happened from returned HDD from customer which was
used our PVR STB. And our STB has toggle power switch so I think user turned
off the power during recording something.

Thanks,
Ryan.
  

-----Original Message-----
From: Eric Sandeen [mailto:sandeen@sandeen.net] 
Sent: Tuesday, February 08, 2011 1:45 PM
To: Ryan Roh
Cc: xfs@oss.sgi.com
Subject: Re: segmentation fault during mount

On 2/7/11 5:01 AM, Ryan Roh wrote:
> Dear Members,
> 
> I'm using XFS based on STMicro SH4 based chip (STi7105).
> 
> and I have some issue on xfs log mounting.
> 

Were the errors after any sort of harsh testing of the filesystem, such as
usb disconnects or power off?

Or was this after a clean unmount?

-Eric

> 
> 1. chip : sh4 STi7105
> 
> 2. HDD : 320GB USB HDD USB 2.0 port.
> 
> 3. OS : Linux 2.6.23.17 + patch for fixing cache aliasing issue.
> 
> 4. XFSProgs version : 3.1.1
> 
>  
> 
> mounting and repairing log in the below. This segmentation fault is 
> caused by
> 
> the assert in xfs_alloc_increment function of xfs_alloc_btree.c file. 
> The btree
> 
> level is equal to root level in the below code.
> 
>  
> 
>     /*
> 
>      * If we went off the root then we are seriously confused.
> 
>      */
> 
>     ASSERT(lev < cur->bc_nlevels);
> 
>  

...

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: segmentation fault during mount
  2011-02-08  5:12   ` Ryan Roh
@ 2011-02-08  5:40     ` Eric Sandeen
  2011-02-08  6:06       ` Ryan Roh
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Sandeen @ 2011-02-08  5:40 UTC (permalink / raw)
  To: Ryan Roh; +Cc: xfs

On 2/7/11 11:12 PM, Ryan Roh wrote:
> Dear Eric,
> 
> I don't know how I can make correct form to answer for this thread because
> I'm newbie here. Sorry.
> 
> Anyway, this issue was happened from returned HDD from customer which was
> used our PVR STB. And our STB has toggle power switch so I think user turned
> off the power during recording something.

Ok, so you're not sure what happened to the hard drive before this, then.

Other Samsung folks have reported problems after intentionally testing the filesystem under harsh conditions such as poweroff or USB unplugs, so I just wondered...

It seems plausible to me that this could be corruption from lack of proper barrier support, and a poweroff or usb unplug (without barrier support) could cause that.

Mounting a corrupted filesystem should never oops the kernel though, so that is a bug.  If you can provide an xfs_metadump image of the filesystem, someone might be able to investigate further.

Does the mount failure persist after an xfs_repair (without using -n?)

If you wish to keep the original filesystem intact, you can make an xfs_metadump image of the filesystem, run xfs_mdrestore to create a new metadata image from that dump, run xfs_repair against that, and try to mount the result.

Does samsung run with CONFIG_XFS_DEBUG enabled?  Otherwise, this:

    /*
     * If we went off the root then we are seriously confused.
     */

    ASSERT(lev < cur->bc_nlevels);

would be a no-op:

#ifndef DEBUG
#define ASSERT(expr)    ((void)0)
...

(As a side note, running with CONFIG_XFS_DEBUG in production is not recommended.)

However, I'm not quite sure that's what you are hitting, if you tripped an ASSERT you should have seen "Assertion failed" in the messages.  This appears to be a null pointer dereference in xfs_free_ag_extent().

-Eric


> Thanks,
> Ryan.
>   
> 
> -----Original Message-----
> From: Eric Sandeen [mailto:sandeen@sandeen.net] 
> Sent: Tuesday, February 08, 2011 1:45 PM
> To: Ryan Roh
> Cc: xfs@oss.sgi.com
> Subject: Re: segmentation fault during mount
> 
> On 2/7/11 5:01 AM, Ryan Roh wrote:
>> Dear Members,
>>
>> I'm using XFS based on STMicro SH4 based chip (STi7105).
>>
>> and I have some issue on xfs log mounting.
>>
> 
> Were the errors after any sort of harsh testing of the filesystem, such as
> usb disconnects or power off?
> 
> Or was this after a clean unmount?
> 
> -Eric
> 
>>
>> 1. chip : sh4 STi7105
>>
>> 2. HDD : 320GB USB HDD USB 2.0 port.
>>
>> 3. OS : Linux 2.6.23.17 + patch for fixing cache aliasing issue.
>>
>> 4. XFSProgs version : 3.1.1
>>
>>  
>>
>> mounting and repairing log in the below. This segmentation fault is 
>> caused by
>>
>> the assert in xfs_alloc_increment function of xfs_alloc_btree.c file. 
>> The btree
>>
>> level is equal to root level in the below code.
>>
>>  
>>
>>     /*
>>
>>      * If we went off the root then we are seriously confused.
>>
>>      */
>>
>>     ASSERT(lev < cur->bc_nlevels);
>>
>>  
> 
> ...
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: segmentation fault during mount
  2011-02-08  5:40     ` Eric Sandeen
@ 2011-02-08  6:06       ` Ryan Roh
  2011-02-08 15:00         ` Eric Sandeen
  0 siblings, 1 reply; 6+ messages in thread
From: Ryan Roh @ 2011-02-08  6:06 UTC (permalink / raw)
  To: 'Eric Sandeen'; +Cc: xfs

Dear Eric,

Thank you for kind reply.

The XFS partition can be mounted after xfs_repair with -L option. 
Actually, the debug option was turned off. So the assert is not called.
Anyway the level was equal to root level of b-tree. So if I change the code
like in the below then XFS mount display the message to repair partition
with xfs_repair.

   /*
     * If we went off the root then we are seriously confused.
     */
    If (lev < cur->bc_nlevels)
        return EFSCORRUPT;
    //ASSERT(lev < cur->bc_nlevels);


Kernel oops can be replayed with metadump and restored image. But when I
tested it with 2.6.33.6 (FC13) then mount failed with " mount: Structure
needs cleaning" message.
And Would you please let me know how I can share the metadump file with
others? It is too big to send through the e-mail. Can I use the FTP server
to share it?

And I got the hint about the patch for vmap cache aliasing issue from Dave
Chinner and I trying to apply it. 
"[GIT PATCH] Fix XFS to work with Virtually indexed architectures" :
http://linux.derkeiler.com/Mailing-Lists/Kernel/2010-02/msg10227.html

Thanks,
Ryan.


-----Original Message-----
From: Eric Sandeen [mailto:sandeen@sandeen.net] 
Sent: Tuesday, February 08, 2011 2:40 PM
To: Ryan Roh
Cc: xfs@oss.sgi.com
Subject: Re: segmentation fault during mount

On 2/7/11 11:12 PM, Ryan Roh wrote:
> Dear Eric,
> 
> I don't know how I can make correct form to answer for this thread 
> because I'm newbie here. Sorry.
> 
> Anyway, this issue was happened from returned HDD from customer which 
> was used our PVR STB. And our STB has toggle power switch so I think 
> user turned off the power during recording something.

Ok, so you're not sure what happened to the hard drive before this, then.

Other Samsung folks have reported problems after intentionally testing the
filesystem under harsh conditions such as poweroff or USB unplugs, so I just
wondered...

It seems plausible to me that this could be corruption from lack of proper
barrier support, and a poweroff or usb unplug (without barrier support)
could cause that.

Mounting a corrupted filesystem should never oops the kernel though, so that
is a bug.  If you can provide an xfs_metadump image of the filesystem,
someone might be able to investigate further.

Does the mount failure persist after an xfs_repair (without using -n?)

If you wish to keep the original filesystem intact, you can make an
xfs_metadump image of the filesystem, run xfs_mdrestore to create a new
metadata image from that dump, run xfs_repair against that, and try to mount
the result.

Does samsung run with CONFIG_XFS_DEBUG enabled?  Otherwise, this:

    /*
     * If we went off the root then we are seriously confused.
     */

    ASSERT(lev < cur->bc_nlevels);

would be a no-op:

#ifndef DEBUG
#define ASSERT(expr)    ((void)0)
...

(As a side note, running with CONFIG_XFS_DEBUG in production is not
recommended.)

However, I'm not quite sure that's what you are hitting, if you tripped an
ASSERT you should have seen "Assertion failed" in the messages.  This
appears to be a null pointer dereference in xfs_free_ag_extent().

-Eric


> Thanks,
> Ryan.
>   
> 
> -----Original Message-----
> From: Eric Sandeen [mailto:sandeen@sandeen.net]
> Sent: Tuesday, February 08, 2011 1:45 PM
> To: Ryan Roh
> Cc: xfs@oss.sgi.com
> Subject: Re: segmentation fault during mount
> 
> On 2/7/11 5:01 AM, Ryan Roh wrote:
>> Dear Members,
>>
>> I'm using XFS based on STMicro SH4 based chip (STi7105).
>>
>> and I have some issue on xfs log mounting.
>>
> 
> Were the errors after any sort of harsh testing of the filesystem, 
> such as usb disconnects or power off?
> 
> Or was this after a clean unmount?
> 
> -Eric
> 
>>
>> 1. chip : sh4 STi7105
>>
>> 2. HDD : 320GB USB HDD USB 2.0 port.
>>
>> 3. OS : Linux 2.6.23.17 + patch for fixing cache aliasing issue.
>>
>> 4. XFSProgs version : 3.1.1
>>
>>  
>>
>> mounting and repairing log in the below. This segmentation fault is 
>> caused by
>>
>> the assert in xfs_alloc_increment function of xfs_alloc_btree.c file. 
>> The btree
>>
>> level is equal to root level in the below code.
>>
>>  
>>
>>     /*
>>
>>      * If we went off the root then we are seriously confused.
>>
>>      */
>>
>>     ASSERT(lev < cur->bc_nlevels);
>>
>>  
> 
> ...
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: segmentation fault during mount
  2011-02-08  6:06       ` Ryan Roh
@ 2011-02-08 15:00         ` Eric Sandeen
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Sandeen @ 2011-02-08 15:00 UTC (permalink / raw)
  To: Ryan Roh; +Cc: xfs

On 2/8/11 12:06 AM, Ryan Roh wrote:
> Dear Eric,
> 
> Thank you for kind reply.
> 
> The XFS partition can be mounted after xfs_repair with -L option. 

Ok.

> Actually, the debug option was turned off. So the assert is not called.
> Anyway the level was equal to root level of b-tree. So if I change the code
> like in the below then XFS mount display the message to repair partition
> with xfs_repair.
> 
>    /*
>      * If we went off the root then we are seriously confused.
>      */
>     If (lev < cur->bc_nlevels)
>         return EFSCORRUPT;
>     //ASSERT(lev < cur->bc_nlevels);
> 
> 
> Kernel oops can be replayed with metadump and restored image. But when I
> tested it with 2.6.33.6 (FC13) then mount failed with " mount: Structure
> needs cleaning" message.

Ok, as it should.  So this has been fixed upstream, as part of Christoph's
btree rework.  These 2 commits were part of a larger series, but they
put this particular error handling in place:

8df4da4a0a642d3a016028c0d922bcb4d5a4a6d7 [XFS] implement generic xfs_btree_decrement
637aa50f461b8ea6b1e8bf9877b0d13d00085043 [XFS] implement generic xfs_btree_increment

> And Would you please let me know how I can share the metadump file with
> others? It is too big to send through the e-mail. Can I use the FTP server
> to share it?

I think there is no need, since you have shown that the bug is fixed upstream;
I should have suggested that myself, but thanks for testing it.
 
It's always a good idea to test upstream before reporting bugs to the list for
old kernels; if bugs are fixed already there is no need to report them here or
in bugzilla.

-Eric

> And I got the hint about the patch for vmap cache aliasing issue from Dave
> Chinner and I trying to apply it. 
> "[GIT PATCH] Fix XFS to work with Virtually indexed architectures" :
> http://linux.derkeiler.com/Mailing-Lists/Kernel/2010-02/msg10227.html
> 
> Thanks,
> Ryan.
> 
> 
> -----Original Message-----
> From: Eric Sandeen [mailto:sandeen@sandeen.net] 
> Sent: Tuesday, February 08, 2011 2:40 PM
> To: Ryan Roh
> Cc: xfs@oss.sgi.com
> Subject: Re: segmentation fault during mount
> 
> On 2/7/11 11:12 PM, Ryan Roh wrote:
>> Dear Eric,
>>
>> I don't know how I can make correct form to answer for this thread 
>> because I'm newbie here. Sorry.
>>
>> Anyway, this issue was happened from returned HDD from customer which 
>> was used our PVR STB. And our STB has toggle power switch so I think 
>> user turned off the power during recording something.
> 
> Ok, so you're not sure what happened to the hard drive before this, then.
> 
> Other Samsung folks have reported problems after intentionally testing the
> filesystem under harsh conditions such as poweroff or USB unplugs, so I just
> wondered...
> 
> It seems plausible to me that this could be corruption from lack of proper
> barrier support, and a poweroff or usb unplug (without barrier support)
> could cause that.
> 
> Mounting a corrupted filesystem should never oops the kernel though, so that
> is a bug.  If you can provide an xfs_metadump image of the filesystem,
> someone might be able to investigate further.
> 
> Does the mount failure persist after an xfs_repair (without using -n?)
> 
> If you wish to keep the original filesystem intact, you can make an
> xfs_metadump image of the filesystem, run xfs_mdrestore to create a new
> metadata image from that dump, run xfs_repair against that, and try to mount
> the result.
> 
> Does samsung run with CONFIG_XFS_DEBUG enabled?  Otherwise, this:
> 
>     /*
>      * If we went off the root then we are seriously confused.
>      */
> 
>     ASSERT(lev < cur->bc_nlevels);
> 
> would be a no-op:
> 
> #ifndef DEBUG
> #define ASSERT(expr)    ((void)0)
> ...
> 
> (As a side note, running with CONFIG_XFS_DEBUG in production is not
> recommended.)
> 
> However, I'm not quite sure that's what you are hitting, if you tripped an
> ASSERT you should have seen "Assertion failed" in the messages.  This
> appears to be a null pointer dereference in xfs_free_ag_extent().
> 
> -Eric
> 
> 
>> Thanks,
>> Ryan.
>>   
>>
>> -----Original Message-----
>> From: Eric Sandeen [mailto:sandeen@sandeen.net]
>> Sent: Tuesday, February 08, 2011 1:45 PM
>> To: Ryan Roh
>> Cc: xfs@oss.sgi.com
>> Subject: Re: segmentation fault during mount
>>
>> On 2/7/11 5:01 AM, Ryan Roh wrote:
>>> Dear Members,
>>>
>>> I'm using XFS based on STMicro SH4 based chip (STi7105).
>>>
>>> and I have some issue on xfs log mounting.
>>>
>>
>> Were the errors after any sort of harsh testing of the filesystem, 
>> such as usb disconnects or power off?
>>
>> Or was this after a clean unmount?
>>
>> -Eric
>>
>>>
>>> 1. chip : sh4 STi7105
>>>
>>> 2. HDD : 320GB USB HDD USB 2.0 port.
>>>
>>> 3. OS : Linux 2.6.23.17 + patch for fixing cache aliasing issue.
>>>
>>> 4. XFSProgs version : 3.1.1
>>>
>>>  
>>>
>>> mounting and repairing log in the below. This segmentation fault is 
>>> caused by
>>>
>>> the assert in xfs_alloc_increment function of xfs_alloc_btree.c file. 
>>> The btree
>>>
>>> level is equal to root level in the below code.
>>>
>>>  
>>>
>>>     /*
>>>
>>>      * If we went off the root then we are seriously confused.
>>>
>>>      */
>>>
>>>     ASSERT(lev < cur->bc_nlevels);
>>>
>>>  
>>
>> ...
>>
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-02-08 14:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-07 11:01 segmentation fault during mount Ryan Roh
2011-02-08  4:44 ` Eric Sandeen
2011-02-08  5:12   ` Ryan Roh
2011-02-08  5:40     ` Eric Sandeen
2011-02-08  6:06       ` Ryan Roh
2011-02-08 15:00         ` Eric Sandeen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox