public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* kernel BUG in __xfs_get_blocks at fs/xfs/linux-2.6/xfs_aops.c:1293!
@ 2006-08-31  5:48 Yao Fei Zhu
  2006-08-31  7:47 ` David Chinner
  0 siblings, 1 reply; 7+ messages in thread
From: Yao Fei Zhu @ 2006-08-31  5:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: haveblue, xfs

Problem description:
Run fsstress on xfs file system with -n 1000 and -p 1000, after about 3 hours,
test box will fall into xmon, and get
kernel BUG in __xfs_get_blocks at fs/xfs/linux-2.6/xfs_aops.c:1293!

Hardware Environment
    Machine type (p650, x235, SF2, etc.): B70+
    Cpu type (Power4, Power5, IA-64, etc.): POWER5+
Software Environmnet
    Base OS: SLES10 GM
    Kernel: 2.6.18-rc5

Additional information:
3:mon> e
cpu 0x3: Vector: 700 (Program Check) at [c0000001e16632d0]
    pc: d0000000006daa88: .__xfs_get_blocks+0x1a0/0x2a0 [xfs]
    lr: d0000000006da984: .__xfs_get_blocks+0x9c/0x2a0 [xfs]
    sp: c0000001e1663550
   msr: 8000000000029032
  current = 0xc0000001dde71310
  paca    = 0xc0000000004c4900
    pid   = 9217, comm = fsstress
kernel BUG in __xfs_get_blocks at fs/xfs/linux-2.6/xfs_aops.c:1293!
3:mon> t
[c0000001e1663640] c000000000108344 .__blockdev_direct_IO+0x560/0xcfc
[c0000001e1663760] d0000000006dc43c .xfs_vm_direct_IO+0xec/0x13c [xfs]
[c0000001e1663860] c0000000000a1474 .generic_file_direct_IO+0xe8/0x15c
[c0000001e1663910] c0000000000a1748 .__generic_file_aio_read+0xf4/0x22c
[c0000001e16639e0] d0000000006e4b94 .xfs_read+0x288/0x368 [xfs]
[c0000001e1663ae0] d0000000006e0750 .xfs_file_aio_read+0x88/0x9c [xfs]
[c0000001e1663b70] c0000000000d4df0 .do_sync_read+0xd4/0x130
[c0000001e1663cf0] c0000000000d5c44 .vfs_read+0x118/0x200
[c0000001e1663d90] c0000000000d6128 .sys_read+0x4c/0x8c
[c0000001e1663e30] c00000000000871c syscall_exit+0x0/0x40
--- Exception: c01 (System Call) at 000000000ff4ddf8
SP (fc38ec90) is in userspace
3:mon> r
R00 = 0000000000000004   R16 = 0000000000000003
R01 = c0000001e1663550   R17 = 0000000000020000
R02 = d000000000727dc0   R18 = c0000001adf18e90
R03 = 0000000000000000   R19 = 000000000000000c
R04 = 0000000000000001   R20 = c0000001e1663b50
R05 = c0000001e1663610   R21 = 0000000000000000
R06 = c0000001e1663620   R22 = c0000001d99cfe88
R07 = ffffffffffffffff   R23 = 000000000000000c
R08 = 0000000000000000   R24 = 0000000000000000
R09 = 0000000000000001   R25 = 0000000000000001
R10 = c00000003173f630   R26 = 000000000010c000
R11 = 0000000000000000   R27 = c0000001adf18e90
R12 = d0000000006e8a78   R28 = 0000000000044000
R13 = c0000000004c4900   R29 = c0000001e16635c8
R14 = c0000001e1663be0   R30 = c000000000517378
R15 = 0000000000000000   R31 = c0000001d99cfbf0
pc  = d0000000006daa88 .__xfs_get_blocks+0x1a0/0x2a0 [xfs]
lr  = d0000000006da984 .__xfs_get_blocks+0x9c/0x2a0 [xfs]
msr = 8000000000029032   cr  = 42000422
ctr = c00000000007d12c   xer = 0000000020000001   trap =  700
3:mon> di d0000000006daa88
d0000000006daa88  0b190000      tdnei   r25,0
d0000000006daa8c  2fb80000      cmpdi   cr7,r24,0
d0000000006daa90  419e004c      beq     cr7,d0000000006daadc    #
.__xfs_get_blocks+0x1f4/0x2a0 [xfs]
d0000000006daa94  38000001      li      r0,1
d0000000006daa98  7d20f8a8      ldarx   r9,r0,r31
d0000000006daa9c  7d290378      or      r9,r9,r0
d0000000006daaa0  7d20f9ad      stdcx.  r9,r0,r31
d0000000006daaa4  40a2fff4      bne     d0000000006daa98        #
.__xfs_get_blocks+0x1b0/0x2a0 [xfs]
d0000000006daaa8  38000020      li      r0,32
d0000000006daaac  60000000      nop
d0000000006daab0  7d20f8a8      ldarx   r9,r0,r31
d0000000006daab4  7d290378      or      r9,r9,r0
d0000000006daab8  7d20f9ad      stdcx.  r9,r0,r31
d0000000006daabc  40a2fff4      bne     d0000000006daab0        #
.__xfs_get_blocks+0x1c8/0x2a0 [xfs]
d0000000006daac0  38000200      li      r0,512
d0000000006daac4  60000000      nop
3:mon> mi
Mem-info:
Node 0 DMA per-cpu:
cpu 0 hot: high 6, batch 1 used:5
cpu 0 cold: high 2, batch 1 used:0
cpu 1 hot: high 6, batch 1 used:5
cpu 1 cold: high 2, batch 1 used:0
cpu 2 hot: high 6, batch 1 used:0
cpu 2 cold: high 2, batch 1 used:0
cpu 3 hot: high 6, batch 1 used:0
cpu 3 cold: high 2, batch 1 used:0
Node 0 DMA32 per-cpu: empty
Node 0 Normal per-cpu: empty
Node 0 HighMem per-cpu: empty
Free pages:       36288kB (0kB HighMem)
Active:25944 inactive:88655 dirty:13 writeback:11 unstable:0 free:567 slab:9664
mapped:468 pagetables:3094
Node 0 DMA free:36288kB min:11584kB low:14464kB high:17344kB active:1660416kB
inactive:5673920kB present:8388608kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 HighMem free:0kB min:2048kB low:2048kB high:2048kB active:0kB
inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 73*64kB 109*128kB 15*256kB 3*512kB 0*1024kB 2*2048kB 0*4096kB
1*8192kB 0*16384kB = 36288kB
Node 0 DMA32: empty
Node 0 Normal: empty
Node 0 HighMem: empty
Swap cache: add 3054, delete 29, find 2/4, race 0+0
Free swap  = 3932928kB
Total swap = 4127296kB
Free swap:       3932928kB
131072 pages of RAM
601 reserved pages
107626 pages shared
3025 pages swap cached
3:mon>



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: kernel BUG in __xfs_get_blocks at fs/xfs/linux-2.6/xfs_aops.c:1293!
  2006-08-31  5:48 kernel BUG in __xfs_get_blocks at fs/xfs/linux-2.6/xfs_aops.c:1293! Yao Fei Zhu
@ 2006-08-31  7:47 ` David Chinner
  2006-08-31  8:02   ` Yao Fei Zhu
  0 siblings, 1 reply; 7+ messages in thread
From: David Chinner @ 2006-08-31  7:47 UTC (permalink / raw)
  To: Yao Fei Zhu; +Cc: linux-kernel, haveblue, xfs

On Thu, Aug 31, 2006 at 01:48:55PM +0800, Yao Fei Zhu wrote:
> Problem description:
> Run fsstress on xfs file system with -n 1000 and -p 1000, after about 3 
> hours,
> test box will fall into xmon, and get
> kernel BUG in __xfs_get_blocks at fs/xfs/linux-2.6/xfs_aops.c:1293!
> 
> Hardware Environment
>    Machine type (p650, x235, SF2, etc.): B70+
>    Cpu type (Power4, Power5, IA-64, etc.): POWER5+
> Software Environmnet
>    Base OS: SLES10 GM
>    Kernel: 2.6.18-rc5
> 
> Additional information:
> 3:mon> e
> cpu 0x3: Vector: 700 (Program Check) at [c0000001e16632d0]
>    pc: d0000000006daa88: .__xfs_get_blocks+0x1a0/0x2a0 [xfs]
>    lr: d0000000006da984: .__xfs_get_blocks+0x9c/0x2a0 [xfs]
>    sp: c0000001e1663550
>   msr: 8000000000029032
>  current = 0xc0000001dde71310
>  paca    = 0xc0000000004c4900
>    pid   = 9217, comm = fsstress
> kernel BUG in __xfs_get_blocks at fs/xfs/linux-2.6/xfs_aops.c:1293!
> 3:mon> t
> [c0000001e1663640] c000000000108344 .__blockdev_direct_IO+0x560/0xcfc
> [c0000001e1663760] d0000000006dc43c .xfs_vm_direct_IO+0xec/0x13c [xfs]
> [c0000001e1663860] c0000000000a1474 .generic_file_direct_IO+0xe8/0x15c
> [c0000001e1663910] c0000000000a1748 .__generic_file_aio_read+0xf4/0x22c
> [c0000001e16639e0] d0000000006e4b94 .xfs_read+0x288/0x368 [xfs]
> [c0000001e1663ae0] d0000000006e0750 .xfs_file_aio_read+0x88/0x9c [xfs]
> [c0000001e1663b70] c0000000000d4df0 .do_sync_read+0xd4/0x130
> [c0000001e1663cf0] c0000000000d5c44 .vfs_read+0x118/0x200
> [c0000001e1663d90] c0000000000d6128 .sys_read+0x4c/0x8c
> [c0000001e1663e30] c00000000000871c syscall_exit+0x0/0x40

Hmmmm. We've mapped a range that has been reserved for a delayed
allocate extent during a direct I/O. That should not happen as XFS
flushes delalloc extents before executing a direct read and holds
the I/O lock which will prevent any new writes from mapping new
delalloc extents. Something went astray, though. :(

Can you give me some more detail on the machine you're running?
e.g. How many CPUs, RAM and what type of disk subsystem you are using?
That will make it easier for us to try to reproduce this problem.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: kernel BUG in __xfs_get_blocks at fs/xfs/linux-2.6/xfs_aops.c:1293!
  2006-08-31  7:47 ` David Chinner
@ 2006-08-31  8:02   ` Yao Fei Zhu
  2006-08-31  8:17     ` David Chinner
  0 siblings, 1 reply; 7+ messages in thread
From: Yao Fei Zhu @ 2006-08-31  8:02 UTC (permalink / raw)
  To: David Chinner; +Cc: linux-kernel, haveblue, xfs

David Chinner wrote:

>
>Hmmmm. We've mapped a range that has been reserved for a delayed
>allocate extent during a direct I/O. That should not happen as XFS
>flushes delalloc extents before executing a direct read and holds
>the I/O lock which will prevent any new writes from mapping new
>delalloc extents. Something went astray, though. :(
>
>Can you give me some more detail on the machine you're running?
>e.g. How many CPUs, RAM and what type of disk subsystem you are using?
>That will make it easier for us to try to reproduce this problem.
>
>Cheers,
>
>Dave.
>  
>
The test box is an IBM System p5 Linux partition, allocated with
0.8 physical POWER5+ cpu processing unit/ 2 virtual processors and 8GB 
memory.
The disk is exported by AIX Virtual IO Server.

BTW, I have CONFIG_PPC_64K_PAGES enabled.
 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: kernel BUG in __xfs_get_blocks at fs/xfs/linux-2.6/xfs_aops.c:1293!
  2006-08-31  8:02   ` Yao Fei Zhu
@ 2006-08-31  8:17     ` David Chinner
  2006-08-31  8:54       ` Andrew Morton
  2006-09-01  6:34       ` Yao Fei Zhu
  0 siblings, 2 replies; 7+ messages in thread
From: David Chinner @ 2006-08-31  8:17 UTC (permalink / raw)
  To: Yao Fei Zhu; +Cc: David Chinner, linux-kernel, haveblue, xfs

On Thu, Aug 31, 2006 at 04:02:36PM +0800, Yao Fei Zhu wrote:
> David Chinner wrote:
> >Hmmmm. We've mapped a range that has been reserved for a delayed
> >allocate extent during a direct I/O. That should not happen as XFS
> >flushes delalloc extents before executing a direct read and holds
> >the I/O lock which will prevent any new writes from mapping new
> >delalloc extents. Something went astray, though. :(
> >
> >Can you give me some more detail on the machine you're running?
> >e.g. How many CPUs, RAM and what type of disk subsystem you are using?
> >That will make it easier for us to try to reproduce this problem.
>
> The test box is an IBM System p5 Linux partition, allocated with
> 0.8 physical POWER5+ cpu processing unit/ 2 virtual processors and 8GB 
> memory.
> The disk is exported by AIX Virtual IO Server.

Nothing too unusual there.

> BTW, I have CONFIG_PPC_64K_PAGES enabled.

But that might be a good place to start. Can you see if you can
reproduce the problem without this config option set?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: kernel BUG in __xfs_get_blocks at fs/xfs/linux-2.6/xfs_aops.c:1293!
  2006-08-31  8:17     ` David Chinner
@ 2006-08-31  8:54       ` Andrew Morton
  2006-09-01  3:43         ` Yao Fei Zhu
  2006-09-01  6:34       ` Yao Fei Zhu
  1 sibling, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2006-08-31  8:54 UTC (permalink / raw)
  To: David Chinner; +Cc: Yao Fei Zhu, linux-kernel, haveblue, xfs

On Thu, 31 Aug 2006 18:17:26 +1000
David Chinner <dgc@sgi.com> wrote:

> > BTW, I have CONFIG_PPC_64K_PAGES enabled.
> 
> But that might be a good place to start. Can you see if you can
> reproduce the problem without this config option set?

It would be useful to compare the compiler warning output for 64k pages
versus that for smaller-pages.  

Several quite worrisome-looking warnings are emitted from various parts of
the kernel with 64k pages.  Related to arithmetic on short types.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: kernel BUG in __xfs_get_blocks at fs/xfs/linux-2.6/xfs_aops.c:1293!
  2006-08-31  8:54       ` Andrew Morton
@ 2006-09-01  3:43         ` Yao Fei Zhu
  0 siblings, 0 replies; 7+ messages in thread
From: Yao Fei Zhu @ 2006-09-01  3:43 UTC (permalink / raw)
  To: Andrew Morton; +Cc: David Chinner, linux-kernel, haveblue, xfs

Andrew Morton wrote:

>On Thu, 31 Aug 2006 18:17:26 +1000
>David Chinner <dgc@sgi.com> wrote:
>
>  
>
>>>BTW, I have CONFIG_PPC_64K_PAGES enabled.
>>>      
>>>
>>But that might be a good place to start. Can you see if you can
>>reproduce the problem without this config option set?
>>    
>>
>
>It would be useful to compare the compiler warning output for 64k pages
>versus that for smaller-pages.  
>
>Several quite worrisome-looking warnings are emitted from various parts of
>the kernel with 64k pages.  Related to arithmetic on short types.
>  
>
1. the config diff
blade10:/boot # diff config-2.6.18-rc5-ppc64 config-2.6.18-rc5-ppc64.64kp
4c4
< # Thu Aug 31 18:25:42 2006
---
 > # Thu Aug 31 21:18:52 2006
51c51
< CONFIG_LOCALVERSION="-ppc64"
---
 > CONFIG_LOCALVERSION="-ppc64.64kp"
173c173
< CONFIG_FORCE_MAX_ZONEORDER=13
---
 > CONFIG_FORCE_MAX_ZONEORDER=9
204c204
< # CONFIG_PPC_64K_PAGES is not set
---
 > CONFIG_PPC_64K_PAGES=y

2. the compiler warning diff
ltctest:~ # diff 4k.warning 64k.warning
0a1,5
 > kernel/power/pm.c:205: warning: ‘pm_register’ is deprecated 
(declared at kernel/power/pm.c:64)
 > kernel/power/pm.c:205: warning: ‘pm_register’ is deprecated 
(declared at kernel/power/pm.c:64)
 > kernel/power/pm.c:206: warning: ‘pm_send_all’ is deprecated 
(declared at kernel/power/pm.c:180)
 > kernel/power/pm.c:206: warning: ‘pm_send_all’ is deprecated 
(declared at kernel/power/pm.c:180)
 > fs/bio.c:169: warning: ‘idx’ may be used uninitialized in this 
function
8,13d12
< fs/bio.c:169: warning: ‘idx’ may be used uninitialized in this 
function
< kernel/power/pm.c:205: warning: ‘pm_register’ is deprecated 
(declared at kernel/power/pm.c:64)
< kernel/power/pm.c:205: warning: ‘pm_register’ is deprecated 
(declared at kernel/power/pm.c:64)
< kernel/power/pm.c:206: warning: ‘pm_send_all’ is deprecated 
(declared at kernel/power/pm.c:180)
< kernel/power/pm.c:206: warning: ‘pm_send_all’ is deprecated 
(declared at kernel/power/pm.c:180)
< fs/eventpoll.c:500: warning: ‘fd’ may be used uninitialized in 
this function
17a17,27
 > fs/eventpoll.c:500: warning: ‘fd’ may be used uninitialized in 
this function
 > fs/fat/inode.c:1227: warning: comparison is always false due to 
limited range of data type
 > fs/hfs/btree.c:243: warning: comparison is always false due to 
limited range of data type
 > fs/hfsplus/btree.c:235: warning: comparison is always false due to 
limited range of data type
 > fs/ocfs2/vote.c:774: warning: ‘response’ may be used 
uninitialized in this function
 > fs/ocfs2/dlm/dlmdomain.c:70: warning: format ‘%lu’ expects type 
‘long unsigned int’, but argument 7 has type ‘int’
 > fs/ocfs2/dlm/dlmdomain.c:70: warning: format ‘%lu’ expects type 
‘long unsigned int’, but argument 7 has type ‘int’
 > fs/ocfs2/dlm/dlmdomain.c:70: warning: format ‘%lu’ expects type 
‘long unsigned int’, but argument 7 has type ‘int’
 > fs/ocfs2/dlm/dlmdomain.c:918: warning: ‘response’ may be used 
uninitialized in this function
 > fs/udf/balloc.c:751: warning: ‘goal_eloc.logicalBlockNum’ may be 
used uninitialized in this function
 > fs/udf/super.c:1364: warning: ‘ino.partitionReferenceNum’ may be 
used uninitialized in this function
56a67,68
 > drivers/usb/core/devio.c:620: warning: comparison is always false due 
to limited range of data type
 > drivers/net/r8169.c:2131: warning: ‘txd’ may be used 
uninitialized in this function
59d70
< drivers/net/r8169.c:2131: warning: ‘txd’ may be used uninitialized 
in this function
70,73c81
< fs/ocfs2/vote.c:774: warning: ‘response’ may be used uninitialized 
in this function
< fs/ocfs2/dlm/dlmdomain.c:918: warning: ‘response’ may be used 
uninitialized in this function
< fs/udf/balloc.c:751: warning: ‘goal_eloc.logicalBlockNum’ may be 
used uninitialized in this function
< fs/udf/super.c:1364: warning: ‘ino.partitionReferenceNum’ may be 
used uninitialized in this function
---
 > net/key/af_key.c:403: warning: comparison is always false due to 
limited range of data type



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: kernel BUG in __xfs_get_blocks at fs/xfs/linux-2.6/xfs_aops.c:1293!
  2006-08-31  8:17     ` David Chinner
  2006-08-31  8:54       ` Andrew Morton
@ 2006-09-01  6:34       ` Yao Fei Zhu
  1 sibling, 0 replies; 7+ messages in thread
From: Yao Fei Zhu @ 2006-09-01  6:34 UTC (permalink / raw)
  To: David Chinner; +Cc: linux-kernel, xfs

David Chinner wrote:

>But that might be a good place to start. Can you see if you can
>reproduce the problem without this config option set?
>  
>
No, I can't reproduce this prlblem without the CONFIG_PPC_64K_PAGES
config option set, the fsstress testcase works fine.



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-09-01  6:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-31  5:48 kernel BUG in __xfs_get_blocks at fs/xfs/linux-2.6/xfs_aops.c:1293! Yao Fei Zhu
2006-08-31  7:47 ` David Chinner
2006-08-31  8:02   ` Yao Fei Zhu
2006-08-31  8:17     ` David Chinner
2006-08-31  8:54       ` Andrew Morton
2006-09-01  3:43         ` Yao Fei Zhu
2006-09-01  6:34       ` Yao Fei Zhu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox