public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* XFS Bug null pointer dereference in xfs_free_ag_extent
@ 2006-07-20  6:59 Jan Dittmer
  2006-07-29  7:19 ` kernel
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Dittmer @ 2006-07-20  6:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: xfs-masters, xfs

Got the following oops from xfs. Afterwards lots of processes in D
state, probably trying to read the partition in question. Kernel
2.6.18-rc2

[196027.687020] BUG: unable to handle kernel NULL pointer dereference at 
virtual address 00000060
[196027.687216]  printing eip:
[196027.687273] c01acc00
[196027.687275] *pde = 00000000
[196027.687337] Oops: 0000 [#1]
[196027.687395] SMP
[196027.687458] Modules linked in: rfcomm l2cap bluetooth nfsd exportfs 
lockd nfs_acl sunrpc pppoe pppox ipv6 ppp_generic slhc twofish serpent 
aes blowfish sha256 crypto_null ipt_LOG ipt_recent ipt_TCPMSS xt_tcpmss 
xt_tcpudp xt_state iptable_filter ipt_MASQUERADE iptable_nat ip_tables 
x_tables dm_mod ip_nat_ftp ip_nat ip_conntrack_ftp ip_conntrack 
nfnetlink tun vfat fat loop lp eeprom i2c_dev i2c_isa usb_storage button 
processor ac e100 snd_seq_dummy snd_seq_oss snd_seq_midi 
snd_seq_midi_event snd_seq cx88_dvb cx88_vp3054_i2c mt352 dvb_pll 
or51132 video_buf_dvb dvb_core nxt200x isl6421 zl10353 cx24123 lgdt330x 
cx22702 cx8802 snd_via82xx firmware_class snd_ac97_codec cx2341x 
snd_ac97_bus cx88xx snd_pcm_oss ir_common snd_mixer_oss video_buf 
tveeprom compat_ioctl32 snd_pcm snd_timer snd_page_alloc snd_mpu401_uart 
via_agp btcx_risc snd_rawmidi snd_seq_device videodev agpgart 
v4l1_compat snd ehci_hcd via_rhine v4l2_common uhci_hcd soundcore 
usbcore parport_pc parport floppy rtc
[196027.690285] CPU:    0
[196027.690286] EIP:    0060:[<c01acc00>]    Not tainted VLI
[196027.690288] EFLAGS: 00210293   (2.6.18-rc2-ds666-via #9)
[196027.690545] EIP is at xfs_btree_init_cursor+0x2f/0x171
[196027.690645] eax: d42b3834   ebx: de835000   ecx: d42b3834   edx: 
0000008c
[196027.690771] esi: 00000000   edi: cb701038   ebp: 00000000   esp: 
cfb20c68
[196027.690896] ds: 007b   es: 007b   ss: 0068
[196027.690978] Process imap (pid: 14978, ti=cfb20000 task=d4d5a570 
task.ti=cfb20000)
[196027.691119] Stack: 00000000 00000017 cb701038 00000017 c0193c67 
00000005 00000000 00000000
[196027.691389]        00000000 00000005 00000000 cb701038 cd848f04 
de835000 0000007a 00000000
[196027.692097]        0004e1d8 df2e18e0 de835000 df2e18e0 c01c9645 
00000000 00000017 cb701038
[196027.692805] Call Trace:
[196027.693104]  [<c0193c67>] xfs_free_ag_extent+0x32/0x5e2
[196027.693445]  [<c01c9645>] xlog_grant_push_ail+0x30/0xfe
[196027.693771]  [<c01954a7>] xfs_free_extent+0xbc/0xd9
[196027.694094]  [<c01c9773>] xfs_log_reserve+0x60/0x5a8
[196027.694436]  [<c01b9376>] xfs_efd_init+0x2f/0x5a
[196027.694741]  [<c01a35c8>] xfs_bmap_finish+0xe6/0x167
[196027.695070]  [<c01d19ab>] xfs_rename+0x866/0xa33
[196027.695412]  [<c01e3d2d>] xfs_vn_rename+0x24/0x64
[196027.695707]  [<c0162d39>] mntput_no_expire+0x11/0x5d
[196027.696029]  [<c01594d1>] link_path_walk+0xb3/0xbd
[196027.696356]  [<c013979b>] pagevec_lookup_tag+0x1b/0x22
[196027.696681]  [<c013bd2a>] kstrdup+0x26/0x60
[196027.696993]  [<c01580bb>] vfs_rename+0x1b6/0x2ef
[196027.697313]  [<c015834d>] __lookup_hash+0x4a/0xc5
[196027.697632]  [<c01599a0>] sys_renameat+0x155/0x1b9
[196027.697961]  [<c013979b>] pagevec_lookup_tag+0x1b/0x22
[196027.698281]  [<c013493b>] wait_on_page_writeback_range+0xa6/0xf1
[196027.698637]  [<c01e1ba5>] xfs_file_fsync+0x3f/0x48
[196027.698953]  [<c0159a15>] sys_rename+0x11/0x15
[196027.699265]  [<c0102795>] sysenter_past_esp+0x56/0x79
[196027.699600] Code: 89 d7 ba 01 00 00 00 56 53 89 c3 8b 74 24 18 a1 c8 
86 4a c0 e8 2b 13 03 00 83 fe 02 89 c1 74 16 72 09 31 c0 83 fe 03 75 78 
eb 51 <8b> 45 60 8b 44 b0 1c 0f c8 eb 6b 83 7c 24 20 00 75 09 8b 44 24
[196027.701445] EIP: [<c01acc00>] xfs_btree_init_cursor+0x2f/0x171 
SS:ESP 0068:cfb20c68
[196027.705801]

Jan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: XFS Bug null pointer dereference in xfs_free_ag_extent
  2006-07-20  6:59 XFS Bug null pointer dereference in xfs_free_ag_extent Jan Dittmer
@ 2006-07-29  7:19 ` kernel
  2006-07-29  7:49   ` Jan Dittmer
  0 siblings, 1 reply; 17+ messages in thread
From: kernel @ 2006-07-29  7:19 UTC (permalink / raw)
  To: Jan Dittmer, linux-kernel

I have the same problem, but it seems not have a patch right now.

Jan Dittmer wrote:
> Got the following oops from xfs. Afterwards lots of processes in D
> state, probably trying to read the partition in question. Kernel
> 2.6.18-rc2
>
> [196027.687020] BUG: unable to handle kernel NULL pointer dereference 
> at virtual address 00000060
> [196027.687216]  printing eip:
> [196027.687273] c01acc00
> [196027.687275] *pde = 00000000
> [196027.687337] Oops: 0000 [#1]
> [196027.687395] SMP
> [196027.687458] Modules linked in: rfcomm l2cap bluetooth nfsd 
> exportfs lockd nfs_acl sunrpc pppoe pppox ipv6 ppp_generic slhc 
> twofish serpent aes blowfish sha256 crypto_null ipt_LOG ipt_recent 
> ipt_TCPMSS xt_tcpmss xt_tcpudp xt_state iptable_filter ipt_MASQUERADE 
> iptable_nat ip_tables x_tables dm_mod ip_nat_ftp ip_nat 
> ip_conntrack_ftp ip_conntrack nfnetlink tun vfat fat loop lp eeprom 
> i2c_dev i2c_isa usb_storage button processor ac e100 snd_seq_dummy 
> snd_seq_oss snd_seq_midi snd_seq_midi_event snd_seq cx88_dvb 
> cx88_vp3054_i2c mt352 dvb_pll or51132 video_buf_dvb dvb_core nxt200x 
> isl6421 zl10353 cx24123 lgdt330x cx22702 cx8802 snd_via82xx 
> firmware_class snd_ac97_codec cx2341x snd_ac97_bus cx88xx snd_pcm_oss 
> ir_common snd_mixer_oss video_buf tveeprom compat_ioctl32 snd_pcm 
> snd_timer snd_page_alloc snd_mpu401_uart via_agp btcx_risc snd_rawmidi 
> snd_seq_device videodev agpgart v4l1_compat snd ehci_hcd via_rhine 
> v4l2_common uhci_hcd soundcore usbcore parport_pc parport floppy rtc
> [196027.690285] CPU:    0
> [196027.690286] EIP:    0060:[<c01acc00>]    Not tainted VLI
> [196027.690288] EFLAGS: 00210293   (2.6.18-rc2-ds666-via #9)
> [196027.690545] EIP is at xfs_btree_init_cursor+0x2f/0x171
> [196027.690645] eax: d42b3834   ebx: de835000   ecx: d42b3834   edx: 
> 0000008c
> [196027.690771] esi: 00000000   edi: cb701038   ebp: 00000000   esp: 
> cfb20c68
> [196027.690896] ds: 007b   es: 007b   ss: 0068
> [196027.690978] Process imap (pid: 14978, ti=cfb20000 task=d4d5a570 
> task.ti=cfb20000)
> [196027.691119] Stack: 00000000 00000017 cb701038 00000017 c0193c67 
> 00000005 00000000 00000000
> [196027.691389]        00000000 00000005 00000000 cb701038 cd848f04 
> de835000 0000007a 00000000
> [196027.692097]        0004e1d8 df2e18e0 de835000 df2e18e0 c01c9645 
> 00000000 00000017 cb701038
> [196027.692805] Call Trace:
> [196027.693104]  [<c0193c67>] xfs_free_ag_extent+0x32/0x5e2
> [196027.693445]  [<c01c9645>] xlog_grant_push_ail+0x30/0xfe
> [196027.693771]  [<c01954a7>] xfs_free_extent+0xbc/0xd9
> [196027.694094]  [<c01c9773>] xfs_log_reserve+0x60/0x5a8
> [196027.694436]  [<c01b9376>] xfs_efd_init+0x2f/0x5a
> [196027.694741]  [<c01a35c8>] xfs_bmap_finish+0xe6/0x167
> [196027.695070]  [<c01d19ab>] xfs_rename+0x866/0xa33
> [196027.695412]  [<c01e3d2d>] xfs_vn_rename+0x24/0x64
> [196027.695707]  [<c0162d39>] mntput_no_expire+0x11/0x5d
> [196027.696029]  [<c01594d1>] link_path_walk+0xb3/0xbd
> [196027.696356]  [<c013979b>] pagevec_lookup_tag+0x1b/0x22
> [196027.696681]  [<c013bd2a>] kstrdup+0x26/0x60
> [196027.696993]  [<c01580bb>] vfs_rename+0x1b6/0x2ef
> [196027.697313]  [<c015834d>] __lookup_hash+0x4a/0xc5
> [196027.697632]  [<c01599a0>] sys_renameat+0x155/0x1b9
> [196027.697961]  [<c013979b>] pagevec_lookup_tag+0x1b/0x22
> [196027.698281]  [<c013493b>] wait_on_page_writeback_range+0xa6/0xf1
> [196027.698637]  [<c01e1ba5>] xfs_file_fsync+0x3f/0x48
> [196027.698953]  [<c0159a15>] sys_rename+0x11/0x15
> [196027.699265]  [<c0102795>] sysenter_past_esp+0x56/0x79
> [196027.699600] Code: 89 d7 ba 01 00 00 00 56 53 89 c3 8b 74 24 18 a1 
> c8 86 4a c0 e8 2b 13 03 00 83 fe 02 89 c1 74 16 72 09 31 c0 83 fe 03 
> 75 78 eb 51 <8b> 45 60 8b 44 b0 1c 0f c8 eb 6b 83 7c 24 20 00 75 09 8b 
> 44 24
> [196027.701445] EIP: [<c01acc00>] xfs_btree_init_cursor+0x2f/0x171 
> SS:ESP 0068:cfb20c68
> [196027.705801]
>
> Jan
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: XFS Bug null pointer dereference in xfs_free_ag_extent
  2006-07-29  7:19 ` kernel
@ 2006-07-29  7:49   ` Jan Dittmer
  2006-07-29  8:10     ` kernel
  2006-07-30 23:44     ` Nathan Scott
  0 siblings, 2 replies; 17+ messages in thread
From: Jan Dittmer @ 2006-07-29  7:49 UTC (permalink / raw)
  To: kernel; +Cc: linux-kernel, xfs

kernel schrieb:
> I have the same problem, but it seems not have a patch right now.
> 

No, I got zero feedback, but let's cc the correct
mailing list. I also filed bug 6877 at kernel.org

Regards,

Jan

> Jan Dittmer wrote:
> 
>> Got the following oops from xfs. Afterwards lots of processes in D
>> state, probably trying to read the partition in question. Kernel
>> 2.6.18-rc2
>>
>> [196027.687020] BUG: unable to handle kernel NULL pointer dereference 
>> at virtual address 00000060
>> [196027.687216]  printing eip:
>> [196027.687273] c01acc00
>> [196027.687275] *pde = 00000000
>> [196027.687337] Oops: 0000 [#1]
>> [196027.687395] SMP
>> [196027.687458] Modules linked in: rfcomm l2cap bluetooth nfsd 
>> exportfs lockd nfs_acl sunrpc pppoe pppox ipv6 ppp_generic slhc 
>> twofish serpent aes blowfish sha256 crypto_null ipt_LOG ipt_recent 
>> ipt_TCPMSS xt_tcpmss xt_tcpudp xt_state iptable_filter ipt_MASQUERADE 
>> iptable_nat ip_tables x_tables dm_mod ip_nat_ftp ip_nat 
>> ip_conntrack_ftp ip_conntrack nfnetlink tun vfat fat loop lp eeprom 
>> i2c_dev i2c_isa usb_storage button processor ac e100 snd_seq_dummy 
>> snd_seq_oss snd_seq_midi snd_seq_midi_event snd_seq cx88_dvb 
>> cx88_vp3054_i2c mt352 dvb_pll or51132 video_buf_dvb dvb_core nxt200x 
>> isl6421 zl10353 cx24123 lgdt330x cx22702 cx8802 snd_via82xx 
>> firmware_class snd_ac97_codec cx2341x snd_ac97_bus cx88xx snd_pcm_oss 
>> ir_common snd_mixer_oss video_buf tveeprom compat_ioctl32 snd_pcm 
>> snd_timer snd_page_alloc snd_mpu401_uart via_agp btcx_risc snd_rawmidi 
>> snd_seq_device videodev agpgart v4l1_compat snd ehci_hcd via_rhine 
>> v4l2_common uhci_hcd soundcore usbcore parport_pc parport floppy rtc
>> [196027.690285] CPU:    0
>> [196027.690286] EIP:    0060:[<c01acc00>]    Not tainted VLI
>> [196027.690288] EFLAGS: 00210293   (2.6.18-rc2-ds666-via #9)
>> [196027.690545] EIP is at xfs_btree_init_cursor+0x2f/0x171
>> [196027.690645] eax: d42b3834   ebx: de835000   ecx: d42b3834   edx: 
>> 0000008c
>> [196027.690771] esi: 00000000   edi: cb701038   ebp: 00000000   esp: 
>> cfb20c68
>> [196027.690896] ds: 007b   es: 007b   ss: 0068
>> [196027.690978] Process imap (pid: 14978, ti=cfb20000 task=d4d5a570 
>> task.ti=cfb20000)
>> [196027.691119] Stack: 00000000 00000017 cb701038 00000017 c0193c67 
>> 00000005 00000000 00000000
>> [196027.691389]        00000000 00000005 00000000 cb701038 cd848f04 
>> de835000 0000007a 00000000
>> [196027.692097]        0004e1d8 df2e18e0 de835000 df2e18e0 c01c9645 
>> 00000000 00000017 cb701038
>> [196027.692805] Call Trace:
>> [196027.693104]  [<c0193c67>] xfs_free_ag_extent+0x32/0x5e2
>> [196027.693445]  [<c01c9645>] xlog_grant_push_ail+0x30/0xfe
>> [196027.693771]  [<c01954a7>] xfs_free_extent+0xbc/0xd9
>> [196027.694094]  [<c01c9773>] xfs_log_reserve+0x60/0x5a8
>> [196027.694436]  [<c01b9376>] xfs_efd_init+0x2f/0x5a
>> [196027.694741]  [<c01a35c8>] xfs_bmap_finish+0xe6/0x167
>> [196027.695070]  [<c01d19ab>] xfs_rename+0x866/0xa33
>> [196027.695412]  [<c01e3d2d>] xfs_vn_rename+0x24/0x64
>> [196027.695707]  [<c0162d39>] mntput_no_expire+0x11/0x5d
>> [196027.696029]  [<c01594d1>] link_path_walk+0xb3/0xbd
>> [196027.696356]  [<c013979b>] pagevec_lookup_tag+0x1b/0x22
>> [196027.696681]  [<c013bd2a>] kstrdup+0x26/0x60
>> [196027.696993]  [<c01580bb>] vfs_rename+0x1b6/0x2ef
>> [196027.697313]  [<c015834d>] __lookup_hash+0x4a/0xc5
>> [196027.697632]  [<c01599a0>] sys_renameat+0x155/0x1b9
>> [196027.697961]  [<c013979b>] pagevec_lookup_tag+0x1b/0x22
>> [196027.698281]  [<c013493b>] wait_on_page_writeback_range+0xa6/0xf1
>> [196027.698637]  [<c01e1ba5>] xfs_file_fsync+0x3f/0x48
>> [196027.698953]  [<c0159a15>] sys_rename+0x11/0x15
>> [196027.699265]  [<c0102795>] sysenter_past_esp+0x56/0x79
>> [196027.699600] Code: 89 d7 ba 01 00 00 00 56 53 89 c3 8b 74 24 18 a1 
>> c8 86 4a c0 e8 2b 13 03 00 83 fe 02 89 c1 74 16 72 09 31 c0 83 fe 03 
>> 75 78 eb 51 <8b> 45 60 8b 44 b0 1c 0f c8 eb 6b 83 7c 24 20 00 75 09 8b 
>> 44 24
>> [196027.701445] EIP: [<c01acc00>] xfs_btree_init_cursor+0x2f/0x171 
>> SS:ESP 0068:cfb20c68
>> [196027.705801]
>>
>> Jan
>> -
>> To unsubscribe from this list: send the line "unsubscribe 
>> linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: XFS Bug null pointer dereference in xfs_free_ag_extent
  2006-07-29  7:49   ` Jan Dittmer
@ 2006-07-29  8:10     ` kernel
  2006-07-30 23:44     ` Nathan Scott
  1 sibling, 0 replies; 17+ messages in thread
From: kernel @ 2006-07-29  8:10 UTC (permalink / raw)
  To: Jan Dittmer, linux-kernel

I've seemd some XFS issue in git7 which was released a moment ago. I'll 
test it.

Jan Dittmer wrote:
> kernel schrieb:
>> I have the same problem, but it seems not have a patch right now.
>>
>
> No, I got zero feedback, but let's cc the correct
> mailing list. I also filed bug 6877 at kernel.org
>
> Regards,
>
> Jan
>
>> Jan Dittmer wrote:
>>
>>> Got the following oops from xfs. Afterwards lots of processes in D
>>> state, probably trying to read the partition in question. Kernel
>>> 2.6.18-rc2
>>>
>>> [196027.687020] BUG: unable to handle kernel NULL pointer 
>>> dereference at virtual address 00000060
>>> [196027.687216]  printing eip:
>>> [196027.687273] c01acc00
>>> [196027.687275] *pde = 00000000
>>> [196027.687337] Oops: 0000 [#1]
>>> [196027.687395] SMP
>>> [196027.687458] Modules linked in: rfcomm l2cap bluetooth nfsd 
>>> exportfs lockd nfs_acl sunrpc pppoe pppox ipv6 ppp_generic slhc 
>>> twofish serpent aes blowfish sha256 crypto_null ipt_LOG ipt_recent 
>>> ipt_TCPMSS xt_tcpmss xt_tcpudp xt_state iptable_filter 
>>> ipt_MASQUERADE iptable_nat ip_tables x_tables dm_mod ip_nat_ftp 
>>> ip_nat ip_conntrack_ftp ip_conntrack nfnetlink tun vfat fat loop lp 
>>> eeprom i2c_dev i2c_isa usb_storage button processor ac e100 
>>> snd_seq_dummy snd_seq_oss snd_seq_midi snd_seq_midi_event snd_seq 
>>> cx88_dvb cx88_vp3054_i2c mt352 dvb_pll or51132 video_buf_dvb 
>>> dvb_core nxt200x isl6421 zl10353 cx24123 lgdt330x cx22702 cx8802 
>>> snd_via82xx firmware_class snd_ac97_codec cx2341x snd_ac97_bus 
>>> cx88xx snd_pcm_oss ir_common snd_mixer_oss video_buf tveeprom 
>>> compat_ioctl32 snd_pcm snd_timer snd_page_alloc snd_mpu401_uart 
>>> via_agp btcx_risc snd_rawmidi snd_seq_device videodev agpgart 
>>> v4l1_compat snd ehci_hcd via_rhine v4l2_common uhci_hcd soundcore 
>>> usbcore parport_pc parport floppy rtc
>>> [196027.690285] CPU:    0
>>> [196027.690286] EIP:    0060:[<c01acc00>]    Not tainted VLI
>>> [196027.690288] EFLAGS: 00210293   (2.6.18-rc2-ds666-via #9)
>>> [196027.690545] EIP is at xfs_btree_init_cursor+0x2f/0x171
>>> [196027.690645] eax: d42b3834   ebx: de835000   ecx: d42b3834   edx: 
>>> 0000008c
>>> [196027.690771] esi: 00000000   edi: cb701038   ebp: 00000000   esp: 
>>> cfb20c68
>>> [196027.690896] ds: 007b   es: 007b   ss: 0068
>>> [196027.690978] Process imap (pid: 14978, ti=cfb20000 task=d4d5a570 
>>> task.ti=cfb20000)
>>> [196027.691119] Stack: 00000000 00000017 cb701038 00000017 c0193c67 
>>> 00000005 00000000 00000000
>>> [196027.691389]        00000000 00000005 00000000 cb701038 cd848f04 
>>> de835000 0000007a 00000000
>>> [196027.692097]        0004e1d8 df2e18e0 de835000 df2e18e0 c01c9645 
>>> 00000000 00000017 cb701038
>>> [196027.692805] Call Trace:
>>> [196027.693104]  [<c0193c67>] xfs_free_ag_extent+0x32/0x5e2
>>> [196027.693445]  [<c01c9645>] xlog_grant_push_ail+0x30/0xfe
>>> [196027.693771]  [<c01954a7>] xfs_free_extent+0xbc/0xd9
>>> [196027.694094]  [<c01c9773>] xfs_log_reserve+0x60/0x5a8
>>> [196027.694436]  [<c01b9376>] xfs_efd_init+0x2f/0x5a
>>> [196027.694741]  [<c01a35c8>] xfs_bmap_finish+0xe6/0x167
>>> [196027.695070]  [<c01d19ab>] xfs_rename+0x866/0xa33
>>> [196027.695412]  [<c01e3d2d>] xfs_vn_rename+0x24/0x64
>>> [196027.695707]  [<c0162d39>] mntput_no_expire+0x11/0x5d
>>> [196027.696029]  [<c01594d1>] link_path_walk+0xb3/0xbd
>>> [196027.696356]  [<c013979b>] pagevec_lookup_tag+0x1b/0x22
>>> [196027.696681]  [<c013bd2a>] kstrdup+0x26/0x60
>>> [196027.696993]  [<c01580bb>] vfs_rename+0x1b6/0x2ef
>>> [196027.697313]  [<c015834d>] __lookup_hash+0x4a/0xc5
>>> [196027.697632]  [<c01599a0>] sys_renameat+0x155/0x1b9
>>> [196027.697961]  [<c013979b>] pagevec_lookup_tag+0x1b/0x22
>>> [196027.698281]  [<c013493b>] wait_on_page_writeback_range+0xa6/0xf1
>>> [196027.698637]  [<c01e1ba5>] xfs_file_fsync+0x3f/0x48
>>> [196027.698953]  [<c0159a15>] sys_rename+0x11/0x15
>>> [196027.699265]  [<c0102795>] sysenter_past_esp+0x56/0x79
>>> [196027.699600] Code: 89 d7 ba 01 00 00 00 56 53 89 c3 8b 74 24 18 
>>> a1 c8 86 4a c0 e8 2b 13 03 00 83 fe 02 89 c1 74 16 72 09 31 c0 83 fe 
>>> 03 75 78 eb 51 <8b> 45 60 8b 44 b0 1c 0f c8 eb 6b 83 7c 24 20 00 75 
>>> 09 8b 44 24
>>> [196027.701445] EIP: [<c01acc00>] xfs_btree_init_cursor+0x2f/0x171 
>>> SS:ESP 0068:cfb20c68
>>> [196027.705801]
>>>
>>> Jan
>>> -
>>> To unsubscribe from this list: send the line "unsubscribe 
>>> linux-kernel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>
>>
>
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: XFS Bug null pointer dereference in xfs_free_ag_extent
  2006-07-29  7:49   ` Jan Dittmer
  2006-07-29  8:10     ` kernel
@ 2006-07-30 23:44     ` Nathan Scott
  2006-07-31  1:30       ` kernel
                         ` (2 more replies)
  1 sibling, 3 replies; 17+ messages in thread
From: Nathan Scott @ 2006-07-30 23:44 UTC (permalink / raw)
  To: Jan Dittmer, kernel; +Cc: linux-kernel, xfs

Hi there,

On Sat, Jul 29, 2006 at 09:49:23AM +0200, Jan Dittmer wrote:
> kernel schrieb:
> > I have the same problem, but it seems not have a patch right now.
> 
> No, I got zero feedback, but let's cc the correct
> mailing list. I also filed bug 6877 at kernel.org
> 

Is this easily reproducible for you?  I've not seen it before, and
the only possibly related recent changes I can think of are these:

http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e63a3690013a475746ad2cea998ebb534d825704

http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d210a28cd851082cec9b282443f8cc0e6fc09830

Could you try reverting each of those to see if either is the cause?

thanks.

-- 
Nathan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: XFS Bug null pointer dereference in xfs_free_ag_extent
  2006-07-30 23:44     ` Nathan Scott
@ 2006-07-31  1:30       ` kernel
  2006-07-31  6:21       ` kernel
  2006-07-31  6:58       ` Jan Dittmer
  2 siblings, 0 replies; 17+ messages in thread
From: kernel @ 2006-07-31  1:30 UTC (permalink / raw)
  To: Nathan Scott, jdi; +Cc: linux-kernel

My hardware is dell 6650 and a FLX380 SAN storage,connected with a 
qlogic2422 FC card.
The another machine is dell 2850 with the same card.
I can reproduce this error with bonnie++ easily.Espacially bonnie++ 
delete small files and small directories.


Nathan Scott wrote:
> Hi there,
>
> On Sat, Jul 29, 2006 at 09:49:23AM +0200, Jan Dittmer wrote:
>   
>> kernel schrieb:
>>     
>>> I have the same problem, but it seems not have a patch right now.
>>>       
>> No, I got zero feedback, but let's cc the correct
>> mailing list. I also filed bug 6877 at kernel.org
>>
>>     
>
> Is this easily reproducible for you?  I've not seen it before, and
> the only possibly related recent changes I can think of are these:
>
> http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e63a3690013a475746ad2cea998ebb534d825704
>
> http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d210a28cd851082cec9b282443f8cc0e6fc09830
>
> Could you try reverting each of those to see if either is the cause?
>
> thanks.
>
>   

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: XFS Bug null pointer dereference in xfs_free_ag_extent
  2006-07-30 23:44     ` Nathan Scott
  2006-07-31  1:30       ` kernel
@ 2006-07-31  6:21       ` kernel
  2006-07-31  6:55         ` Nathan Scott
  2006-07-31  6:58       ` Jan Dittmer
  2 siblings, 1 reply; 17+ messages in thread
From: kernel @ 2006-07-31  6:21 UTC (permalink / raw)
  To: Nathan Scott, jdi; +Cc: linux-kernel

Test again......very strange.
I can easily reproduce it on the XFS with SAN(FLX380) connected with a 
qlogic 2400 FC card.

Everything is right when I using ext3/reiser4/reiserfs with 
FLX380,qlogic 2400.
Everything is right too when I using XFS with a normal scsi storage or a 
raid storage(NOT a SAN).


Nathan Scott wrote:
> Hi there,
>
> On Sat, Jul 29, 2006 at 09:49:23AM +0200, Jan Dittmer wrote:
>   
>> kernel schrieb:
>>     
>>> I have the same problem, but it seems not have a patch right now.
>>>       
>> No, I got zero feedback, but let's cc the correct
>> mailing list. I also filed bug 6877 at kernel.org
>>
>>     
>
> Is this easily reproducible for you?  I've not seen it before, and
> the only possibly related recent changes I can think of are these:
>
> http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e63a3690013a475746ad2cea998ebb534d825704
>
> http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d210a28cd851082cec9b282443f8cc0e6fc09830
>
> Could you try reverting each of those to see if either is the cause?
>
> thanks.
>
>   

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: XFS Bug null pointer dereference in xfs_free_ag_extent
  2006-07-31  6:21       ` kernel
@ 2006-07-31  6:55         ` Nathan Scott
  2006-07-31  7:28           ` kernel
  0 siblings, 1 reply; 17+ messages in thread
From: Nathan Scott @ 2006-07-31  6:55 UTC (permalink / raw)
  To: kernel; +Cc: jdi, linux-kernel

On Mon, Jul 31, 2006 at 02:21:10PM +0800, kernel wrote:
> Test again......very strange.
> I can easily reproduce it on the XFS with SAN(FLX380) connected with a 
> qlogic 2400 FC card.

Eggshellent... can you reproduce it with each of those changes
(below) backed out of your tree please?  Else, git bisect is our
next best bet.  Thanks!

> > Is this easily reproducible for you?  I've not seen it before, and
> > the only possibly related recent changes I can think of are these:
> >
> > http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e63a3690013a475746ad2cea998ebb534d825704
> >
> > http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d210a28cd851082cec9b282443f8cc0e6fc09830
> >
> > Could you try reverting each of those to see if either is the cause?

cheers.

-- 
Nathan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: XFS Bug null pointer dereference in xfs_free_ag_extent
  2006-07-30 23:44     ` Nathan Scott
  2006-07-31  1:30       ` kernel
  2006-07-31  6:21       ` kernel
@ 2006-07-31  6:58       ` Jan Dittmer
  2 siblings, 0 replies; 17+ messages in thread
From: Jan Dittmer @ 2006-07-31  6:58 UTC (permalink / raw)
  To: Nathan Scott; +Cc: kernel, linux-kernel, xfs

Nathan Scott schrieb:
> Hi there,
> 
> On Sat, Jul 29, 2006 at 09:49:23AM +0200, Jan Dittmer wrote:
> 
>>kernel schrieb:
>>
>>>I have the same problem, but it seems not have a patch right now.
>>
>>No, I got zero feedback, but let's cc the correct
>>mailing list. I also filed bug 6877 at kernel.org
>>
> 
> 
> Is this easily reproducible for you?  I've not seen it before, and
> the only possibly related recent changes I can think of are these:
> 
> http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e63a3690013a475746ad2cea998ebb534d825704
> 
> http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d210a28cd851082cec9b282443f8cc0e6fc09830
> 
> Could you try reverting each of those to see if either is the cause?

No, the XFS partition in question is gone due to the infamous
endian bug in 2.6.17. I only saw the error once. Hardware is
4 sata drives on a sil-something controller with software
raid5. No lvm. But the error reads like a rename race of some
kind?

Jan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: XFS Bug null pointer dereference in xfs_free_ag_extent
  2006-07-31  6:55         ` Nathan Scott
@ 2006-07-31  7:28           ` kernel
  2006-07-31  8:41             ` Joe Jin
  2006-07-31  9:43             ` Nathan Scott
  0 siblings, 2 replies; 17+ messages in thread
From: kernel @ 2006-07-31  7:28 UTC (permalink / raw)
  To: Nathan Scott, jdi; +Cc: linux-kernel

I format the same partition and restart the testing server before each 
testing.
I'vs tested on each format at least twenty times.
With XFS and SAN, This crash happens on every bonnie++ testing.

And I have tested such things on another mathine, results are same.


Nathan Scott wrote:
> On Mon, Jul 31, 2006 at 02:21:10PM +0800, kernel wrote:
>   
>> Test again......very strange.
>> I can easily reproduce it on the XFS with SAN(FLX380) connected with a 
>> qlogic 2400 FC card.
>>     
>
> Eggshellent... can you reproduce it with each of those changes
> (below) backed out of your tree please?  Else, git bisect is our
> next best bet.  Thanks!
>
>   
>>> Is this easily reproducible for you?  I've not seen it before, and
>>> the only possibly related recent changes I can think of are these:
>>>
>>> http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e63a3690013a475746ad2cea998ebb534d825704
>>>
>>> http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d210a28cd851082cec9b282443f8cc0e6fc09830
>>>
>>> Could you try reverting each of those to see if either is the cause?
>>>       
>
> cheers.
>
>   

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: XFS Bug null pointer dereference in xfs_free_ag_extent
  2006-07-31  7:28           ` kernel
@ 2006-07-31  8:41             ` Joe Jin
  2006-07-31  9:53               ` Nathan Scott
  2006-07-31  9:43             ` Nathan Scott
  1 sibling, 1 reply; 17+ messages in thread
From: Joe Jin @ 2006-07-31  8:41 UTC (permalink / raw)
  To: kernel; +Cc: Nathan Scott, jdi, linux-kernel, rdunlap, wim.coekaerts

[-- Attachment #1: Type: text/plain, Size: 1837 bytes --]

At XFS have a debug option, but I wonder why why it was not opened,
and you may open the option with follow patch, rebuild kernel, you
maybe get more information of it.
And when I trace the code, I also found at some function should check
the return value, it also include the patch.
Hope help for you.



On 7/31/06, kernel <linux@idccenter.cn> wrote:
> I format the same partition and restart the testing server before each
> testing.
> I'vs tested on each format at least twenty times.
> With XFS and SAN, This crash happens on every bonnie++ testing.
>
> And I have tested such things on another mathine, results are same.
>
>
> Nathan Scott wrote:
> > On Mon, Jul 31, 2006 at 02:21:10PM +0800, kernel wrote:
> >
> >> Test again......very strange.
> >> I can easily reproduce it on the XFS with SAN(FLX380) connected with a
> >> qlogic 2400 FC card.
> >>
> >
> > Eggshellent... can you reproduce it with each of those changes
> > (below) backed out of your tree please?  Else, git bisect is our
> > next best bet.  Thanks!
> >
> >
> >>> Is this easily reproducible for you?  I've not seen it before, and
> >>> the only possibly related recent changes I can think of are these:
> >>>
> >>> http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e63a3690013a475746ad2cea998ebb534d825704
> >>>
> >>> http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d210a28cd851082cec9b282443f8cc0e6fc09830
> >>>
> >>> Could you try reverting each of those to see if either is the cause?
> >>>
> >
> > cheers.
> >
> >
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


-- 
Regards,
Joe.Jin

[-- Attachment #2: xfs.patch --]
[-- Type: text/x-patch, Size: 1473 bytes --]

--- linux-2.6.18-rc3/fs/xfs/Kconfig	2006-07-30 14:15:36.000000000 +0800
+++ linux.new/fs/xfs/Kconfig	2006-07-31 16:22:32.000000000 +0800
@@ -17,6 +17,15 @@
 	  system of your root partition is compiled as a module, you'll need
 	  to use an initial ramdisk (initrd) to boot.
 
+config XFS_DEBUG
+	bool "XFS debugging"
+	depends on XFS_FS
+	help
+	  If you are experiencing any problems with the XFS filesystem, say
+	  Y here.  This will result in additional debugging messages to be
+	  written to the system log.  Under normal circumstances, this
+	  results in very little overhead.
+
 config XFS_QUOTA
 	bool "XFS Quota support"
 	depends on XFS_FS
--- linux-2.6.18-rc3/fs/xfs/xfs_btree.c	2006-07-30 14:15:36.000000000 +0800
+++ linux.new/fs/xfs/xfs_btree.c	2006-07-31 15:39:43.000000000 +0800
@@ -586,6 +586,9 @@
 	 * Allocate a new cursor.
 	 */
 	cur = kmem_zone_zalloc(xfs_btree_cur_zone, KM_SLEEP);
+	if(!cur)
+		return NULL;
+
 	/*
 	 * Deduce the number of btree levels from the arguments.
 	 */
--- linux-2.6.18-rc3/fs/xfs/xfs_alloc.c	2006-07-30 14:15:36.000000000 +0800
+++ linux.new/fs/xfs/xfs_alloc.c	2006-07-31 16:09:04.000000000 +0800
@@ -649,6 +649,9 @@
 	 */
 	bno_cur = xfs_btree_init_cursor(args->mp, args->tp, args->agbp,
 		args->agno, XFS_BTNUM_BNO, NULL, 0);
+	if(!bno_cur)
+		return XFS_ERROR(ENOMEM);
+	
 	/*
 	 * Lookup bno and minlen in the btree (minlen is irrelevant, really).
 	 * Look for the closest free block <= bno, it must contain bno

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: XFS Bug null pointer dereference in xfs_free_ag_extent
  2006-07-31  7:28           ` kernel
  2006-07-31  8:41             ` Joe Jin
@ 2006-07-31  9:43             ` Nathan Scott
  2006-07-31 10:04               ` kernel
  1 sibling, 1 reply; 17+ messages in thread
From: Nathan Scott @ 2006-07-31  9:43 UTC (permalink / raw)
  To: kernel; +Cc: jdi, linux-kernel

On Mon, Jul 31, 2006 at 03:28:53PM +0800, kernel wrote:
> I format the same partition and restart the testing server before each 
> testing.
> I'vs tested on each format at least twenty times.
> With XFS and SAN, This crash happens on every bonnie++ testing.

Its not clear to me - are you testing with the patches I mentioned
earlier reverted or not?

> And I have tested such things on another mathine, results are same.

Can you send me the xfs_info output from one of these filesystems,
and the exact bonnie++ command line used so I can reproduce it here?

thanks.

-- 
Nathan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: XFS Bug null pointer dereference in xfs_free_ag_extent
  2006-07-31  8:41             ` Joe Jin
@ 2006-07-31  9:53               ` Nathan Scott
  0 siblings, 0 replies; 17+ messages in thread
From: Nathan Scott @ 2006-07-31  9:53 UTC (permalink / raw)
  To: Joe Jin; +Cc: kernel, jdi, linux-kernel, rdunlap, wim.coekaerts

On Mon, Jul 31, 2006 at 04:41:28PM +0800, Joe Jin wrote:
> At XFS have a debug option, but I wonder why why it was not opened,
> and you may open the option with follow patch, rebuild kernel, you
> maybe get more information of it.
> And when I trace the code, I also found at some function should check
> the return value, it also include the patch.

Hmmm.  If you want to enable debug, theres an "official" patch in
the XFS CVS tree for doing so ... this patch has one or two minor
issues:

+config XFS_DEBUG
+	bool "XFS debugging"
+	depends on XFS_FS
+	help
+	  If you are experiencing any problems with the XFS filesystem, say
+	  Y here.  This will result in additional debugging messages to be
+	  written to the system log.

Thats not really what it does - it actually mostly enables asserts
all over the code, which cause the system to panic if something bad
is detected.

I don't expect the XFS debug code to help in this particular situation
in fact it may well hide the problem.

+  Under normal circumstances, this
+	  results in very little overhead.

Its actually quite expensive.  This is one of the reasons the enabling
patch isn't in mainline, as people switch it on and wonder what happened
to their performance numbers (also it is more useful with kdb).

 	cur = kmem_zone_zalloc(xfs_btree_cur_zone, KM_SLEEP);
+	if(!cur)
+		return NULL;

This can never happen due to the alloc flags argument.

--- linux-2.6.18-rc3/fs/xfs/xfs_alloc.c	2006-07-30 14:15:36.000000000 +0800
+++ linux.new/fs/xfs/xfs_alloc.c	2006-07-31 16:09:04.000000000 +0800
@@ -649,6 +649,9 @@
 	 */
 	bno_cur = xfs_btree_init_cursor(args->mp, args->tp, args->agbp,
 		args->agno, XFS_BTNUM_BNO, NULL, 0);
+	if(!bno_cur)
+		return XFS_ERROR(ENOMEM);

This also should never happen.  From my reading of the oops report,
the problem is inside xfs_btree_init_cursor, I think from one of the
buffer pointers being null when it shouldn't be, but I'd like to get
it reproduced locally to be sure.

cheers.

-- 
Nathan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: XFS Bug null pointer dereference in xfs_free_ag_extent
  2006-07-31  9:43             ` Nathan Scott
@ 2006-07-31 10:04               ` kernel
       [not found]                 ` <215036450607311849o43b1555br13ea2f3f20fb3b82@mail.gmail.com>
  0 siblings, 1 reply; 17+ messages in thread
From: kernel @ 2006-07-31 10:04 UTC (permalink / raw)
  To: Nathan Scott, jdi, lkmaillist; +Cc: linux-kernel

I've revent one of the patch,the other one will test today.

command lines: bonnie++ -u user     (user is a common user not 
root,bonnie++'s version is 1.03a)

[root@test hv]# xfs_info /dev/sdc1
meta-data=/dev/sdc1              isize=256    agcount=32, 
agsize=13926368 blks
         =                       sectsz=4096  attr=0
data     =                       bsize=4096   blocks=445643072, imaxpct=25
         =                       sunit=32     swidth=448 blks, unwritten=1
naming   =version 2              bsize=4096 
log      =internal               bsize=4096   blocks=32768, version=2
         =                       sectsz=4096  sunit=1 blks
realtime =none                   extsz=65536  blocks=0, rtextents=0

I'vs format partation with several different mkfs.xfs options.
mkfs.xfs /dev/sdc1
mkfs.xfs -l version=2,size=128m(64m,32m) /dev/sdc1
mkfs.xfs -l version=2,size=128m(64m,32m) -d su=128k,sw=14 -s size=4k 
/dev/sdc1
mkfs.xfs -s size=4k /dev/sdc1

with DEBUG patch, I got this...

Assertion failed: args.agbp != ((void *)0), file: fs/xfs/xfs_alloc.c, 
line: 2467
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at fs/xfs/support/debug.c:83
invalid opcode: 0000 [1] SMP
CPU 3
Modules linked in: qla2xxx scsi_transport_fc
Pid: 4931, comm: bonnie++ Not tainted 2.6.18-rc3 #1
RIP: 0010:[<ffffffff8032cc1c>]  [<ffffffff8032cc1c>] assfail+0x1a/0x29
RSP: 0018:ffff81007e99fd38  EFLAGS: 00010296
RAX: 0000000000000054 RBX: 0000000000000000 RCX: ffffffff804df348
RDX: ffffffff804df348 RSI: 0000000000040096 RDI: ffffffff804df340
RBP: ffff81007c413568 R08: ffff81007ef9f000 R09: 0000000000000080
R10: 0000000000000000 R11: 0000000000000080 R12: 0000000000000004
R13: 0000000000000000 R14: ffff81007e99fe38 R15: ffff810068340e70
FS:  00002b271d6359e0(0000) GS:ffff810002f68940(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000050f298 CR3: 000000007e877000 CR4: 00000000000006e0
Process bonnie++ (pid: 4931, threadinfo ffff81007e99e000, task 
ffff81007c6f0000)
Stack:  ffff810068340e70 ffffffff802c0e49 ffff81007c413568 ffff81007cc77800
 0000000000000000 ffff81007e92d028 0000000000000000 0000002400000001
 0000000000000000 0000000000000001 ffff810000000000 0000000000000001
Call Trace:
 [<ffffffff802c0e49>] xfs_free_extent+0x105/0x173
 [<ffffffff80315156>] xfs_trans_get_efd+0x7e/0x86
 [<ffffffff802d3599>] xfs_bmap_finish+0x119/0x1b4
 [<ffffffff8031bcad>] xfs_inactive+0x510/0x549
 [<ffffffff8032b382>] xfs_fs_clear_inode+0xa7/0x118
 [<ffffffff8028342a>] clear_inode+0x97/0xca
 [<ffffffff802844d1>] generic_delete_inode+0x8e/0xf6
 [<ffffffff8027bc4f>] do_unlinkat+0xde/0x131
 [<ffffffff8027de50>] sys_getdents64+0xaa/0xba
 [<ffffffff8027d2a5>] sys_fcntl+0x2fd/0x309
 [<ffffffff802099c6>] system_call+0x7e/0x83


Code: 0f 0b 68 91 0e 47 80 c2 53 00 48 83 c4 08 c3 48 8b 35 26 5e
RIP  [<ffffffff8032cc1c>] assfail+0x1a/0x29
 RSP <ffff81007e99fd38>



Nathan Scott wrote:
> On Mon, Jul 31, 2006 at 03:28:53PM +0800, kernel wrote:
>   
>> I format the same partition and restart the testing server before each 
>> testing.
>> I'vs tested on each format at least twenty times.
>> With XFS and SAN, This crash happens on every bonnie++ testing.
>>     
>
> Its not clear to me - are you testing with the patches I mentioned
> earlier reverted or not?
>
>   
>> And I have tested such things on another mathine, results are same.
>>     
>
> Can you send me the xfs_info output from one of these filesystems,
> and the exact bonnie++ command line used so I can reproduce it here?
>
> thanks.
>
>   

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: XFS Bug null pointer dereference in xfs_free_ag_extent
       [not found]                 ` <215036450607311849o43b1555br13ea2f3f20fb3b82@mail.gmail.com>
@ 2006-08-01  8:12                   ` Jan Dittmer
  2006-08-02  1:26                     ` Nathan Scott
  2006-08-08  3:30                   ` Nathan Scott
  1 sibling, 1 reply; 17+ messages in thread
From: Jan Dittmer @ 2006-08-01  8:12 UTC (permalink / raw)
  To: Joe Jin; +Cc: kernel, Nathan Scott, linux-kernel

Joe Jin schrieb:
>  From the information, I think it caused by (args.agbp == NULL).
> get rid of, we'll find the call trace should panic:
> xfs_free_extent
> |_   xfs_free_ag_extent  => here args.agbp= NULL;
>         |_ xfs_btree_init_cursor()
>               |_ agf = XFS_BUF_TO_AGF(agbp);  => (xfs_agf_t 
> *)XFS_BUF_PTR(arbp)
>                              |_ (xfs_caddr_t)((agbp)->b_addr) : but 
> here, agbp is NULL
> so it caused the oops.
> Non debug option, and the oops occured at xfs_btree_init_cursor().
> 

Probably caused by this part of the diff from Nathan's earlier mail:

--- 8558226281c45a61d7a0bc056505246e705a372b
+++ 22af489d3f346c7bb4488cdcf1ee91e59e48ddf3
--- fs/xfs/xfs_alloc.c
+++ fs/xfs/xfs_alloc.c

@@ -1951,8 +1951,14 @@ xfs_alloc_fix_freelist(
  		 * the restrictions correctly.  Can happen for free calls
  		 * on a completely full ag.
  		 */
-		if (targs.agbno == NULLAGBLOCK)
+		if (targs.agbno == NULLAGBLOCK) {
+			if (!(flags & XFS_ALLOC_FLAG_FREEING)) {
+				xfs_trans_brelse(tp, agflbp);
+				args->agbp = NULL;
+				return 0;
+			}
  			break;
+		}
  		/*
  		 * Put each allocated block on the list.
  		 */

Jan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: XFS Bug null pointer dereference in xfs_free_ag_extent
  2006-08-01  8:12                   ` Jan Dittmer
@ 2006-08-02  1:26                     ` Nathan Scott
  0 siblings, 0 replies; 17+ messages in thread
From: Nathan Scott @ 2006-08-02  1:26 UTC (permalink / raw)
  To: Jan Dittmer, Joe Jin, kernel; +Cc: linux-kernel, xfs

On Tue, Aug 01, 2006 at 10:12:14AM +0200, Jan Dittmer wrote:
> Joe Jin schrieb:
> >  From the information, I think it caused by (args.agbp == NULL).
> > get rid of, we'll find the call trace should panic:
> > xfs_free_extent
> > |_   xfs_free_ag_extent  => here args.agbp= NULL;
> >         |_ xfs_btree_init_cursor()
> >               |_ agf = XFS_BUF_TO_AGF(agbp);  => (xfs_agf_t 
> > *)XFS_BUF_PTR(arbp)
> >                              |_ (xfs_caddr_t)((agbp)->b_addr) : but 
> > here, agbp is NULL
> > so it caused the oops.
> > Non debug option, and the oops occured at xfs_btree_init_cursor().
> > 
> 
> Probably caused by this part of the diff from Nathan's earlier mail:

*nod* - that is my suspicion, be great if you guys with the
reproducible case could confirm/deny.. (assuming this is the
case we're hitting, you can also try changing the assignment
to NULL there to instead be "agbp", and see if that corrects
things for you once more).

> --- fs/xfs/xfs_alloc.c
> +++ fs/xfs/xfs_alloc.c
> 
> @@ -1951,8 +1951,14 @@ xfs_alloc_fix_freelist(
>   		 * the restrictions correctly.  Can happen for free calls
>   		 * on a completely full ag.
>   		 */
> -		if (targs.agbno == NULLAGBLOCK)
> +		if (targs.agbno == NULLAGBLOCK) {
> +			if (!(flags & XFS_ALLOC_FLAG_FREEING)) {
> +				xfs_trans_brelse(tp, agflbp);
> +				args->agbp = NULL;
> +				return 0;
> +			}
>   			break;
> +		}

cheers.

-- 
Nathan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: XFS Bug null pointer dereference in xfs_free_ag_extent
       [not found]                 ` <215036450607311849o43b1555br13ea2f3f20fb3b82@mail.gmail.com>
  2006-08-01  8:12                   ` Jan Dittmer
@ 2006-08-08  3:30                   ` Nathan Scott
  1 sibling, 0 replies; 17+ messages in thread
From: Nathan Scott @ 2006-08-08  3:30 UTC (permalink / raw)
  To: Joe Jin, Tony.Ho, jdi, Chris Seufert; +Cc: xfs, linux-kernel

On Tue, Aug 01, 2006 at 09:49:12AM +0800, Joe Jin wrote:
> >From the information, I think it caused by (args.agbp == NULL).
> get rid of, we'll find the call trace should panic:
> xfs_free_extent
> |_   xfs_free_ag_extent  => here args.agbp= NULL;
>         |_ xfs_btree_init_cursor()
>               |_ agf = XFS_BUF_TO_AGF(agbp);  => (xfs_agf_t
> *)XFS_BUF_PTR(arbp)
>                              |_ (xfs_caddr_t)((agbp)->b_addr) : but here,
> agbp is NULL
> so it caused the oops.

You've all reported this same issue - could any/all of you
try the patch here...
http://oss.sgi.com/archives/xfs/2006-08/msg00054.html

Let me know if that fixes it.  In particular, if you were able
to easily reproduce this before, I'd like to hear whether this
resolves things, as I've still not hit the bug myself.

cheers.

-- 
Nathan

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2006-08-08  3:30 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-20  6:59 XFS Bug null pointer dereference in xfs_free_ag_extent Jan Dittmer
2006-07-29  7:19 ` kernel
2006-07-29  7:49   ` Jan Dittmer
2006-07-29  8:10     ` kernel
2006-07-30 23:44     ` Nathan Scott
2006-07-31  1:30       ` kernel
2006-07-31  6:21       ` kernel
2006-07-31  6:55         ` Nathan Scott
2006-07-31  7:28           ` kernel
2006-07-31  8:41             ` Joe Jin
2006-07-31  9:53               ` Nathan Scott
2006-07-31  9:43             ` Nathan Scott
2006-07-31 10:04               ` kernel
     [not found]                 ` <215036450607311849o43b1555br13ea2f3f20fb3b82@mail.gmail.com>
2006-08-01  8:12                   ` Jan Dittmer
2006-08-02  1:26                     ` Nathan Scott
2006-08-08  3:30                   ` Nathan Scott
2006-07-31  6:58       ` Jan Dittmer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox