public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* kernel BUG at fs/ext3/balloc.c:942!
@ 2004-05-05 21:31 Zwane Mwaikambo
  2004-05-05 22:24 ` Rob Shakir
  2004-05-07 17:23 ` Mingming Cao
  0 siblings, 2 replies; 6+ messages in thread
From: Zwane Mwaikambo @ 2004-05-05 21:31 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Andrew Morton

Shout at me if you need something else, right now i need to reboot my
workstation.

------------[ cut here ]------------
kernel BUG at fs/ext3/balloc.c:942!
invalid operand: 0000 [#1]
SMP
CPU:    0
EIP:    0060:[<c01a4b75>]    Not tainted VLI
EFLAGS: 00010287   (2.6.6-rc3-mm2)
EIP is at ext3_try_to_allocate_with_rsv+0x102/0x1e7
eax: 00010000   ebx: 00002246   ecx: 00018000   edx: 00012244
esi: f72eca00   edi: 00000000   ebp: f71c1cdc   esp: f73b19b0
ds: 007b   es: 007b   ss: 0068
Process nfsd (pid: 755, threadinfo=f73b0000 task=f73ad0e0)
Stack: f72eca00 00000000 c014bf30 00000000 00010000 f7259900 00000002 c253e4ec
       00000000 00000002 f72eca00 00002246 f7257040 c01a4e23 e0557e40 00002246
       f71c1cdc f73b1a38 00000000 f71c1c80 c01aa235 c32c1600 c01b22b7 00000002
Call Trace:
 [<c014bf30>] __bread+0x5/0x1b
 [<c01a4e23>] ext3_new_block+0x1c9/0x57a
 [<c01aa235>] ext3_do_update_inode+0x162/0x357
 [<c01b22b7>] journal_get_write_access+0x33/0x45
 [<c01a76d2>] ext3_alloc_branch+0x48/0x259
 [<c01a7bc4>] ext3_get_block_handle+0x137/0x2fc
 [<c03c4e97>] ipv4_sabotage_out+0xd8/0xf2
 [<c039184c>] ip_finish_output2+0x0/0x18f
 [<c01b1565>] start_this_handle+0xd2/0x34a
 [<c01a7ddd>] ext3_get_block+0x54/0x9e
 [<c014c67b>] __block_prepare_write+0x1bf/0x3a0
 [<c014cfb0>] block_prepare_write+0x28/0x3d
 [<c01a7d89>] ext3_get_block+0x0/0x9e
 [<c01a829f>] ext3_prepare_write+0x51/0x110
 [<c01a7d89>] ext3_get_block+0x0/0x9e
 [<c01301f9>] generic_file_aio_write_nolock+0x39a/0xa9b
 [<c039f0e2>] tcp_transmit_skb+0x3f3/0x5fd
 [<c02ad418>] copy_to_user+0x32/0x45
 [<c0377ffb>] memcpy_toiovec+0x27/0x49
 [<c0130959>] generic_file_write_nolock+0x5f/0x7d
 [<c03b3cf9>] inet_recvmsg+0x4a/0x69
 [<c0130b14>] generic_file_writev+0x3a/0x5a
 [<c0130ada>] generic_file_writev+0x0/0x5a
 [<c0149ec5>] do_readv_writev+0x1db/0x21f
 [<c0149966>] do_sync_write+0x0/0xa4
 [<c0149fa1>] vfs_writev+0x49/0x52
 [<c01f07b0>] nfsd_write+0xf2/0x305
 [<c03cc8a7>] svc_sock_enqueue+0x143/0x2b0
 [<c01f6ef8>] nfsd3_proc_write+0xb8/0x11b
 [<c01f8951>] nfs3svc_decode_writeargs+0x0/0x159
 [<c01ecf0b>] nfsd_dispatch+0xb9/0x190
 [<c03cee15>] svc_authenticate+0x4d/0x7c
 [<c03cc5f2>] svc_process+0x46b/0x5c2
 [<c01148cc>] default_wake_function+0x0/0xc
 [<c01ecc8d>] nfsd+0x1c1/0x386
 [<c0103952>] ret_from_fork+0x6/0x14
 [<c01ecacc>] nfsd+0x0/0x386
 [<c0101f6d>] kernel_thread_helper+0x5/0xb

Code: e8 e8 2f f2 ff ff 85 c0 b8 ff ff ff ff 0f 44 d8 8b 86 6c 01 00 00 8b
4c 24 10 03 48 10 39 4d 08 73 09 8b 44 24 10 39 45 0c 73 08 <0f> 0b ae 03
f5 4f 41 c0 8b 54 24 38 8b 4c 24 18 89 f0 89 14 24


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: kernel BUG at fs/ext3/balloc.c:942!
  2004-05-05 21:31 Zwane Mwaikambo
@ 2004-05-05 22:24 ` Rob Shakir
  2004-05-06 12:32   ` Chris Mason
  2004-05-07 17:23 ` Mingming Cao
  1 sibling, 1 reply; 6+ messages in thread
From: Rob Shakir @ 2004-05-05 22:24 UTC (permalink / raw)
  To: linux-kernel

I've seen something vaguely similar to this problem, but rather than running 
ext3 I'm running ReiserFS. 

This problem has occured twice, but I've just got the machine back running to
report the bug properly.


kernel BUG at fs/reiserfs/prints.c:339!
invalid operand: 0000 [#1]
PREEMPT SMP DEBUG_PAGEALLOC
CPU:    1
EIP:    0060:[<c01c9f88>]    Not tainted
EFLAGS: 00010286   (2.6.5)
EIP is at reiserfs_panic+0x38/0x70
eax: 0000003c   ebx: f518edf8   ecx: f556e000   edx: c045c5a0
esi: 00000001   edi: 00000003   ebp: f3923a30   esp: f3923a20
ds: 007b   es: 007b   ss: 0068
Process nfsd (pid: 122, threadinfo=f3922000 task=f394ba30)
Stack: c03f7212 c052abe0 c03fbfd0 f396b7c0 f3923b34 c01d44fd f518edf8 c040dc00
       00000000 00001000 eb1d8e88 eb1d8e88 00000001 f3923a68 00011f70 00000000
       c0460040 d4294f58 d3356638 00000062 00000063 00000062 f3923b8c 00000010
Call Trace:
 [<c01d44fd>] search_by_key+0x211d/0x2120
 [<c01bad09>] reiserfs_read_locked_inode+0x69/0x100
 [<c01bae6d>] reiserfs_iget+0x9d/0xb0
 [<c01bada0>] reiserfs_find_actor+0x0/0x30
 [<c01bac80>] reiserfs_init_locked_inode+0x0/0x20
 [<c01baeac>] reiserfs_get_dentry+0x2c/0xa0
 [<c0226cab>] find_exported_dentry+0x3b/0xb20
 [<c034f857>] dev_queue_xmit+0x3b7/0x490
 [<c035d6e0>] pfifo_fast_enqueue+0x0/0x90
 [<c035850c>] neigh_resolve_output+0x13c/0x260
 [<c036c1c4>] ip_finish_output2+0xe4/0x1e6
 [<c035c42d>] nf_hook_slow+0xed/0x140
 [<c036c0e0>] ip_finish_output2+0x0/0x1e6
 [<c0369afb>] ip_finish_output+0x23b/0x240
 [<c036c0e0>] ip_finish_output2+0x0/0x1e6
 [<c036c0c5>] dst_output+0x15/0x30
 [<c035c42d>] nf_hook_slow+0xed/0x140
 [<c036c0b0>] dst_output+0x0/0x30
 [<c036bb6b>] ip_push_pending_frames+0x3db/0x440
 [<c036c0b0>] dst_output+0x0/0x30
 [<c011cab4>] __change_page_attr+0x24/0x1e0
 [<c013a47b>] kernel_text_address+0x3b/0x50
 [<c011cef1>] kernel_map_pages+0x31/0x6c
 [<c01bafd3>] reiserfs_decode_fh+0xb3/0xe0
 [<c0229ee0>] nfsd_acceptable+0x0/0x1a0
 [<c022a274>] fh_verify+0x1f4/0x5c0
 [<c0229ee0>] nfsd_acceptable+0x0/0x1a0
 [<c03c592a>] svc_udp_recvfrom+0xda/0x290
 [<c0228eb3>] nfsd_proc_getattr+0x73/0xa0
 [<c02283f7>] nfsd_dispatch+0xd7/0x1e0
 [<c0228320>] nfsd_dispatch+0x0/0x1e0
 [<c03c4abb>] svc_process+0x4bb/0x61d
 [<c011f660>] default_wake_function+0x0/0x20
 [<c0228096>] nfsd+0x276/0x500
 [<c0227e20>] nfsd+0x0/0x500
 [<c01052e5>] kernel_thread_helper+0x5/0x10

Code: 0f 0b 53 01 40 be 3f c0 c7 04 24 c0 83 40 c0 b8 e1 b9 3f c0

I'm not sure if this is directly related to the behaviour that Zwane's reported
above, but the similarities seemed to be enough to warrant this being posted as
a reply to his post.

Thanks,
Rob

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: kernel BUG at fs/ext3/balloc.c:942!
  2004-05-05 22:24 ` Rob Shakir
@ 2004-05-06 12:32   ` Chris Mason
  2004-05-06 17:00     ` Rob Shakir
  0 siblings, 1 reply; 6+ messages in thread
From: Chris Mason @ 2004-05-06 12:32 UTC (permalink / raw)
  To: Rob Shakir; +Cc: linux-kernel

On Wed, 2004-05-05 at 18:24, Rob Shakir wrote:
> I've seen something vaguely similar to this problem, but rather than running 
> ext3 I'm running ReiserFS. 
> 
> This problem has occured twice, but I've just got the machine back running to
> report the bug properly.
> 
There was an additional error message from reiserfs before the bug,
could you please look for it in your logs?

Looks like you're on 2.6.5, the reiserfs logging fixes in 2.6.6-rc2 and
higher fix a few different ways you can oops in search_by_key.

-chris



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: kernel BUG at fs/ext3/balloc.c:942!
  2004-05-06 12:32   ` Chris Mason
@ 2004-05-06 17:00     ` Rob Shakir
  0 siblings, 0 replies; 6+ messages in thread
From: Rob Shakir @ 2004-05-06 17:00 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-kernel

On Thu, May 06, 2004 at 08:32:04AM -0400, Chris Mason wrote:
> There was an additional error message from reiserfs before the bug,
> could you please look for it in your logs?

The messages preceding the bug in the logs were:

5140
0: file_release
1: reiserfs_vfs_truncate_file
PAP-5140: search_by_key: schedule occurred in do_balance!------------[ cut here ]------------

I apologise for not adding those in the initial bug report. 

> 
> Looks like you're on 2.6.5, the reiserfs logging fixes in 2.6.6-rc2 and
> higher fix a few different ways you can oops in search_by_key.
> 

Thanks, I'll try that and see if it fixes the problem.

Rob

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: kernel BUG at fs/ext3/balloc.c:942!
  2004-05-05 21:31 Zwane Mwaikambo
  2004-05-05 22:24 ` Rob Shakir
@ 2004-05-07 17:23 ` Mingming Cao
  1 sibling, 0 replies; 6+ messages in thread
From: Mingming Cao @ 2004-05-07 17:23 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Linux Kernel, Andrew Morton

On Wed, 2004-05-05 at 14:31, Zwane Mwaikambo wrote:
> Shout at me if you need something else, right now i need to reboot my
> workstation.

Thanks for working this problem with us.  

The problem is that the reservation window is being discard too early in
ext3_release_file(). ext3_release_file() calls
ext3_discard_resercation() on the last file_close().  It is possible for
two process who open same file for write, while one process discarded a
reservation window when it is the last one closed it's own filp,  while
another process is in the middle of using that reservation window for
block allocation on that inode.

We should really discard the reservation window on the last writer on
the inode, instead of the last writer on the filp.

Here is the patch to fix this issue.

This should fix a reservation window race between multiple processes when one process closed the file while another one is in the middle of block allocation using the inode's reservation window. reservation window should be discard on the last writer on the inode, not the last writer on the filp. 


---

 src-ming/fs/ext3/file.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletion(-)

diff -puN fs/ext3/file.c~ext3_reservation_discard_race_fix fs/ext3/file.c
--- src/fs/ext3/file.c~ext3_reservation_discard_race_fix	2004-05-06 22:46:44.338842592 -0700
+++ src-ming/fs/ext3/file.c	2004-05-07 00:48:58.059947520 -0700
@@ -33,7 +33,8 @@
  */
 static int ext3_release_file (struct inode * inode, struct file * filp)
 {
-	if (filp->f_mode & FMODE_WRITE)
+	/* if we are the last writer on the inode, drop the block reservation */
+	if (filp->f_mode & FMODE_WRITE && (atomic_read(&inode->i_writecount) == 1))
 		ext3_discard_reservation(inode);
 	if (is_dx(inode) && filp->private_data)
 		ext3_htree_free_dir_info(filp->private_data);

_


^ permalink raw reply	[flat|nested] 6+ messages in thread

* kernel BUG at fs/ext3/balloc.c:942!
@ 2004-08-03 23:15 Philip Molter
  0 siblings, 0 replies; 6+ messages in thread
From: Philip Molter @ 2004-08-03 23:15 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 444 bytes --]

This is a FC2 stock errata kernel 2.6.6-1.435.2.3, roughly equivalent to 
2.6.7-rc2.  This kernel does contain the bugfix in fs/ext3/file.c 
(http://marc.theaimsgroup.com/?l=linux-kernel&m=108395053322261&w=2). 
This only occurs under heavy load, but I can't pinpoint where or how. 
It must be a race because we cannot reproduce the problem reliably.

Log output attached

If anyone has any insight, please let me know,

Philip Molter
Giganews

[-- Attachment #2: balloc-oops.txt --]
[-- Type: text/plain, Size: 3222 bytes --]

Aug  3 17:01:54 h01 kernel: kernel BUG at fs/ext3/balloc.c:942!
Aug  3 17:01:54 h01 kernel: invalid operand: 0000 [#1]
Aug  3 17:01:54 h01 kernel: SMP
Aug  3 17:01:54 h01 kernel: Modules linked in: e1000 bonding ipv6 mii 8021q floppy sg microcode dm_mod uhci_hcd ext3 jbd raid5 xor raid1 3w_xxxx sd_mod scsi_mod
Aug  3 17:01:54 h01 kernel: CPU:    1
Aug  3 17:01:54 h01 kernel: EIP:    0060:[<82897d5b>]    Not tainted
Aug  3 17:01:54 h01 kernel: EFLAGS: 00010287   (2.6.6-1.435.2.3smp)
Aug  3 17:01:54 h01 kernel: EIP is at ext3_try_to_allocate_with_rsv+0x119/0x1bc [ext3]
Aug  3 17:01:54 h01 kernel: eax: 00520000   ebx: 00518000   ecx: 00518000   edx: 6ec144dc
Aug  3 17:01:54 h01 kernel: esi: 00000000   edi: ffffffff   ebp: 81f42200   esp: 0415ebf8
Aug  3 17:01:54 h01 kernel: ds: 007b   es: 007b   ss: 0068
Aug  3 17:01:54 h01 kernel: Process pdflush (pid: 68, threadinfo=0415e000 task=81d5cc70)
Aug  3 17:01:54 h01 kernel: Stack: 7b24c100 000000a3 3cdc72cc 00000000 000000a3 000000a3 00002000 81f42200
Aug  3 17:01:54 h01 kernel:        82898049 6f98a14c 00002000 6ec144dc 0415ec58 00000220 6ec144dc 7b248000
Aug  3 17:01:54 h01 kernel:        7b284400 7b26c460 00000000 00000001 6f98a14c 0051a000 6ec14534 3cdc72cc
Aug  3 17:01:54 h01 kernel: Call Trace:
Aug  3 17:01:54 h01 kernel:  [<82898049>] ext3_new_block+0x1ba/0x4aa [ext3]
Aug  3 17:01:54 h01 kernel:  [<8289a1ab>] ext3_alloc_block+0x9/0xb [ext3]
Aug  3 17:01:54 h01 kernel:  [<8289a4e1>] ext3_alloc_branch+0x4a/0x232 [ext3]
Aug  3 17:01:54 h01 kernel:  [<021fdd17>] as_add_request+0xbd/0x179
Aug  3 17:01:54 h01 kernel:  [<8289a9fe>] ext3_get_block_handle+0x1c7/0x28e [ext3]
Aug  3 17:01:54 h01 kernel:  [<8289ab29>] ext3_get_block+0x64/0x6c [ext3]
Aug  3 17:01:54 h01 kernel:  [<021544c0>] __block_write_full_page+0x107/0x2d0
Aug  3 17:01:54 h01 kernel:  [<8289aac5>] ext3_get_block+0x0/0x6c [ext3]
Aug  3 17:01:54 h01 kernel:  [<0215585f>] block_write_full_page+0xc7/0xd0
Aug  3 17:01:54 h01 kernel:  [<8289aac5>] ext3_get_block+0x0/0x6c [ext3]
Aug  3 17:01:54 h01 kernel:  [<8289b3b3>] ext3_ordered_writepage+0xca/0x136 [ext3]
Aug  3 17:01:54 h01 kernel:  [<8289b2c9>] bget_one+0x0/0x7 [ext3]
Aug  3 17:01:54 h01 kernel:  [<0216dda0>] mpage_writepages+0x143/0x273
Aug  3 17:01:54 h01 kernel:  [<8289b2e9>] ext3_ordered_writepage+0x0/0x136 [ext3]
Aug  3 17:01:54 h01 kernel:  [<0216c747>] __sync_single_inode+0x5d/0x19a
Aug  3 17:01:54 h01 kernel:  [<0216ca9c>] sync_sb_inodes+0x18e/0x242
Aug  3 17:01:54 h01 kernel:  [<0213ca16>] pdflush+0x0/0x1e
Aug  3 17:01:54 h01 kernel:  [<0216cbc9>] writeback_inodes+0x79/0xcd
Aug  3 17:01:54 h01 kernel:  [<0213c00a>] background_writeout+0x73/0xb5
Aug  3 17:01:54 h01 kernel:  [<0213c97a>] __pdflush+0xf6/0x192
Aug  3 17:01:54 h01 kernel:  [<0213ca30>] pdflush+0x1a/0x1e
Aug  3 17:01:54 h01 kernel:  [<0213bf97>] background_writeout+0x0/0xb5
Aug  3 17:01:54 h01 kernel:  [<0213ca16>] pdflush+0x0/0x1e
Aug  3 17:01:54 h01 kernel:  [<0212ff79>] kthread+0x73/0x9b
Aug  3 17:01:54 h01 kernel:  [<0212ff06>] kthread+0x0/0x9b
Aug  3 17:01:54 h01 kernel:  [<021051f1>] kernel_thread_helper+0x5/0xb
Aug  3 17:01:54 h01 kernel:
Aug  3 17:01:54 h01 kernel: Code: 0f 0b ae 03 31 89 8a 82 ff 74 24 2c 89 e8 57 ff 74 24 2c 8b

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-08-03 23:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-03 23:15 kernel BUG at fs/ext3/balloc.c:942! Philip Molter
  -- strict thread matches above, loose matches on Subject: below --
2004-05-05 21:31 Zwane Mwaikambo
2004-05-05 22:24 ` Rob Shakir
2004-05-06 12:32   ` Chris Mason
2004-05-06 17:00     ` Rob Shakir
2004-05-07 17:23 ` Mingming Cao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox