oops in close when exiting fsx

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* oops in close when exiting fsx
@ 2006-08-12 19:40 Steve French
  2006-08-13  1:09 ` Andrew Morton
  0 siblings, 1 reply; 4+ messages in thread
From: Steve French @ 2006-08-12 19:40 UTC (permalink / raw)
  To: linux-kernel

ctl-c exiting fsx after a few hours with 2.6.18-rc4 got the following 
oops - anyone recognize it?
Although I didn't see cifs symbols on the call stack it is running on a 
cifs mount, but it is not
one I have seen before.

BUG: unable to handle kernel NULL pointer dereference at virtual address 
0000000 0
 printing eip:
c133064f
*pde = 00000000
Oops: 0002 [#1]
Modules linked in: cifs nfsd exportfs lockd nfs_acl ipv6 sunrpc 
snd_pcm_oss snd_ mixer_oss snd_seq snd_seq_device af_packet edd ibm_acpi 
snd_intel8x0 snd_ac97_co dec snd_ac97_bus snd_pcm snd_timer snd 
soundcore snd_page_alloc
CPU:    0
EIP:    0060:[<c133064f>]    Not tainted VLI
EFLAGS: 00210002   (2.6.18-rc4-default #1)
EIP is at __down+0x56/0xc5
eax: f7708b6c   ebx: f7708b64   ecx: 00000000   edx: f5f77f04
esi: f5f5e030   edi: 00200246   ebp: 00000000   esp: f5f77ed8
ds: 007b   es: 007b   ss: 0068
Process fsx-linux (pid: 4038, ti=f5f76000 task=f5f5e030 task.ti=f5f76000)
Stack: 00000000 00000000 00000000 00000000 00000000 f76e24a0 00000000 
c1038908
       00000001 f5f5e030 c1013678 f7708b6c 00000000 f7708b40 f609f600 
f7708b64
       c132ed37 dff49a40 f5f76000 fa2da685 f611f8c0 f5a5406c 03240e72 
00000008
Call Trace:
 [<c1038908>] mempool_free+0x43/0x46
 [<c1013678>] default_wake_function+0x0/0xc
 [<c132ed37>] __down_failed+0x7/0xc
 [<fa2da685>] .text.lock.file+0x87/0x9a [cifs]
 [<c104e807>] __fput+0xab/0x148
 [<c104c453>] filp_close+0x4e/0x54
 [<c101773a>] put_files_struct+0x64/0xa6
 [<c1018581>] do_exit+0x1c7/0x675
 [<c10052b0>] do_syscall_trace+0x12b/0x172
 [<c1018a8b>] sys_exit_group+0x0/0xd
 [<c1002abf>] syscall_call+0x7/0xb
Code: 89 74 24 24 c7 44 24 28 78 36 01 c1 c7 06 02 00 00 00 9c 5f fa 83 
4c 24 20  01 8d 43 08 8b 48 04 8d 54 24 2c 89 50 04 89 44 24 2c <89> 11 
ff 43 04 89 4c 24  30 8b 43 04 48 01 03 0f 98 c0 84 c0 75
EIP: [<c133064f>] __down+0x56/0xc5 SS:ESP 0068:f5f77ed8
 <3>BUG: sleeping function called from invalid context at kernel/rwsem.c:20
in_atomic():0, irqs_disabled():1
 [<c1026618>] down_read+0x12/0x1f
 [<c101fcc8>] blocking_notifier_call_chain+0xe/0x29
 [<c10183d6>] do_exit+0x1c/0x675
 [<c101695e>] printk+0x14/0x18
 [<c1003dcd>] die+0x206/0x20e
 [<c13315bf>] do_page_fault+0x3ea/0x4b8
 [<c13311d5>] do_page_fault+0x0/0x4b8
 [<c10035d1>] error_code+0x39/0x40
 [<c133064f>] __down+0x56/0xc5
 [<c1038908>] mempool_free+0x43/0x46
 [<c1013678>] default_wake_function+0x0/0xc
 [<c132ed37>] __down_failed+0x7/0xc
 [<fa2da685>] .text.lock.file+0x87/0x9a [cifs]
 [<c104e807>] __fput+0xab/0x148
 [<c104c453>] filp_close+0x4e/0x54
 [<c101773a>] put_files_struct+0x64/0xa6
 [<c1018581>] do_exit+0x1c7/0x675
 [<c10052b0>] do_syscall_trace+0x12b/0x172
 [<c1018a8b>] sys_exit_group+0x0/0xd
 [<c1002abf>] syscall_call+0x7/0xb
Fixing recursive fault but reboot is needed!


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: oops in close when exiting fsx
  2006-08-12 19:40 oops in close when exiting fsx Steve French
@ 2006-08-13  1:09 ` Andrew Morton
  0 siblings, 0 replies; 4+ messages in thread
From: Andrew Morton @ 2006-08-13  1:09 UTC (permalink / raw)
  To: Steve French; +Cc: linux-kernel

On Sat, 12 Aug 2006 14:40:22 -0500
Steve French <smfrench@austin.rr.com> wrote:

> ctl-c exiting fsx after a few hours with 2.6.18-rc4 got the following 
> oops - anyone recognize it?
> Although I didn't see cifs symbols on the call stack it is running on a 
> cifs mount, but it is not
> one I have seen before.

CIFS is in the trace.

> BUG: unable to handle kernel NULL pointer dereference at virtual address 
> 0000000 0
>  printing eip:
> c133064f
> *pde = 00000000
> Oops: 0002 [#1]
> Modules linked in: cifs nfsd exportfs lockd nfs_acl ipv6 sunrpc 
> snd_pcm_oss snd_ mixer_oss snd_seq snd_seq_device af_packet edd ibm_acpi 
> snd_intel8x0 snd_ac97_co dec snd_ac97_bus snd_pcm snd_timer snd 
> soundcore snd_page_alloc
> CPU:    0
> EIP:    0060:[<c133064f>]    Not tainted VLI
> EFLAGS: 00210002   (2.6.18-rc4-default #1)
> EIP is at __down+0x56/0xc5
> eax: f7708b6c   ebx: f7708b64   ecx: 00000000   edx: f5f77f04
> esi: f5f5e030   edi: 00200246   ebp: 00000000   esp: f5f77ed8
> ds: 007b   es: 007b   ss: 0068
> Process fsx-linux (pid: 4038, ti=f5f76000 task=f5f5e030 task.ti=f5f76000)
> Stack: 00000000 00000000 00000000 00000000 00000000 f76e24a0 00000000 
> c1038908
>        00000001 f5f5e030 c1013678 f7708b6c 00000000 f7708b40 f609f600 
> f7708b64
>        c132ed37 dff49a40 f5f76000 fa2da685 f611f8c0 f5a5406c 03240e72 
> 00000008
> Call Trace:
>  [<c1038908>] mempool_free+0x43/0x46
>  [<c1013678>] default_wake_function+0x0/0xc
>  [<c132ed37>] __down_failed+0x7/0xc
>  [<fa2da685>] .text.lock.file+0x87/0x9a [cifs]

That's the out-of-line lock section for fs/cifs/file.c

>  [<c104e807>] __fput+0xab/0x148
>  [<c104c453>] filp_close+0x4e/0x54
>  [<c101773a>] put_files_struct+0x64/0xa6
>  [<c1018581>] do_exit+0x1c7/0x675
>  [<c10052b0>] do_syscall_trace+0x12b/0x172
>  [<c1018a8b>] sys_exit_group+0x0/0xd
>  [<c1002abf>] syscall_call+0x7/0xb

I'd say that cifs_close() is doing down() on a corrupted semaphore structure.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: oops in close when exiting fsx
@ 2006-08-13  8:53 Chuck Ebbert
  2006-08-13 18:43 ` Steve French
  0 siblings, 1 reply; 4+ messages in thread
From: Chuck Ebbert @ 2006-08-13  8:53 UTC (permalink / raw)
  To: Steve French; +Cc: linux-kernel

In-Reply-To: <44DE2EA6.4060809@austin.rr.com>

On Sat, 12 Aug 2006 14:40:22 -0500, Steve French wrote:

> ctl-c exiting fsx after a few hours with 2.6.18-rc4 got the following 
> oops - anyone recognize it?
> Although I didn't see cifs symbols on the call stack it is running on a 
> cifs mount, but it is not
> one I have seen before.

> EIP is at __down+0x56/0xc5

  1a:   8d 43 08                  lea    0x8(%ebx),%eax  <= addr of sema wait queue list_head
  1d:   8b 48 04                  mov    0x4(%eax),%ecx  <= list->prev
  20:   8d 54 24 2c               lea    0x2c(%esp),%edx
  24:   89 50 04                  mov    %edx,0x4(%eax)
  27:   89 44 24 2c               mov    %eax,0x2c(%esp)
   0:   89 11                     mov    %edx,(%ecx)   <===== list->prev->next = new

The semaphore's wait queue head is corrupted: 'prev' is 0.

>  [<c1038908>] mempool_free+0x43/0x46
>  [<c1013678>] default_wake_function+0x0/0xc
>  [<c132ed37>] __down_failed+0x7/0xc
>  [<fa2da685>] .text.lock.file+0x87/0x9a [cifs]      <=====
>  [<c104e807>] __fput+0xab/0x148
>  [<c104c453>] filp_close+0x4e/0x54
>  [<c101773a>] put_files_struct+0x64/0xa6
>  [<c1018581>] do_exit+0x1c7/0x675
>  [<c10052b0>] do_syscall_trace+0x12b/0x172
>  [<c1018a8b>] sys_exit_group+0x0/0xd
>  [<c1002abf>] syscall_call+0x7/0xb

It came from a lock section in the cifs code.  If you disassemble
.text.lock.file in cifs.o, at offset 0x87 (or shortly after) you
will see a jump back to the code that's trying to get the semaphore.

-- 
Chuck


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: oops in close when exiting fsx
  2006-08-13  8:53 Chuck Ebbert
@ 2006-08-13 18:43 ` Steve French
  0 siblings, 0 replies; 4+ messages in thread
From: Steve French @ 2006-08-13 18:43 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: linux-kernel

Chuck Ebbert wrote:
> In-Reply-To: <44DE2EA6.4060809@austin.rr.com>
>
> On Sat, 12 Aug 2006 14:40:22 -0500, Steve French wrote:
>
>   
>> ctl-c exiting fsx after a few hours with 2.6.18-rc4 got the following 
>> oops - anyone recognize it?
>> Although I didn't see cifs symbols on the call stack it is running on a 
>> cifs mount, but it is not
>> one I have seen before.
>>     
>
>   
>> EIP is at __down+0x56/0xc5
>>     
>
>   1a:   8d 43 08                  lea    0x8(%ebx),%eax  <= addr of sema wait queue list_head
>   1d:   8b 48 04                  mov    0x4(%eax),%ecx  <= list->prev
>   20:   8d 54 24 2c               lea    0x2c(%esp),%edx
>   24:   89 50 04                  mov    %edx,0x4(%eax)
>   27:   89 44 24 2c               mov    %eax,0x2c(%esp)
>    0:   89 11                     mov    %edx,(%ecx)   <===== list->prev->next = new
>
> The semaphore's wait queue head is corrupted: 'prev' is 0.
>
>   
>>  [<c1038908>] mempool_free+0x43/0x46
>>  [<c1013678>] default_wake_function+0x0/0xc
>>  [<c132ed37>] __down_failed+0x7/0xc
>>  [<fa2da685>] .text.lock.file+0x87/0x9a [cifs]      <=====
>>  [<c104e807>] __fput+0xab/0x148
>>  [<c104c453>] filp_close+0x4e/0x54
>>  [<c101773a>] put_files_struct+0x64/0xa6
>>  [<c1018581>] do_exit+0x1c7/0x675
>>  [<c10052b0>] do_syscall_trace+0x12b/0x172
>>  [<c1018a8b>] sys_exit_group+0x0/0xd
>>  [<c1002abf>] syscall_call+0x7/0xb
>>     
>
> It came from a lock section in the cifs code.  If you disassemble
> .text.lock.file in cifs.o, at offset 0x87 (or shortly after) you
> will see a jump back to the code that's trying to get the semaphore.
>
>   


Thanks - This is a part of new cifs code recently added to handle posix 
locks (it has not pushed to mainline yet) better

                down(&pSMBFile->lock_sem);
                list_for_each_entry_safe(li, tmp, &pSMBFile->llist, llist) {
                        list_del(&li->llist);
                        kfree(li);
                }
                up(&pSMBFile->lock_sem);


My guess is that there is a path in which the lock_sem is not 
initialized - will trace that.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-08-13 18:43 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-12 19:40 oops in close when exiting fsx Steve French
2006-08-13  1:09 ` Andrew Morton
  -- strict thread matches above, loose matches on Subject: below --
2006-08-13  8:53 Chuck Ebbert
2006-08-13 18:43 ` Steve French

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox