public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* ext3-related crash in 2.6.20-rc1
@ 2006-12-23 23:43 Pavel Machek
  2006-12-23 23:55 ` bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1) Pavel Machek
  2006-12-24  1:12 ` ext3-related crash in 2.6.20-rc1 Andrew Morton
  0 siblings, 2 replies; 19+ messages in thread
From: Pavel Machek @ 2006-12-23 23:43 UTC (permalink / raw)
  To: Andrew Morton, kernel list; +Cc: marcel, maxk, bluez-devel

Hi!

I got this nasty oops while playing with debugger. Not sure if that is
related; it also might be something with bluetooth; I already know it
corrupts memory during suspend, perhaps it corrupts memory in some
error path?

 

								Pavel


l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
PM: Removing info for bluetooth:acl00803715A329
e1000: eth0: e1000_watchdog: NIC Link is Down
e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex
e1000: eth0: e1000_watchdog: 10/100 speed: disabling TSO
e1000: eth0: e1000_watchdog: NIC Link is Down
e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex
e1000: eth0: e1000_watchdog: 10/100 speed: disabling TSO
------------[ cut here ]------------
kernel BUG at fs/buffer.c:1235!
invalid opcode: 0000 [#1]
SMP 
Modules linked in:
CPU:    0
EIP:    0060:[<c01933c2>]    Not tainted VLI
EFLAGS: 00010046   (2.6.20-rc1 #379)
EIP is at __find_get_block+0x1b2/0x1c0
eax: 00000086   ebx: 00001000   ecx: 00000000   edx: 006780b2
esi: 0033d60d   edi: 00001000   ebp: 000000cf   esp: c9135c90
ds: 007b   es: 007b   ss: 0068
Process phone (pid: 1161, ti=c9134000 task=f7949030 task.ti=c9134000)
Stack: 006780b2 00000000 f7ec2820 00000003 ad40ad40 f7d8f5ba c0652a48 00000000 
       f88da000 00000012 0000000f f65c9000 f65c9230 c016c9df c016c3c4 00001000 
       0033d60d 00001000 000000cf c01933ef 00001000 5a0ff380 0000007f f65c9234 
Call Trace:
 [<c01933ef>] __getblk+0x1f/0x290
 [<c01db680>] __ext3_get_inode_loc+0x120/0x3a0
 [<c01db9d7>] ext3_reserve_inode_write+0x27/0x80
 [<c01dbe1a>] ext3_mark_inode_dirty+0x1a/0x40
 [<c01dc2c9>] ext3_dirty_inode+0x79/0xb0
 [<c018c854>] __mark_inode_dirty+0x34/0x1c0
 [<c0154934>] __generic_file_aio_write_nolock+0x244/0x590
 [<c0154cd9>] generic_file_aio_write+0x59/0xd0
 [<c01da050>] ext3_file_write+0x30/0xc0
 [<c0170ad7>] do_sync_write+0xc7/0x130
 [<c0171266>] vfs_write+0xa6/0x160
 [<c0171b21>] sys_write+0x41/0x70
 [<c010304c>] syscall_call+0x7/0xb
 [<b7f18d2e>] 0xb7f18d2e
 =======================
Code: 00 8b 7c 24 18 f3 a5 fb 8b 44 24 10 85 c0 0f 84 2c ff ff ff 8b 44 24 10 e8 5c ca ff ff e9 1e ff ff ff 89 d8 e8 50 ca ff ff eb 8d <0f> 0b eb fe 0f 0b eb fe 0f 0b eb fe 89 f6 55 57 56 53 83 ec 48 
EIP: [<c01933c2>] __find_get_block+0x1b2/0x1c0 SS:ESP 0068:c9135c90
 

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
  2006-12-23 23:43 ext3-related crash in 2.6.20-rc1 Pavel Machek
@ 2006-12-23 23:55 ` Pavel Machek
  2006-12-24  0:01   ` Pavel Machek
                     ` (2 more replies)
  2006-12-24  1:12 ` ext3-related crash in 2.6.20-rc1 Andrew Morton
  1 sibling, 3 replies; 19+ messages in thread
From: Pavel Machek @ 2006-12-23 23:55 UTC (permalink / raw)
  To: Andrew Morton, kernel list; +Cc: marcel, maxk, bluez-devel

Hi!

> I got this nasty oops while playing with debugger. Not sure if that is
> related; it also might be something with bluetooth; I already know it
> corrupts memory during suspend, perhaps it corrupts memory in some
> error path?

Okay, I spoke too soon. bluetooth & suspend memory corruption was
_way_ harder to reproduce than expected. Took me 5-or-so-suspend
cycles... so it is probably unrelated to the previous crash.

I was getting pretty regular crashes with bluetooth & gdb, but I was
not using bluetooth at the time of ext3-related crash.

								Pavel

acpi acpi: resuming
__tx_submit: hci0 tx submit failed urb c20efb08 type 2 err -113
agpgart-intel 0000:00:00.0: resuming
pci 0000:00:02.0: resuming
pci 0000:00:02.1: resuming
PM: Writing back config space on device 0000:00:02.1 at offset 1 (was 900000, writing 900003)
HDA Intel 0000:00:1b.0: resuming
PM: Writing back config space on device 0000:00:1b.0 at offset 1 (was 100106, writing 100102)
PCI: Setting latency timer of device 0000:00:1b.0 to 64
pci 0000:00:1c.0: resuming
PCI: Setting latency timer of device 0000:00:1c.0 to 64
pci 0000:00:1c.1: resuming
PCI: Setting latency timer of device 0000:00:1c.1 to 64
pci 0000:00:1c.2: resuming
PCI: Setting latency timer of device 0000:00:1c.2 to 64
pci 0000:00:1c.3: resuming
PM: Writing back config space on device 0000:00:1c.3 at offset f (was 40400, writing 4040b)
PM: Writing back config space on device 0000:00:1c.3 at offset 9 (was 10001, writing e421e421)
PM: Writing back config space on device 0000:00:1c.3 at offset 8 (was 0, writing ebf0ea00)
PM: Writing back config space on device 0000:00:1c.3 at offset 7 (was 20000000, writing 8070)
PM: Writing back config space on device 0000:00:1c.3 at offset 3 (was 810000, writing 810010)
PM: Writing back config space on device 0000:00:1c.3 at offset 1 (was 100000, writing 100107)
PCI: Setting latency timer of device 0000:00:1c.3 to 64
uhci_hcd 0000:00:1d.0: resuming
PCI: Setting latency timer of device 0000:00:1d.0 to 64
usb usb4: root hub lost power or was reset
uhci_hcd 0000:00:1d.1: resuming
PCI: Setting latency timer of device 0000:00:1d.1 to 64
usb usb2: root hub lost power or was reset
uhci_hcd 0000:00:1d.2: resuming
PCI: Setting latency timer of device 0000:00:1d.2 to 64
usb usb5: root hub lost power or was reset
uhci_hcd 0000:00:1d.3: resuming
PCI: Setting latency timer of device 0000:00:1d.3 to 64
usb usb3: root hub lost power or was reset
ehci_hcd 0000:00:1d.7: resuming
PCI: Setting latency timer of device 0000:00:1d.7 to 64
pci 0000:00:1e.0: resuming
PM: Writing back config space on device 0000:00:1e.0 at offset 1 (was 100005, writing 100007)
PCI: Setting latency timer of device 0000:00:1e.0 to 64
pci 0000:00:1f.0: resuming
PIIX_IDE 0000:00:1f.1: resuming
ahci 0000:00:1f.2: resuming
PCI: Setting latency timer of device 0000:00:1f.2 to 64
pci 0000:00:1f.3: resuming
pci 0000:02:00.0: resuming
PM: Writing back config space on device 0000:02:00.0 at offset 1 (was 100107, writing 100103)
pci 0000:03:00.0: resuming
yenta_cardbus 0000:15:00.0: resuming
ohci1394 0000:15:00.1: resuming
PM: Writing back config space on device 0000:15:00.1 at offset 4 (was 0, writing e4301000)
PM: Writing back config space on device 0000:15:00.1 at offset 3 (was 800000, writing 804000)
PM: Writing back config space on device 0000:15:00.1 at offset 1 (was 2100000, writing 2100006)
ohci1394: fw-host0: OHCI-1394 1.0 (PCI): IRQ=[21]  MMIO=[e4301000-e43017ff]  Max Packet=[2048]  IR/IT contexts=[4/4]
sdhci 0000:15:00.2: resuming
PM: Writing back config space on device 0000:15:00.2 at offset 4 (was 0, writing e4301800)
PM: Writing back config space on device 0000:15:00.2 at offset 3 (was 800000, writing 804000)
PM: Writing back config space on device 0000:15:00.2 at offset 1 (was 2100000, writing 2100006)
system 00:00: resuming
pnp 00:01: resuming
system 00:02: resuming
pnp 00:03: resuming
pnp 00:04: resuming
pnp 00:05: resuming
pnp 00:06: resuming
pnp 00:07: resuming
i8042 kbd 00:08: resuming
pnp: Device 00:08 does not support activation.
i8042 aux 00:09: resuming
pnp: Device 00:09 does not support activation.
pnp 00:0a: resuming
pnp 00:0b: resuming
platform bluetooth: resuming
pcspkr pcspkr: resuming
vesafb vesafb.0: resuming
serial8250 serial8250: resuming
usb usb1: resuming
hub 1-0:1.0: resuming
usb usb2: resuming
hub 2-0:1.0: resuming
usb usb4: resuming
ata2: SATA link down (SStatus 0 SControl 0)
ata3: SATA link down (SStatus 0 SControl 0)
ata4: SATA link down (SStatus 0 SControl 0)
usb usb5: resuming
hub 4-0:1.0: resuming
hub 5-0:1.0: resuming
usb usb3: resuming
hub 3-0:1.0: resuming
i8042 i8042: resuming
atkbd serio0: resuming
psmouse serio1: resuming
mmcblk mmc0:cc53: resuming
sd 0:0:0:0: resuming
usb 3-2: resuming
 usbdev3.14_ep00: PM: resume from 0, parent 3-2 still 2
usb 3-2:1.0: PM: resume from 2, parent 3-2 still 2
usb 3-2:1.0: resuming
 usbdev3.14_ep81: PM: resume from 0, parent 3-2:1.0 still 2
 usbdev3.14_ep02: PM: resume from 0, parent 3-2:1.0 still 2
 usbdev3.14_ep83: PM: resume from 0, parent 3-2:1.0 still 2
usb 3-1: resuming
 usbdev3.15_ep00: PM: resume from 0, parent 3-1 still 2
hci_usb 3-1:1.0: PM: resume from 2, parent 3-1 still 2
hci_usb 3-1:1.0: resuming
 hci0: PM: resume from 0, parent 3-1:1.0 still 2
 usbdev3.15_ep81: PM: resume from 0, parent 3-1:1.0 still 2
 usbdev3.15_ep82: PM: resume from 0, parent 3-1:1.0 still 2
 usbdev3.15_ep02: PM: resume from 0, parent 3-1:1.0 still 2
hci_usb 3-1:1.1: PM: resume from 2, parent 3-1 still 2
hci_usb 3-1:1.1: resuming
 usbdev3.15_ep83: PM: resume from 0, parent 3-1:1.1 still 2
 usbdev3.15_ep03: PM: resume from 0, parent 3-1:1.1 still 2
usb 3-1:1.2: PM: resume from 2, parent 3-1 still 2
usb 3-1:1.2: resuming
 usbdev3.15_ep84: PM: resume from 0, parent 3-1:1.2 still 2
 usbdev3.15_ep04: PM: resume from 0, parent 3-1:1.2 still 2
usb 3-1:1.3: PM: resume from 2, parent 3-1 still 2
usb 3-1:1.3: resuming
Restarting tasks ... <6>usb 3-1: USB disconnect, address 15
PM: Removing info for No Bus:usbdev3.15_ep81
PM: Removing info for No Bus:usbdev3.15_ep82
PM: Removing info for No Bus:usbdev3.15_ep02
slab error in verify_redzone_free(): cache `size-512': memory outside object was overwritten
 [<c016a1b8>] cache_free_debugcheck+0x128/0x1d0
 [<c04b58e3>] hci_usb_close+0xf3/0x160
 [<c016b530>] kfree+0x50/0xa0
 [<c04b58e3>] hci_usb_close+0xf3/0x160
 [<c04b59be>] hci_usb_disconnect+0x2e/0x90
 [<c0454f23>] usb_disable_interface+0x53/0x70
 [<c04576f8>] usb_unbind_interface+0x38/0x80
 [<c032f908>] __device_release_driver+0x68/0xb0
 [<c032fc3e>] device_release_driver+0x1e/0x40
 [<c032f1db>] bus_remove_device+0x8b/0xa0
 [<c032dbc9>] device_del+0x159/0x1c0
 [<c04559ad>] usb_disable_device+0x4d/0x100
 [<c044fe8a>] usb_disconnect+0x9a/0x110
 [<c0452405>] hub_thread+0x355/0xbd0
 [<c061426e>] schedule+0x2de/0x8f0
 [<c013c640>] autoremove_wake_function+0x0/0x50
 [<c04520b0>] hub_thread+0x0/0xbd0
 [<c013c58c>] kthread+0xec/0xf0
 [<c013c4a0>] kthread+0x0/0xf0
 [<c0103be7>] kernel_thread_helper+0x7/0x10
 =======================
e91f6288: redzone 1:0x5a5a5a5a, redzone 2:0xc054aeae.
------------[ cut here ]------------
kernel BUG at mm/slab.c:2878!
invalid opcode: 0000 [#1]
SMP 
Modules linked in:
CPU:    0
EIP:    0060:[<c016a242>]    Not tainted VLI
EFLAGS: 00010002   (2.6.20-rc1 #383)
EIP is at cache_free_debugcheck+0x1b2/0x1d0
eax: e91f6284   ebx: e91f6078   ecx: 00052c00   edx: 0000020c
esi: c20df680   edi: e91f6288   ebp: 5a5a5a5a   esp: c2227e30
ds: 007b   es: 007b   ss: 0068
Process khubd (pid: 303, ti=c2226000 task=c21f6a70 task.ti=c2226000)
Stack: c06b3fe0 e91f6288 5a5a5a5a c054aeae c04b58e3 e91f6040 c20df680 c20d9164 
       e91f628c 00000282 c016b530 c20efb08 c20efaf4 e977a274 0000000c c04b58e3 
       e977a230 e977a260 f7b3f904 e977a1a4 00000001 e977a1a4 f7b3f904 c07e2060 
Call Trace:
 [<c054aeae>] sock_alloc_send_skb+0x16e/0x1c0
 [<c04b58e3>] hci_usb_close+0xf3/0x160
 [<c016b530>] kfree+0x50/0xa0
 [<c04b58e3>] hci_usb_close+0xf3/0x160
 [<c04b59be>] hci_usb_disconnect+0x2e/0x90
 [<c0454f23>] usb_disable_interface+0x53/0x70
 [<c04576f8>] usb_unbind_interface+0x38/0x80
 [<c032f908>] __device_release_driver+0x68/0xb0
 [<c032fc3e>] device_release_driver+0x1e/0x40
 [<c032f1db>] bus_remove_device+0x8b/0xa0
 [<c032dbc9>] device_del+0x159/0x1c0
 [<c04559ad>] usb_disable_device+0x4d/0x100
 [<c044fe8a>] usb_disconnect+0x9a/0x110
 [<c0452405>] hub_thread+0x355/0xbd0
 [<c061426e>] schedule+0x2de/0x8f0
 [<c013c640>] autoremove_wake_function+0x0/0x50
 [<c04520b0>] hub_thread+0x0/0xbd0
 [<c013c58c>] kthread+0xec/0xf0
 [<c013c4a0>] kthread+0x0/0xf0
 [<c0103be7>] kernel_thread_helper+0x7/0x10
 =======================
Code: f0 2c 5a 75 8b b9 39 31 6b c0 89 f2 b8 88 e8 61 c0 e8 73 f4 ff ff eb 89 81 fb a5 c2 0f 17 0f 85 6c ff ff ff 90 8d 74 26 00 eb 8e <0f> 0b eb fe 0f 0b eb fe 8d b6 00 00 00 00 0f 0b eb fe 8b 52 0c 
EIP: [<c016a242>] cache_free_debugcheck+0x1b2/0x1d0 SS:ESP 0068:c2227e30
 <7>PM: Adding info for No Bus:vcs63
PM: Adding info for No Bus:vcsa63
PM: Removing info for No Bus:vcs63
PM: Removing info for No Bus:vcsa63
done.
Enabling non-boot CPUs ...
SMP alternatives: switching to SMP code
Booting processor 1/1 eip 3000
Initializing CPU#1
Calibrating delay using timer specific routine.. 3657.63 BogoMIPS (lpj=18288162)
CPU: After generic identify, caps: bfe9fbff 00100000 00000000 00000000 0000c1a9 00000000 00000000
monitor/mwait feature present.
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 2048K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
CPU: After all inits, caps: bfe9fbff 00100000 00000000 00002940 0000c1a9 00000000 00000000
CPU1: Intel Genuine Intel(R) CPU           T2400  @ 1.83GHz stepping 08
PM: Adding info for No Bus:msr1
CPU1 is up
ata1: waiting for device to spin up (7 secs)


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
  2006-12-23 23:55 ` bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1) Pavel Machek
@ 2006-12-24  0:01   ` Pavel Machek
  2006-12-24  0:06     ` not-only-bluetooth memory corruption Pavel Machek
  2006-12-24  1:18   ` bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1) Andrew Morton
  2006-12-24 14:39   ` Marcel Holtmann
  2 siblings, 1 reply; 19+ messages in thread
From: Pavel Machek @ 2006-12-24  0:01 UTC (permalink / raw)
  To: Andrew Morton, kernel list; +Cc: marcel, maxk, bluez-devel

Hi!

> > I got this nasty oops while playing with debugger. Not sure if that is
> > related; it also might be something with bluetooth; I already know it
> > corrupts memory during suspend, perhaps it corrupts memory in some
> > error path?
> 
> Okay, I spoke too soon. bluetooth & suspend memory corruption was
> _way_ harder to reproduce than expected. Took me 5-or-so-suspend
> cycles... so it is probably unrelated to the previous crash.
> 
> I was getting pretty regular crashes with bluetooth & gdb, but I was
> not using bluetooth at the time of ext3-related crash.

And for completeness, here's bluetooth + gdb oops. Ok, I'm not _sure_
it is bluetooth related. I'll try it without bluetooth in a while.

								Pavel

PM: Adding info for No Bus:vcsa8
coda_read_super: Bad mount data
coda_read_super: device index: 0
coda_read_super: rootfid is (01234567.ffffffff.080519b0.00000000)
PM: Removing info for No Bus:vcs10
PM: Removing info for No Bus:vcsa10
coda_upcall: Venus dead on (op,un) (7.2) flags 10
Failure of coda_cnode_make for root: error -19
hci_cmd_task: hci0 command tx timeout
PM: Adding info for No Bus:rfcomm1
PM: Adding info for bluetooth:acl00803715A329
hci_acldata_packet: hci0 ACL packet for unknown connection handle 12
hci_acldata_packet: hci0 ACL packet for unknown connection handle 12
hci_acldata_packet: hci0 ACL packet for unknown connection handle 12
hci_acldata_packet: hci0 ACL packet for unknown connection handle 12
hci_acldata_packet: hci0 ACL packet for unknown connection handle 12
hci_acldata_packet: hci0 ACL packet for unknown connection handle 12
hci_acldata_packet: hci0 ACL packet for unknown connection handle 12
hci_acldata_packet: hci0 ACL packet for unknown connection handle 12
hci_acldata_packet: hci0 ACL packet for unknown connection handle 12
hci_acldata_packet: hci0 ACL packet for unknown connection handle 12
hci_acldata_packet: hci0 ACL packet for unknown connection handle 12
hci_acldata_packet: hci0 ACL packet for unknown connection handle 12
hci_acldata_packet: hci0 ACL packet for unknown connection handle 12
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
l2cap_recv_acldata: Unexpected continuation frame (len 0)
PM: Removing info for bluetooth:acl00803715A329
------------[ cut here ]------------
kernel BUG at fs/buffer.c:1235!
invalid opcode: 0000 [#1]
SMP 
Modules linked in:
CPU:    1
EIP:    0060:[<c01912b2>]    Not tainted VLI
EFLAGS: 00010046   (2.6.20-rc1 #383)
EIP is at __find_get_block+0x1b2/0x1c0
eax: 00000086   ebx: 00001000   ecx: 00000000   edx: 006780b2
esi: 0033d60d   edi: 00001000   ebp: 000000cf   esp: f75a3c90
ds: 007b   es: 007b   ss: 0068
Process phone (pid: 1795, ti=f75a2000 task=c2287030 task.ti=f75a2000)
Stack: 006780b2 00000000 c21e9a08 00000003 ad40ad40 f7d8d1dc c0629908 00000000 
       f89fa000 00000012 00000002 00000003 ad55ad55 f7d8d182 c0629908 00001000 
       0033d60d 00001000 000000cf c01912df 00001000 f7dbf74c 00000000 00000008 
Call Trace:
 [<c01912df>] __getblk+0x1f/0x290
 [<c016a284>] check_poison_obj+0x24/0x1a0
 [<c0280115>] soft_cursor+0x175/0x1e0
 [<c01b1ad0>] __ext3_get_inode_loc+0x120/0x3a0
 [<c016954e>] dbg_redzone1+0xe/0x20
 [<c016a43e>] cache_alloc_debugcheck_after+0x3e/0x150
 [<c01c1703>] journal_start+0x83/0xe0
 [<c01b1e27>] ext3_reserve_inode_write+0x27/0x80
 [<c01b226a>] ext3_mark_inode_dirty+0x1a/0x40
 [<c01b2719>] ext3_dirty_inode+0x79/0xb0
 [<c018a744>] __mark_inode_dirty+0x34/0x1c0
 [<c0181a59>] file_update_time+0x39/0xa0
 [<c0152984>] __generic_file_aio_write_nolock+0x244/0x590
 [<c0120fad>] __wake_up_sync+0x3d/0x60
 [<c06154df>] __mutex_lock_slowpath+0xef/0x230
 [<c0152d29>] generic_file_aio_write+0x59/0xd0
 [<c01b04a0>] ext3_file_write+0x30/0xc0
 [<c016e997>] do_sync_write+0xc7/0x130
 [<c013c640>] autoremove_wake_function+0x0/0x50
 [<c015f2e9>] remove_vma+0x39/0x50
 [<c016f126>] vfs_write+0xa6/0x160
 [<c016e8d0>] do_sync_write+0x0/0x130
 [<c016f9e1>] sys_write+0x41/0x70
 [<c010304c>] syscall_call+0x7/0xb
 =======================
Code: 00 8b 7c 24 18 f3 a5 fb 8b 44 24 10 85 c0 0f 84 2c ff ff ff 8b 44 24 10 e8 5c ca ff ff e9 1e ff ff ff 89 d8 e8 50 ca ff ff eb 8d <0f> 0b eb fe 0f 0b eb fe 0f 0b eb fe 89 f6 55 57 56 53 83 ec 48 
EIP: [<c01912b2>] __find_get_block+0x1b2/0x1c0 SS:ESP 0068:f75a3c90
 

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* not-only-bluetooth memory corruption
  2006-12-24  0:01   ` Pavel Machek
@ 2006-12-24  0:06     ` Pavel Machek
  2006-12-24  0:07       ` ptrace() memory corruption? Pavel Machek
  0 siblings, 1 reply; 19+ messages in thread
From: Pavel Machek @ 2006-12-24  0:06 UTC (permalink / raw)
  To: Andrew Morton, kernel list; +Cc: marcel, maxk, bluez-devel

[-- Attachment #1: Type: text/plain, Size: 5021 bytes --]

On Sun 2006-12-24 01:01:50, Pavel Machek wrote:
> Hi!
> 
> > > I got this nasty oops while playing with debugger. Not sure if that is
> > > related; it also might be something with bluetooth; I already know it
> > > corrupts memory during suspend, perhaps it corrupts memory in some
> > > error path?
> > 
> > Okay, I spoke too soon. bluetooth & suspend memory corruption was
> > _way_ harder to reproduce than expected. Took me 5-or-so-suspend
> > cycles... so it is probably unrelated to the previous crash.
> > 
> > I was getting pretty regular crashes with bluetooth & gdb, but I was
> > not using bluetooth at the time of ext3-related crash.
> 
> And for completeness, here's bluetooth + gdb oops. Ok, I'm not _sure_
> it is bluetooth related. I'll try it without bluetooth in a while.

Ok, so this one is not bluetooth related. My little "phone"
application provokes nasty oops, even when talking to
/dev/null. Strange, that code does _nothing_
strange. (www.sf.net/projects/tui).

Is there something wrong with gdb?

									Pavel
pcmcia: Detected deprecated PCMCIA ioctl usage from process: cardmgr.
pcmcia: This interface will soon be removed from the kernel; please expect breakage unless you upgrade to new tools.
pcmcia: see http://www.kernel.org/pub/linux/utils/kernel/pcmcia/pcmcia.html for details.
cs: IO port probe 0x310-0x380: clean.
cs: IO port probe 0xa00-0xaff: clean.
PM: Adding info for No Bus:vcs10
PM: Adding info for No Bus:vcsa10
PM: Removing info for No Bus:vcs10
PM: Removing info for No Bus:vcsa10
PM: Adding info for No Bus:vcs10
PM: Adding info for No Bus:vcsa10
PM: Removing info for No Bus:vcs10
PM: Removing info for No Bus:vcsa10
e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex
e1000: eth0: e1000_watchdog: 10/100 speed: disabling TSO
PM: Adding info for No Bus:vcs11
PM: Adding info for No Bus:vcsa11
PM: Adding info for No Bus:vcs2
PM: Adding info for No Bus:vcs3
PM: Adding info for No Bus:vcs4
PM: Adding info for No Bus:vcsa2
PM: Adding info for No Bus:vcs5
PM: Adding info for No Bus:vcs6
PM: Adding info for No Bus:vcs7
PM: Adding info for No Bus:vcs8
PM: Adding info for No Bus:vcsa3
PM: Adding info for No Bus:vcsa4
PM: Adding info for No Bus:vcsa5
PM: Adding info for No Bus:vcsa6
PM: Adding info for No Bus:vcsa7
PM: Adding info for No Bus:vcsa8
coda_read_super: Bad mount data
coda_read_super: device index: 0
coda_read_super: No pseudo device
PM: Removing info for No Bus:vcs1
PM: Removing info for No Bus:vcsa1
PM: Adding info for No Bus:vcs1
PM: Adding info for No Bus:vcsa1
PM: Removing info for No Bus:vcs1
PM: Removing info for No Bus:vcsa1
PM: Adding info for No Bus:vcs1
PM: Adding info for No Bus:vcsa1
kjournald starting.  Commit interval 5 seconds
EXT3 FS on sda2, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
------------[ cut here ]------------
kernel BUG at fs/buffer.c:1235!
invalid opcode: 0000 [#1]
SMP 
Modules linked in:
CPU:    1
EIP:    0060:[<c01912b2>]    Not tainted VLI
EFLAGS: 00010046   (2.6.20-rc1 #383)
EIP is at __find_get_block+0x1b2/0x1c0
eax: 00000086   ebx: 00001000   ecx: 00000000   edx: 006780b2
esi: 0033d60d   edi: 00001000   ebp: 000000cf   esp: f6de1c90
ds: 007b   es: 007b   ss: 0068
Process phone (pid: 1847, ti=f6de0000 task=c2329550 task.ti=f6de0000)
Stack: 006780b2 00000000 c20eb5f8 00000003 05950595 c236d8d2 c0629908 00000000 
       f88da000 00000012 00000002 00000003 05800580 c236d878 c016a284 00001000 
       0033d60d 00001000 000000cf c01912df 00001000 5a0df380 0000007f f77b9c0c 
Call Trace:
 [<c016a284>] check_poison_obj+0x24/0x1a0
 [<c01912df>] __getblk+0x1f/0x290
 [<c01312e0>] lock_timer_base+0x20/0x50
 [<c01317f0>] __mod_timer+0x90/0xa0
 [<c01b1ad0>] __ext3_get_inode_loc+0x120/0x3a0
 [<c016954e>] dbg_redzone1+0xe/0x20
 [<c016a43e>] cache_alloc_debugcheck_after+0x3e/0x150
 [<c01c1703>] journal_start+0x83/0xe0
 [<c01b1e27>] ext3_reserve_inode_write+0x27/0x80
 [<c01b226a>] ext3_mark_inode_dirty+0x1a/0x40
 [<c01b2719>] ext3_dirty_inode+0x79/0xb0
 [<c018a744>] __mark_inode_dirty+0x34/0x1c0
 [<c0181a59>] file_update_time+0x39/0xa0
 [<c0152984>] __generic_file_aio_write_nolock+0x244/0x590
 [<c06154df>] __mutex_lock_slowpath+0xef/0x230
 [<c0152d29>] generic_file_aio_write+0x59/0xd0
 [<c0103aa4>] apic_timer_interrupt+0x28/0x30
 [<c01b04a0>] ext3_file_write+0x30/0xc0
 [<c016e997>] do_sync_write+0xc7/0x130
 [<c013c640>] autoremove_wake_function+0x0/0x50
 [<c015f2e9>] remove_vma+0x39/0x50
 [<c016f126>] vfs_write+0xa6/0x160
 [<c016e8d0>] do_sync_write+0x0/0x130
 [<c016f9e1>] sys_write+0x41/0x70
 [<c010304c>] syscall_call+0x7/0xb
 =======================
Code: 00 8b 7c 24 18 f3 a5 fb 8b 44 24 10 85 c0 0f 84 2c ff ff ff 8b 44 24 10 e8 5c ca ff ff e9 1e ff ff ff 89 d8 e8 50 ca ff ff eb 8d <0f> 0b eb fe 0f 0b eb fe 0f 0b eb fe 89 f6 55 57 56 53 83 ec 48 
EIP: [<c01912b2>] __find_get_block+0x1b2/0x1c0 SS:ESP 0068:f6de1c90
 


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: oops4.bz2 --]
[-- Type: application/octet-stream, Size: 12140 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* ptrace() memory corruption?
  2006-12-24  0:06     ` not-only-bluetooth memory corruption Pavel Machek
@ 2006-12-24  0:07       ` Pavel Machek
  2006-12-24  1:14         ` Jiri Slaby
  0 siblings, 1 reply; 19+ messages in thread
From: Pavel Machek @ 2006-12-24  0:07 UTC (permalink / raw)
  To: Andrew Morton, kernel list; +Cc: marcel, maxk, bluez-devel

On Sun 2006-12-24 01:06:05, Pavel Machek wrote:
> On Sun 2006-12-24 01:01:50, Pavel Machek wrote:
> > Hi!
> > 
> > > > I got this nasty oops while playing with debugger. Not sure if that is
> > > > related; it also might be something with bluetooth; I already know it
> > > > corrupts memory during suspend, perhaps it corrupts memory in some
> > > > error path?
> > > 
> > > Okay, I spoke too soon. bluetooth & suspend memory corruption was
> > > _way_ harder to reproduce than expected. Took me 5-or-so-suspend
> > > cycles... so it is probably unrelated to the previous crash.
> > > 
> > > I was getting pretty regular crashes with bluetooth & gdb, but I was
> > > not using bluetooth at the time of ext3-related crash.
> > 
> > And for completeness, here's bluetooth + gdb oops. Ok, I'm not _sure_
> > it is bluetooth related. I'll try it without bluetooth in a while.
> 
> Ok, so this one is not bluetooth related. My little "phone"
> application provokes nasty oops, even when talking to
> /dev/null. Strange, that code does _nothing_
> strange. (www.sf.net/projects/tui).
> 
> Is there something wrong with gdb?

Yep. If I do gdb /bin/bash, run; I'll get similar oops. Am I alone
seeing this?
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ext3-related crash in 2.6.20-rc1
  2006-12-23 23:43 ext3-related crash in 2.6.20-rc1 Pavel Machek
  2006-12-23 23:55 ` bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1) Pavel Machek
@ 2006-12-24  1:12 ` Andrew Morton
  1 sibling, 0 replies; 19+ messages in thread
From: Andrew Morton @ 2006-12-24  1:12 UTC (permalink / raw)
  To: Pavel Machek; +Cc: kernel list, marcel, maxk, bluez-devel

On Sun, 24 Dec 2006 00:43:05 +0100
Pavel Machek <pavel@ucw.cz> wrote:

> Hi!
> 
> I got this nasty oops while playing with debugger. Not sure if that is
> related; it also might be something with bluetooth; I already know it
> corrupts memory during suspend, perhaps it corrupts memory in some
> error path?
> 
>  
> 
> 								Pavel
> 
> 
> l2cap_recv_acldata: Unexpected continuation frame (len 0)
> l2cap_recv_acldata: Unexpected continuation frame (len 0)
> l2cap_recv_acldata: Unexpected continuation frame (len 0)
> l2cap_recv_acldata: Unexpected continuation frame (len 0)
> l2cap_recv_acldata: Unexpected continuation frame (len 0)
> l2cap_recv_acldata: Unexpected continuation frame (len 0)
> l2cap_recv_acldata: Unexpected continuation frame (len 0)
> PM: Removing info for bluetooth:acl00803715A329
> e1000: eth0: e1000_watchdog: NIC Link is Down
> e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex
> e1000: eth0: e1000_watchdog: 10/100 speed: disabling TSO
> e1000: eth0: e1000_watchdog: NIC Link is Down
> e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex
> e1000: eth0: e1000_watchdog: 10/100 speed: disabling TSO
> ------------[ cut here ]------------
> kernel BUG at fs/buffer.c:1235!

get thee to fs/buffer.c:1235.  You'll see that someone somewhere forgot to
reenable local interrupts.

Were you using gdb at the time?  A fix for something like that was merged
into mainline yesterday.

The slab errors which you're reporting in later emails will almost surely
be unrelated to this.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ptrace() memory corruption?
  2006-12-24  0:07       ` ptrace() memory corruption? Pavel Machek
@ 2006-12-24  1:14         ` Jiri Slaby
  2006-12-24 11:52           ` Jiri Slaby
  0 siblings, 1 reply; 19+ messages in thread
From: Jiri Slaby @ 2006-12-24  1:14 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Andrew Morton, kernel list, marcel, maxk, bluez-devel

Pavel Machek wrote:
>> Is there something wrong with gdb?
> 
> Yep. If I do gdb /bin/bash, run; I'll get similar oops. Am I alone
> seeing this?

Nope, I have this nasty thing here too and will post oopses in the afternoon,
just before Jezisek comes :).

regards,
-- 
http://www.fi.muni.cz/~xslaby/            Jiri Slaby
faculty of informatics, masaryk university, brno, cz
e-mail: jirislaby gmail com, gpg pubkey fingerprint:
B674 9967 0407 CE62 ACC8  22A0 32CC 55C3 39D4 7A7E

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
  2006-12-23 23:55 ` bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1) Pavel Machek
  2006-12-24  0:01   ` Pavel Machek
@ 2006-12-24  1:18   ` Andrew Morton
  2006-12-24 23:24     ` Pavel Machek
  2006-12-24 14:39   ` Marcel Holtmann
  2 siblings, 1 reply; 19+ messages in thread
From: Andrew Morton @ 2006-12-24  1:18 UTC (permalink / raw)
  To: Pavel Machek; +Cc: kernel list, marcel, maxk, bluez-devel

On Sun, 24 Dec 2006 00:55:01 +0100
Pavel Machek <pavel@ucw.cz> wrote:

> PM: Removing info for No Bus:usbdev3.15_ep81
> PM: Removing info for No Bus:usbdev3.15_ep82
> PM: Removing info for No Bus:usbdev3.15_ep02
> slab error in verify_redzone_free(): cache `size-512': memory outside object was overwritten
>  [<c016a1b8>] cache_free_debugcheck+0x128/0x1d0
>  [<c04b58e3>] hci_usb_close+0xf3/0x160
>  [<c016b530>] kfree+0x50/0xa0
>  [<c04b58e3>] hci_usb_close+0xf3/0x160
>  [<c04b59be>] hci_usb_disconnect+0x2e/0x90
>  [<c0454f23>] usb_disable_interface+0x53/0x70
>  [<c04576f8>] usb_unbind_interface+0x38/0x80
>  [<c032f908>] __device_release_driver+0x68/0xb0
>  [<c032fc3e>] device_release_driver+0x1e/0x40
>  [<c032f1db>] bus_remove_device+0x8b/0xa0
>  [<c032dbc9>] device_del+0x159/0x1c0
>  [<c04559ad>] usb_disable_device+0x4d/0x100
>  [<c044fe8a>] usb_disconnect+0x9a/0x110
>  [<c0452405>] hub_thread+0x355/0xbd0
>  [<c061426e>] schedule+0x2de/0x8f0
>  [<c013c640>] autoremove_wake_function+0x0/0x50
>  [<c04520b0>] hub_thread+0x0/0xbd0
>  [<c013c58c>] kthread+0xec/0xf0
>  [<c013c4a0>] kthread+0x0/0xf0
>  [<c0103be7>] kernel_thread_helper+0x7/0x10
>  =======================

yes, this one looks like memory scribblage in bluetooth.  The
buffer.c assertion failure should now be fixed, please verify.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ptrace() memory corruption?
  2006-12-24  1:14         ` Jiri Slaby
@ 2006-12-24 11:52           ` Jiri Slaby
  2006-12-24 12:22             ` Andrew Morton
  0 siblings, 1 reply; 19+ messages in thread
From: Jiri Slaby @ 2006-12-24 11:52 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Pavel Machek, Andrew Morton, kernel list, marcel, maxk,
	bluez-devel

Jiri Slaby wrote:
> Pavel Machek wrote:
>>> Is there something wrong with gdb?
>> Yep. If I do gdb /bin/bash, run; I'll get similar oops. Am I alone
>> seeing this?
> 
> Nope, I have this nasty thing here too and will post oopses in the afternoon,
> just before Jezisek comes :).

Ok, I captured this through netconosle:
[    8.499155] usb 3-2: new low speed USB device using uhci_hcd and address 2
[    8.721946] usb 3-2: new device found, idVendor=045e, idProduct=00f0
[    8.722016] usb 3-2: new device strings: Mfr=1, Product=2, SerialNumber=0
[    8.722081] usb 3-2: Product: Microsoft � Laser Mouse 6000
[    8.722145] usb 3-2: Manufacturer: Microsoft Corporation
[    8.722344] usb 3-2: configuration #1 chosen from 1 choice
[    8.753100] input: Microsoft Corporation Microsoft � Laser Mouse 6000 as
/class/input/input4
[    8.753310] input: USB HID v1.11 Mouse [Microsoft Corporation Microsoft �
Laser Mouse 6000] on usb-0000:00:1d.1-2
[   58.672510] WARNING (!__warned) at /home/l/latest/xxx/kernel/softirq.c:137
local_bh_enable()
[   58.672562]  [<c0103f1b>] show_trace_log_lvl+0x1a/0x30
[   58.672682]  [<c01045d5>] show_trace+0x12/0x14
[   58.672787]  [<c010465c>] dump_stack+0x16/0x18
[   58.672893]  [<c0126ccc>] local_bh_enable+0x8c/0x9b
[   58.672998]  [<c030a499>] lock_sock_nested+0xa3/0xab
[   58.673107]  [<c03080e1>] sock_fasync+0x3e/0x145
[   58.673216]  [<c0309056>] sock_close+0x19/0x3d
[   58.673322]  [<c0165baf>] __fput+0xa6/0x161
[   58.673432]  [<c0165e25>] fput+0x22/0x3b
[   58.673538]  [<c016358a>] filp_close+0x41/0x67
[   58.673646]  [<c01645f3>] sys_close+0x67/0xaf
[   58.673753]  [<c0102fe4>] syscall_call+0x7/0xb
[   58.673855]  =======================
[   58.674091] ------------[ cut here ]------------
[   58.674158] kernel BUG at /home/l/latest/xxx/fs/buffer.c:1244!
[   58.674224] invalid opcode: 0000 [#1]
[   58.674286] SMP
[   58.674414] last sysfs file: /devices/platform/i2c-9191/9191-0290/fan3_min
[   58.674478] Modules linked in: eth1394 floppy ohci1394 ieee1394 ide_cd cdrom
[   58.674778] CPU:    1
[   58.674779] EIP:    0060:[<c0181fa0>]    Not tainted VLI
[   58.674780] EFLAGS: 00010046   (2.6.20-rc1-mm1 #207)
[   58.674971] EIP is at __find_get_block+0x165/0x171
[   58.675035] eax: 00000092   ebx: f78e6ec0   ecx: 00001000   edx: 00008025
[   58.675101] esi: 00000001   edi: 00001000   ebp: f76efc6c   esp: f76efc34
[   58.675166] ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
[   58.675232] Process bash (pid: 1595, ti=f76ee000 task=c1a40560 task.ti=f76ee000)
[   58.675297] Stack: c1b31ac0 f7df7e3c 0000003a c1b31bd0 f76efc74 c0181bb4
00000001 00000000
[   58.675693]        f7df7e3c f76efc74 c0181544 f78e6ec0 00000001 00001000
f76efc9c c0181fc2
[   58.676083]        f76efcb4 c0181eab 00008025 c1b31ac0 f74e5d40 f76efcb8
c01aa46a f78e6ec0
[   58.676476] Call Trace:
[   58.676594]  [<c0103f1b>] show_trace_log_lvl+0x1a/0x30
[   58.676692]  [<c0103fd6>] show_stack_log_lvl+0xa5/0xca
[   58.676789]  [<c01041ce>] show_registers+0x1d3/0x2b8
[   58.676887]  [<c01043d4>] die+0x121/0x243
[   58.676984]  [<c010456c>] do_trap+0x76/0x9c
[   58.677083]  [<c0104dcf>] do_invalid_op+0x97/0xa1
[   58.677181]  [<c038a7e4>] error_code+0x7c/0x84
[   58.677278]  [<c0181fc2>] __getblk+0x16/0x20a
[   58.677375]  [<c019ec64>] __ext3_get_inode_loc+0x139/0x332
[   58.677476]  [<c019ee71>] ext3_get_inode_loc+0x14/0x16
[   58.677575]  [<c019ee93>] ext3_reserve_inode_write+0x20/0x6c
[   58.677674]  [<c019eeff>] ext3_mark_inode_dirty+0x20/0x37
[   58.677772]  [<c01a1cd0>] ext3_dirty_inode+0x6b/0x6d
[   58.677871]  [<c017e7c4>] __mark_inode_dirty+0x2a/0x170
[   58.677969]  [<c0176d3c>] touch_atime+0xb4/0xe8
[   58.678067]  [<c016ce4d>] __link_path_walk+0x91e/0xcb6
[   58.678164]  [<c016d22b>] link_path_walk+0x46/0xc3
[   58.678262]  [<c016d46f>] do_path_lookup+0x86/0x1b0
[   58.678359]  [<c016df00>] __path_lookup_intent_open+0x44/0x7f
[   58.678457]  [<c016dfb3>] path_lookup_open+0x21/0x27
[   58.678555]  [<c016e088>] open_namei+0x62/0x5cb
[   58.678653]  [<c01638d2>] do_filp_open+0x26/0x43
[   58.678750]  [<c0163930>] do_sys_open+0x41/0xca
[   58.678847]  [<c01639f1>] sys_open+0x1c/0x1e
[   58.678943]  [<c0102fe4>] syscall_call+0x7/0xb
[   58.679040]  =======================
[   58.679101] Code: 45 d0 e8 b6 f5 ff ff e9 22 ff ff ff 89 d8 e8 aa f5 ff ff eb
8c 89 ce 8d 4e ff 8b 04 8f 89 04 b7 85 c9 75 f1 89 1f e9 f6 fe ff ff <0f> 0b eb
fe 0f 0b eb fe 0f 0b eb fe 55 89 e5 57 56 53 83 ec 1c
[   58.681386] EIP: [<c0181fa0>] __find_get_block+0x165/0x171 SS:ESP 0068:f76efc34
[   58.681545]

after gdb /bin/bash
(gdb) run

regards,
-- 
http://www.fi.muni.cz/~xslaby/            Jiri Slaby
faculty of informatics, masaryk university, brno, cz
e-mail: jirislaby gmail com, gpg pubkey fingerprint:
B674 9967 0407 CE62 ACC8  22A0 32CC 55C3 39D4 7A7E

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ptrace() memory corruption?
  2006-12-24 11:52           ` Jiri Slaby
@ 2006-12-24 12:22             ` Andrew Morton
  2006-12-24 13:44               ` Jiri Slaby
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Morton @ 2006-12-24 12:22 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Pavel Machek, kernel list, marcel, maxk, bluez-devel

On Sun, 24 Dec 2006 12:51:16 +0059
Jiri Slaby <jirislaby@gmail.com> wrote:

> [   58.674780] EFLAGS: 00010046   (2.6.20-rc1-mm1 #207)

please, test 2.6.20-rc2.  We applied a fix for this.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ptrace() memory corruption?
  2006-12-24 12:22             ` Andrew Morton
@ 2006-12-24 13:44               ` Jiri Slaby
  0 siblings, 0 replies; 19+ messages in thread
From: Jiri Slaby @ 2006-12-24 13:44 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Pavel Machek, kernel list, marcel, maxk, bluez-devel

Andrew Morton wrote:
> On Sun, 24 Dec 2006 12:51:16 +0059
> Jiri Slaby <jirislaby@gmail.com> wrote:
> 
>> [   58.674780] EFLAGS: 00010046   (2.6.20-rc1-mm1 #207)
> 
> please, test 2.6.20-rc2.  We applied a fix for this.

It's working now.

thanks,
-- 
http://www.fi.muni.cz/~xslaby/            Jiri Slaby
faculty of informatics, masaryk university, brno, cz
e-mail: jirislaby gmail com, gpg pubkey fingerprint:
B674 9967 0407 CE62 ACC8  22A0 32CC 55C3 39D4 7A7E

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
  2006-12-23 23:55 ` bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1) Pavel Machek
  2006-12-24  0:01   ` Pavel Machek
  2006-12-24  1:18   ` bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1) Andrew Morton
@ 2006-12-24 14:39   ` Marcel Holtmann
  2006-12-24 23:36     ` Pavel Machek
  2006-12-24 23:43     ` Pavel Machek
  2 siblings, 2 replies; 19+ messages in thread
From: Marcel Holtmann @ 2006-12-24 14:39 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Andrew Morton, kernel list, maxk, bluez-devel

Hi Pavel,

> > I got this nasty oops while playing with debugger. Not sure if that is
> > related; it also might be something with bluetooth; I already know it
> > corrupts memory during suspend, perhaps it corrupts memory in some
> > error path?
> 
> Okay, I spoke too soon. bluetooth & suspend memory corruption was
> _way_ harder to reproduce than expected. Took me 5-or-so-suspend
> cycles... so it is probably unrelated to the previous crash.

can you try to reproduce this with 2.6.20-rc2 as well.

Regards

Marcel



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
  2006-12-24  1:18   ` bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1) Andrew Morton
@ 2006-12-24 23:24     ` Pavel Machek
  0 siblings, 0 replies; 19+ messages in thread
From: Pavel Machek @ 2006-12-24 23:24 UTC (permalink / raw)
  To: Andrew Morton; +Cc: kernel list, marcel, maxk, bluez-devel

Hi!

> > PM: Removing info for No Bus:usbdev3.15_ep81
> > PM: Removing info for No Bus:usbdev3.15_ep82
> > PM: Removing info for No Bus:usbdev3.15_ep02
> > slab error in verify_redzone_free(): cache `size-512': memory outside object was overwritten
> >  [<c016a1b8>] cache_free_debugcheck+0x128/0x1d0
> >  [<c04b58e3>] hci_usb_close+0xf3/0x160
> >  [<c016b530>] kfree+0x50/0xa0
> >  [<c04b58e3>] hci_usb_close+0xf3/0x160
> >  [<c04b59be>] hci_usb_disconnect+0x2e/0x90
> >  [<c0454f23>] usb_disable_interface+0x53/0x70
> >  [<c04576f8>] usb_unbind_interface+0x38/0x80
> >  [<c032f908>] __device_release_driver+0x68/0xb0
> >  [<c032fc3e>] device_release_driver+0x1e/0x40
> >  [<c032f1db>] bus_remove_device+0x8b/0xa0
> >  [<c032dbc9>] device_del+0x159/0x1c0
> >  [<c04559ad>] usb_disable_device+0x4d/0x100
> >  [<c044fe8a>] usb_disconnect+0x9a/0x110
> >  [<c0452405>] hub_thread+0x355/0xbd0
> >  [<c061426e>] schedule+0x2de/0x8f0
> >  [<c013c640>] autoremove_wake_function+0x0/0x50
> >  [<c04520b0>] hub_thread+0x0/0xbd0
> >  [<c013c58c>] kthread+0xec/0xf0
> >  [<c013c4a0>] kthread+0x0/0xf0
> >  [<c0103be7>] kernel_thread_helper+0x7/0x10
> >  =======================
> 
> yes, this one looks like memory scribblage in bluetooth.  The
> buffer.c assertion failure should now be fixed, please verify.

I can confirm buffer.c assertion to be fixed (yes, I was using gdb at
that time).
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
  2006-12-24 14:39   ` Marcel Holtmann
@ 2006-12-24 23:36     ` Pavel Machek
  2006-12-30 21:52       ` Adrian Bunk
  2006-12-24 23:43     ` Pavel Machek
  1 sibling, 1 reply; 19+ messages in thread
From: Pavel Machek @ 2006-12-24 23:36 UTC (permalink / raw)
  To: Marcel Holtmann; +Cc: kernel list, Andrew Morton, maxk, bluez-devel

[-- Attachment #1: Type: text/plain, Size: 7401 bytes --]

On Sun 2006-12-24 15:39:23, Marcel Holtmann wrote:
> Hi Pavel,
> 
> > > I got this nasty oops while playing with debugger. Not sure if that is
> > > related; it also might be something with bluetooth; I already know it
> > > corrupts memory during suspend, perhaps it corrupts memory in some
> > > error path?
> > 
> > Okay, I spoke too soon. bluetooth & suspend memory corruption was
> > _way_ harder to reproduce than expected. Took me 5-or-so-suspend
> > cycles... so it is probably unrelated to the previous crash.
> 
> can you try to reproduce this with 2.6.20-rc2 as well.

Yep, here it is, reproduced on 6-th-or-so suspend.

bluetooth may need to be actively used in order for this to trigger;
connecting to the net over my cellphone seems to work okay.

(Full logs in attachment).

								Pavel

Linux version 2.6.20-rc2 (pavel@amd) (gcc version 4.0.4 20060507
(prerelease) (Debian 4.0.3-3)) #383 SMP Fri Dec 22 11:30:05 CET 2006
...
system 00:00: resuming
pnp 00:01: resuming
system 00:02: resuming
pnp 00:03: resuming
pnp 00:04: resuming
pnp 00:05: resuming
pnp 00:06: resuming
pnp 00:07: resuming
i8042 kbd 00:08: resuming
pnp: Device 00:08 does not support activation.
i8042 aux 00:09: resuming
pnp: Device 00:09 does not support activation.
pnp 00:0a: resuming
pnp 00:0b: resuming
platform bluetooth: resuming
pcspkr pcspkr: resuming
vesafb vesafb.0: resuming
serial8250 serial8250: resuming
usb usb1: resuming
usb usb3: resuming
ata2: SATA link down (SStatus 0 SControl 0)
ata3: SATA link down (SStatus 0 SControl 0)
ata4: SATA link down (SStatus 0 SControl 0)
hub 1-0:1.0: resuming
hub 3-0:1.0: resuming
i8042 i8042: resuming
atkbd serio0: resuming
psmouse serio1: resuming
usb usb4: resuming
usb usb5: resuming
hub 4-0:1.0: resuming
hub 5-0:1.0: resuming
usb usb2: resuming
hub 2-0:1.0: resuming
mmcblk mmc0:cc53: resuming
sd 0:0:0:0: resuming
usb 3-2: resuming
 usbdev3.8_ep00: PM: resume from 0, parent 3-2 still 2
usb 3-2:1.0: PM: resume from 2, parent 3-2 still 2
usb 3-2:1.0: resuming
 usbdev3.8_ep81: PM: resume from 0, parent 3-2:1.0 still 2
 usbdev3.8_ep02: PM: resume from 0, parent 3-2:1.0 still 2
 usbdev3.8_ep83: PM: resume from 0, parent 3-2:1.0 still 2
usb 3-1: resuming
 usbdev3.9_ep00: PM: resume from 0, parent 3-1 still 2
hci_usb 3-1:1.0: PM: resume from 2, parent 3-1 still 2
hci_usb 3-1:1.0: resuming
 hci0: PM: resume from 0, parent 3-1:1.0 still 2
 usbdev3.9_ep81: PM: resume from 0, parent 3-1:1.0 still 2
 usbdev3.9_ep82: PM: resume from 0, parent 3-1:1.0 still 2
 usbdev3.9_ep02: PM: resume from 0, parent 3-1:1.0 still 2
hci_usb 3-1:1.1: PM: resume from 2, parent 3-1 still 2
hci_usb 3-1:1.1: resuming
 usbdev3.9_ep83: PM: resume from 0, parent 3-1:1.1 still 2
 usbdev3.9_ep03: PM: resume from 0, parent 3-1:1.1 still 2
usb 3-1:1.2: PM: resume from 2, parent 3-1 still 2
usb 3-1:1.2: resuming
 usbdev3.9_ep84: PM: resume from 0, parent 3-1:1.2 still 2
 usbdev3.9_ep04: PM: resume from 0, parent 3-1:1.2 still 2
usb 3-1:1.3: PM: resume from 2, parent 3-1 still 2
usb 3-1:1.3: resuming
Restarting tasks ... <3>__tx_submit: hci0 tx submit failed urb f765d1bc type 2 err -19
usb 3-1: USB disconnect, address 9
PM: Removing info for No Bus:usbdev3.9_ep81
PM: Removing info for No Bus:usbdev3.9_ep82
PM: Removing info for No Bus:usbdev3.9_ep02
slab error in verify_redzone_free(): cache `size-512': memory outside object was overwritten
 [<c016a298>] cache_free_debugcheck+0x128/0x1d0
 [<c04b08d3>] hci_usb_close+0xf3/0x160
 [<c016b610>] kfree+0x50/0xa0
 [<c04b08d3>] hci_usb_close+0xf3/0x160
 [<c04b09ae>] hci_usb_disconnect+0x2e/0x90
 [<c044fed3>] usb_disable_interface+0x53/0x70
 [<c04526a8>] usb_unbind_interface+0x38/0x80
 [<c032a8b8>] __device_release_driver+0x68/0xb0
 [<c032abee>] device_release_driver+0x1e/0x40
 [<c032a18b>] bus_remove_device+0x8b/0xa0
 [<c0328b79>] device_del+0x159/0x1c0
 [<c045095d>] usb_disable_device+0x4d/0x100
 [<c044ae3a>] usb_disconnect+0x9a/0x110
 [<c044d3b5>] hub_thread+0x355/0xbd0
 [<c060f53e>] schedule+0x2de/0x8f0
 [<c013c680>] autoremove_wake_function+0x0/0x50
 [<c044d060>] hub_thread+0x0/0xbd0
 [<c013c5cc>] kthread+0xec/0xf0
 [<c013c4e0>] kthread+0x0/0xf0
 [<c0103be7>] kernel_thread_helper+0x7/0x10
 =======================
f70a2720: redzone 1:0x5a5a5a5a, redzone 2:0xc0545e9e.
------------[ cut here ]------------
kernel BUG at mm/slab.c:2878!
invalid opcode: 0000 [#1]
SMP 
Modules linked in:
CPU:    0
EIP:    0060:[<c016a322>]    Not tainted VLI
EFLAGS: 00010012   (2.6.20-rc2 #383)
EIP is at cache_free_debugcheck+0x1b2/0x1d0
eax: f70a271c   ebx: f70a20f8   ecx: 00052c00   edx: 0000020c
esi: c20df680   edi: f70a2720   ebp: 5a5a5a5a   esp: c2313e30
ds: 007b   es: 007b   ss: 0068
Process khubd (pid: 304, ti=c2312000 task=c2257030 task.ti=c2312000)
Stack: c06aedf0 f70a2720 5a5a5a5a c0545e9e c04b08d3 f70a20c0 c20df680 c20d9164 
       f70a2724 00000286 c016b610 f653e8d8 f653e8c4 c2134ba0 0000000c c04b08d3 
       c2134b5c c2134b8c f62e0a54 c2134ad0 00000001 c2134ad0 f62e0a54 c07dbee0 
Call Trace:
 [<c0545e9e>] sock_alloc_send_skb+0x16e/0x1c0
 [<c04b08d3>] hci_usb_close+0xf3/0x160
 [<c016b610>] kfree+0x50/0xa0
 [<c04b08d3>] hci_usb_close+0xf3/0x160
 [<c04b09ae>] hci_usb_disconnect+0x2e/0x90
 [<c044fed3>] usb_disable_interface+0x53/0x70
 [<c04526a8>] usb_unbind_interface+0x38/0x80
 [<c032a8b8>] __device_release_driver+0x68/0xb0
 [<c032abee>] device_release_driver+0x1e/0x40
 [<c032a18b>] bus_remove_device+0x8b/0xa0
 [<c0328b79>] device_del+0x159/0x1c0
 [<c045095d>] usb_disable_device+0x4d/0x100
 [<c044ae3a>] usb_disconnect+0x9a/0x110
 [<c044d3b5>] hub_thread+0x355/0xbd0
 [<c060f53e>] schedule+0x2de/0x8f0
 [<c013c680>] autoremove_wake_function+0x0/0x50
 [<c044d060>] hub_thread+0x0/0xbd0
 [<c013c5cc>] kthread+0xec/0xf0
 [<c013c4e0>] kthread+0x0/0xf0
 [<c0103be7>] kernel_thread_helper+0x7/0x10
 =======================
Code: f0 2c 5a 75 8b b9 05 df 6a c0 89 f2 b8 88 98 61 c0 e8 73 f4 ff ff eb 89 81 fb a5 c2 0f 17 0f 85 6c ff ff ff 90 8d 74 26 00 eb 8e <0f> 0b eb fe 0f 0b eb fe 8d b6 00 00 00 00 0f 0b eb fe 8b 52 0c 
EIP: [<c016a322>] cache_free_debugcheck+0x1b2/0x1d0 SS:ESP 0068:c2313e30
 <7>PM: Adding info for No Bus:vcs63
PM: Adding info for No Bus:vcsa63
PM: Removing info for No Bus:vcs63
PM: Removing info for No Bus:vcsa63
done.
Enabling non-boot CPUs ...
SMP alternatives: switching to SMP code
Booting processor 1/1 eip 3000
Initializing CPU#1
Calibrating delay using timer specific routine.. 3657.64 BogoMIPS (lpj=18288234)
CPU: After generic identify, caps: bfe9fbff 00100000 00000000 00000000 0000c1a9 00000000 00000000
monitor/mwait feature present.
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 2048K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
CPU: After all inits, caps: bfe9fbff 00100000 00000000 00002940 0000c1a9 00000000 00000000
CPU1: Intel Genuine Intel(R) CPU           T2400  @ 1.83GHz stepping 08
PM: Adding info for No Bus:msr1
CPU1 is up
ata1: waiting for device to spin up (8 secs)
e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex
e1000: eth0: e1000_watchdog: 10/100 speed: disabling TSO
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: configured for UDMA/100
SCSI device sda: 117210240 512-byte hdwr sectors (60012 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: oops4.bz2 --]
[-- Type: application/octet-stream, Size: 15482 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
  2006-12-24 14:39   ` Marcel Holtmann
  2006-12-24 23:36     ` Pavel Machek
@ 2006-12-24 23:43     ` Pavel Machek
  2006-12-28  8:42       ` Pavel Machek
  1 sibling, 1 reply; 19+ messages in thread
From: Pavel Machek @ 2006-12-24 23:43 UTC (permalink / raw)
  To: Marcel Holtmann; +Cc: Andrew Morton, kernel list, maxk, bluez-devel

Hi!

> > > I got this nasty oops while playing with debugger. Not sure if that is
> > > related; it also might be something with bluetooth; I already know it
> > > corrupts memory during suspend, perhaps it corrupts memory in some
> > > error path?
> > 
> > Okay, I spoke too soon. bluetooth & suspend memory corruption was
> > _way_ harder to reproduce than expected. Took me 5-or-so-suspend
> > cycles... so it is probably unrelated to the previous crash.
> 
> can you try to reproduce this with 2.6.20-rc2 as well.

(reproduced in another mail).

        _urb_queue_tail(__pending_q(husb, _urb->type), _urb);
        err = usb_submit_urb(urb, GFP_ATOMIC);
        if (err) {
                BT_ERR("%s tx submit failed urb %p type %d err %d",
                                husb->hdev->name, urb, _urb->type, err);
                _urb_unlink(_urb);

                ~~~~~~~~~~~~~~~~~~
	 	 Do we need to remove urb from pending_q here?

                _urb_queue_tail(__completed_q(husb, _urb->type), _urb);
        } else
                atomic_inc(__pending_tx(husb, _urb->type));

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
  2006-12-24 23:43     ` Pavel Machek
@ 2006-12-28  8:42       ` Pavel Machek
  2006-12-28 10:40         ` Marcel Holtmann
  0 siblings, 1 reply; 19+ messages in thread
From: Pavel Machek @ 2006-12-28  8:42 UTC (permalink / raw)
  To: Marcel Holtmann; +Cc: Andrew Morton, kernel list, maxk, bluez-devel

Hi!

> > > > I got this nasty oops while playing with debugger. Not sure if that is
> > > > related; it also might be something with bluetooth; I already know it
> > > > corrupts memory during suspend, perhaps it corrupts memory in some
> > > > error path?
> > > 
> > > Okay, I spoke too soon. bluetooth & suspend memory corruption was
> > > _way_ harder to reproduce than expected. Took me 5-or-so-suspend
> > > cycles... so it is probably unrelated to the previous crash.
> > 
> > can you try to reproduce this with 2.6.20-rc2 as well.
> 
> (reproduced in another mail).
> 
>         _urb_queue_tail(__pending_q(husb, _urb->type), _urb);
>         err = usb_submit_urb(urb, GFP_ATOMIC);
>         if (err) {
>                 BT_ERR("%s tx submit failed urb %p type %d err %d",
>                                 husb->hdev->name, urb, _urb->type, err);
>                 _urb_unlink(_urb);
> 
>                 ~~~~~~~~~~~~~~~~~~
> 	 	 Do we need to remove urb from pending_q here?
> 
>                 _urb_queue_tail(__completed_q(husb, _urb->type), _urb);
>         } else
>                 atomic_inc(__pending_tx(husb, _urb->type));
> 

Any news? Should I convert above idea to a patch? Or should I make
bluetooth suspend() routine return error so corruption is impossible
to hit?
						Pavel
-- 
Thanks for all the (sleeping) penguins.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
  2006-12-28  8:42       ` Pavel Machek
@ 2006-12-28 10:40         ` Marcel Holtmann
  0 siblings, 0 replies; 19+ messages in thread
From: Marcel Holtmann @ 2006-12-28 10:40 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Andrew Morton, kernel list, maxk, bluez-devel

Hi Pavel,

> > > > > I got this nasty oops while playing with debugger. Not sure if that is
> > > > > related; it also might be something with bluetooth; I already know it
> > > > > corrupts memory during suspend, perhaps it corrupts memory in some
> > > > > error path?
> > > > 
> > > > Okay, I spoke too soon. bluetooth & suspend memory corruption was
> > > > _way_ harder to reproduce than expected. Took me 5-or-so-suspend
> > > > cycles... so it is probably unrelated to the previous crash.
> > > 
> > > can you try to reproduce this with 2.6.20-rc2 as well.
> > 
> > (reproduced in another mail).
> > 
> >         _urb_queue_tail(__pending_q(husb, _urb->type), _urb);
> >         err = usb_submit_urb(urb, GFP_ATOMIC);
> >         if (err) {
> >                 BT_ERR("%s tx submit failed urb %p type %d err %d",
> >                                 husb->hdev->name, urb, _urb->type, err);
> >                 _urb_unlink(_urb);
> > 
> >                 ~~~~~~~~~~~~~~~~~~
> > 	 	 Do we need to remove urb from pending_q here?
> > 
> >                 _urb_queue_tail(__completed_q(husb, _urb->type), _urb);
> >         } else
> >                 atomic_inc(__pending_tx(husb, _urb->type));
> > 
> 
> Any news? Should I convert above idea to a patch? Or should I make
> bluetooth suspend() routine return error so corruption is impossible
> to hit?

to be honest, I have no idea. This code is way to ugly anyway.

Regards

Marcel



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
  2006-12-24 23:36     ` Pavel Machek
@ 2006-12-30 21:52       ` Adrian Bunk
  2007-01-01 19:01         ` Pavel Machek
  0 siblings, 1 reply; 19+ messages in thread
From: Adrian Bunk @ 2006-12-30 21:52 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Marcel Holtmann, kernel list, Andrew Morton, maxk, bluez-devel

On Mon, Dec 25, 2006 at 12:36:47AM +0100, Pavel Machek wrote:
> On Sun 2006-12-24 15:39:23, Marcel Holtmann wrote:
> > Hi Pavel,
> > 
> > > > I got this nasty oops while playing with debugger. Not sure if that is
> > > > related; it also might be something with bluetooth; I already know it
> > > > corrupts memory during suspend, perhaps it corrupts memory in some
> > > > error path?
> > > 
> > > Okay, I spoke too soon. bluetooth & suspend memory corruption was
> > > _way_ harder to reproduce than expected. Took me 5-or-so-suspend
> > > cycles... so it is probably unrelated to the previous crash.
> > 
> > can you try to reproduce this with 2.6.20-rc2 as well.
> 
> Yep, here it is, reproduced on 6-th-or-so suspend.
> 
> bluetooth may need to be actively used in order for this to trigger;
> connecting to the net over my cellphone seems to work okay.
> 
> (Full logs in attachment).

Is this issue also present in 2.6.19 or is it a regression?

> 								Pavel

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1)
  2006-12-30 21:52       ` Adrian Bunk
@ 2007-01-01 19:01         ` Pavel Machek
  0 siblings, 0 replies; 19+ messages in thread
From: Pavel Machek @ 2007-01-01 19:01 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Marcel Holtmann, kernel list, Andrew Morton, maxk, bluez-devel

Hi!

> > > > Okay, I spoke too soon. bluetooth & suspend memory corruption was
> > > > _way_ harder to reproduce than expected. Took me 5-or-so-suspend
> > > > cycles... so it is probably unrelated to the previous crash.
> > > 
> > > can you try to reproduce this with 2.6.20-rc2 as well.
> > 
> > Yep, here it is, reproduced on 6-th-or-so suspend.
> > 
> > bluetooth may need to be actively used in order for this to trigger;
> > connecting to the net over my cellphone seems to work okay.
> > 
> > (Full logs in attachment).
> 
> Is this issue also present in 2.6.19 or is it a regression?

Not sure... but I know there were some bluetooth & suspend problems
before.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2007-01-01 19:01 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-23 23:43 ext3-related crash in 2.6.20-rc1 Pavel Machek
2006-12-23 23:55 ` bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1) Pavel Machek
2006-12-24  0:01   ` Pavel Machek
2006-12-24  0:06     ` not-only-bluetooth memory corruption Pavel Machek
2006-12-24  0:07       ` ptrace() memory corruption? Pavel Machek
2006-12-24  1:14         ` Jiri Slaby
2006-12-24 11:52           ` Jiri Slaby
2006-12-24 12:22             ` Andrew Morton
2006-12-24 13:44               ` Jiri Slaby
2006-12-24  1:18   ` bluetooth memory corruption (was Re: ext3-related crash in 2.6.20-rc1) Andrew Morton
2006-12-24 23:24     ` Pavel Machek
2006-12-24 14:39   ` Marcel Holtmann
2006-12-24 23:36     ` Pavel Machek
2006-12-30 21:52       ` Adrian Bunk
2007-01-01 19:01         ` Pavel Machek
2006-12-24 23:43     ` Pavel Machek
2006-12-28  8:42       ` Pavel Machek
2006-12-28 10:40         ` Marcel Holtmann
2006-12-24  1:12 ` ext3-related crash in 2.6.20-rc1 Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox