* Oops in usb-serial with keyspan adapter on current upstream @ 2009-05-18 0:07 Benjamin Herrenschmidt 2009-05-18 15:06 ` Alan Stern 0 siblings, 1 reply; 13+ messages in thread From: Benjamin Herrenschmidt @ 2009-05-18 0:07 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Linux Kernel list, linux-usb, Alan Stern Hi folks ! Current kernels give the oops below at boot with my Keyspan plugged, used to work fine on 2.6.28 at least. It looks to me like some kind of race between the disconnection after the FW load and the re-connect but I'm not sure. Cheers, Ben. usbcore: registered new interface driver usbserial USB Serial support registered for generic usbcore: registered new interface driver usbserial_generic usbserial: USB Serial Driver core USB Serial support registered for Keyspan - (without firmware) USB Serial support registered for Keyspan 1 port adapter USB Serial support registered for Keyspan 2 port adapter USB Serial support registered for Keyspan 4 port adapter keyspan 2-1:1.0: Keyspan - (without firmware) converter detected usb 2-1: firmware: using built-in firmware keyspan/usa49w.fw usb 2-1: USB disconnect, address 2 usbcore: registered new interface driver keyspan keyspan: v1.1.5:Keyspan USB to Serial Converter Driver Unable to handle kernel paging request for data at address 0x00000010 Faulting instruction address: 0xc0000000002da15c Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=4 PowerMac Modules linked in: snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device snd keyspan uninorth_agp usbserial agpgart soundcore NIP: c0000000002da15c LR: c0000000002d3760 CTR: c0000000002d35ec REGS: c00000017a4cb4c0 TRAP: 0300 Not tainted (2.6.30-rc6-00037-g8646010-dirty) MSR: 9000000000009032 <EE,ME,IR,DR> CR: 24000024 XER: 200fffff DAR: 0000000000000010, DSISR: 0000000040000000 TASK = c00000017a4890e0[191] 'khubd' THREAD: c00000017a4c8000 CPU: 2 GPR00: c000000000755f30 c00000017a4cb740 c000000000783d48 0000000000000001 GPR04: 0000000000000000 c0000001783fba90 0000000000000001 0000000000000000 GPR08: 0000000000000000 0000000000000000 0000000000000000 c0000000002d35ec GPR12: 0000000024000082 c0000000007b5680 0000000000000002 c0000001781c0a30 GPR16: 00000000000003e8 0000000000000000 c0000001781fd000 c0000001781c0a30 GPR20: 0000000000000000 0000000000000000 c0000001781fd800 0000000000000001 GPR24: c00000017a41a600 c0000001789ab480 c0000001783fb998 0000000000000000 GPR28: 0000000000000000 c00000017a4cb7b0 c00000000070f8d0 0000000000000000 NIP [c0000000002da15c] .release_nodes+0x54/0x254 LR [c0000000002d3760] .device_del+0x174/0x1e8 Call Trace: [c00000017a4cb740] [c00000017a4cb880] 0xc00000017a4cb880 (unreliable) [c00000017a4cb7f0] [c0000000002d3760] .device_del+0x174/0x1e8 [c00000017a4cb880] [d0000000000b14dc] .usb_serial_disconnect+0xf8/0x1d0 [usbserial] [c00000017a4cb930] [c0000000003a9280] .usb_unbind_interface+0x7c/0x138 [c00000017a4cb9d0] [c0000000002d6acc] .__device_release_driver+0xb8/0x100 [c00000017a4cba60] [c0000000002d6c80] .device_release_driver+0x30/0x54 [c00000017a4cbaf0] [c0000000002d5d18] .bus_remove_device+0xe4/0x124 [c00000017a4cbb80] [c0000000002d3754] .device_del+0x168/0x1e8 [c00000017a4cbc10] [c0000000003a643c] .usb_disable_device+0xa4/0x15c [c00000017a4cbcb0] [c0000000003a0a5c] .usb_disconnect+0xc4/0x180 [c00000017a4cbd60] [c0000000003a1b44] .hub_thread+0x594/0x101c [c00000017a4cbf00] [c000000000065b98] .kthread+0x80/0xcc [c00000017a4cbf90] [c00000000001fef0] .kernel_thread+0x54/0x70 Instruction dump: ebc2a308 7c7a1b78 7c872378 39000000 3b800000 3be00000 38010070 f8010070 f8010078 7c1d0378 48000070 e81e8000 <e96a0010> e8840000 7fab0000 419e0014 ---[ end trace 3ce00127d362ccb2 ]--- ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Oops in usb-serial with keyspan adapter on current upstream 2009-05-18 0:07 Oops in usb-serial with keyspan adapter on current upstream Benjamin Herrenschmidt @ 2009-05-18 15:06 ` Alan Stern 2009-05-19 1:05 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 13+ messages in thread From: Alan Stern @ 2009-05-18 15:06 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: Rafael J. Wysocki, Linux Kernel list, linux-usb On Mon, 18 May 2009, Benjamin Herrenschmidt wrote: > Hi folks ! > > Current kernels give the oops below at boot with my Keyspan plugged, > used to work fine on 2.6.28 at least. Does your test kernel include commit 2d93148ab6988cad872e65d694c95e8944e1b626? > It looks to me like some kind of race between the disconnection after > the FW load and the re-connect but I'm not sure. I don't think that's quite right. Your trace shows the oops occurred during the disconnect processing, which means the re-connect had not yet started. > Cheers, > Ben. > > usbcore: registered new interface driver usbserial > USB Serial support registered for generic > usbcore: registered new interface driver usbserial_generic > usbserial: USB Serial Driver core > USB Serial support registered for Keyspan - (without firmware) > USB Serial support registered for Keyspan 1 port adapter > USB Serial support registered for Keyspan 2 port adapter > USB Serial support registered for Keyspan 4 port adapter > keyspan 2-1:1.0: Keyspan - (without firmware) converter detected > usb 2-1: firmware: using built-in firmware keyspan/usa49w.fw > usb 2-1: USB disconnect, address 2 > usbcore: registered new interface driver keyspan > keyspan: v1.1.5:Keyspan USB to Serial Converter Driver > Unable to handle kernel paging request for data at address 0x00000010 > Faulting instruction address: 0xc0000000002da15c > Oops: Kernel access of bad area, sig: 11 [#1] > SMP NR_CPUS=4 PowerMac > Modules linked in: snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device snd keyspan uninorth_agp usbserial agpgart soundcore > NIP: c0000000002da15c LR: c0000000002d3760 CTR: c0000000002d35ec > REGS: c00000017a4cb4c0 TRAP: 0300 Not tainted (2.6.30-rc6-00037-g8646010-dirty) > MSR: 9000000000009032 <EE,ME,IR,DR> CR: 24000024 XER: 200fffff > DAR: 0000000000000010, DSISR: 0000000040000000 > TASK = c00000017a4890e0[191] 'khubd' THREAD: c00000017a4c8000 CPU: 2 > GPR00: c000000000755f30 c00000017a4cb740 c000000000783d48 0000000000000001 > GPR04: 0000000000000000 c0000001783fba90 0000000000000001 0000000000000000 > GPR08: 0000000000000000 0000000000000000 0000000000000000 c0000000002d35ec > GPR12: 0000000024000082 c0000000007b5680 0000000000000002 c0000001781c0a30 > GPR16: 00000000000003e8 0000000000000000 c0000001781fd000 c0000001781c0a30 > GPR20: 0000000000000000 0000000000000000 c0000001781fd800 0000000000000001 > GPR24: c00000017a41a600 c0000001789ab480 c0000001783fb998 0000000000000000 > GPR28: 0000000000000000 c00000017a4cb7b0 c00000000070f8d0 0000000000000000 > NIP [c0000000002da15c] .release_nodes+0x54/0x254 > LR [c0000000002d3760] .device_del+0x174/0x1e8 > Call Trace: > [c00000017a4cb740] [c00000017a4cb880] 0xc00000017a4cb880 (unreliable) > [c00000017a4cb7f0] [c0000000002d3760] .device_del+0x174/0x1e8 Is it possible to pin this down to a single bad data access or instruction? Alan Stern ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Oops in usb-serial with keyspan adapter on current upstream 2009-05-18 15:06 ` Alan Stern @ 2009-05-19 1:05 ` Benjamin Herrenschmidt 2009-05-19 1:49 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 13+ messages in thread From: Benjamin Herrenschmidt @ 2009-05-19 1:05 UTC (permalink / raw) To: Alan Stern; +Cc: Rafael J. Wysocki, Linux Kernel list, linux-usb On Mon, 2009-05-18 at 11:06 -0400, Alan Stern wrote: > On Mon, 18 May 2009, Benjamin Herrenschmidt wrote: > > > Hi folks ! > > > > Current kernels give the oops below at boot with my Keyspan plugged, > > used to work fine on 2.6.28 at least. > > Does your test kernel include commit > 2d93148ab6988cad872e65d694c95e8944e1b626? Yes, it appears so. > > It looks to me like some kind of race between the disconnection after > > the FW load and the re-connect but I'm not sure. > > I don't think that's quite right. Your trace shows the oops occurred > during the disconnect processing, which means the re-connect had not > yet started. Right. It shows that we re-enter device_del in fact which sounds fishy unless it's for a different device. I'll get more data for you. Cheers, Ben. > > Cheers, > > Ben. > > > > usbcore: registered new interface driver usbserial > > USB Serial support registered for generic > > usbcore: registered new interface driver usbserial_generic > > usbserial: USB Serial Driver core > > USB Serial support registered for Keyspan - (without firmware) > > USB Serial support registered for Keyspan 1 port adapter > > USB Serial support registered for Keyspan 2 port adapter > > USB Serial support registered for Keyspan 4 port adapter > > keyspan 2-1:1.0: Keyspan - (without firmware) converter detected > > usb 2-1: firmware: using built-in firmware keyspan/usa49w.fw > > usb 2-1: USB disconnect, address 2 > > usbcore: registered new interface driver keyspan > > keyspan: v1.1.5:Keyspan USB to Serial Converter Driver > > Unable to handle kernel paging request for data at address 0x00000010 > > Faulting instruction address: 0xc0000000002da15c > > Oops: Kernel access of bad area, sig: 11 [#1] > > SMP NR_CPUS=4 PowerMac > > Modules linked in: snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device snd keyspan uninorth_agp usbserial agpgart soundcore > > NIP: c0000000002da15c LR: c0000000002d3760 CTR: c0000000002d35ec > > REGS: c00000017a4cb4c0 TRAP: 0300 Not tainted (2.6.30-rc6-00037-g8646010-dirty) > > MSR: 9000000000009032 <EE,ME,IR,DR> CR: 24000024 XER: 200fffff > > DAR: 0000000000000010, DSISR: 0000000040000000 > > TASK = c00000017a4890e0[191] 'khubd' THREAD: c00000017a4c8000 CPU: 2 > > GPR00: c000000000755f30 c00000017a4cb740 c000000000783d48 0000000000000001 > > GPR04: 0000000000000000 c0000001783fba90 0000000000000001 0000000000000000 > > GPR08: 0000000000000000 0000000000000000 0000000000000000 c0000000002d35ec > > GPR12: 0000000024000082 c0000000007b5680 0000000000000002 c0000001781c0a30 > > GPR16: 00000000000003e8 0000000000000000 c0000001781fd000 c0000001781c0a30 > > GPR20: 0000000000000000 0000000000000000 c0000001781fd800 0000000000000001 > > GPR24: c00000017a41a600 c0000001789ab480 c0000001783fb998 0000000000000000 > > GPR28: 0000000000000000 c00000017a4cb7b0 c00000000070f8d0 0000000000000000 > > NIP [c0000000002da15c] .release_nodes+0x54/0x254 > > LR [c0000000002d3760] .device_del+0x174/0x1e8 > > Call Trace: > > [c00000017a4cb740] [c00000017a4cb880] 0xc00000017a4cb880 (unreliable) > > [c00000017a4cb7f0] [c0000000002d3760] .device_del+0x174/0x1e8 > > Is it possible to pin this down to a single bad data access or > instruction? > > Alan Stern > > -- > To unsubscribe from this list: send the line "unsubscribe linux-usb" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Oops in usb-serial with keyspan adapter on current upstream 2009-05-19 1:05 ` Benjamin Herrenschmidt @ 2009-05-19 1:49 ` Benjamin Herrenschmidt 2009-05-19 14:39 ` Alan Stern 0 siblings, 1 reply; 13+ messages in thread From: Benjamin Herrenschmidt @ 2009-05-19 1:49 UTC (permalink / raw) To: Alan Stern; +Cc: Rafael J. Wysocki, Linux Kernel list, linux-usb On Tue, 2009-05-19 at 11:05 +1000, Benjamin Herrenschmidt wrote: > On Mon, 2009-05-18 at 11:06 -0400, Alan Stern wrote: > > On Mon, 18 May 2009, Benjamin Herrenschmidt wrote: > > > > > Hi folks ! > > > > > > Current kernels give the oops below at boot with my Keyspan plugged, > > > used to work fine on 2.6.28 at least. > > > > Does your test kernel include commit > > 2d93148ab6988cad872e65d694c95e8944e1b626? > > Yes, it appears so. So I can still reproduce it with today checkout (363383277081ce831642b72df40932ee05ce40a2). Unable to handle kernel paging request for data at address 0x00000010 Faulting instruction address: 0xc000000000301e0c Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=4 PowerMac Modules linked in: keyspan usbserial NIP: c000000000301e0c LR: c0000000002fb410 CTR: c0000000002fb29c REGS: c00000015a4e34c0 TRAP: 0300 Not tainted (2.6.30-rc6-00065-g3633832) MSR: 9000000000009032 <EE,ME,IR,DR> CR: 24000024 XER: 20000000 DAR: 0000000000000010, DSISR: 0000000040000000 TASK = c00000015a494950[170] 'khubd' THREAD: c00000015a4e0000 CPU: 1 GPR00: c0000000008475e8 c00000015a4e3740 c000000000875430 0000000000000001 GPR04: 0000000000000000 c0000001581e2290 0000000000000001 0000000000000000 GPR08: 0000000000000000 0000000000000000 0000000000000000 c0000000002fb29c GPR12: 0000000024000082 c0000000008a9480 0000000000000002 c0000001581dd230 GPR16: 00000000000003e8 0000000000000001 c0000001580b0800 c0000001581dd230 GPR20: 0000000000000000 0000000000000001 c0000001580b1800 0000000000000002 GPR24: c000000158226c00 c00000015811a780 c0000001581e2198 0000000000000000 GPR28: 0000000000000000 c00000015a4e37b0 c0000000007fede8 0000000000000000 NIP [c000000000301e0c] .release_nodes+0x54/0x254 LR [c0000000002fb410] .device_del+0x174/0x1e8 Call Trace: [c00000015a4e3740] [c00000015a4e3880] 0xc00000015a4e3880 (unreliable) [c00000015a4e37f0] [c0000000002fb410] .device_del+0x174/0x1e8 [c00000015a4e3880] [d0000000005454dc] .usb_serial_disconnect+0xf8/0x1d0 [usbserial] [c00000015a4e3930] [c0000000003d1ed0] .usb_unbind_interface+0x7c/0x138 [c00000015a4e39d0] [c0000000002fe77c] .__device_release_driver+0xb8/0x100 [c00000015a4e3a60] [c0000000002fe930] .device_release_driver+0x30/0x54 [c00000015a4e3af0] [c0000000002fd9c8] .bus_remove_device+0xe4/0x124 [c00000015a4e3b80] [c0000000002fb404] .device_del+0x168/0x1e8 [c00000015a4e3c10] [c0000000003cf08c] .usb_disable_device+0xa4/0x15c [c00000015a4e3cb0] [c0000000003c96ac] .usb_disconnect+0xc4/0x180 [c00000015a4e3d60] [c0000000003ca794] .hub_thread+0x594/0x101c [c00000015a4e3f00] [c0000000000672b0] .kthread+0x80/0xcc [c00000015a4e3f90] [c00000000001fef0] .kernel_thread+0x54/0x70 Instruction dump: ebc2a398 7c7a1b78 7c872378 39000000 3b800000 3be00000 38010070 f8010070 f8010078 7c1d0378 48000070 e81e8000 <e96a0010> e8840000 7fab0000 419e0014 ---[ end trace f9d8f1b68a9ea04d ]--- Pretty clearly a NULL dereference inside release_nodes() called by device_del(). After a bit of disassembly, it looks like the offending dereference is node->release (ie, node_to_group) called with a NULL node, so basically we got to a situation where cur == NULL in the first loop inside of remove_nodes() and it thus blows on grp = node_to_group(node); It also looks like first is NULL but it's unclear whether it is such as the result of first = first->next or the function was called that way, though I could try to figure it out with added debugging. It -does- seem that "cnt" might be 1 at the time of the crash. Do you need more data ? Cheers, Ben. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Oops in usb-serial with keyspan adapter on current upstream 2009-05-19 1:49 ` Benjamin Herrenschmidt @ 2009-05-19 14:39 ` Alan Stern 2009-05-25 6:43 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 13+ messages in thread From: Alan Stern @ 2009-05-19 14:39 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: Rafael J. Wysocki, Linux Kernel list, linux-usb On Tue, 19 May 2009, Benjamin Herrenschmidt wrote: > So I can still reproduce it with today checkout > (363383277081ce831642b72df40932ee05ce40a2). > > Unable to handle kernel paging request for data at address 0x00000010 > Faulting instruction address: 0xc000000000301e0c > Oops: Kernel access of bad area, sig: 11 [#1] > SMP NR_CPUS=4 PowerMac > Modules linked in: keyspan usbserial > NIP: c000000000301e0c LR: c0000000002fb410 CTR: c0000000002fb29c > REGS: c00000015a4e34c0 TRAP: 0300 Not tainted (2.6.30-rc6-00065-g3633832) > MSR: 9000000000009032 <EE,ME,IR,DR> CR: 24000024 XER: 20000000 > DAR: 0000000000000010, DSISR: 0000000040000000 > TASK = c00000015a494950[170] 'khubd' THREAD: c00000015a4e0000 CPU: 1 > GPR00: c0000000008475e8 c00000015a4e3740 c000000000875430 0000000000000001 > GPR04: 0000000000000000 c0000001581e2290 0000000000000001 0000000000000000 > GPR08: 0000000000000000 0000000000000000 0000000000000000 c0000000002fb29c > GPR12: 0000000024000082 c0000000008a9480 0000000000000002 c0000001581dd230 > GPR16: 00000000000003e8 0000000000000001 c0000001580b0800 c0000001581dd230 > GPR20: 0000000000000000 0000000000000001 c0000001580b1800 0000000000000002 > GPR24: c000000158226c00 c00000015811a780 c0000001581e2198 0000000000000000 > GPR28: 0000000000000000 c00000015a4e37b0 c0000000007fede8 0000000000000000 > NIP [c000000000301e0c] .release_nodes+0x54/0x254 > LR [c0000000002fb410] .device_del+0x174/0x1e8 > Call Trace: > [c00000015a4e3740] [c00000015a4e3880] 0xc00000015a4e3880 (unreliable) > [c00000015a4e37f0] [c0000000002fb410] .device_del+0x174/0x1e8 > [c00000015a4e3880] [d0000000005454dc] .usb_serial_disconnect+0xf8/0x1d0 [usbserial] > [c00000015a4e3930] [c0000000003d1ed0] .usb_unbind_interface+0x7c/0x138 > [c00000015a4e39d0] [c0000000002fe77c] .__device_release_driver+0xb8/0x100 > [c00000015a4e3a60] [c0000000002fe930] .device_release_driver+0x30/0x54 > [c00000015a4e3af0] [c0000000002fd9c8] .bus_remove_device+0xe4/0x124 > [c00000015a4e3b80] [c0000000002fb404] .device_del+0x168/0x1e8 > [c00000015a4e3c10] [c0000000003cf08c] .usb_disable_device+0xa4/0x15c > [c00000015a4e3cb0] [c0000000003c96ac] .usb_disconnect+0xc4/0x180 > [c00000015a4e3d60] [c0000000003ca794] .hub_thread+0x594/0x101c > [c00000015a4e3f00] [c0000000000672b0] .kthread+0x80/0xcc > [c00000015a4e3f90] [c00000000001fef0] .kernel_thread+0x54/0x70 > Instruction dump: > ebc2a398 7c7a1b78 7c872378 39000000 3b800000 3be00000 38010070 f8010070 > f8010078 7c1d0378 48000070 e81e8000 <e96a0010> e8840000 7fab0000 419e0014 > ---[ end trace f9d8f1b68a9ea04d ]--- > > Pretty clearly a NULL dereference inside release_nodes() called by > device_del(). After a bit of disassembly, it looks like the offending > dereference is node->release (ie, node_to_group) called with a NULL node, > so basically we got to a situation where cur == NULL in the first loop > inside of remove_nodes() and it thus blows on > > grp = node_to_group(node); > > It also looks like first is NULL but it's unclear whether it is such > as the result of first = first->next or the function was called that > way, though I could try to figure it out with added debugging. It -does- > seem that "cnt" might be 1 at the time of the crash. It looks like dev->devres has been corrupted somehow. As far as I can tell, the keyspan driver doesn't touch it at all. In fact, nothing in drivers/usb/serial does. > Do you need more data ? The inner device_del() call comes from usb_serial_disconnect(). You could try adding some debugging there, to find out if dev->devres has been tampered with. Alan Stern ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Oops in usb-serial with keyspan adapter on current upstream 2009-05-19 14:39 ` Alan Stern @ 2009-05-25 6:43 ` Benjamin Herrenschmidt 2009-05-25 15:17 ` Alan Stern 0 siblings, 1 reply; 13+ messages in thread From: Benjamin Herrenschmidt @ 2009-05-25 6:43 UTC (permalink / raw) To: Alan Stern; +Cc: Rafael J. Wysocki, Linux Kernel list, linux-usb Found it. Patch below. usb-serial: Fix crash when sub-driver attach() returns positive value This fixes a crash in usb-serial that typically happens with keyspan USB devices, though it would happen potentially with anything that returns a positive value from the subdriver's attach() method to indicate that a FW was loaded and the device will disconnect and reconnect. What happens is that we haven't yet initialized the struct device embedded inside the struct usb_serial_port when we call "exit:" and return from probe(). Later, when we get the disconnect() call, usb_serial_disconnect() tries to do a device_del() and put_device() on all the ports, despite the fact that in this case, the struct device wasn't initialized. This causes the device core to crash (right, it should be more robust). This works around it by using the device->bus pointer as an indication as to whether those structures are initialized. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> --- Index: linux-work/drivers/usb/serial/usb-serial.c =================================================================== --- linux-work.orig/drivers/usb/serial/usb-serial.c 2009-05-25 16:25:10.000000000 +1000 +++ linux-work/drivers/usb/serial/usb-serial.c 2009-05-25 16:26:22.000000000 +1000 @@ -1075,14 +1075,19 @@ void usb_serial_disconnect(struct usb_in } kill_traffic(port); cancel_work_sync(&port->work); - device_del(&port->dev); + /* port->dev may not be initialized yet. We check that + * by testing if the "bus" pointer is NULL + */ + if (port->dev.bus) + device_del(&port->dev); } } serial->type->shutdown(serial); for (i = 0; i < serial->num_ports; ++i) { port = serial->port[i]; if (port) { - put_device(&port->dev); + if (port->dev.bus) + put_device(&port->dev); serial->port[i] = NULL; } } On Tue, 2009-05-19 at 10:39 -0400, Alan Stern wrote: > On Tue, 19 May 2009, Benjamin Herrenschmidt wrote: > > > So I can still reproduce it with today checkout > > (363383277081ce831642b72df40932ee05ce40a2). > > > > Unable to handle kernel paging request for data at address 0x00000010 > > Faulting instruction address: 0xc000000000301e0c > > Oops: Kernel access of bad area, sig: 11 [#1] > > SMP NR_CPUS=4 PowerMac > > Modules linked in: keyspan usbserial > > NIP: c000000000301e0c LR: c0000000002fb410 CTR: c0000000002fb29c > > REGS: c00000015a4e34c0 TRAP: 0300 Not tainted (2.6.30-rc6-00065-g3633832) > > MSR: 9000000000009032 <EE,ME,IR,DR> CR: 24000024 XER: 20000000 > > DAR: 0000000000000010, DSISR: 0000000040000000 > > TASK = c00000015a494950[170] 'khubd' THREAD: c00000015a4e0000 CPU: 1 > > GPR00: c0000000008475e8 c00000015a4e3740 c000000000875430 0000000000000001 > > GPR04: 0000000000000000 c0000001581e2290 0000000000000001 0000000000000000 > > GPR08: 0000000000000000 0000000000000000 0000000000000000 c0000000002fb29c > > GPR12: 0000000024000082 c0000000008a9480 0000000000000002 c0000001581dd230 > > GPR16: 00000000000003e8 0000000000000001 c0000001580b0800 c0000001581dd230 > > GPR20: 0000000000000000 0000000000000001 c0000001580b1800 0000000000000002 > > GPR24: c000000158226c00 c00000015811a780 c0000001581e2198 0000000000000000 > > GPR28: 0000000000000000 c00000015a4e37b0 c0000000007fede8 0000000000000000 > > NIP [c000000000301e0c] .release_nodes+0x54/0x254 > > LR [c0000000002fb410] .device_del+0x174/0x1e8 > > Call Trace: > > [c00000015a4e3740] [c00000015a4e3880] 0xc00000015a4e3880 (unreliable) > > [c00000015a4e37f0] [c0000000002fb410] .device_del+0x174/0x1e8 > > [c00000015a4e3880] [d0000000005454dc] .usb_serial_disconnect+0xf8/0x1d0 [usbserial] > > [c00000015a4e3930] [c0000000003d1ed0] .usb_unbind_interface+0x7c/0x138 > > [c00000015a4e39d0] [c0000000002fe77c] .__device_release_driver+0xb8/0x100 > > [c00000015a4e3a60] [c0000000002fe930] .device_release_driver+0x30/0x54 > > [c00000015a4e3af0] [c0000000002fd9c8] .bus_remove_device+0xe4/0x124 > > [c00000015a4e3b80] [c0000000002fb404] .device_del+0x168/0x1e8 > > [c00000015a4e3c10] [c0000000003cf08c] .usb_disable_device+0xa4/0x15c > > [c00000015a4e3cb0] [c0000000003c96ac] .usb_disconnect+0xc4/0x180 > > [c00000015a4e3d60] [c0000000003ca794] .hub_thread+0x594/0x101c > > [c00000015a4e3f00] [c0000000000672b0] .kthread+0x80/0xcc > > [c00000015a4e3f90] [c00000000001fef0] .kernel_thread+0x54/0x70 > > Instruction dump: > > ebc2a398 7c7a1b78 7c872378 39000000 3b800000 3be00000 38010070 f8010070 > > f8010078 7c1d0378 48000070 e81e8000 <e96a0010> e8840000 7fab0000 419e0014 > > ---[ end trace f9d8f1b68a9ea04d ]--- > > > > Pretty clearly a NULL dereference inside release_nodes() called by > > device_del(). After a bit of disassembly, it looks like the offending > > dereference is node->release (ie, node_to_group) called with a NULL node, > > so basically we got to a situation where cur == NULL in the first loop > > inside of remove_nodes() and it thus blows on > > > > grp = node_to_group(node); > > > > It also looks like first is NULL but it's unclear whether it is such > > as the result of first = first->next or the function was called that > > way, though I could try to figure it out with added debugging. It -does- > > seem that "cnt" might be 1 at the time of the crash. > > It looks like dev->devres has been corrupted somehow. As far as I can > tell, the keyspan driver doesn't touch it at all. In fact, nothing in > drivers/usb/serial does. > > > Do you need more data ? > > The inner device_del() call comes from usb_serial_disconnect(). You > could try adding some debugging there, to find out if dev->devres has > been tampered with. > > Alan Stern > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Oops in usb-serial with keyspan adapter on current upstream 2009-05-25 6:43 ` Benjamin Herrenschmidt @ 2009-05-25 15:17 ` Alan Stern 2009-05-25 22:00 ` Benjamin Herrenschmidt 2009-05-27 3:30 ` Benjamin Herrenschmidt 0 siblings, 2 replies; 13+ messages in thread From: Alan Stern @ 2009-05-25 15:17 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: Rafael J. Wysocki, Linux Kernel list, linux-usb On Mon, 25 May 2009, Benjamin Herrenschmidt wrote: > Found it. Patch below. > > usb-serial: Fix crash when sub-driver attach() returns positive value > > This fixes a crash in usb-serial that typically happens with keyspan USB > devices, though it would happen potentially with anything that returns > a positive value from the subdriver's attach() method to indicate that > a FW was loaded and the device will disconnect and reconnect. > > What happens is that we haven't yet initialized the struct device embedded > inside the struct usb_serial_port when we call "exit:" and return > from probe(). > > Later, when we get the disconnect() call, usb_serial_disconnect() tries > to do a device_del() and put_device() on all the ports, despite the fact > that in this case, the struct device wasn't initialized. This causes the > device core to crash (right, it should be more robust). Very clever. Does this simpler patch also fix the problem? Alan Stern Index: usb-2.6/drivers/usb/serial/usb-serial.c =================================================================== --- usb-2.6.orig/drivers/usb/serial/usb-serial.c +++ usb-2.6/drivers/usb/serial/usb-serial.c @@ -974,6 +974,7 @@ int usb_serial_probe(struct usb_interfac if (retval > 0) { /* quietly accept this device, but don't bind to a serial port as it's about to disappear */ + serial->num_ports = 0; goto exit; } } ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Oops in usb-serial with keyspan adapter on current upstream 2009-05-25 15:17 ` Alan Stern @ 2009-05-25 22:00 ` Benjamin Herrenschmidt 2009-05-26 14:06 ` Alan Stern 2009-05-27 3:30 ` Benjamin Herrenschmidt 1 sibling, 1 reply; 13+ messages in thread From: Benjamin Herrenschmidt @ 2009-05-25 22:00 UTC (permalink / raw) To: Alan Stern; +Cc: Rafael J. Wysocki, Linux Kernel list, linux-usb On Mon, 2009-05-25 at 11:17 -0400, Alan Stern wrote: > On Mon, 25 May 2009, Benjamin Herrenschmidt wrote: > > > Found it. Patch below. > > > > usb-serial: Fix crash when sub-driver attach() returns positive value > > > > This fixes a crash in usb-serial that typically happens with keyspan USB > > devices, though it would happen potentially with anything that returns > > a positive value from the subdriver's attach() method to indicate that > > a FW was loaded and the device will disconnect and reconnect. > > > > What happens is that we haven't yet initialized the struct device embedded > > inside the struct usb_serial_port when we call "exit:" and return > > from probe(). > > > > Later, when we get the disconnect() call, usb_serial_disconnect() tries > > to do a device_del() and put_device() on all the ports, despite the fact > > that in this case, the struct device wasn't initialized. This causes the > > device core to crash (right, it should be more robust). > > Very clever. Does this simpler patch also fix the problem? Well, both our patches are bogus in the sense that they will leak the stuff in the port structure. The right patch would be I think something like that (typing straight in the mailer, I can do a "proper" patch later): > Index: usb-2.6/drivers/usb/serial/usb-serial.c > =================================================================== > --- usb-2.6.orig/drivers/usb/serial/usb-serial.c > +++ usb-2.6/drivers/usb/serial/usb-serial.c > @@ -974,6 +974,7 @@ int usb_serial_probe(struct usb_interfac > if (retval > 0) { > /* quietly accept this device, but don't bind to a > serial port as it's about to disappear */ > + for (i = 0; i < serial->num_ports; i++) { > + port_free(serial->port[i]); > + serial_port[i] = NULL; > + } > + serial->num_ports = 0; > goto exit; > } > } Cheers, Ben. > -- > To unsubscribe from this list: send the line "unsubscribe linux-usb" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Oops in usb-serial with keyspan adapter on current upstream 2009-05-25 22:00 ` Benjamin Herrenschmidt @ 2009-05-26 14:06 ` Alan Stern 2009-05-26 22:28 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 13+ messages in thread From: Alan Stern @ 2009-05-26 14:06 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: Rafael J. Wysocki, Linux Kernel list, linux-usb On Tue, 26 May 2009, Benjamin Herrenschmidt wrote: > Well, both our patches are bogus in the sense that they will leak > the stuff in the port structure. That's not true at all. Take a look at the destroy_serial() routine. Alan Stern ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Oops in usb-serial with keyspan adapter on current upstream 2009-05-26 14:06 ` Alan Stern @ 2009-05-26 22:28 ` Benjamin Herrenschmidt 2009-05-27 2:13 ` Alan Stern 0 siblings, 1 reply; 13+ messages in thread From: Benjamin Herrenschmidt @ 2009-05-26 22:28 UTC (permalink / raw) To: Alan Stern; +Cc: Rafael J. Wysocki, Linux Kernel list, linux-usb On Tue, 2009-05-26 at 10:06 -0400, Alan Stern wrote: > On Tue, 26 May 2009, Benjamin Herrenschmidt wrote: > > > Well, both our patches are bogus in the sense that they will leak > > the stuff in the port structure. > > That's not true at all. Take a look at the destroy_serial() routine. Right... hrm. This is an horrible mess, I can't wait to see that refcount going wrong and things trying to double free or leaking the structure. Oh well, fix it the way your want, I don't care as long as it's fixed. This driver is just plain horrible. Ben. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Oops in usb-serial with keyspan adapter on current upstream 2009-05-26 22:28 ` Benjamin Herrenschmidt @ 2009-05-27 2:13 ` Alan Stern 2009-05-27 3:30 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 13+ messages in thread From: Alan Stern @ 2009-05-27 2:13 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: Rafael J. Wysocki, Linux Kernel list, linux-usb On Wed, 27 May 2009, Benjamin Herrenschmidt wrote: > On Tue, 2009-05-26 at 10:06 -0400, Alan Stern wrote: > > On Tue, 26 May 2009, Benjamin Herrenschmidt wrote: > > > > > Well, both our patches are bogus in the sense that they will leak > > > the stuff in the port structure. > > > > That's not true at all. Take a look at the destroy_serial() routine. > > Right... hrm. This is an horrible mess, I can't wait to see that > refcount going wrong and things trying to double free or leaking the > structure. > > Oh well, fix it the way your want, I don't care as long as it's fixed. > This driver is just plain horrible. I can't disagree with that. :-) The usb-serial design was a mess to begin with, and it didn't get much better when the serial bus was introduced. I'm inclined to use my patch because it's shortest. Can I add your "Tested-by:"? Alan Stern ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Oops in usb-serial with keyspan adapter on current upstream 2009-05-27 2:13 ` Alan Stern @ 2009-05-27 3:30 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 13+ messages in thread From: Benjamin Herrenschmidt @ 2009-05-27 3:30 UTC (permalink / raw) To: Alan Stern; +Cc: Rafael J. Wysocki, Linux Kernel list, linux-usb > I can't disagree with that. :-) The usb-serial design was a mess to > begin with, and it didn't get much better when the serial bus was > introduced. > > I'm inclined to use my patch because it's shortest. Can I add your > "Tested-by:"? Just tested it. I'll send that as a reply to the patch itself. Cheers, Ben. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Oops in usb-serial with keyspan adapter on current upstream 2009-05-25 15:17 ` Alan Stern 2009-05-25 22:00 ` Benjamin Herrenschmidt @ 2009-05-27 3:30 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 13+ messages in thread From: Benjamin Herrenschmidt @ 2009-05-27 3:30 UTC (permalink / raw) To: Alan Stern; +Cc: Rafael J. Wysocki, Linux Kernel list, linux-usb > Very clever. Does this simpler patch also fix the problem? Yes it does, thanks. Tested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> --- > > Index: usb-2.6/drivers/usb/serial/usb-serial.c > =================================================================== > --- usb-2.6.orig/drivers/usb/serial/usb-serial.c > +++ usb-2.6/drivers/usb/serial/usb-serial.c > @@ -974,6 +974,7 @@ int usb_serial_probe(struct usb_interfac > if (retval > 0) { > /* quietly accept this device, but don't bind to a > serial port as it's about to disappear */ > + serial->num_ports = 0; > goto exit; > } > } > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2009-05-27 3:30 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-05-18 0:07 Oops in usb-serial with keyspan adapter on current upstream Benjamin Herrenschmidt 2009-05-18 15:06 ` Alan Stern 2009-05-19 1:05 ` Benjamin Herrenschmidt 2009-05-19 1:49 ` Benjamin Herrenschmidt 2009-05-19 14:39 ` Alan Stern 2009-05-25 6:43 ` Benjamin Herrenschmidt 2009-05-25 15:17 ` Alan Stern 2009-05-25 22:00 ` Benjamin Herrenschmidt 2009-05-26 14:06 ` Alan Stern 2009-05-26 22:28 ` Benjamin Herrenschmidt 2009-05-27 2:13 ` Alan Stern 2009-05-27 3:30 ` Benjamin Herrenschmidt 2009-05-27 3:30 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).