On Thu, Aug 16, 2012 at 08:48:25PM -0700, Greg KH wrote: > On Fri, Aug 17, 2012 at 10:00:46AM +0800, Fengguang Wu wrote: > > On Sun, Aug 05, 2012 at 09:58:26AM -0700, Greg KH wrote: > > > On Sun, Aug 05, 2012 at 10:59:38AM +0800, Fengguang Wu wrote: > > > > Hi all, > > > > > > > > This line triggers an oops in kvm boot test: > > > > > > > > usb_match_id(): > > > > ==> 748 for (; id->idVendor || id->idProduct || id->bDeviceClass || > > > > 749 id->bInterfaceClass || id->driver_info; id++) { > > > > 750 if (usb_match_one_id(interface, id)) > > > > 751 return id; > > > > 752 } > > > > > > > > It's an old bug and happens also in linux 3.0. It's very reproducible > > > > for the attached config. I can send the initrd (yocto-minimal-i386.cgz) > > > > on your request in private email. > > > > > > Odds are a driver without a terminating NULL for the device id list is > > > causing this to fail. > > > > > > What devices are in the system and what drivers are trying to be bound? > > > > The last match is for: drivers/usb/misc/emi62.c > > > > Located down by Tianyu's debug patch: > > > > [ 2.206708] usb_device_match: device 1-1:1.0, driver cytherm > > [ 2.207627] usb_device_match: device 1-1:1.0, driver emi62 - firmware loader > > [ 2.208769] BUG: unable to handle kernel paging request at c1f7478e > > [ 2.209726] IP: [] usb_match_id+0x5b/0xcd > > > > > --- a/drivers/usb/core/driver.c > > > +++ b/drivers/usb/core/driver.c > > > @@ -778,7 +778,8 @@ static int usb_device_match(struct device *dev, struct device_driver *drv) > > > > > > intf = to_usb_interface(dev); > > > usb_drv = to_usb_driver(drv); > > > - > > > + > > > + pr_info("%s: device %s, driver %s \n", dev_name(dev), drv->name); > > > id = usb_match_id(intf, usb_drv->id_table); > > > if (id) > > > return 1; > > Odd that it takes so long after you call that function for it to fail. > > And that driver has a proper termination, so we aren't walking off the > end of the list, so it must be in the probe function itself. > > Care to add some more debugging for that driver? > > Also, I don't see the dev_info() message from the driver itself, in the > probe function, so it must not be getting called properly. > > Something weird is happening... Very true. I find it related to timing: after adding these debug printks, it becomes very hard to reproduce. And I confirmed that emi62_probe() is never called. Anyway, I managed to get one oops. For comparison, one non-failing dmesg for the same kernel is also attached. --- tip.orig/drivers/usb/core/driver.c 2012-08-16 09:27:20.063315545 +0800 +++ tip/drivers/usb/core/driver.c 2012-08-17 12:27:32.441761637 +0800 @@ -747,6 +747,7 @@ const struct usb_device_id *usb_match_id device and interface. */ for (; id->idVendor || id->idProduct || id->bDeviceClass || id->bInterfaceClass || id->driver_info; id++) { + printk(KERN_ERR "usb_match_id: id=%p idVendor=%#x idProduct=%#x\n", id, id->idVendor, id->idProduct); if (usb_match_one_id(interface, id)) return id; } @@ -779,6 +780,7 @@ static int usb_device_match(struct devic intf = to_usb_interface(dev); usb_drv = to_usb_driver(drv); + pr_info("%s: device %s, driver %s \n", __func__, dev_name(dev), drv->name); id = usb_match_id(intf, usb_drv->id_table); if (id) return 1; --- tip.orig/drivers/usb/misc/emi62.c 2012-07-29 18:22:35.256996861 +0800 +++ tip/drivers/usb/misc/emi62.c 2012-08-17 12:46:49.761743138 +0800 @@ -242,9 +242,9 @@ MODULE_DEVICE_TABLE (usb, id_table); static int emi62_probe(struct usb_interface *intf, const struct usb_device_id *id) { struct usb_device *dev = interface_to_usbdev(intf); - dev_dbg(&intf->dev, "emi62_probe\n"); + dev_err(&intf->dev, "emi62_probe\n"); - dev_info(&intf->dev, "%s start\n", __func__); + printk(KERN_ERR "%s start\n", __func__); emi62_load_firmware(dev); Thanks, Fengguang