* Re: was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website [not found] ` <317435358.100327.1350822615555.JavaMail.mail@webmail20> @ 2012-10-21 14:21 ` Daniel Mack 2012-10-21 14:57 ` Artem S. Tashkinov 0 siblings, 1 reply; 15+ messages in thread From: Daniel Mack @ 2012-10-21 14:21 UTC (permalink / raw) To: Artem S. Tashkinov Cc: bp, pavel, linux-kernel, netdev@vger.kernel.org, security, linux-media, linux-usb, alsa-devel [-- Attachment #1: Type: text/plain, Size: 1906 bytes --] [Cc: alsa-devel] On 21.10.2012 14:30, Artem S. Tashkinov wrote: > On Oct 21, 2012, Daniel Mack wrote: > >> A hint at least. How did you enable the audio record exactly? Can you >> reproduce this with arecord? >> >> What chipset are you on? Please provide both "lspci -v" and "lsusb -v" >> dumps. As I said, I fail to reproduce that issue on any of my machines. > > All other applications can read from the USB audio without problems, it's > just something in the way Adobe Flash polls my audio input which causes > a crash. > > Just video capture (without audio) works just fine in Adobe Flash. Ok, so that pretty much rules out the host controller. I just wonder why I still don't see it here, and I haven't heard of any such problem from anyone else. Some more questions: - Which version of Flash are you running? - Does this also happen with Firefox? - Does flash access the device directly or via PulseAudio? - Could you please apply the attached patch and see what it spits out to dmesg once Flash opens the device? It returns -EINVAL in the hw_params callback to prevent the actual streaming. On my machine with Flash 11.4.31.110, I get values of 2/44800/1/32768/2048/0, which seems sane. Or does your machine still crash before anything is written to the logs? > Only and only when I choose to use > > USB Device 0x46d:0x81d my system crashes in Adobe Flash. > > See the screenshot: > > https://bugzilla.kernel.org/attachment.cgi?id=84151 When exactly does the crash happen? Right after you selected that entry from the list? There's a little recording level meter in that dialog. Does that show any input from the microphone? > My hardware information can be fetched from here: > > https://bugzilla.kernel.org/show_bug.cgi?id=49181 > > On a second thought that can be even an ALSA crash or pretty much > anything else. We'll see. Thanks for your help to sort this out! Daniel [-- Attachment #2: snd-usb-hwparams.diff --] [-- Type: text/x-patch, Size: 778 bytes --] diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index f782ce1..5664b45 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -453,6 +453,18 @@ static int snd_usb_hw_params(struct snd_pcm_substream *substream, unsigned int channels, rate, format; int ret, changed; + + printk(">>> %s()\n", __func__); + + printk("format: %d\n", params_format(hw_params)); + printk("rate: %d\n", params_rate(hw_params)); + printk("channels: %d\n", params_channels(hw_params)); + printk("buffer bytes: %d\n", params_buffer_bytes(hw_params)); + printk("period bytes: %d\n", params_period_bytes(hw_params)); + printk("access: %d\n", params_access(hw_params)); + + return -EINVAL; + ret = snd_pcm_lib_alloc_vmalloc_buffer(substream, params_buffer_bytes(hw_params)); if (ret < 0) ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: Re: was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website 2012-10-21 14:21 ` was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website Daniel Mack @ 2012-10-21 14:57 ` Artem S. Tashkinov 2012-10-21 15:22 ` Daniel Mack 0 siblings, 1 reply; 15+ messages in thread From: Artem S. Tashkinov @ 2012-10-21 14:57 UTC (permalink / raw) To: zonque Cc: bp, pavel, linux-kernel, netdev, security, linux-media, linux-usb, alsa-devel > On Oct 21, 2012, Daniel Mack wrote: > > [Cc: alsa-devel] > > On 21.10.2012 14:30, Artem S. Tashkinov wrote: > > On Oct 21, 2012, Daniel Mack wrote: > > > >> A hint at least. How did you enable the audio record exactly? Can you > >> reproduce this with arecord? > >> > >> What chipset are you on? Please provide both "lspci -v" and "lsusb -v" > >> dumps. As I said, I fail to reproduce that issue on any of my machines. > > > > All other applications can read from the USB audio without problems, it's > > just something in the way Adobe Flash polls my audio input which causes > > a crash. > > > > Just video capture (without audio) works just fine in Adobe Flash. > > Ok, so that pretty much rules out the host controller. I just wonder why > I still don't see it here, and I haven't heard of any such problem from > anyone else. > > Some more questions: > > - Which version of Flash are you running? Google Chrome has its own version of Adobe Flash: Name: Shockwave Flash Description: Shockwave Flash 11.4 r31 Version: 11.4.31.110 > - Does this also happen with Firefox? No, Adobe Flash in Firefox is an older version (Shockwave Flash 11.1 r102), it shows just two input devices instead of three which the newer Flash players sees. * HDA Intel PCH * USB Device 0x46d:0x81d > - Does flash access the device directly or via PulseAudio? PA is not installed on my computer, so Flash accesses it directly via ALSA calls. > - Could you please apply the attached patch and see what it spits out to > dmesg once Flash opens the device? It returns -EINVAL in the hw_params > callback to prevent the actual streaming. On my machine with Flash > 11.4.31.110, I get values of 2/44800/1/32768/2048/0, which seems sane. > Or does your machine still crash before anything is written to the logs? I will try it a bit later. > > Only and only when I choose to use > > > > USB Device 0x46d:0x81d my system crashes in Adobe Flash. > > > > See the screenshot: > > > > https://bugzilla.kernel.org/attachment.cgi?id=84151 > > When exactly does the crash happen? Right after you selected that entry > from the list? There's a little recording level meter in that dialog. > Does that show any input from the microphone? Yes, right after I select it and move the mouse cursor away from this combobox so that this selection becomes active. > > My hardware information can be fetched from here: > > > > https://bugzilla.kernel.org/show_bug.cgi?id=49181 > > > > On a second thought that can be even an ALSA crash or pretty much > > anything else. > > We'll see. Thanks for your help to sort this out! Thank you for your assistance! ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website 2012-10-21 14:57 ` Artem S. Tashkinov @ 2012-10-21 15:22 ` Daniel Mack 2012-10-21 15:28 ` Alan Stern 0 siblings, 1 reply; 15+ messages in thread From: Daniel Mack @ 2012-10-21 15:22 UTC (permalink / raw) To: Artem S. Tashkinov Cc: bp, pavel, linux-kernel, netdev, security, linux-media, linux-usb, alsa-devel On 21.10.2012 16:57, Artem S. Tashkinov wrote: >> On Oct 21, 2012, Daniel Mack wrote: >> >> [Cc: alsa-devel] >> >> On 21.10.2012 14:30, Artem S. Tashkinov wrote: >>> On Oct 21, 2012, Daniel Mack wrote: >>> >>>> A hint at least. How did you enable the audio record exactly? Can you >>>> reproduce this with arecord? >>>> >>>> What chipset are you on? Please provide both "lspci -v" and "lsusb -v" >>>> dumps. As I said, I fail to reproduce that issue on any of my machines. >>> >>> All other applications can read from the USB audio without problems, it's >>> just something in the way Adobe Flash polls my audio input which causes >>> a crash. >>> >>> Just video capture (without audio) works just fine in Adobe Flash. >> >> Ok, so that pretty much rules out the host controller. I just wonder why >> I still don't see it here, and I haven't heard of any such problem from >> anyone else. >> >> Some more questions: >> >> - Which version of Flash are you running? > > Google Chrome has its own version of Adobe Flash: > > Name: Shockwave Flash > Description: Shockwave Flash 11.4 r31 > Version: 11.4.31.110 So that's the same that I'm using. >> - Does this also happen with Firefox? > > No, Adobe Flash in Firefox is an older version (Shockwave Flash 11.1 r102), it shows > just two input devices instead of three which the newer Flash players sees. > > * HDA Intel PCH > * USB Device 0x46d:0x81d And that works, I assume? Does the second choice in the newer Flash version work maybe? >> - Does flash access the device directly or via PulseAudio? > > PA is not installed on my computer, so Flash accesses it directly via ALSA calls. Ok, Same here. >> - Could you please apply the attached patch and see what it spits out to >> dmesg once Flash opens the device? It returns -EINVAL in the hw_params >> callback to prevent the actual streaming. On my machine with Flash >> 11.4.31.110, I get values of 2/44800/1/32768/2048/0, which seems sane. >> Or does your machine still crash before anything is written to the logs? > > I will try it a bit later. Yes, we need to trace the call chain and see at which point the trouble starts. What could help is tracing the google-chrome binary with strace maybe. At least we would see the ioctl command sequence, if the log file survives the crash. As the usb list is still in Cc: - Artem's lcpci dump shows that his machine features XHCI controllers. Can anyone think of a relation to this problem? And Artem, is there any way you boot your system on an older machine that only has EHCI ports? Thinking about it, I wonder whether the freeze in VBox and the crashes on native hardware have the same root cause. In that case, would it be possible to share that VBox image? Daniel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website 2012-10-21 15:22 ` Daniel Mack @ 2012-10-21 15:28 ` Alan Stern 2012-10-21 15:36 ` Daniel Mack 0 siblings, 1 reply; 15+ messages in thread From: Alan Stern @ 2012-10-21 15:28 UTC (permalink / raw) To: Daniel Mack Cc: Artem S. Tashkinov, bp, pavel, linux-kernel, netdev, security, linux-media, linux-usb, alsa-devel On Sun, 21 Oct 2012, Daniel Mack wrote: > As the usb list is still in Cc: - Artem's lcpci dump shows that his > machine features XHCI controllers. Can anyone think of a relation to > this problem? > > And Artem, is there any way you boot your system on an older machine > that only has EHCI ports? Thinking about it, I wonder whether the freeze > in VBox and the crashes on native hardware have the same root cause. In > that case, would it be possible to share that VBox image? Don't grasp at straws. All of the kernel logs Artem has posted show ehci-hcd; none of them show xhci-hcd. Therefore the xHCI controller is highly unlikely to be involved. Alan Stern ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website 2012-10-21 15:28 ` Alan Stern @ 2012-10-21 15:36 ` Daniel Mack 0 siblings, 0 replies; 15+ messages in thread From: Daniel Mack @ 2012-10-21 15:36 UTC (permalink / raw) To: Alan Stern Cc: security, alsa-devel, netdev, linux-usb, linux-kernel, bp, pavel, Artem S. Tashkinov, linux-media On Oct 21, 2012 5:28 PM, "Alan Stern" <stern@rowland.harvard.edu> wrote: > > On Sun, 21 Oct 2012, Daniel Mack wrote: > > > As the usb list is still in Cc: - Artem's lcpci dump shows that his > > machine features XHCI controllers. Can anyone think of a relation to > > this problem? > > > > And Artem, is there any way you boot your system on an older machine > > that only has EHCI ports? Thinking about it, I wonder whether the freeze > > in VBox and the crashes on native hardware have the same root cause. In > > that case, would it be possible to share that VBox image? > > Don't grasp at straws. All of the kernel logs Artem has posted show > ehci-hcd; none of them show xhci-hcd. Therefore the xHCI controller is > highly unlikely to be involved. You might be right - I'm just looking for differences between his setup and mine that would explain why nobody else sees a severe bug that is 100% reproducible for him. ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <20121021170315.GB20642@liondog.tnic>]
* Re: Re: Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website [not found] ` <20121021170315.GB20642@liondog.tnic> @ 2012-10-21 19:49 ` Artem S. Tashkinov 2012-10-21 19:54 ` Daniel Mack ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Artem S. Tashkinov @ 2012-10-21 19:49 UTC (permalink / raw) To: bp Cc: pavel, linux-kernel, netdev, security, linux-media, linux-usb, zonque, alsa-devel, stern > > On Oct 21, 2012, Borislav Petkov <bp@alien8.de> wrote: > > On Sun, Oct 21, 2012 at 11:59:36AM +0000, Artem S. Tashkinov wrote: > > http://imageshack.us/a/img685/9452/panicz.jpg > > > > list_del corruption. prev->next should be ... but was ... > > Btw, this is one of the debug options I told you to enable. > > > I cannot show you more as I have no serial console to use :( and the kernel > > doesn't have enough time to push error messages to rsyslog and fsync > > /var/log/messages > > I already told you how to catch that oops: boot with "pause_on_oops=600" > on the kernel command line and photograph the screen when the first oops > happens. This'll show us where the problem begins. This option didn't have any effect, or maybe it's because it's such a serious crash the kernel has no time to actually print an ooops/panic message. dmesg messages up to a crash can be seen here: https://bugzilla.kernel.org/attachment.cgi?id=84221 I dumped them using this application: $ cat scat.c #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <string.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #define O_LARGEFILE 0100000 #define BUFFER 4096 #define __USE_FILE_OFFSET64 1 #define __USE_LARGEFILE64 1 int main(int argc, char *argv[]) { int fd_out; int64_t bytes_read; void *buffer; if (argc!=2) { printf("Usage is: scat destination\n"); return 1; } buffer = malloc(BUFFER * sizeof(char)); if (buffer == NULL) { printf("Error: can't allocate buffers\n"); return 2; } memset(buffer, 0, BUFFER); printf("Dumping to \"%s\" ... ", argv[1]); fflush(NULL); if ((fd_out = open64(argv[1], O_WRONLY | O_LARGEFILE | O_SYNC | O_NOFOLLOW, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH)) == -1) { printf("Error: destination file can't be created\n"); perror("open() "); return 2; } bytes_read = 1; while (bytes_read) { bytes_read = fread(buffer, sizeof(char), BUFFER, stdin); if (write(fd_out, (void *) buffer, bytes_read) != bytes_read) { printf("Error: can't write data to the destination file! Possibly a target disk is full\n"); return 3; } } close(fd_out); printf(" OK\n"); return 0; } I ran it this way: while :; do dmesg -c; done | scat /dev/sda11 (yes, straight to a hdd partition to eliminate a FS cache) Don't judge me harshly - I'm not a programmer. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website 2012-10-21 19:49 ` Re: Re: Re: " Artem S. Tashkinov @ 2012-10-21 19:54 ` Daniel Mack 2012-10-21 20:43 ` Artem S. Tashkinov 2012-10-21 20:36 ` Re: Re: Re: Re: " Borislav Petkov 2012-10-22 15:17 ` Alan Stern 2 siblings, 1 reply; 15+ messages in thread From: Daniel Mack @ 2012-10-21 19:54 UTC (permalink / raw) To: Artem S. Tashkinov Cc: bp, pavel, linux-kernel, netdev, security, linux-media, linux-usb, alsa-devel, stern On 21.10.2012 21:49, Artem S. Tashkinov wrote: >> >> On Oct 21, 2012, Borislav Petkov <bp@alien8.de> wrote: >> >> On Sun, Oct 21, 2012 at 11:59:36AM +0000, Artem S. Tashkinov wrote: >>> http://imageshack.us/a/img685/9452/panicz.jpg >>> >>> list_del corruption. prev->next should be ... but was ... >> >> Btw, this is one of the debug options I told you to enable. >> >>> I cannot show you more as I have no serial console to use :( and the kernel >>> doesn't have enough time to push error messages to rsyslog and fsync >>> /var/log/messages >> >> I already told you how to catch that oops: boot with "pause_on_oops=600" >> on the kernel command line and photograph the screen when the first oops >> happens. This'll show us where the problem begins. > > This option didn't have any effect, or maybe it's because it's such a serious crash > the kernel has no time to actually print an ooops/panic message. > > dmesg messages up to a crash can be seen here: https://bugzilla.kernel.org/attachment.cgi?id=84221 Nice. Could you do that again with the patch applied I sent yo some hours ago? Thanks, Daniel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website 2012-10-21 19:54 ` Daniel Mack @ 2012-10-21 20:43 ` Artem S. Tashkinov 2012-10-21 21:00 ` Daniel Mack 0 siblings, 1 reply; 15+ messages in thread From: Artem S. Tashkinov @ 2012-10-21 20:43 UTC (permalink / raw) To: zonque Cc: bp, pavel, linux-kernel, netdev, security, linux-media, linux-usb, alsa-devel, stern > Nice. Could you do that again with the patch applied I sent yo some > hours ago? That patch was of no help - the system has crashed and I couldn't spot relevant messages. I've no idea what it means. Artem ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website 2012-10-21 20:43 ` Artem S. Tashkinov @ 2012-10-21 21:00 ` Daniel Mack 0 siblings, 0 replies; 15+ messages in thread From: Daniel Mack @ 2012-10-21 21:00 UTC (permalink / raw) To: Artem S. Tashkinov Cc: bp-Gina5bIWoIWzQB+pC5nmwQ, pavel-+ZI9xUNit7I, linux-kernel-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA, security-DgEjT+Ai2ygdnm+yROfE0A, linux-media-u79uwXL29TY76Z2rM5mHXA, linux-usb-u79uwXL29TY76Z2rM5mHXA, alsa-devel-K7yf7f+aM1XWsZ/bQMPhNw, stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz On 21.10.2012 22:43, Artem S. Tashkinov wrote: >> Nice. Could you do that again with the patch applied I sent yo some >> hours ago? > > That patch was of no help - the system has crashed and I couldn't spot relevant > messages. > > I've no idea what it means. The sequence of driver callbacks issued on a stream start is .open() .hw_params() .prepare() .trigger() If the ALSA part really causes this issue, the bad things happen either in any of the driver callback functions or in the core underneath. The patch I sent returns an error from the hw_params callback, and as you still see the problem, that means that the crash happens before any of the USB audio streaming really starts. Could you try and return -EINVAL from snd_usb_capture_open() please? If anyone has a better idea on how to debug this, please chime in. Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re: Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website 2012-10-21 19:49 ` Re: Re: Re: " Artem S. Tashkinov 2012-10-21 19:54 ` Daniel Mack @ 2012-10-21 20:36 ` Borislav Petkov 2012-10-22 15:17 ` Alan Stern 2 siblings, 0 replies; 15+ messages in thread From: Borislav Petkov @ 2012-10-21 20:36 UTC (permalink / raw) To: Artem S. Tashkinov Cc: pavel, linux-kernel, netdev, security, linux-media, linux-usb, zonque, alsa-devel, stern On Sun, Oct 21, 2012 at 07:49:01PM +0000, Artem S. Tashkinov wrote: > I ran it this way: while :; do dmesg -c; done | scat /dev/sda11 (yes, > straight to a hdd partition to eliminate a FS cache) Well, I'm no fs guy but this should still go through the buffer cache. I think the O_SYNC flag makes sure it all lands on the partition in time. Oh well, it doesn't matter. > Don't judge me harshly - I'm not a programmer. If you wrote that and you're not a programmer, it certainly looks cool, good job!. [ Btw, don't forget to free(buffer) at the end. ] Also, there was a patchset recently which added a blockconsole method to the kernel with which you can do something like that in a generic way. Back to the issue at hand: it looks like ehci_hcd is causing some list corruptions, maybe coming from the uvcvideo or whatever. I think the usb people will have a better idea. Btw, is there any particular reason you're running a 32-bit kernel? Thanks. -- Regards/Gruss, Boris. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re: Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website 2012-10-21 19:49 ` Re: Re: Re: " Artem S. Tashkinov 2012-10-21 19:54 ` Daniel Mack 2012-10-21 20:36 ` Re: Re: Re: Re: " Borislav Petkov @ 2012-10-22 15:17 ` Alan Stern 2012-10-22 15:30 ` Daniel Mack 2 siblings, 1 reply; 15+ messages in thread From: Alan Stern @ 2012-10-22 15:17 UTC (permalink / raw) To: Artem S. Tashkinov Cc: bp, pavel, linux-kernel, netdev, security, linux-media, linux-usb, zonque, alsa-devel On Sun, 21 Oct 2012, Artem S. Tashkinov wrote: > dmesg messages up to a crash can be seen here: https://bugzilla.kernel.org/attachment.cgi?id=84221 The first problem in the log is endpoint list corruption. Here's a debugging patch which should provide a little more information. Alan Stern drivers/usb/core/hcd.c | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) Index: usb-3.6/drivers/usb/core/hcd.c =================================================================== --- usb-3.6.orig/drivers/usb/core/hcd.c +++ usb-3.6/drivers/usb/core/hcd.c @@ -1083,6 +1083,8 @@ EXPORT_SYMBOL_GPL(usb_calc_bus_time); /*-------------------------------------------------------------------------*/ +static bool list_error; + /** * usb_hcd_link_urb_to_ep - add an URB to its endpoint queue * @hcd: host controller to which @urb was submitted @@ -1126,6 +1128,20 @@ int usb_hcd_link_urb_to_ep(struct usb_hc */ if (HCD_RH_RUNNING(hcd)) { urb->unlinked = 0; + + { + struct list_head *cur = &urb->ep->urb_list; + struct list_head *prev = cur->prev; + + if (prev->next != cur && !list_error) { + list_error = true; + dev_err(&urb->dev->dev, + "ep %x list add corruption: %p %p %p\n", + urb->ep->desc.bEndpointAddress, + cur, prev, prev->next); + } + } + list_add_tail(&urb->urb_list, &urb->ep->urb_list); } else { rc = -ESHUTDOWN; @@ -1193,6 +1209,26 @@ void usb_hcd_unlink_urb_from_ep(struct u { /* clear all state linking urb to this dev (and hcd) */ spin_lock(&hcd_urb_list_lock); + { + struct list_head *cur = &urb->urb_list; + struct list_head *prev = cur->prev; + struct list_head *next = cur->next; + + if (prev->next != cur && !list_error) { + list_error = true; + dev_err(&urb->dev->dev, + "ep %x list del corruption prev: %p %p %p\n", + urb->ep->desc.bEndpointAddress, + cur, prev, prev->next); + } + if (next->prev != cur && !list_error) { + list_error = true; + dev_err(&urb->dev->dev, + "ep %x list del corruption next: %p %p %p\n", + urb->ep->desc.bEndpointAddress, + cur, next, next->prev); + } + } list_del_init(&urb->urb_list); spin_unlock(&hcd_urb_list_lock); } ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website 2012-10-22 15:17 ` Alan Stern @ 2012-10-22 15:30 ` Daniel Mack 2012-10-22 15:54 ` Alan Stern 0 siblings, 1 reply; 15+ messages in thread From: Daniel Mack @ 2012-10-22 15:30 UTC (permalink / raw) To: Alan Stern Cc: Artem S. Tashkinov, bp, pavel, linux-kernel, netdev, security, linux-media, linux-usb, alsa-devel On 22.10.2012 17:17, Alan Stern wrote: > On Sun, 21 Oct 2012, Artem S. Tashkinov wrote: > >> dmesg messages up to a crash can be seen here: https://bugzilla.kernel.org/attachment.cgi?id=84221 > > The first problem in the log is endpoint list corruption. Here's a > debugging patch which should provide a little more information. Maybe add a BUG() after each of these dev_err() so we stop at the first occurance and also see where we're coming from? > drivers/usb/core/hcd.c | 36 ++++++++++++++++++++++++++++++++++++ > 1 file changed, 36 insertions(+) > > Index: usb-3.6/drivers/usb/core/hcd.c > =================================================================== > --- usb-3.6.orig/drivers/usb/core/hcd.c > +++ usb-3.6/drivers/usb/core/hcd.c > @@ -1083,6 +1083,8 @@ EXPORT_SYMBOL_GPL(usb_calc_bus_time); > > /*-------------------------------------------------------------------------*/ > > +static bool list_error; > + > /** > * usb_hcd_link_urb_to_ep - add an URB to its endpoint queue > * @hcd: host controller to which @urb was submitted > @@ -1126,6 +1128,20 @@ int usb_hcd_link_urb_to_ep(struct usb_hc > */ > if (HCD_RH_RUNNING(hcd)) { > urb->unlinked = 0; > + > + { > + struct list_head *cur = &urb->ep->urb_list; > + struct list_head *prev = cur->prev; > + > + if (prev->next != cur && !list_error) { > + list_error = true; > + dev_err(&urb->dev->dev, > + "ep %x list add corruption: %p %p %p\n", > + urb->ep->desc.bEndpointAddress, > + cur, prev, prev->next); > + } > + } > + > list_add_tail(&urb->urb_list, &urb->ep->urb_list); > } else { > rc = -ESHUTDOWN; > @@ -1193,6 +1209,26 @@ void usb_hcd_unlink_urb_from_ep(struct u > { > /* clear all state linking urb to this dev (and hcd) */ > spin_lock(&hcd_urb_list_lock); > + { > + struct list_head *cur = &urb->urb_list; > + struct list_head *prev = cur->prev; > + struct list_head *next = cur->next; > + > + if (prev->next != cur && !list_error) { > + list_error = true; > + dev_err(&urb->dev->dev, > + "ep %x list del corruption prev: %p %p %p\n", > + urb->ep->desc.bEndpointAddress, > + cur, prev, prev->next); > + } > + if (next->prev != cur && !list_error) { > + list_error = true; > + dev_err(&urb->dev->dev, > + "ep %x list del corruption next: %p %p %p\n", > + urb->ep->desc.bEndpointAddress, > + cur, next, next->prev); > + } > + } > list_del_init(&urb->urb_list); > spin_unlock(&hcd_urb_list_lock); > } > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website 2012-10-22 15:30 ` Daniel Mack @ 2012-10-22 15:54 ` Alan Stern 2012-10-22 17:30 ` Artem S. Tashkinov 0 siblings, 1 reply; 15+ messages in thread From: Alan Stern @ 2012-10-22 15:54 UTC (permalink / raw) To: Daniel Mack Cc: Artem S. Tashkinov, bp, pavel, linux-kernel, netdev, security, linux-media, linux-usb, alsa-devel On Mon, 22 Oct 2012, Daniel Mack wrote: > On 22.10.2012 17:17, Alan Stern wrote: > > On Sun, 21 Oct 2012, Artem S. Tashkinov wrote: > > > >> dmesg messages up to a crash can be seen here: https://bugzilla.kernel.org/attachment.cgi?id=84221 > > > > The first problem in the log is endpoint list corruption. Here's a > > debugging patch which should provide a little more information. > > Maybe add a BUG() after each of these dev_err() so we stop at the first > occurance and also see where we're coming from? A BUG() at these points would crash the machine hard. And where we came from doesn't matter; what matters is the values in the pointers. Alan Stern ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website 2012-10-22 15:54 ` Alan Stern @ 2012-10-22 17:30 ` Artem S. Tashkinov 2012-10-22 18:01 ` Alan Stern 0 siblings, 1 reply; 15+ messages in thread From: Artem S. Tashkinov @ 2012-10-22 17:30 UTC (permalink / raw) To: stern Cc: zonque, bp, pavel, linux-kernel, netdev, security, linux-media, linux-usb, alsa-devel On Oct 22, 2012, Alan Stern <stern@rowland.harvard.edu> wrote: > A BUG() at these points would crash the machine hard. And where we > came from doesn't matter; what matters is the values in the pointers. OK, here's what the kernel prints with your patch: usb 6.1.4: ep 86 list del corruption prev: e5103b54 e5103a94 e51039d4 A small delay before I got thousands of list_del corruption messages would have been nice, but I managed to catch the message anyway. Artem ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website 2012-10-22 17:30 ` Artem S. Tashkinov @ 2012-10-22 18:01 ` Alan Stern 0 siblings, 0 replies; 15+ messages in thread From: Alan Stern @ 2012-10-22 18:01 UTC (permalink / raw) To: Artem S. Tashkinov Cc: zonque, bp, pavel, linux-kernel, netdev, security, linux-media, linux-usb, alsa-devel On Mon, 22 Oct 2012, Artem S. Tashkinov wrote: > OK, here's what the kernel prints with your patch: > > usb 6.1.4: ep 86 list del corruption prev: e5103b54 e5103a94 e51039d4 > > A small delay before I got thousands of list_del corruption messages would > have been nice, but I managed to catch the message anyway. All right. Here's a new patch, which will print more information and will provide a 10-second delay. For this to be useful, you should capture a usbmon trace at the same time. The relevant entries will show up in the trace shortly before _and_ shortly after the error message appears. Alan Stern P.S.: It will help if you unplug as many of the other USB devices as possible before running this test. Index: usb-3.6/drivers/usb/core/hcd.c =================================================================== --- usb-3.6.orig/drivers/usb/core/hcd.c +++ usb-3.6/drivers/usb/core/hcd.c @@ -1083,6 +1083,8 @@ EXPORT_SYMBOL_GPL(usb_calc_bus_time); /*-------------------------------------------------------------------------*/ +static bool list_error; + /** * usb_hcd_link_urb_to_ep - add an URB to its endpoint queue * @hcd: host controller to which @urb was submitted @@ -1193,6 +1195,25 @@ void usb_hcd_unlink_urb_from_ep(struct u { /* clear all state linking urb to this dev (and hcd) */ spin_lock(&hcd_urb_list_lock); + { + struct list_head *cur = &urb->urb_list; + struct list_head *prev = cur->prev; + struct list_head *next = cur->next; + + if (prev->next != cur && !list_error) { + list_error = true; + dev_err(&urb->dev->dev, + "ep %x list del corruption prev: %p %p %p %p %p\n", + urb->ep->desc.bEndpointAddress, + cur, prev, prev->next, next, next->prev); + dev_err(&urb->dev->dev, + "head %p urb %p urbprev %p urbnext %p\n", + &urb->ep->urb_list, urb, + list_entry(prev, struct urb, urb_list), + list_entry(next, struct urb, urb_list)); + mdelay(10000); + } + } list_del_init(&urb->urb_list); spin_unlock(&hcd_urb_list_lock); } ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2012-10-22 18:01 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <2104474742.26357.1350734815286.JavaMail.mail@webmail05>
[not found] ` <20121020162759.GA12551@liondog.tnic>
[not found] ` <966148591.30347.1350754909449.JavaMail.mail@webmail08>
[not found] ` <20121020203227.GC555@elf.ucw.cz>
[not found] ` <20121020225849.GA8976@liondog.tnic>
[not found] ` <1781795634.31179.1350774917965.JavaMail.mail@webmail04>
[not found] ` <20121021002424.GA16247@liondog.tnic>
[not found] ` <1798605268.19162.1350784641831.JavaMail.mail@webmail17>
[not found] ` <20121021110851.GA6504@liondog.tnic>
[not found] ` <121566322.100103.1350820776893.JavaMail.mail@webmail20>
[not found] ` <5083E4AA.3060807@gmail.com>
[not found] ` <317435358.100327.1350822615555.JavaMail.mail@webmail20>
2012-10-21 14:21 ` was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website Daniel Mack
2012-10-21 14:57 ` Artem S. Tashkinov
2012-10-21 15:22 ` Daniel Mack
2012-10-21 15:28 ` Alan Stern
2012-10-21 15:36 ` Daniel Mack
[not found] ` <20121021170315.GB20642@liondog.tnic>
2012-10-21 19:49 ` Re: Re: Re: " Artem S. Tashkinov
2012-10-21 19:54 ` Daniel Mack
2012-10-21 20:43 ` Artem S. Tashkinov
2012-10-21 21:00 ` Daniel Mack
2012-10-21 20:36 ` Re: Re: Re: Re: " Borislav Petkov
2012-10-22 15:17 ` Alan Stern
2012-10-22 15:30 ` Daniel Mack
2012-10-22 15:54 ` Alan Stern
2012-10-22 17:30 ` Artem S. Tashkinov
2012-10-22 18:01 ` Alan Stern
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).