* Re: was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
[not found] ` <317435358.100327.1350822615555.JavaMail.mail@webmail20>
@ 2012-10-21 14:21 ` Daniel Mack
2012-10-21 14:57 ` Artem S. Tashkinov
0 siblings, 1 reply; 15+ messages in thread
From: Daniel Mack @ 2012-10-21 14:21 UTC (permalink / raw)
To: Artem S. Tashkinov
Cc: bp, pavel, linux-kernel, netdev@vger.kernel.org, security,
linux-media, linux-usb, alsa-devel
[-- Attachment #1: Type: text/plain, Size: 1906 bytes --]
[Cc: alsa-devel]
On 21.10.2012 14:30, Artem S. Tashkinov wrote:
> On Oct 21, 2012, Daniel Mack wrote:
>
>> A hint at least. How did you enable the audio record exactly? Can you
>> reproduce this with arecord?
>>
>> What chipset are you on? Please provide both "lspci -v" and "lsusb -v"
>> dumps. As I said, I fail to reproduce that issue on any of my machines.
>
> All other applications can read from the USB audio without problems, it's
> just something in the way Adobe Flash polls my audio input which causes
> a crash.
>
> Just video capture (without audio) works just fine in Adobe Flash.
Ok, so that pretty much rules out the host controller. I just wonder why
I still don't see it here, and I haven't heard of any such problem from
anyone else.
Some more questions:
- Which version of Flash are you running?
- Does this also happen with Firefox?
- Does flash access the device directly or via PulseAudio?
- Could you please apply the attached patch and see what it spits out to
dmesg once Flash opens the device? It returns -EINVAL in the hw_params
callback to prevent the actual streaming. On my machine with Flash
11.4.31.110, I get values of 2/44800/1/32768/2048/0, which seems sane.
Or does your machine still crash before anything is written to the logs?
> Only and only when I choose to use
>
> USB Device 0x46d:0x81d my system crashes in Adobe Flash.
>
> See the screenshot:
>
> https://bugzilla.kernel.org/attachment.cgi?id=84151
When exactly does the crash happen? Right after you selected that entry
from the list? There's a little recording level meter in that dialog.
Does that show any input from the microphone?
> My hardware information can be fetched from here:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=49181
>
> On a second thought that can be even an ALSA crash or pretty much
> anything else.
We'll see. Thanks for your help to sort this out!
Daniel
[-- Attachment #2: snd-usb-hwparams.diff --]
[-- Type: text/x-patch, Size: 778 bytes --]
diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c
index f782ce1..5664b45 100644
--- a/sound/usb/pcm.c
+++ b/sound/usb/pcm.c
@@ -453,6 +453,18 @@ static int snd_usb_hw_params(struct snd_pcm_substream *substream,
unsigned int channels, rate, format;
int ret, changed;
+
+ printk(">>> %s()\n", __func__);
+
+ printk("format: %d\n", params_format(hw_params));
+ printk("rate: %d\n", params_rate(hw_params));
+ printk("channels: %d\n", params_channels(hw_params));
+ printk("buffer bytes: %d\n", params_buffer_bytes(hw_params));
+ printk("period bytes: %d\n", params_period_bytes(hw_params));
+ printk("access: %d\n", params_access(hw_params));
+
+ return -EINVAL;
+
ret = snd_pcm_lib_alloc_vmalloc_buffer(substream,
params_buffer_bytes(hw_params));
if (ret < 0)
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: Re: was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
2012-10-21 14:21 ` was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website Daniel Mack
@ 2012-10-21 14:57 ` Artem S. Tashkinov
2012-10-21 15:22 ` Daniel Mack
0 siblings, 1 reply; 15+ messages in thread
From: Artem S. Tashkinov @ 2012-10-21 14:57 UTC (permalink / raw)
To: zonque
Cc: bp, pavel, linux-kernel, netdev, security, linux-media, linux-usb,
alsa-devel
> On Oct 21, 2012, Daniel Mack wrote:
>
> [Cc: alsa-devel]
>
> On 21.10.2012 14:30, Artem S. Tashkinov wrote:
> > On Oct 21, 2012, Daniel Mack wrote:
> >
> >> A hint at least. How did you enable the audio record exactly? Can you
> >> reproduce this with arecord?
> >>
> >> What chipset are you on? Please provide both "lspci -v" and "lsusb -v"
> >> dumps. As I said, I fail to reproduce that issue on any of my machines.
> >
> > All other applications can read from the USB audio without problems, it's
> > just something in the way Adobe Flash polls my audio input which causes
> > a crash.
> >
> > Just video capture (without audio) works just fine in Adobe Flash.
>
> Ok, so that pretty much rules out the host controller. I just wonder why
> I still don't see it here, and I haven't heard of any such problem from
> anyone else.
>
> Some more questions:
>
> - Which version of Flash are you running?
Google Chrome has its own version of Adobe Flash:
Name: Shockwave Flash
Description: Shockwave Flash 11.4 r31
Version: 11.4.31.110
> - Does this also happen with Firefox?
No, Adobe Flash in Firefox is an older version (Shockwave Flash 11.1 r102), it shows
just two input devices instead of three which the newer Flash players sees.
* HDA Intel PCH
* USB Device 0x46d:0x81d
> - Does flash access the device directly or via PulseAudio?
PA is not installed on my computer, so Flash accesses it directly via ALSA calls.
> - Could you please apply the attached patch and see what it spits out to
> dmesg once Flash opens the device? It returns -EINVAL in the hw_params
> callback to prevent the actual streaming. On my machine with Flash
> 11.4.31.110, I get values of 2/44800/1/32768/2048/0, which seems sane.
> Or does your machine still crash before anything is written to the logs?
I will try it a bit later.
> > Only and only when I choose to use
> >
> > USB Device 0x46d:0x81d my system crashes in Adobe Flash.
> >
> > See the screenshot:
> >
> > https://bugzilla.kernel.org/attachment.cgi?id=84151
>
> When exactly does the crash happen? Right after you selected that entry
> from the list? There's a little recording level meter in that dialog.
> Does that show any input from the microphone?
Yes, right after I select it and move the mouse cursor away from this combobox
so that this selection becomes active.
> > My hardware information can be fetched from here:
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=49181
> >
> > On a second thought that can be even an ALSA crash or pretty much
> > anything else.
>
> We'll see. Thanks for your help to sort this out!
Thank you for your assistance!
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
2012-10-21 14:57 ` Artem S. Tashkinov
@ 2012-10-21 15:22 ` Daniel Mack
2012-10-21 15:28 ` Alan Stern
0 siblings, 1 reply; 15+ messages in thread
From: Daniel Mack @ 2012-10-21 15:22 UTC (permalink / raw)
To: Artem S. Tashkinov
Cc: bp, pavel, linux-kernel, netdev, security, linux-media, linux-usb,
alsa-devel
On 21.10.2012 16:57, Artem S. Tashkinov wrote:
>> On Oct 21, 2012, Daniel Mack wrote:
>>
>> [Cc: alsa-devel]
>>
>> On 21.10.2012 14:30, Artem S. Tashkinov wrote:
>>> On Oct 21, 2012, Daniel Mack wrote:
>>>
>>>> A hint at least. How did you enable the audio record exactly? Can you
>>>> reproduce this with arecord?
>>>>
>>>> What chipset are you on? Please provide both "lspci -v" and "lsusb -v"
>>>> dumps. As I said, I fail to reproduce that issue on any of my machines.
>>>
>>> All other applications can read from the USB audio without problems, it's
>>> just something in the way Adobe Flash polls my audio input which causes
>>> a crash.
>>>
>>> Just video capture (without audio) works just fine in Adobe Flash.
>>
>> Ok, so that pretty much rules out the host controller. I just wonder why
>> I still don't see it here, and I haven't heard of any such problem from
>> anyone else.
>>
>> Some more questions:
>>
>> - Which version of Flash are you running?
>
> Google Chrome has its own version of Adobe Flash:
>
> Name: Shockwave Flash
> Description: Shockwave Flash 11.4 r31
> Version: 11.4.31.110
So that's the same that I'm using.
>> - Does this also happen with Firefox?
>
> No, Adobe Flash in Firefox is an older version (Shockwave Flash 11.1 r102), it shows
> just two input devices instead of three which the newer Flash players sees.
>
> * HDA Intel PCH
> * USB Device 0x46d:0x81d
And that works, I assume? Does the second choice in the newer Flash
version work maybe?
>> - Does flash access the device directly or via PulseAudio?
>
> PA is not installed on my computer, so Flash accesses it directly via ALSA calls.
Ok, Same here.
>> - Could you please apply the attached patch and see what it spits out to
>> dmesg once Flash opens the device? It returns -EINVAL in the hw_params
>> callback to prevent the actual streaming. On my machine with Flash
>> 11.4.31.110, I get values of 2/44800/1/32768/2048/0, which seems sane.
>> Or does your machine still crash before anything is written to the logs?
>
> I will try it a bit later.
Yes, we need to trace the call chain and see at which point the trouble
starts. What could help is tracing the google-chrome binary with strace
maybe. At least we would see the ioctl command sequence, if the log file
survives the crash.
As the usb list is still in Cc: - Artem's lcpci dump shows that his
machine features XHCI controllers. Can anyone think of a relation to
this problem?
And Artem, is there any way you boot your system on an older machine
that only has EHCI ports? Thinking about it, I wonder whether the freeze
in VBox and the crashes on native hardware have the same root cause. In
that case, would it be possible to share that VBox image?
Daniel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
2012-10-21 15:22 ` Daniel Mack
@ 2012-10-21 15:28 ` Alan Stern
2012-10-21 15:36 ` Daniel Mack
0 siblings, 1 reply; 15+ messages in thread
From: Alan Stern @ 2012-10-21 15:28 UTC (permalink / raw)
To: Daniel Mack
Cc: Artem S. Tashkinov, bp, pavel, linux-kernel, netdev, security,
linux-media, linux-usb, alsa-devel
On Sun, 21 Oct 2012, Daniel Mack wrote:
> As the usb list is still in Cc: - Artem's lcpci dump shows that his
> machine features XHCI controllers. Can anyone think of a relation to
> this problem?
>
> And Artem, is there any way you boot your system on an older machine
> that only has EHCI ports? Thinking about it, I wonder whether the freeze
> in VBox and the crashes on native hardware have the same root cause. In
> that case, would it be possible to share that VBox image?
Don't grasp at straws. All of the kernel logs Artem has posted show
ehci-hcd; none of them show xhci-hcd. Therefore the xHCI controller is
highly unlikely to be involved.
Alan Stern
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
2012-10-21 15:28 ` Alan Stern
@ 2012-10-21 15:36 ` Daniel Mack
0 siblings, 0 replies; 15+ messages in thread
From: Daniel Mack @ 2012-10-21 15:36 UTC (permalink / raw)
To: Alan Stern
Cc: security, alsa-devel, netdev, linux-usb, linux-kernel, bp, pavel,
Artem S. Tashkinov, linux-media
On Oct 21, 2012 5:28 PM, "Alan Stern" <stern@rowland.harvard.edu> wrote:
>
> On Sun, 21 Oct 2012, Daniel Mack wrote:
>
> > As the usb list is still in Cc: - Artem's lcpci dump shows that his
> > machine features XHCI controllers. Can anyone think of a relation to
> > this problem?
> >
> > And Artem, is there any way you boot your system on an older machine
> > that only has EHCI ports? Thinking about it, I wonder whether the freeze
> > in VBox and the crashes on native hardware have the same root cause. In
> > that case, would it be possible to share that VBox image?
>
> Don't grasp at straws. All of the kernel logs Artem has posted show
> ehci-hcd; none of them show xhci-hcd. Therefore the xHCI controller is
> highly unlikely to be involved.
You might be right - I'm just looking for differences between his setup and
mine that would explain why nobody else sees a severe bug that is 100%
reproducible for him.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re: Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
[not found] ` <20121021170315.GB20642@liondog.tnic>
@ 2012-10-21 19:49 ` Artem S. Tashkinov
2012-10-21 19:54 ` Daniel Mack
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: Artem S. Tashkinov @ 2012-10-21 19:49 UTC (permalink / raw)
To: bp
Cc: pavel, linux-kernel, netdev, security, linux-media, linux-usb,
zonque, alsa-devel, stern
>
> On Oct 21, 2012, Borislav Petkov <bp@alien8.de> wrote:
>
> On Sun, Oct 21, 2012 at 11:59:36AM +0000, Artem S. Tashkinov wrote:
> > http://imageshack.us/a/img685/9452/panicz.jpg
> >
> > list_del corruption. prev->next should be ... but was ...
>
> Btw, this is one of the debug options I told you to enable.
>
> > I cannot show you more as I have no serial console to use :( and the kernel
> > doesn't have enough time to push error messages to rsyslog and fsync
> > /var/log/messages
>
> I already told you how to catch that oops: boot with "pause_on_oops=600"
> on the kernel command line and photograph the screen when the first oops
> happens. This'll show us where the problem begins.
This option didn't have any effect, or maybe it's because it's such a serious crash
the kernel has no time to actually print an ooops/panic message.
dmesg messages up to a crash can be seen here: https://bugzilla.kernel.org/attachment.cgi?id=84221
I dumped them using this application:
$ cat scat.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define O_LARGEFILE 0100000
#define BUFFER 4096
#define __USE_FILE_OFFSET64 1
#define __USE_LARGEFILE64 1
int main(int argc, char *argv[])
{
int fd_out;
int64_t bytes_read;
void *buffer;
if (argc!=2) {
printf("Usage is: scat destination\n");
return 1;
}
buffer = malloc(BUFFER * sizeof(char));
if (buffer == NULL) {
printf("Error: can't allocate buffers\n");
return 2;
}
memset(buffer, 0, BUFFER);
printf("Dumping to \"%s\" ... ", argv[1]);
fflush(NULL);
if ((fd_out = open64(argv[1], O_WRONLY | O_LARGEFILE | O_SYNC | O_NOFOLLOW, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH)) == -1) {
printf("Error: destination file can't be created\n");
perror("open() ");
return 2;
}
bytes_read = 1;
while (bytes_read) {
bytes_read = fread(buffer, sizeof(char), BUFFER, stdin);
if (write(fd_out, (void *) buffer, bytes_read) != bytes_read)
{
printf("Error: can't write data to the destination file! Possibly a target disk is full\n");
return 3;
}
}
close(fd_out);
printf(" OK\n");
return 0;
}
I ran it this way: while :; do dmesg -c; done | scat /dev/sda11 (yes, straight to a hdd partition to eliminate a FS cache)
Don't judge me harshly - I'm not a programmer.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
2012-10-21 19:49 ` Re: Re: Re: " Artem S. Tashkinov
@ 2012-10-21 19:54 ` Daniel Mack
2012-10-21 20:43 ` Artem S. Tashkinov
2012-10-21 20:36 ` Re: Re: Re: Re: " Borislav Petkov
2012-10-22 15:17 ` Alan Stern
2 siblings, 1 reply; 15+ messages in thread
From: Daniel Mack @ 2012-10-21 19:54 UTC (permalink / raw)
To: Artem S. Tashkinov
Cc: bp, pavel, linux-kernel, netdev, security, linux-media, linux-usb,
alsa-devel, stern
On 21.10.2012 21:49, Artem S. Tashkinov wrote:
>>
>> On Oct 21, 2012, Borislav Petkov <bp@alien8.de> wrote:
>>
>> On Sun, Oct 21, 2012 at 11:59:36AM +0000, Artem S. Tashkinov wrote:
>>> http://imageshack.us/a/img685/9452/panicz.jpg
>>>
>>> list_del corruption. prev->next should be ... but was ...
>>
>> Btw, this is one of the debug options I told you to enable.
>>
>>> I cannot show you more as I have no serial console to use :( and the kernel
>>> doesn't have enough time to push error messages to rsyslog and fsync
>>> /var/log/messages
>>
>> I already told you how to catch that oops: boot with "pause_on_oops=600"
>> on the kernel command line and photograph the screen when the first oops
>> happens. This'll show us where the problem begins.
>
> This option didn't have any effect, or maybe it's because it's such a serious crash
> the kernel has no time to actually print an ooops/panic message.
>
> dmesg messages up to a crash can be seen here: https://bugzilla.kernel.org/attachment.cgi?id=84221
Nice. Could you do that again with the patch applied I sent yo some
hours ago?
Thanks,
Daniel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re: Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
2012-10-21 19:49 ` Re: Re: Re: " Artem S. Tashkinov
2012-10-21 19:54 ` Daniel Mack
@ 2012-10-21 20:36 ` Borislav Petkov
2012-10-22 15:17 ` Alan Stern
2 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2012-10-21 20:36 UTC (permalink / raw)
To: Artem S. Tashkinov
Cc: pavel, linux-kernel, netdev, security, linux-media, linux-usb,
zonque, alsa-devel, stern
On Sun, Oct 21, 2012 at 07:49:01PM +0000, Artem S. Tashkinov wrote:
> I ran it this way: while :; do dmesg -c; done | scat /dev/sda11 (yes,
> straight to a hdd partition to eliminate a FS cache)
Well, I'm no fs guy but this should still go through the buffer cache. I
think the O_SYNC flag makes sure it all lands on the partition in time.
Oh well, it doesn't matter.
> Don't judge me harshly - I'm not a programmer.
If you wrote that and you're not a programmer, it certainly looks cool,
good job!.
[ Btw, don't forget to free(buffer) at the end. ]
Also, there was a patchset recently which added a blockconsole method to
the kernel with which you can do something like that in a generic way.
Back to the issue at hand: it looks like ehci_hcd is causing some list
corruptions, maybe coming from the uvcvideo or whatever. I think the usb
people will have a better idea.
Btw, is there any particular reason you're running a 32-bit kernel?
Thanks.
--
Regards/Gruss,
Boris.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
2012-10-21 19:54 ` Daniel Mack
@ 2012-10-21 20:43 ` Artem S. Tashkinov
2012-10-21 21:00 ` Daniel Mack
0 siblings, 1 reply; 15+ messages in thread
From: Artem S. Tashkinov @ 2012-10-21 20:43 UTC (permalink / raw)
To: zonque
Cc: bp, pavel, linux-kernel, netdev, security, linux-media, linux-usb,
alsa-devel, stern
> Nice. Could you do that again with the patch applied I sent yo some
> hours ago?
That patch was of no help - the system has crashed and I couldn't spot relevant
messages.
I've no idea what it means.
Artem
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
2012-10-21 20:43 ` Artem S. Tashkinov
@ 2012-10-21 21:00 ` Daniel Mack
0 siblings, 0 replies; 15+ messages in thread
From: Daniel Mack @ 2012-10-21 21:00 UTC (permalink / raw)
To: Artem S. Tashkinov
Cc: bp-Gina5bIWoIWzQB+pC5nmwQ, pavel-+ZI9xUNit7I,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
netdev-u79uwXL29TY76Z2rM5mHXA, security-DgEjT+Ai2ygdnm+yROfE0A,
linux-media-u79uwXL29TY76Z2rM5mHXA,
linux-usb-u79uwXL29TY76Z2rM5mHXA,
alsa-devel-K7yf7f+aM1XWsZ/bQMPhNw,
stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz
On 21.10.2012 22:43, Artem S. Tashkinov wrote:
>> Nice. Could you do that again with the patch applied I sent yo some
>> hours ago?
>
> That patch was of no help - the system has crashed and I couldn't spot relevant
> messages.
>
> I've no idea what it means.
The sequence of driver callbacks issued on a stream start is
.open()
.hw_params()
.prepare()
.trigger()
If the ALSA part really causes this issue, the bad things happen either
in any of the driver callback functions or in the core underneath.
The patch I sent returns an error from the hw_params callback, and as
you still see the problem, that means that the crash happens before any
of the USB audio streaming really starts.
Could you try and return -EINVAL from snd_usb_capture_open() please?
If anyone has a better idea on how to debug this, please chime in.
Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re: Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
2012-10-21 19:49 ` Re: Re: Re: " Artem S. Tashkinov
2012-10-21 19:54 ` Daniel Mack
2012-10-21 20:36 ` Re: Re: Re: Re: " Borislav Petkov
@ 2012-10-22 15:17 ` Alan Stern
2012-10-22 15:30 ` Daniel Mack
2 siblings, 1 reply; 15+ messages in thread
From: Alan Stern @ 2012-10-22 15:17 UTC (permalink / raw)
To: Artem S. Tashkinov
Cc: bp, pavel, linux-kernel, netdev, security, linux-media, linux-usb,
zonque, alsa-devel
On Sun, 21 Oct 2012, Artem S. Tashkinov wrote:
> dmesg messages up to a crash can be seen here: https://bugzilla.kernel.org/attachment.cgi?id=84221
The first problem in the log is endpoint list corruption. Here's a
debugging patch which should provide a little more information.
Alan Stern
drivers/usb/core/hcd.c | 36 ++++++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
Index: usb-3.6/drivers/usb/core/hcd.c
===================================================================
--- usb-3.6.orig/drivers/usb/core/hcd.c
+++ usb-3.6/drivers/usb/core/hcd.c
@@ -1083,6 +1083,8 @@ EXPORT_SYMBOL_GPL(usb_calc_bus_time);
/*-------------------------------------------------------------------------*/
+static bool list_error;
+
/**
* usb_hcd_link_urb_to_ep - add an URB to its endpoint queue
* @hcd: host controller to which @urb was submitted
@@ -1126,6 +1128,20 @@ int usb_hcd_link_urb_to_ep(struct usb_hc
*/
if (HCD_RH_RUNNING(hcd)) {
urb->unlinked = 0;
+
+ {
+ struct list_head *cur = &urb->ep->urb_list;
+ struct list_head *prev = cur->prev;
+
+ if (prev->next != cur && !list_error) {
+ list_error = true;
+ dev_err(&urb->dev->dev,
+ "ep %x list add corruption: %p %p %p\n",
+ urb->ep->desc.bEndpointAddress,
+ cur, prev, prev->next);
+ }
+ }
+
list_add_tail(&urb->urb_list, &urb->ep->urb_list);
} else {
rc = -ESHUTDOWN;
@@ -1193,6 +1209,26 @@ void usb_hcd_unlink_urb_from_ep(struct u
{
/* clear all state linking urb to this dev (and hcd) */
spin_lock(&hcd_urb_list_lock);
+ {
+ struct list_head *cur = &urb->urb_list;
+ struct list_head *prev = cur->prev;
+ struct list_head *next = cur->next;
+
+ if (prev->next != cur && !list_error) {
+ list_error = true;
+ dev_err(&urb->dev->dev,
+ "ep %x list del corruption prev: %p %p %p\n",
+ urb->ep->desc.bEndpointAddress,
+ cur, prev, prev->next);
+ }
+ if (next->prev != cur && !list_error) {
+ list_error = true;
+ dev_err(&urb->dev->dev,
+ "ep %x list del corruption next: %p %p %p\n",
+ urb->ep->desc.bEndpointAddress,
+ cur, next, next->prev);
+ }
+ }
list_del_init(&urb->urb_list);
spin_unlock(&hcd_urb_list_lock);
}
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
2012-10-22 15:17 ` Alan Stern
@ 2012-10-22 15:30 ` Daniel Mack
2012-10-22 15:54 ` Alan Stern
0 siblings, 1 reply; 15+ messages in thread
From: Daniel Mack @ 2012-10-22 15:30 UTC (permalink / raw)
To: Alan Stern
Cc: Artem S. Tashkinov, bp, pavel, linux-kernel, netdev, security,
linux-media, linux-usb, alsa-devel
On 22.10.2012 17:17, Alan Stern wrote:
> On Sun, 21 Oct 2012, Artem S. Tashkinov wrote:
>
>> dmesg messages up to a crash can be seen here: https://bugzilla.kernel.org/attachment.cgi?id=84221
>
> The first problem in the log is endpoint list corruption. Here's a
> debugging patch which should provide a little more information.
Maybe add a BUG() after each of these dev_err() so we stop at the first
occurance and also see where we're coming from?
> drivers/usb/core/hcd.c | 36 ++++++++++++++++++++++++++++++++++++
> 1 file changed, 36 insertions(+)
>
> Index: usb-3.6/drivers/usb/core/hcd.c
> ===================================================================
> --- usb-3.6.orig/drivers/usb/core/hcd.c
> +++ usb-3.6/drivers/usb/core/hcd.c
> @@ -1083,6 +1083,8 @@ EXPORT_SYMBOL_GPL(usb_calc_bus_time);
>
> /*-------------------------------------------------------------------------*/
>
> +static bool list_error;
> +
> /**
> * usb_hcd_link_urb_to_ep - add an URB to its endpoint queue
> * @hcd: host controller to which @urb was submitted
> @@ -1126,6 +1128,20 @@ int usb_hcd_link_urb_to_ep(struct usb_hc
> */
> if (HCD_RH_RUNNING(hcd)) {
> urb->unlinked = 0;
> +
> + {
> + struct list_head *cur = &urb->ep->urb_list;
> + struct list_head *prev = cur->prev;
> +
> + if (prev->next != cur && !list_error) {
> + list_error = true;
> + dev_err(&urb->dev->dev,
> + "ep %x list add corruption: %p %p %p\n",
> + urb->ep->desc.bEndpointAddress,
> + cur, prev, prev->next);
> + }
> + }
> +
> list_add_tail(&urb->urb_list, &urb->ep->urb_list);
> } else {
> rc = -ESHUTDOWN;
> @@ -1193,6 +1209,26 @@ void usb_hcd_unlink_urb_from_ep(struct u
> {
> /* clear all state linking urb to this dev (and hcd) */
> spin_lock(&hcd_urb_list_lock);
> + {
> + struct list_head *cur = &urb->urb_list;
> + struct list_head *prev = cur->prev;
> + struct list_head *next = cur->next;
> +
> + if (prev->next != cur && !list_error) {
> + list_error = true;
> + dev_err(&urb->dev->dev,
> + "ep %x list del corruption prev: %p %p %p\n",
> + urb->ep->desc.bEndpointAddress,
> + cur, prev, prev->next);
> + }
> + if (next->prev != cur && !list_error) {
> + list_error = true;
> + dev_err(&urb->dev->dev,
> + "ep %x list del corruption next: %p %p %p\n",
> + urb->ep->desc.bEndpointAddress,
> + cur, next, next->prev);
> + }
> + }
> list_del_init(&urb->urb_list);
> spin_unlock(&hcd_urb_list_lock);
> }
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
2012-10-22 15:30 ` Daniel Mack
@ 2012-10-22 15:54 ` Alan Stern
2012-10-22 17:30 ` Artem S. Tashkinov
0 siblings, 1 reply; 15+ messages in thread
From: Alan Stern @ 2012-10-22 15:54 UTC (permalink / raw)
To: Daniel Mack
Cc: Artem S. Tashkinov, bp, pavel, linux-kernel, netdev, security,
linux-media, linux-usb, alsa-devel
On Mon, 22 Oct 2012, Daniel Mack wrote:
> On 22.10.2012 17:17, Alan Stern wrote:
> > On Sun, 21 Oct 2012, Artem S. Tashkinov wrote:
> >
> >> dmesg messages up to a crash can be seen here: https://bugzilla.kernel.org/attachment.cgi?id=84221
> >
> > The first problem in the log is endpoint list corruption. Here's a
> > debugging patch which should provide a little more information.
>
> Maybe add a BUG() after each of these dev_err() so we stop at the first
> occurance and also see where we're coming from?
A BUG() at these points would crash the machine hard. And where we
came from doesn't matter; what matters is the values in the pointers.
Alan Stern
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
2012-10-22 15:54 ` Alan Stern
@ 2012-10-22 17:30 ` Artem S. Tashkinov
2012-10-22 18:01 ` Alan Stern
0 siblings, 1 reply; 15+ messages in thread
From: Artem S. Tashkinov @ 2012-10-22 17:30 UTC (permalink / raw)
To: stern
Cc: zonque, bp, pavel, linux-kernel, netdev, security, linux-media,
linux-usb, alsa-devel
On Oct 22, 2012, Alan Stern <stern@rowland.harvard.edu> wrote:
> A BUG() at these points would crash the machine hard. And where we
> came from doesn't matter; what matters is the values in the pointers.
OK, here's what the kernel prints with your patch:
usb 6.1.4: ep 86 list del corruption prev: e5103b54 e5103a94 e51039d4
A small delay before I got thousands of list_del corruption messages would
have been nice, but I managed to catch the message anyway.
Artem
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
2012-10-22 17:30 ` Artem S. Tashkinov
@ 2012-10-22 18:01 ` Alan Stern
0 siblings, 0 replies; 15+ messages in thread
From: Alan Stern @ 2012-10-22 18:01 UTC (permalink / raw)
To: Artem S. Tashkinov
Cc: zonque, bp, pavel, linux-kernel, netdev, security, linux-media,
linux-usb, alsa-devel
On Mon, 22 Oct 2012, Artem S. Tashkinov wrote:
> OK, here's what the kernel prints with your patch:
>
> usb 6.1.4: ep 86 list del corruption prev: e5103b54 e5103a94 e51039d4
>
> A small delay before I got thousands of list_del corruption messages would
> have been nice, but I managed to catch the message anyway.
All right. Here's a new patch, which will print more information and
will provide a 10-second delay.
For this to be useful, you should capture a usbmon trace at the same
time. The relevant entries will show up in the trace shortly before
_and_ shortly after the error message appears.
Alan Stern
P.S.: It will help if you unplug as many of the other USB devices as
possible before running this test.
Index: usb-3.6/drivers/usb/core/hcd.c
===================================================================
--- usb-3.6.orig/drivers/usb/core/hcd.c
+++ usb-3.6/drivers/usb/core/hcd.c
@@ -1083,6 +1083,8 @@ EXPORT_SYMBOL_GPL(usb_calc_bus_time);
/*-------------------------------------------------------------------------*/
+static bool list_error;
+
/**
* usb_hcd_link_urb_to_ep - add an URB to its endpoint queue
* @hcd: host controller to which @urb was submitted
@@ -1193,6 +1195,25 @@ void usb_hcd_unlink_urb_from_ep(struct u
{
/* clear all state linking urb to this dev (and hcd) */
spin_lock(&hcd_urb_list_lock);
+ {
+ struct list_head *cur = &urb->urb_list;
+ struct list_head *prev = cur->prev;
+ struct list_head *next = cur->next;
+
+ if (prev->next != cur && !list_error) {
+ list_error = true;
+ dev_err(&urb->dev->dev,
+ "ep %x list del corruption prev: %p %p %p %p %p\n",
+ urb->ep->desc.bEndpointAddress,
+ cur, prev, prev->next, next, next->prev);
+ dev_err(&urb->dev->dev,
+ "head %p urb %p urbprev %p urbnext %p\n",
+ &urb->ep->urb_list, urb,
+ list_entry(prev, struct urb, urb_list),
+ list_entry(next, struct urb, urb_list));
+ mdelay(10000);
+ }
+ }
list_del_init(&urb->urb_list);
spin_unlock(&hcd_urb_list_lock);
}
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2012-10-22 18:01 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <2104474742.26357.1350734815286.JavaMail.mail@webmail05>
[not found] ` <20121020162759.GA12551@liondog.tnic>
[not found] ` <966148591.30347.1350754909449.JavaMail.mail@webmail08>
[not found] ` <20121020203227.GC555@elf.ucw.cz>
[not found] ` <20121020225849.GA8976@liondog.tnic>
[not found] ` <1781795634.31179.1350774917965.JavaMail.mail@webmail04>
[not found] ` <20121021002424.GA16247@liondog.tnic>
[not found] ` <1798605268.19162.1350784641831.JavaMail.mail@webmail17>
[not found] ` <20121021110851.GA6504@liondog.tnic>
[not found] ` <121566322.100103.1350820776893.JavaMail.mail@webmail20>
[not found] ` <5083E4AA.3060807@gmail.com>
[not found] ` <317435358.100327.1350822615555.JavaMail.mail@webmail20>
2012-10-21 14:21 ` was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website Daniel Mack
2012-10-21 14:57 ` Artem S. Tashkinov
2012-10-21 15:22 ` Daniel Mack
2012-10-21 15:28 ` Alan Stern
2012-10-21 15:36 ` Daniel Mack
[not found] ` <20121021170315.GB20642@liondog.tnic>
2012-10-21 19:49 ` Re: Re: Re: " Artem S. Tashkinov
2012-10-21 19:54 ` Daniel Mack
2012-10-21 20:43 ` Artem S. Tashkinov
2012-10-21 21:00 ` Daniel Mack
2012-10-21 20:36 ` Re: Re: Re: Re: " Borislav Petkov
2012-10-22 15:17 ` Alan Stern
2012-10-22 15:30 ` Daniel Mack
2012-10-22 15:54 ` Alan Stern
2012-10-22 17:30 ` Artem S. Tashkinov
2012-10-22 18:01 ` Alan Stern
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).