From: David Brownell <david-b@pacbell.net>
To: Ingo Molnar <mingo@elte.hu>, Alan Stern <stern@rowland.harvard.edu>
Cc: Greg KH <gregkh@suse.de>,
linux-kernel@vger.kernel.org, linux-usb@vger.kernel.org,
"Rafael J. Wysocki" <rjw@sisk.pl>
Subject: Re: [USB boot crash, -git] ecm_do_notify(), list_add corruption. prev->next should be next (ffff88003b8f82f8)
Date: Wed, 23 Jul 2008 16:37:32 -0700 [thread overview]
Message-ID: <200807231637.33009.david-b@pacbell.net> (raw)
In-Reply-To: <20080722134042.GA14315@elte.hu>
On Tuesday 22 July 2008, Ingo Molnar wrote:
>
> hi Greg, David,
>
> -tip randconfig boot testing just found this USB boot crash regression:
Which I can reproduce with "dummy_hcd" (an emulator) but not
using a real peripheral controller driver ... using i386,
not x86_64 as you did, fwiw.
So far, the fingers point at dummy_hcd... the merge doesn't
seem to have had problems, and the gadget driver had been
tested with four different peripheral controller drivers
(pre-merge).
I'll give it a look on something with a serial console ... doing
it on a PC is useless, since the list debug stuff does a BUG()
which renders the machine unusable even if I could read more than
20 lines of data on the screen. :(
> dummy_udc dummy_udc: enabled ep-a (ep1in-bulk) maxpacket 512
> dummy_udc dummy_udc: enabled ep-b (ep2out-bulk) maxpacket 512
Was that all that it told you about? If it was telling you it
enabled those two, it *should* have previously told you it was
enabling ep-c and ep-d (also maxpacket 512) also ep-e and ep-f
(maxpacket 16 and 8, respectively, I'd think).
What it was doing here: The host side enumerated this (emulated)
device, activated altsetting with data (and hence ep-a and ep-b),
and the peripheral side then issued a link state notification.
But the link state notification (probably using ep-e) message
couldn't be queued (list_add_tail) because of this oopsing:
> usb0: qlen 10
> g_cdc gadget: notify connect false
> list_add corruption. prev->next should be next (ffff88003b8f82f8), but was ffff88003b8f8e80. (prev=ffff88003b8f8e80).
Now, prev->next == prev is expected here: that list of messages
should be empty.
What's wrong is that head->prev != head, meaning something
trashed a dummy_hcd data structure.
> ------------[ cut here ]------------
> kernel BUG at lib/list_debug.c:33!
> invalid opcode: 0000 [1] PREEMPT SMP DEBUG_PAGEALLOC
> ...
> Call Trace:
> <IRQ> [<ffffffff8073de15>] dummy_queue+0xd5/0x1d0
> [<ffffffff8073f3b6>] ecm_do_notify+0x116/0x1f0
I tried this on the "real hardware" (net2280) being emulated
in this case by this "dummy" driver, and it works just fine
with list debugging enabled. And I've used it with three
other flavors of "real hardware" (though not yet with the
latest kernel GIT), so I suspect it'll continue to work there.
My first reaction is to think this must be an issue with the
"dummy_hcd" code, since that's actually the proximate location
of the oops. I sanity checked the relevant ECM logic, and
it looks OK at first glance. (As I'd expect, since it already
worked with four different controller drivers!)
> [<ffffffff8073f4a5>] ecm_notify+0x15/0x20
> [<ffffffff8073f851>] ecm_set_alt+0x111/0x1d0
> [<ffffffff807418d7>] composite_setup+0x127/0x900
> [<ffffffff80261136>] ? lock_release_holdtime+0x66/0x80
> [<ffffffff8073d31b>] ? dummy_timer+0x65b/0xac0
> [<ffffffff8073ccc0>] ? dummy_timer+0x0/0xac0
> [<ffffffff8073d334>] dummy_timer+0x674/0xac0
> [<ffffffff8073ccc0>] ? dummy_timer+0x0/0xac0
> [<ffffffff80248c7b>] run_timer_softirq+0x1db/0x250
> [<ffffffff80244936>] __do_softirq+0x66/0xd0
> [<ffffffff8020ce8c>] call_softirq+0x1c/0x30
> [<ffffffff8020f7a5>] do_softirq+0x45/0x80
> [<ffffffff802447d5>] irq_exit+0xa5/0xb0
> [<ffffffff8021ce0d>] smp_apic_timer_interrupt+0x8d/0xd0
> [<ffffffff8020c8d6>] apic_timer_interrupt+0x66/0x70
> ...
> Kernel panic - not syncing: Fatal exception in interrupt
> Pid: 0, comm: swapper Tainted: G D 2.6.26-tip-06162-g2ef4b1e-dirty #13411
>
> With this config:
>
> http://redhat.com/~mingo/misc/config-Tue_Jul_22_13_44_45_CEST_2008.bad
>
> i tried to do a blind revert of da741b8c5 ("usb ethernet gadget: split
> CDC Ethernet function") where this crash originates from - but the
> resulting kernel would not build. (it has followup dependencies)
Right. These updates are arguably overdue: factoring the
individual functions out from each other. The Ethernet gadget
code had three (!) separate protocol stacks, each of which now
lives in its own file as does the core they shared.
So reverting them would be the wrong solution in any case.
- Dave
next prev parent reply other threads:[~2008-07-23 23:37 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-21 22:30 [GIT PATCH] USB patches for 2.6.26 Greg KH
2008-07-22 8:54 ` Benny Halevy
2008-07-22 15:27 ` Greg KH
2008-07-22 13:40 ` [USB boot crash, -git] ecm_do_notify(), list_add corruption. prev->next should be next (ffff88003b8f82f8) Ingo Molnar
2008-07-23 0:10 ` Greg KH
2008-07-23 0:22 ` David Brownell
2008-07-23 23:37 ` David Brownell [this message]
2008-07-24 3:46 ` Alan Stern
2008-07-24 7:40 ` David Brownell
2008-07-25 3:57 ` Alan Stern
2008-07-26 1:18 ` David Brownell
2008-07-26 15:06 ` David Brownell
2008-07-26 15:19 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200807231637.33009.david-b@pacbell.net \
--to=david-b@pacbell.net \
--cc=gregkh@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-usb@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=rjw@sisk.pl \
--cc=stern@rowland.harvard.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox