netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* e1000: assertion hit in e1000_clean(), kernel 2.6.21.1
@ 2007-05-18 22:33 Chuck Ebbert
  2007-05-18 23:03 ` Kok, Auke
  0 siblings, 1 reply; 5+ messages in thread
From: Chuck Ebbert @ 2007-05-18 22:33 UTC (permalink / raw)
  To: Auke Kok; +Cc: Netdev

We have several reports now of hitting this assertion in
netif_rx_complete(), inlined in e1000_clean():

        BUG_ON(!test_bit(__LINK_STATE_RX_SCHED, &dev->state));

 [<c0431162>] __queue_work+0x51/0x5e
 [<c059eea1>] et_rx_action+0x94/0x185
 [<c042837d>] __do_softirq+0x5d/0xba
 [<c0407837>] do_softirq+0x59/0xb1
 [<c04281e9>] local_bh_enable_ip+0x35/0x40
 [<c059e46b>] dev_open+0x44/0x62
 [<c059ce8c>] dev_change_flags+0x46/0xe3
 [<c05d9e09>] devinet_ioctl+0x250/0x56a

The second function is "net_rx_action", corrupted
by the serial connection. 

The source file has four extra lines at the top because of a
trivial wireless patch, so 898 in that code is really 894 in
the stock kernel.

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=240339

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: e1000: assertion hit in e1000_clean(), kernel 2.6.21.1
  2007-05-18 22:33 e1000: assertion hit in e1000_clean(), kernel 2.6.21.1 Chuck Ebbert
@ 2007-05-18 23:03 ` Kok, Auke
  2007-05-18 23:18   ` Curtis Doty
  2007-05-20 10:55   ` Herbert Xu
  0 siblings, 2 replies; 5+ messages in thread
From: Kok, Auke @ 2007-05-18 23:03 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: Netdev

Chuck Ebbert wrote:
> We have several reports now of hitting this assertion in
> netif_rx_complete(), inlined in e1000_clean():
> 
>         BUG_ON(!test_bit(__LINK_STATE_RX_SCHED, &dev->state));
> 
>  [<c0431162>] __queue_work+0x51/0x5e
>  [<c059eea1>] et_rx_action+0x94/0x185
>  [<c042837d>] __do_softirq+0x5d/0xba
>  [<c0407837>] do_softirq+0x59/0xb1
>  [<c04281e9>] local_bh_enable_ip+0x35/0x40
>  [<c059e46b>] dev_open+0x44/0x62
>  [<c059ce8c>] dev_change_flags+0x46/0xe3
>  [<c05d9e09>] devinet_ioctl+0x250/0x56a
> 
> The second function is "net_rx_action", corrupted
> by the serial connection. 
> 
> The source file has four extra lines at the top because of a
> trivial wireless patch, so 898 in that code is really 894 in
> the stock kernel.

please shared that code then.

> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=240339

this is lacking a lot of debugging info. Please post *all* the dmesg output, 
lspci -vvv, ethtool -e ethX, etc. in the bugzilla.

Auke

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: e1000: assertion hit in e1000_clean(), kernel 2.6.21.1
  2007-05-18 23:03 ` Kok, Auke
@ 2007-05-18 23:18   ` Curtis Doty
  2007-05-20 10:55   ` Herbert Xu
  1 sibling, 0 replies; 5+ messages in thread
From: Curtis Doty @ 2007-05-18 23:18 UTC (permalink / raw)
  To: Netdev

4:03pm Kok, Auke said:

> Chuck Ebbert wrote:
>>  We have several reports now of hitting this assertion in
>>  netif_rx_complete(), inlined in e1000_clean():
>>
>>          BUG_ON(!test_bit(__LINK_STATE_RX_SCHED, &dev->state));
>>
>>   [<c0431162>] __queue_work+0x51/0x5e
>>   [<c059eea1>] et_rx_action+0x94/0x185
>>   [<c042837d>] __do_softirq+0x5d/0xba
>>   [<c0407837>] do_softirq+0x59/0xb1
>>   [<c04281e9>] local_bh_enable_ip+0x35/0x40
>>   [<c059e46b>] dev_open+0x44/0x62
>>   [<c059ce8c>] dev_change_flags+0x46/0xe3
>>   [<c05d9e09>] devinet_ioctl+0x250/0x56a
>>
>>  The second function is "net_rx_action", corrupted
>>  by the serial connection.
>>  The source file has four extra lines at the top because of a
>>  trivial wireless patch, so 898 in that code is really 894 in
>>  the stock kernel.
>
> please shared that code then.
>

http://cvs.fedora.redhat.com/viewcvs/devel/kernel/linux-2.6-wireless.patch


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: e1000: assertion hit in e1000_clean(), kernel 2.6.21.1
  2007-05-18 23:03 ` Kok, Auke
  2007-05-18 23:18   ` Curtis Doty
@ 2007-05-20 10:55   ` Herbert Xu
  2007-05-20 19:03     ` Kok, Auke
  1 sibling, 1 reply; 5+ messages in thread
From: Herbert Xu @ 2007-05-20 10:55 UTC (permalink / raw)
  To: Kok, Auke; +Cc: cebbert, netdev

Kok, Auke <auke-jan.h.kok@intel.com> wrote:
>
>> The source file has four extra lines at the top because of a
>> trivial wireless patch, so 898 in that code is really 894 in
>> the stock kernel.
> 
> please shared that code then.

I've had a look and e1000 is definitely buggy.

The problem is that you're calling netif_poll_enable on startup.
This is *wrong*.

netif_poll_enable can only be called if you've previously called
netif_poll_disable.  Otherwise a poll might already be in action
and you may get a crash like this.

So perhaps you should divide e1000_up into two sections, one that
is called on both start and restart and another which is only
called on restart (i.e., after e1000_down).

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: e1000: assertion hit in e1000_clean(), kernel 2.6.21.1
  2007-05-20 10:55   ` Herbert Xu
@ 2007-05-20 19:03     ` Kok, Auke
  0 siblings, 0 replies; 5+ messages in thread
From: Kok, Auke @ 2007-05-20 19:03 UTC (permalink / raw)
  To: Herbert Xu; +Cc: cebbert, netdev, Jesse Brandeburg

Herbert Xu wrote:
> Kok, Auke <auke-jan.h.kok@intel.com> wrote:
>>> The source file has four extra lines at the top because of a
>>> trivial wireless patch, so 898 in that code is really 894 in
>>> the stock kernel.
>> please shared that code then.
> 
> I've had a look and e1000 is definitely buggy.
> 
> The problem is that you're calling netif_poll_enable on startup.
> This is *wrong*.
> 
> netif_poll_enable can only be called if you've previously called
> netif_poll_disable.  Otherwise a poll might already be in action
> and you may get a crash like this.
> 
> So perhaps you should divide e1000_up into two sections, one that
> is called on both start and restart and another which is only
> called on restart (i.e., after e1000_down).

OK, that would explain the recent frenzy of reports in this matter. That code 
was only recently merged. I will dig into this and get a patch out as soon as I 
can so you can test this.

Thanks Herbert.

Auke

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-05-20 19:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-18 22:33 e1000: assertion hit in e1000_clean(), kernel 2.6.21.1 Chuck Ebbert
2007-05-18 23:03 ` Kok, Auke
2007-05-18 23:18   ` Curtis Doty
2007-05-20 10:55   ` Herbert Xu
2007-05-20 19:03     ` Kok, Auke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).