public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Stephen D. Williams" <sdw@lig.net>
To: Matti Aarnio <matti.aarnio@zmailer.org>
Cc: Daniel Walker <dwalker@mvista.com>, linux-kernel@vger.kernel.org
Subject: Re: Soft lockup in e100 driver ?
Date: Wed, 10 Aug 2005 20:29:54 -0400	[thread overview]
Message-ID: <42FA9C02.3030406@lig.net> (raw)
In-Reply-To: <20050809163632.GQ22165@mea-ext.zmailer.org>

[-- Attachment #1: Type: text/plain, Size: 5581 bytes --]

I have been working for days to get a recent kernel to work with these 
small-format UP Celeron 2Ghz (running at 1.33Ghz) motherboards that I am 
planning to use as thin clients.  I'm doing a PXE boot, loading kernels, 
and trying to get networking to come up.

I eventually realized that the problem is that the e100 driver loads but 
does not allow any packet traffic.  The system isn't crashed, but I do 
get transmit timeouts.

I've used kernels: 2.6.10, 2.6.11, and 2.6.12.4, stock with only the 
"squashfs" patch applied and compiled as 586/....

The interesting thing is that Ubuntu 5.04, booted "Live" on the box, 
works just fine with the e100 driver with a kernel shown as: 
"2.6.10-5-386".  I'm going to work on pulling this kernel and its 
modules off to use.

Any help urgently appreciated.

sdw

Matti Aarnio wrote:

>On Tue, Aug 09, 2005 at 09:16:21AM -0700, Daniel Walker wrote:
>  
>
>>It looks like this might be an SMP race , it seem that both processors
>>are in e100_down(). There is a while loop in e100_clean_cbs() that
>>appears to have an unsafe looping condition . 
>>
>>It looks like cbs_avail might jump over params.cbs.count , then you
>>would have to wait for a rollover . Is this a PREEMPT_NONE kernel?
>>    
>>
>
>  # CONFIG_PREEMPT is not set
>  # CONFIG_PREEMPT_BKL is not set
>
>which is probably same as "NONE".
>
>There is _one_ processor in down, but other may be in trying to send
>some data out, or otherwise polling the card.
>
>However...  while real bugs in their own sense, none of these are
>as important as original "card dies" thing, during a recovery of
>which all this soft-lockup merryment happens.
>
>Also, as it happens only once a week or so (except when it happens
>right after another), testing code patches is rather slow.
>I can guess which things make it more likely, but I can't make it
>happen at will.
>
>  /Matti Aarnio
>
>
>  
>
>>This patch may help, but it's not a complete fix.
>>
>>--- linux-2.6.12.orig/drivers/net/e100.c        2005-08-05 16:45:59.000000000 +0000
>>+++ linux-2.6.12/drivers/net/e100.c     2005-08-09 16:14:45.000000000 +0000
>>@@ -1393,7 +1393,7 @@ static inline int e100_tx_clean(struct n
>> static void e100_clean_cbs(struct nic *nic)
>> {
>>        if(nic->cbs) {
>>-               while(nic->cbs_avail != nic->params.cbs.count) {
>>+               while(nic->cbs_avail < nic->params.cbs.count) {
>>                        struct cb *cb = nic->cb_to_clean;
>>                        if(cb->skb) {
>>                                pci_unmap_single(nic->pdev,
>>
>>
>>
>>On Tue, 2005-08-09 at 16:36 +0300, Matti Aarnio wrote:
>>    
>>
>>>Running very recent Fedora Core Development kernel I can following
>>>soft-oops..   ( 2.6.12-1.1455_FC5smp )
>>>
>>>
>>>e100: eth0: e100_watchdog: link up, 100Mbps, full-duplex
>>>BUG: soft lockup detected on CPU#0!
>>>
>>>Pid: 10743, comm:             ifconfig
>>>EIP: 0060:[<f88bf2f9>] CPU: 0
>>>EIP is at e100_clean_cbs+0x2f/0x12b [e100]
>>> EFLAGS: 00000293    Not tainted  (2.6.12-1.1455_FC5smp)
>>>EAX: 495c7c2b EBX: 495c7c2b ECX: f6c311a0 EDX: 00000000
>>>ESI: 00000040 EDI: f6c30000 EBP: f71a4b20 DS: 007b ES: 007b
>>>CR0: 8005003b CR2: 0804a544 CR3: 01e9cd80 CR4: 000006f0
>>> [<f88c0708>] e100_down+0x66/0x9a [e100]
>>> [<f88c1623>] e100_close+0xa/0xd [e100]
>>> [<c02b7adb>] dev_close+0x40/0x7e
>>> [<c02b8f59>] dev_change_flags+0x46/0xf5
>>> [<c02f76b3>] devinet_ioctl+0x564/0x5df
>>> [<c02af22c>] sock_ioctl+0xc3/0x250
>>> [<c02af169>] sock_ioctl+0x0/0x250
>>> [<c01762ef>] do_ioctl+0x1f/0x6d
>>> [<c017648f>] vfs_ioctl+0x50/0x1c6
>>> [<c0176662>] sys_ioctl+0x5d/0x6f
>>> [<c010394d>] syscall_call+0x7/0xb
>>> [<c014473f>] softlockup_tick+0x6f/0x80
>>> [<c01085b8>] timer_interrupt+0x2d/0x75
>>> [<c01448dd>] handle_IRQ_event+0x2e/0x5a
>>> [<c01449cb>] __do_IRQ+0xc2/0x127
>>> [<c0105f7e>] do_IRQ+0x4e/0x86
>>> =======================
>>> [<c01160cc>] smp_apic_timer_interrupt+0xc1/0xca
>>> [<c0104382>] common_interrupt+0x1a/0x20
>>> [<f88bf2f9>] e100_clean_cbs+0x2f/0x12b [e100]
>>> [<f88c0708>] e100_down+0x66/0x9a [e100]
>>> [<f88c1623>] e100_close+0xa/0xd [e100]
>>> [<c02b7adb>] dev_close+0x40/0x7e
>>> [<c02b8f59>] dev_change_flags+0x46/0xf5
>>> [<c02f76b3>] devinet_ioctl+0x564/0x5df
>>> [<c02af22c>] sock_ioctl+0xc3/0x250
>>> [<c02af169>] sock_ioctl+0x0/0x250
>>> [<c01762ef>] do_ioctl+0x1f/0x6d
>>> [<c017648f>] vfs_ioctl+0x50/0x1c6
>>> [<c0176662>] sys_ioctl+0x5d/0x6f
>>> [<c010394d>] syscall_call+0x7/0xb
>>>
>>>
>>>
>>>Preconditions for this are:
>>>
>>>- E100 card stopped working for some reason (no idea why, it just
>>>  does sometimes at this oldish 2x P-III machine)
>>>- There are active datastreams running in and out
>>>  (around 0.2 Mbps out, multiple megabits in.)
>>>- Commanding then "ifconfig eth0 down" results in what feels like 
>>>  system freezing, but it does recover in about 30-60 seconds
>>>  (it takes long enough for me to sweat bullets...)
>>>- While in freeze state, keyboard can go crazy, but mouse does
>>>  respond, as well as tvtime shows bt848 captured live video.
>>>-
>>>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>>the body of a message to majordomo@vger.kernel.org
>>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>Please read the FAQ at  http://www.tux.org/lkml/
>>>      
>>>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>  
>


[-- Attachment #2: sdw.vcf --]
[-- Type: text/x-vcard, Size: 234 bytes --]

begin:vcard
fn:Stephen Williams
n:Williams;Stephen
email;internet:sdw@lig.net
tel;work:703-724-0118
tel;fax:703-995-0407
tel;pager:sdwpage@lig.net
tel;home:703-729-5405
tel;cell:703-371-9362
x-mozilla-html:TRUE
version:2.1
end:vcard


  reply	other threads:[~2005-08-11  0:29 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-08-09 13:36 Soft lockup in e100 driver ? Matti Aarnio
2005-08-09 13:55 ` Jesper Juhl
2005-08-09 14:37   ` Matti Aarnio
2005-08-09 15:41     ` Steven Rostedt
2005-08-09 15:55       ` Matti Aarnio
2005-08-09 16:14         ` Steven Rostedt
2005-08-09 14:58 ` Lee Revell
2005-08-09 15:23   ` Steven Rostedt
2005-08-09 15:32     ` Daniel Walker
2005-08-09 16:16 ` Daniel Walker
2005-08-09 16:36   ` Matti Aarnio
2005-08-11  0:29     ` Stephen D. Williams [this message]
2005-08-11  0:32       ` Stephen D. Williams
2005-08-11  7:39         ` Matti Aarnio
2005-08-12  3:11           ` Stephen D. Williams
2005-08-12  5:45             ` Jesse Brandeburg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42FA9C02.3030406@lig.net \
    --to=sdw@lig.net \
    --cc=dwalker@mvista.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matti.aarnio@zmailer.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox