All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Stephen D. Williams" <sdw@lig.net>
To: linux-kernel@vger.kernel.org
Cc: Matti Aarnio <matti.aarnio@zmailer.org>,
	Daniel Walker <dwalker@mvista.com>
Subject: Re: Soft lockup in e100 driver ?
Date: Wed, 10 Aug 2005 20:32:45 -0400	[thread overview]
Message-ID: <42FA9CAD.7030607@lig.net> (raw)
In-Reply-To: <42FA9C02.3030406@lig.net>

[-- Attachment #1: Type: text/plain, Size: 6060 bytes --]

I just noticed that the Ubuntu setup says "GSI 20(level,low) -> IRQ 20" 
whereas I remember my built kernels saying "No GSI......  IRQ 11".  I'll 
investigate what that means and how to enable it.  Pointers appreciated.

sdw

Stephen D. Williams wrote:

> I have been working for days to get a recent kernel to work with these 
> small-format UP Celeron 2Ghz (running at 1.33Ghz) motherboards that I 
> am planning to use as thin clients.  I'm doing a PXE boot, loading 
> kernels, and trying to get networking to come up.
>
> I eventually realized that the problem is that the e100 driver loads 
> but does not allow any packet traffic.  The system isn't crashed, but 
> I do get transmit timeouts.
>
> I've used kernels: 2.6.10, 2.6.11, and 2.6.12.4, stock with only the 
> "squashfs" patch applied and compiled as 586/....
>
> The interesting thing is that Ubuntu 5.04, booted "Live" on the box, 
> works just fine with the e100 driver with a kernel shown as: 
> "2.6.10-5-386".  I'm going to work on pulling this kernel and its 
> modules off to use.
>
> Any help urgently appreciated.
>
> sdw
>
> Matti Aarnio wrote:
>
>> On Tue, Aug 09, 2005 at 09:16:21AM -0700, Daniel Walker wrote:
>>  
>>
>>> It looks like this might be an SMP race , it seem that both processors
>>> are in e100_down(). There is a while loop in e100_clean_cbs() that
>>> appears to have an unsafe looping condition .
>>> It looks like cbs_avail might jump over params.cbs.count , then you
>>> would have to wait for a rollover . Is this a PREEMPT_NONE kernel?
>>>   
>>
>>
>>  # CONFIG_PREEMPT is not set
>>  # CONFIG_PREEMPT_BKL is not set
>>
>> which is probably same as "NONE".
>>
>> There is _one_ processor in down, but other may be in trying to send
>> some data out, or otherwise polling the card.
>>
>> However...  while real bugs in their own sense, none of these are
>> as important as original "card dies" thing, during a recovery of
>> which all this soft-lockup merryment happens.
>>
>> Also, as it happens only once a week or so (except when it happens
>> right after another), testing code patches is rather slow.
>> I can guess which things make it more likely, but I can't make it
>> happen at will.
>>
>>  /Matti Aarnio
>>
>>
>>  
>>
>>> This patch may help, but it's not a complete fix.
>>>
>>> --- linux-2.6.12.orig/drivers/net/e100.c        2005-08-05 
>>> 16:45:59.000000000 +0000
>>> +++ linux-2.6.12/drivers/net/e100.c     2005-08-09 
>>> 16:14:45.000000000 +0000
>>> @@ -1393,7 +1393,7 @@ static inline int e100_tx_clean(struct n
>>> static void e100_clean_cbs(struct nic *nic)
>>> {
>>>        if(nic->cbs) {
>>> -               while(nic->cbs_avail != nic->params.cbs.count) {
>>> +               while(nic->cbs_avail < nic->params.cbs.count) {
>>>                        struct cb *cb = nic->cb_to_clean;
>>>                        if(cb->skb) {
>>>                                pci_unmap_single(nic->pdev,
>>>
>>>
>>>
>>> On Tue, 2005-08-09 at 16:36 +0300, Matti Aarnio wrote:
>>>   
>>>
>>>> Running very recent Fedora Core Development kernel I can following
>>>> soft-oops..   ( 2.6.12-1.1455_FC5smp )
>>>>
>>>>
>>>> e100: eth0: e100_watchdog: link up, 100Mbps, full-duplex
>>>> BUG: soft lockup detected on CPU#0!
>>>>
>>>> Pid: 10743, comm:             ifconfig
>>>> EIP: 0060:[<f88bf2f9>] CPU: 0
>>>> EIP is at e100_clean_cbs+0x2f/0x12b [e100]
>>>> EFLAGS: 00000293    Not tainted  (2.6.12-1.1455_FC5smp)
>>>> EAX: 495c7c2b EBX: 495c7c2b ECX: f6c311a0 EDX: 00000000
>>>> ESI: 00000040 EDI: f6c30000 EBP: f71a4b20 DS: 007b ES: 007b
>>>> CR0: 8005003b CR2: 0804a544 CR3: 01e9cd80 CR4: 000006f0
>>>> [<f88c0708>] e100_down+0x66/0x9a [e100]
>>>> [<f88c1623>] e100_close+0xa/0xd [e100]
>>>> [<c02b7adb>] dev_close+0x40/0x7e
>>>> [<c02b8f59>] dev_change_flags+0x46/0xf5
>>>> [<c02f76b3>] devinet_ioctl+0x564/0x5df
>>>> [<c02af22c>] sock_ioctl+0xc3/0x250
>>>> [<c02af169>] sock_ioctl+0x0/0x250
>>>> [<c01762ef>] do_ioctl+0x1f/0x6d
>>>> [<c017648f>] vfs_ioctl+0x50/0x1c6
>>>> [<c0176662>] sys_ioctl+0x5d/0x6f
>>>> [<c010394d>] syscall_call+0x7/0xb
>>>> [<c014473f>] softlockup_tick+0x6f/0x80
>>>> [<c01085b8>] timer_interrupt+0x2d/0x75
>>>> [<c01448dd>] handle_IRQ_event+0x2e/0x5a
>>>> [<c01449cb>] __do_IRQ+0xc2/0x127
>>>> [<c0105f7e>] do_IRQ+0x4e/0x86
>>>> =======================
>>>> [<c01160cc>] smp_apic_timer_interrupt+0xc1/0xca
>>>> [<c0104382>] common_interrupt+0x1a/0x20
>>>> [<f88bf2f9>] e100_clean_cbs+0x2f/0x12b [e100]
>>>> [<f88c0708>] e100_down+0x66/0x9a [e100]
>>>> [<f88c1623>] e100_close+0xa/0xd [e100]
>>>> [<c02b7adb>] dev_close+0x40/0x7e
>>>> [<c02b8f59>] dev_change_flags+0x46/0xf5
>>>> [<c02f76b3>] devinet_ioctl+0x564/0x5df
>>>> [<c02af22c>] sock_ioctl+0xc3/0x250
>>>> [<c02af169>] sock_ioctl+0x0/0x250
>>>> [<c01762ef>] do_ioctl+0x1f/0x6d
>>>> [<c017648f>] vfs_ioctl+0x50/0x1c6
>>>> [<c0176662>] sys_ioctl+0x5d/0x6f
>>>> [<c010394d>] syscall_call+0x7/0xb
>>>>
>>>>
>>>>
>>>> Preconditions for this are:
>>>>
>>>> - E100 card stopped working for some reason (no idea why, it just
>>>>  does sometimes at this oldish 2x P-III machine)
>>>> - There are active datastreams running in and out
>>>>  (around 0.2 Mbps out, multiple megabits in.)
>>>> - Commanding then "ifconfig eth0 down" results in what feels like 
>>>>  system freezing, but it does recover in about 30-60 seconds
>>>>  (it takes long enough for me to sweat bullets...)
>>>> - While in freeze state, keyboard can go crazy, but mouse does
>>>>  respond, as well as tvtime shows bt848 captured live video.
>>>> -
>>>> To unsubscribe from this list: send the line "unsubscribe 
>>>> linux-kernel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>>     
>>>
>> -
>> To unsubscribe from this list: send the line "unsubscribe 
>> linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>  
>>
>


[-- Attachment #2: sdw.vcf --]
[-- Type: text/x-vcard, Size: 234 bytes --]

begin:vcard
fn:Stephen Williams
n:Williams;Stephen
email;internet:sdw@lig.net
tel;work:703-724-0118
tel;fax:703-995-0407
tel;pager:sdwpage@lig.net
tel;home:703-729-5405
tel;cell:703-371-9362
x-mozilla-html:TRUE
version:2.1
end:vcard


  reply	other threads:[~2005-08-11  0:32 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-08-09 13:36 Soft lockup in e100 driver ? Matti Aarnio
2005-08-09 13:55 ` Jesper Juhl
2005-08-09 14:37   ` Matti Aarnio
2005-08-09 15:41     ` Steven Rostedt
2005-08-09 15:55       ` Matti Aarnio
2005-08-09 16:14         ` Steven Rostedt
2005-08-09 14:58 ` Lee Revell
2005-08-09 15:23   ` Steven Rostedt
2005-08-09 15:32     ` Daniel Walker
2005-08-09 16:16 ` Daniel Walker
2005-08-09 16:36   ` Matti Aarnio
2005-08-11  0:29     ` Stephen D. Williams
2005-08-11  0:32       ` Stephen D. Williams [this message]
2005-08-11  7:39         ` Matti Aarnio
2005-08-12  3:11           ` Stephen D. Williams
2005-08-12  5:45             ` Jesse Brandeburg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42FA9CAD.7030607@lig.net \
    --to=sdw@lig.net \
    --cc=dwalker@mvista.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matti.aarnio@zmailer.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.