netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matheos Worku <Matheos.Worku@Sun.COM>
To: Jesper Krogh <jesper@krogh.cc>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: niu driver - Transmit timed out - 2.6.29
Date: Fri, 27 Mar 2009 17:42:14 -0700	[thread overview]
Message-ID: <49CD7266.1070002@sun.com> (raw)
In-Reply-To: <49CD2996.60502@krogh.cc>

Jesper Krogh wrote:
> Jesper Krogh wrote:
>> Ok. I was just so happy .. (See "Status update on Sun Neptune 10Gbit 
>> driver earlier).
>>
>> But then it "blew up" again:
>>
>> Mar 26 13:25:49 hest kernel: [25335.505049] ------------[ cut here 
>> ]------------
>> Mar 26 13:25:49 hest kernel: [25335.505055] WARNING: at 
>> net/sched/sch_generic.c:226 dev_watchdog+0x1fd/0x210()
>> Mar 26 13:25:49 hest kernel: [25335.505057] Hardware name: Sun Fire 
>> X4600 M2
>> Mar 26 13:25:49 hest kernel: [25335.505059] NETDEV WATCHDOG: eth4 
>> (niu): transmit timed out
>> Mar 26 13:25:49 hest kernel: [25335.505060] Modules linked in: 
>> af_packet ext4 jbd2 crc16 nfsd exportfs autofs4 nfs lockd auth_rpcgss 
>> sunrpc iptable_filter ip_tables x_tables ib_iser rdma_cm ib_cm iw_cm 
>> ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi 
>> scsi_transport_iscsi ipv6 parport_pc lp parport loop sr_mod joydev 
>> psmouse niu usb_storage usbhid i2c_nforce2 libusual hid serio_raw 
>> pcspkr shpchp k8temp pci_hotplug i2c_core button evdev ext3 jbd 
>> mbcache ide_cd_mod cdrom sg sd_mod ata_generic libata mptsas mptspi 
>> mptscsih qla2xxx mptbase scsi_transport_sas scsi_transport_fc 
>> ehci_hcd scsi_transport_spi ohci_hcd e1000 scsi_mod amd74xx usbcore 
>> dm_mirror dm_region_hash dm_log dm_snapshot dm_mod thermal processor 
>> fan thermal_sys fuse
>> Mar 26 13:25:49 hest kernel: [25335.505109] Pid: 0, comm: swapper Not 
>> tainted 2.6.29 #30
>> Mar 26 13:25:49 hest kernel: [25335.505111] Call Trace:
>> Mar 26 13:25:49 hest kernel: [25335.505113]  <IRQ>  
>> [<ffffffff8023d5c2>] warn_slowpath+0xf2/0x130
>> Mar 26 13:25:49 hest kernel: [25335.505124]  [<ffffffff80239d2d>] 
>> task_tick_fair+0x4d/0xd0
>> Mar 26 13:25:49 hest kernel: [25335.505130]  [<ffffffff80355e33>] 
>> cpumask_next_and+0x23/0x40
>> Mar 26 13:25:49 hest kernel: [25335.505132]  [<ffffffff80233f84>] 
>> find_busiest_group+0x204/0x870
>> Mar 26 13:25:49 hest kernel: [25335.505136]  [<ffffffff8035b65e>] 
>> strlcpy+0x4e/0x80
>> Mar 26 13:25:49 hest kernel: [25335.505138]  [<ffffffff8041f11d>] 
>> dev_watchdog+0x1fd/0x210
>> Mar 26 13:25:49 hest kernel: [25335.505141]  [<ffffffff80235ac5>] 
>> run_rebalance_domains+0x3c5/0x530
>> Mar 26 13:25:49 hest kernel: [25335.505143]  [<ffffffff802474bb>] 
>> run_timer_softirq+0x1bb/0x230
>> Mar 26 13:25:49 hest kernel: [25335.505148]  [<ffffffff802574e1>] 
>> sched_clock_cpu+0x131/0x180
>> Mar 26 13:25:49 hest kernel: [25335.505151]  [<ffffffff80242cdb>] 
>> __do_softirq+0x8b/0x150
>> Mar 26 13:25:49 hest kernel: [25335.505155]  [<ffffffff8020d3bc>] 
>> call_softirq+0x1c/0x30
>> Mar 26 13:25:49 hest kernel: [25335.505157]  [<ffffffff8020e505>] 
>> do_softirq+0x35/0x80
>> Mar 26 13:25:49 hest kernel: [25335.505161]  [<ffffffff8021f715>] 
>> smp_apic_timer_interrupt+0x85/0xd0
>> Mar 26 13:25:49 hest kernel: [25335.505163]  [<ffffffff8020cdf3>] 
>> apic_timer_interrupt+0x13/0x20
>> Mar 26 13:25:49 hest kernel: [25335.505164]  <EOI>  
>> [<ffffffff80212dc7>] default_idle+0x27/0x40
>> Mar 26 13:25:49 hest kernel: [25335.505169]  [<ffffffff80212fea>] 
>> c1e_idle+0xba/0x100
>> Mar 26 13:25:49 hest kernel: [25335.505171]  [<ffffffff8020ae80>] 
>> cpu_idle+0x40/0x70
>> Mar 26 13:25:49 hest kernel: [25335.505173] ---[ end trace 
>> e6e4f250dc22390d ]---
>
> There was actually a bit more in the log:
>
> Mar 26 13:25:49 hest kernel: [25335.505176] niu 0000:84:00.0: niu: 
> eth4: Transmit timed out, resetting
> Mar 26 13:25:49 hest kernel: [25335.587191] niu 0000:84:00.0: niu: 
> eth4: bits (40000000) of register RXDMA_CFIG1 would not cl
> ear, val[c0000000]
> Mar 26 13:25:49 hest last message repeated 4 times
> Mar 26 13:25:58 hest kernel: [25345.504898] niu 0000:84:00.0: niu: 
> eth4: Transmit timed out, resetting
> Mar 26 13:26:08 hest kernel: [25355.504758] niu 0000:84:00.0: niu: 
> eth4: Transmit timed out, resetting
> Mar 26 13:26:13 hest kernel: [25360.504687] niu 0000:84:00.0: niu: 
> eth4: Transmit timed out, resetting
> Mar 26 13:26:18 hest kernel: [25365.504619] niu 0000:84:00.0: niu: 
> eth4: Transmit timed out, resetting
> Mar 26 13:26:23 hest kernel: [25370.504549] niu 0000:84:00.0: niu: 
> eth4: Transmit timed out, resetting
> Mar 26 13:26:28 hest kernel: [25375.504479] niu 0000:84:00.0: niu: 
> eth4: Transmit timed out, resetting
> Mar 26 13:26:33 hest kernel: [25380.504409] niu 0000:84:00.0: niu: 
> eth4: Transmit timed out, resetting
> Mar 26 13:26:38 hest kernel: [25385.504340] niu 0000:84:00.0: niu: 
> eth4: Transmit timed out, resetting
>
> This is probably the interesting part:
> Mar 26 13:25:49 hest kernel: [25335.587191] niu 0000:84:00.0: niu: 
> eth4: bits (40000000) of register RXDMA_CFIG1 would not clear, 
> val[c0000000]
Jesper,

One of the RX  ring DMAs  is failing to reset. I guess whatever is 
hanging the TX side is affecting the RX side as well. Can you do lspci 
on the function  and its siblings?
Regards
Matheos

>
> Any suggestions?
>
> Is this perhaps just broken hardware.. or a driver issue?  (I had the 
> Sun nxge driver working for around 180 days on the same card.. so I 
> would assume the hardware is ok).
>
> Jesper


  reply	other threads:[~2009-03-28  0:47 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-26 12:44 niu driver - Transmit timed out - 2.6.29 Jesper Krogh
2009-03-27 19:31 ` Jesper Krogh
2009-03-28  0:42   ` Matheos Worku [this message]
2009-03-28  6:05     ` Jesper Krogh
2009-03-28  6:18       ` Matheos Worku
2009-03-28  7:25         ` Jesper Krogh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49CD7266.1070002@sun.com \
    --to=matheos.worku@sun.com \
    --cc=jesper@krogh.cc \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).