From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Paul Aviles" Subject: Re: e1000 Detected Tx Unit Hang Date: Sat, 16 Sep 2006 22:05:47 -0400 Message-ID: <000301c6d9fd$ca77a5f0$3224050a@avilespaxp> References: <002c01c6ce9d$a1cf9100$3224050a@avilespaxp> <4807377b0609031045w67f70a3ese6bea93c15f75ba2@mail.gmail.com> <000d01c6cfb1$f0d26880$3224050a@avilespaxp> <4807377b0609050909v59c1ad87jc4ef08ba1f4453d2@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=response Content-Transfer-Encoding: 7bit Return-path: Received: from dsl-7-36.cofs.net ([68.142.7.36]:16195 "EHLO www.palei.com") by vger.kernel.org with ESMTP id S964906AbWIQCFx (ORCPT ); Sat, 16 Sep 2006 22:05:53 -0400 To: "Jesse Brandeburg" , Sender: netdev-owner@vger.kernel.org List-Id: netdev.vger.kernel.org Jesse, today the server froze and was not able to see anything in the logs. Nothing at all about any error, just plain froze. Just in case, this is a different unit altogether, still the same model as the units having the Tx Unit Hang, but different memory, motherboard and CPU. The only 1 thing that is the same is the hard drive a regular IDE... The only one thing I noticed that is very weird to me at least is that in powering off the unit from the crash and rebooting it I saw some lines like this in the logs.. Sep 16 11:08:03 www kernel: checking if image is initramfs... it is Sep 16 07:05:19 www sysctl: kernel.msgmnb = 65536 The odd part is the diff in the time stamps between one entry and the very next one in the log. Any ideas what can cause this? Also, any way to get a dump or some way to prevent the system from locking without any log entries? Regards, Paul ----- Original Message ----- From: "Jesse Brandeburg" To: "Paul Aviles" Cc: Sent: Tuesday, September 05, 2006 12:09 PM Subject: Re: e1000 Detected Tx Unit Hang > On 9/3/06, Paul Aviles wrote: >> Hey Jesse, thanks for your reply. Here is the stuff on /procs. The weird > no problem, > >> part is that I have several other identical systems and only one is >> affected. Today I moved the hard drive to another similar system and I am >> not seeing the problem so I am wondering if is something maybe wrong with >> the card eeprom? Is there a way to check that? > > I doubt it is an eeprom problem. you can dump the eeproms with > ethtool -e eth0 from both machines and compare them . Odd that only > one system is having the problem. Could it be that the hardware on > that box is having issues? Are you sure the machines are running the > same bios version with the same settings? Any overclocking? > >> cat /proc/interrupts >> CPU0 CPU1 >> 16: 70540 0 IO-APIC-level uhci_hcd:usb4, eth0 > > this could contribute to your problem, were you able to test without NAPI? > > Jesse > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > >