* e1000 intel driver bug (which impacts nfs)
@ 2002-05-26 13:06 jason andrade
2002-06-02 22:28 ` Thomas Langås
0 siblings, 1 reply; 8+ messages in thread
From: jason andrade @ 2002-05-26 13:06 UTC (permalink / raw)
Hi,
I'd spent many hours trying to diagnose and get a fix for what i thought
was a nfs performance bug. It turns out that i'm 99% sure this has ended
up being a bug in the Intel E1000 driver for the Intel 1000T (or any
Intel Gigabit Ethernet adapter in copper for me anyway). It's present in
both the older 3.X drivers and the new 4.X driver including 4.1.7 (the
current version)
The symptom/problem is that nfs will simply "hang" - clients will start
to queue requests and we were unable to figure out anything on the server
that would clear this except a reboot. With some more testing we were
able to verify and reproduce and resolve the problem by stopping nfs,
downing the gigabit interface, unloading the driver, reloading it,
reconfiguring the interface and restarting nfs. Within 2 minutes the
clients would start responding again.
Someone else has told me he can achieve the same effect with a ifconfig
down, pause, ifconfig up on that interface but this to date has not
worked for me.
I hope this helps anyone else trying to debug mysterious "nfs hangs" under
2.4.X. It doesn't seem to be tickled unless you are doing quite large
amounts of nfs traffic (we're pushing 1-1.5T a day on this interface)
and it's quite random (i've had a lockup from 4 hours to 10 days after
a reboot)
I am still trying to work out why 8K nfs mounts do not work (UDP) for
us (back to 1K now) and to try 8/16/32K mounts over TCP instead.
Since i now finally have a pure gigE network with a 9000 MTU for the
backend between servers i'm hoping this might work a bit better.
I'd also like to second Seth Vidal's comments about getting Neil, Trond
and co to provide a definitive (revised weekly? monthly?)
"this is what our recommend patchlist is and against which kernels and why"
on the nfs list and/or as part of the faq.
it is increasingly hard to track the major nfs patch contributors to work
out what should be applied and what can wait as well as figuring out the
patch dependencies.
cheers,
-jason
_______________________________________________________________
Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: e1000 intel driver bug (which impacts nfs)
2002-05-26 13:06 e1000 intel driver bug (which impacts nfs) jason andrade
@ 2002-06-02 22:28 ` Thomas Langås
2002-06-03 2:30 ` seth vidal
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Thomas Langås @ 2002-06-02 22:28 UTC (permalink / raw)
To: jason andrade; +Cc: nfs
jason andrade:
> I hope this helps anyone else trying to debug mysterious "nfs hangs" under
> 2.4.X. It doesn't seem to be tickled unless you are doing quite large
> amounts of nfs traffic (we're pushing 1-1.5T a day on this interface)
> and it's quite random (i've had a lockup from 4 hours to 10 days after
> a reboot)
We've also got problems with nfs-hangs when transfering large files (ie.
files around 300-400M, sometime we have to go a bit higher tho, like
2GB-3GB files, but it's always possible to trigger this. However, we
don't need to be jumping through hoops to "fix it", after a min or so,
it's ok again. It seems to me like there's a VM problem or something.
We've got 2GB mem on the machines which are suffering from theese
problems.
--
Thomas
_______________________________________________________________
Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: e1000 intel driver bug (which impacts nfs)
2002-06-02 22:28 ` Thomas Langås
@ 2002-06-03 2:30 ` seth vidal
2002-06-03 7:48 ` Trond Myklebust
2002-06-03 8:36 ` Ryan Sweet
2 siblings, 0 replies; 8+ messages in thread
From: seth vidal @ 2002-06-03 2:30 UTC (permalink / raw)
To: nfs; +Cc: jason andrade
[-- Attachment #1: Type: text/plain, Size: 988 bytes --]
On Sun, 2002-06-02 at 18:28, Thomas Langås wrote:
> jason andrade:
> > I hope this helps anyone else trying to debug mysterious "nfs hangs" under
> > 2.4.X. It doesn't seem to be tickled unless you are doing quite large
> > amounts of nfs traffic (we're pushing 1-1.5T a day on this interface)
> > and it's quite random (i've had a lockup from 4 hours to 10 days after
> > a reboot)
>
> We've also got problems with nfs-hangs when transfering large files (ie.
> files around 300-400M, sometime we have to go a bit higher tho, like
> 2GB-3GB files, but it's always possible to trigger this. However, we
> don't need to be jumping through hoops to "fix it", after a min or so,
> it's ok again. It seems to me like there's a VM problem or something.
>
> We've got 2GB mem on the machines which are suffering from theese
> problems.
Could this be a VM problem - something like a problem flushing the write
cache?
it sounds like that from your description.
-sv
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 232 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: e1000 intel driver bug (which impacts nfs)
2002-06-02 22:28 ` Thomas Langås
2002-06-03 2:30 ` seth vidal
@ 2002-06-03 7:48 ` Trond Myklebust
2002-06-03 8:36 ` Ryan Sweet
2 siblings, 0 replies; 8+ messages in thread
From: Trond Myklebust @ 2002-06-03 7:48 UTC (permalink / raw)
To: nfs
>>>>> " " =3D=3D Thomas Lang=E5s <tlan@stud.ntnu.no> writes:
> We've also got problems with nfs-hangs when transfering large
> files (ie. files around 300-400M, sometime we have to go a bit
> higher tho, like 2GB-3GB files, but it's always possible to
> trigger this. However, we don't need to be jumping through
> hoops to "fix it", after a min or so, it's ok again. It seems
> to me like there's a VM problem or something.
> We've got 2GB mem on the machines which are suffering from
> theese problems.
There is a problem with highmem that I'm just barely getting to grips
with: NFS is able to starve the kernel of highmem bounce buffer
resources because we kmap() the pages for too long (surprise: NFS
predates the highmem code by several years and so nobody ever
considered kmap() when the design was made).
An attempt at a fix has been merged in to the development kernels as
of 2.5.19. The same patches are available for 2.4.19-pre8 +
NFS_ALL. If you'd like to test it out, you will need
http://www.fys.uio.no/~trondmy/src/2.4.19-pre8/linux-2.4.19-NFS_ALL.dif
+ the 4 patches from
http://www.fys.uio.no/~trondmy/src/2.4.19-pre8/alpha
Cheers,
Trond
_______________________________________________________________
Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: e1000 intel driver bug (which impacts nfs)
2002-06-02 22:28 ` Thomas Langås
2002-06-03 2:30 ` seth vidal
2002-06-03 7:48 ` Trond Myklebust
@ 2002-06-03 8:36 ` Ryan Sweet
2 siblings, 0 replies; 8+ messages in thread
From: Ryan Sweet @ 2002-06-03 8:36 UTC (permalink / raw)
To: nfs; +Cc: jason andrade
On Mon, 3 Jun 2002, Thomas Lang=E5s wrote:
> We've also got problems with nfs-hangs when transfering large files (ie.
> files around 300-400M, sometime we have to go a bit higher tho, like
> 2GB-3GB files, but it's always possible to trigger this. However, we
> don't need to be jumping through hoops to "fix it", after a min or so,
> it's ok again. It seems to me like there's a VM problem or something.
Does the problem happen with local I/O also, or only with nfs?
--=20
Ryan Sweet <ryan.sweet@atosorigin.com>
Atos Origin Engineering Services
http://www.aoes.nl
_______________________________________________________________
Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: e1000 intel driver bug (which impacts nfs)
@ 2002-06-03 8:54 Mark Manuel Cruz Ramos
0 siblings, 0 replies; 8+ messages in thread
From: Mark Manuel Cruz Ramos @ 2002-06-03 8:54 UTC (permalink / raw)
To: 'Ryan Sweet', nfs; +Cc: jason andrade
I had this problem also using the standard install of rh 7.2. but =
after
upgrading the kernel to 2.4.18, everything is fine already.
-----Original Message-----
From: Ryan Sweet [mailto:rsweet@atos-group.nl]
Sent: Monday, June 03, 2002 4:37 PM
To: nfs@lists.sourceforge.net
Cc: jason andrade
Subject: Re: [NFS] e1000 intel driver bug (which impacts nfs)
On Mon, 3 Jun 2002, Thomas Lang=E5s wrote:
> We've also got problems with nfs-hangs when transfering large files =
(ie.
> files around 300-400M, sometime we have to go a bit higher tho, like
> 2GB-3GB files, but it's always possible to trigger this. However, we
> don't need to be jumping through hoops to "fix it", after a min or =
so,
> it's ok again. It seems to me like there's a VM problem or something.
Does the problem happen with local I/O also, or only with nfs?
--=20
Ryan Sweet <ryan.sweet@atosorigin.com>
Atos Origin Engineering Services
http://www.aoes.nl
_______________________________________________________________
Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________________________
Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: e1000 intel driver bug (which impacts nfs)
@ 2002-06-05 15:04 darren.miller
0 siblings, 0 replies; 8+ messages in thread
From: darren.miller @ 2002-06-05 15:04 UTC (permalink / raw)
To: nfs; +Cc: jason andrade, nfs, nfs-admin
[-- Attachment #1: Type: text/plain, Size: 2203 bytes --]
I recently have been doing heavy investigation into using Linux for NAS
with an Intel E1000 and
Slackware 8.
I found this problem also, however it disappears if you use PowerTweak
Daemon and get the latest
driver for the Intel Card.
But PowerTweak cured alot of my issues...
http://powertweak.sourceforge.net/
Hope this helps
Darren
==============================================================================
Darren Miller
Senior Systems Support Engineer
Microsoft Certified Professional
SCO Advanced Certified Engineer
Infomation Systems Department (Core Server Support)
Philips Semiconductors,Milbrook Industrial Estate,Southampton,SO15
0DJ,England
Thomas Langås <tlan@stud.ntnu.no>
Sent by: nfs-admin@lists.sourceforge.net
2002-06-02 23:28
Please respond to nfs
To: jason andrade <jason@dstc.edu.au>
cc: nfs@lists.sourceforge.net
(bcc: Darren Miller/SOU/SC/PHILIPS)
Subject: Re: [NFS] e1000 intel driver bug (which impacts nfs)
Classification:
jason andrade:
> I hope this helps anyone else trying to debug mysterious "nfs hangs"
under
> 2.4.X. It doesn't seem to be tickled unless you are doing quite large
> amounts of nfs traffic (we're pushing 1-1.5T a day on this interface)
> and it's quite random (i've had a lockup from 4 hours to 10 days after
> a reboot)
We've also got problems with nfs-hangs when transfering large files (ie.
files around 300-400M, sometime we have to go a bit higher tho, like
2GB-3GB files, but it's always possible to trigger this. However, we
don't need to be jumping through hoops to "fix it", after a min or so,
it's ok again. It seems to me like there's a VM problem or something.
We've got 2GB mem on the machines which are suffering from theese
problems.
--
Thomas
_______________________________________________________________
Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
[-- Attachment #2: Type: text/html, Size: 3632 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread* RE: e1000 intel driver bug (which impacts nfs)
@ 2002-09-22 20:54 Allen Day
0 siblings, 0 replies; 8+ messages in thread
From: Allen Day @ 2002-09-22 20:54 UTC (permalink / raw)
To: nfs
I found this thread describing NFS hanging with the Intel E1000 gigabit
ethernet adapter, and was wondering if any more progress has been made on
the problem.
I'm using a Redhat 7.3 2.4.18-3smp kernel, nfs-utils-0.3.3-5, on a dual
Xeon 1.8GHz / 4GB / SuperMicro P4DP6.
The symptoms I'm experiencing are that the NFS server behaves fine for a
few hours, then mounts are no longer available from it. Trying to mount
using a remote host or localhost as the NFS client gives an error in
dmesg: "nfs: task xxxx can't get a request slot". It isn't possible to
stop NFS via its init.d script, or even to get the machine to respond to
a 'shutdown' -- the only way I've been able to temporarily solve the
problem is to push the reset button on the box.
It may also be worth mentioning that once I notice the NFS problem has
started, I can no longer do a 'df' without making the terminal hang.
Also, the NFS share isn't very small... it's about 450GB.
Any idea what's going on here? Am I looking at an E1000 driver problem or
an NFS problem?
-Allen
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2002-09-22 20:44 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-05-26 13:06 e1000 intel driver bug (which impacts nfs) jason andrade
2002-06-02 22:28 ` Thomas Langås
2002-06-03 2:30 ` seth vidal
2002-06-03 7:48 ` Trond Myklebust
2002-06-03 8:36 ` Ryan Sweet
-- strict thread matches above, loose matches on Subject: below --
2002-06-03 8:54 Mark Manuel Cruz Ramos
2002-06-05 15:04 darren.miller
2002-09-22 20:54 Allen Day
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.