All of lore.kernel.org
 help / color / mirror / Atom feed
* huge number of intr/s on large nfs server
@ 2002-10-14 20:21 Eff Norwood
  2002-10-15  8:13 ` Bogdan Costescu
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Eff Norwood @ 2002-10-14 20:21 UTC (permalink / raw)
  To: nfs; +Cc: Daniel Phillips

Hi All,

I have a 2.4.18 kernel running on a dual 2.4Ghz Xeon platform using software
RAID 5 via IBM's EVMS and EXT3. The system is being used as an NFS server
and although local disk performance is excellent, NFS performance (over UDP
and TCP, vers 2 and 3 with multiple different client mount block sizes) is
poor to bad. Looking at mpstat while the system is under load shows the
%system to be quite high (94-96%) but most interestingly shows the number of
intr/s (context switches) to be 17-18K plus!

Since I was not sure what was causing all of these context switches, I
installed SGI kernprof and ran it during a 15 minute run. I used this
command to start kernprof: 'kernprof -r -d time -f 1000 -t pc -b -c all' and
this one to stop it: 'kernprof -e -i | sort -nr +2 | less >
big_csswitch.txt'

The output of this collection is located here (18Kb):

http://www.effrem.com/linux/kernel/dev/big_csswitch.txt

Most interesting to me is why in the top three results:

default_idle [c010542c]: 861190
_text_lock_inode [c015d031]: 141795
UNKNOWN_KERNEL [c01227f0]: 101532

that default_idle would be the highest value when the CPUs showed 94-96%
busy. Also interesting is what UNKNOWN_KERNEL is. ???

The server described above has 14 internal IDE disks configured as software
Raid 5 and connected to the network with one Syskonnect copper gigabit card.
I used 30 100 base-T connected clients all of which performed sequential
writes to one large 1.3TB volume on the file server. They were mounted
NFSv2, UDP, 8K r+w size for this run. I was able to achieve only 35MB/sec of
sustained NFS write throughput. Local disk performance (e.g. dd file) for
sustained writes is *much* higher. I am using knfsd with the latest 2.4.18
Neil Brown fixes from his site. Distribution is Debian 3.0 Woody Stable.

Many thanks in advance for the insight,

Eff Norwood




-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 25+ messages in thread
* RE: huge number of intr/s on large nfs server
@ 2002-10-15 14:02 Heflin, Roger A.
  2002-10-16  2:41 ` Eff Norwood
  0 siblings, 1 reply; 25+ messages in thread
From: Heflin, Roger A. @ 2002-10-15 14:02 UTC (permalink / raw)
  To: nfs, enorwood

You say 35MB/second, and 17k context switches, so that is about
2k/context switch.    You get at least 1 context switch per every
few packets sent out, and you get 1 context switch for each disk
io done (I believe), the more data you send out, the more context
switches/interrupts that you will get.   Things that appear to reduce
the numbers are using larger packets (32k seems to reduce the
numbers over 8k nfs-wsize,rsize), making anything else larger
may also reduce the total number.   And software raid should
increase the number a bit as more has to be taken care of by
the main cpu, whereas in hardware raid the parity writes
are invisible to the main cpu.  Also with hardware raid the=20
parity calcs are on the hardware and not on the main cpu,
so this reduces the main cpu usage.   With a hardware raid
setup I get performance numbers similar to what you are seeing,
only I get more like 50% cpu usage on a single slightly slower cpu.
With a SCSI fc 5+1 disk setup with a mylex controller I am getting
about 25MB/second writes.

Just doing local IO will produce lots and lots of interrupts/context
switches.

When you did the local dd you did make sure to break the cache?
Otherwise the results that you get will be rather useless.   I have
been finding that I can usually get about 1/2 of the local capacity
to the network across NFS when I break the cache, if you don't
break the cache you get very very large results.

                                          Roger

> Message: 1
> From: "Eff Norwood" <enorwood@effrem.com>
> To: <nfs@lists.sourceforge.net>
> Cc: "Daniel Phillips" <phillips@arcor.de>
> Date: Mon, 14 Oct 2002 13:21:15 -0700
> Subject: [NFS] huge number of intr/s on large nfs server
>=20
> Hi All,
>=20
> I have a 2.4.18 kernel running on a dual 2.4Ghz Xeon platform using =
software
> RAID 5 via IBM's EVMS and EXT3. The system is being used as an NFS =
server
> and although local disk performance is excellent, NFS performance =
(over UDP
> and TCP, vers 2 and 3 with multiple different client mount block =
sizes) is
> poor to bad. Looking at mpstat while the system is under load shows =
the
> %system to be quite high (94-96%) but most interestingly shows the =
number of
> intr/s (context switches) to be 17-18K plus!
>=20
> Since I was not sure what was causing all of these context switches, I
> installed SGI kernprof and ran it during a 15 minute run. I used this
> command to start kernprof: 'kernprof -r -d time -f 1000 -t pc -b -c =
all' and
> this one to stop it: 'kernprof -e -i | sort -nr +2 | less >
> big_csswitch.txt'
>=20
> The output of this collection is located here (18Kb):
>=20
> http://www.effrem.com/linux/kernel/dev/big_csswitch.txt
>=20
> Most interesting to me is why in the top three results:
>=20
> default_idle [c010542c]: 861190
> _text_lock_inode [c015d031]: 141795
> UNKNOWN_KERNEL [c01227f0]: 101532
>=20
> that default_idle would be the highest value when the CPUs showed =
94-96%
> busy. Also interesting is what UNKNOWN_KERNEL is. ???
>=20
> The server described above has 14 internal IDE disks configured as =
software
> Raid 5 and connected to the network with one Syskonnect copper gigabit =
card.
> I used 30 100 base-T connected clients all of which performed =
sequential
> writes to one large 1.3TB volume on the file server. They were mounted
> NFSv2, UDP, 8K r+w size for this run. I was able to achieve only =
35MB/sec of
> sustained NFS write throughput. Local disk performance (e.g. dd file) =
for
> sustained writes is *much* higher. I am using knfsd with the latest =
2.4.18
> Neil Brown fixes from his site. Distribution is Debian 3.0 Woody =
Stable.
>=20
> Many thanks in advance for the insight,
>=20
> Eff Norwood
>=20
>=20
>=20
>=20
>=20


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 25+ messages in thread
* huge number of intr/s on large nfs server
@ 2002-10-15 19:27 Heflin, Roger A.
  2002-10-15 20:51 ` Eric Whiting
  2002-10-16  2:54 ` Eff Norwood
  0 siblings, 2 replies; 25+ messages in thread
From: Heflin, Roger A. @ 2002-10-15 19:27 UTC (permalink / raw)
  To: nfs




	Something else,

	if you are dd a 10mb file on the local machine that is all going to be =
in cache, so
	the rate will be horribly incorrect.

	Also these are gigabit copper cards correct?  Gigabit copper cards only =
run=20
	a max of 50MB/second, so getting 35MB/second over one of them is pretty =
good.

	I am getting 1500cs/second and 6500ints/second at about 8MB/second to
	the network. =20

	I get 3600cs/second and 15000ints/second at about 20MB/second to the
	network.

	How fast of IO are you getting on the local machine?

							Roger


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 25+ messages in thread
* RE: huge number of intr/s on large nfs server
@ 2002-10-15 21:40 Heflin, Roger A.
  2002-10-15 23:05 ` Eric Whiting
  0 siblings, 1 reply; 25+ messages in thread
From: Heflin, Roger A. @ 2002-10-15 21:40 UTC (permalink / raw)
  To: Eric Whiting; +Cc: nfs

After doing some reading I figured out my mistake.

Others had told me that it ran 4 pairs at 125Mhz each, and that it=20
worked just like 100BT (1 bit per clock cycle), apparently they are
sending 2bit per cycle so 2 bits x 125Mhz * 4 channels is=20
1Gbps, if they were only using 1 bit per cycle it would be
only 50MB/second.

What is your underlying setup to get the 93MB/second,
ie disks/controllers/cpus?

				Roger

> -----Original Message-----
> From:	Eric Whiting [SMTP:ewhiting@amis.com]
> Sent:	Tuesday, October 15, 2002 3:52 PM
> To:	Heflin, Roger A.
> Cc:	nfs@lists.sourceforge.net
> Subject:	Re: [NFS] huge number of intr/s on large nfs server
>=20
> I'm running 93MBytes/s on gigE copper (testing using a 1000M file).
> (jumbo frames enabled)
>=20
> eric
>=20
>=20
>=20
> "Heflin, Roger A." wrote:
> >=20
> >         Something else,
> >=20
> >         if you are dd a 10mb file on the local machine that is all =
going to be in cache, so
> >         the rate will be horribly incorrect.
> >=20
> >         Also these are gigabit copper cards correct?  Gigabit copper =
cards only run
> >         a max of 50MB/second, so getting 35MB/second over one of =
them is pretty good.
> >=20
> >         I am getting 1500cs/second and 6500ints/second at about =
8MB/second to
> >         the network.
> >=20
> >         I get 3600cs/second and 15000ints/second at about =
20MB/second to the
> >         network.
> >=20
> >         How fast of IO are you getting on the local machine?
> >=20
> >                                                         Roger
> >=20
> > -------------------------------------------------------
> > This sf.net email is sponsored by:ThinkGeek
> > Welcome to geek heaven.
> > http://thinkgeek.com/sf
> > _______________________________________________
> > NFS maillist  -  NFS@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nfs


-------------------------------------------------------
This sf.net email is sponsored by: viaVerio will pay you up to
$1,000 for every account that you consolidate with us.
http://ad.doubleclick.net/clk;4749864;7604308;v?
http://www.viaverio.com/consolidator/osdn.cfm
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2002-10-18  2:19 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-10-14 20:21 huge number of intr/s on large nfs server Eff Norwood
2002-10-15  8:13 ` Bogdan Costescu
2002-10-15 16:50   ` Eff Norwood
2002-10-15 17:02     ` Bogdan Costescu
2002-10-15 21:22 ` Andrew Theurer
2002-10-16 20:06   ` Eff Norwood
2002-10-16 22:51     ` Donavan Pantke
2002-10-16 23:18       ` Eff Norwood
2002-10-16 23:28         ` Donavan Pantke
2002-10-17  2:28 ` Benjamin LaHaise
2002-10-17  2:49   ` Eff Norwood
2002-10-17 11:15     ` Alex Thiel
2002-10-17 16:42       ` Eff Norwood
2002-10-17 13:33     ` Andrew Theurer
2002-10-17 16:59       ` Eff Norwood
2002-10-18  2:05     ` Benjamin LaHaise
2002-10-18  2:19       ` Eff Norwood
  -- strict thread matches above, loose matches on Subject: below --
2002-10-15 14:02 Heflin, Roger A.
2002-10-16  2:41 ` Eff Norwood
2002-10-15 19:27 Heflin, Roger A.
2002-10-15 20:51 ` Eric Whiting
2002-10-16  2:58   ` Eff Norwood
2002-10-16  2:54 ` Eff Norwood
2002-10-15 21:40 Heflin, Roger A.
2002-10-15 23:05 ` Eric Whiting

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.