All of lore.kernel.org
 help / color / mirror / Atom feed
* Problems with large number of clients and reads
@ 2008-06-03 18:50 Norman Weathers
  2008-06-04 13:49 ` Chuck Lever
                   ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Norman Weathers @ 2008-06-03 18:50 UTC (permalink / raw)
  To: linux-nfs

Hello all,

We are having some issues with some high throughput servers of ours.

Here is the issue, we are using a vanilla 2.6.22.14 kernel on a node
with 2 Dual Core Intels (3 GHz) and 16 GB of ram.  The files that are
being served are around 2 GB each, and there are usually 3 to 5 of them
being read, so once read they fit into memory nicely, and when all is
working correctly, we have a perfectly filled cache, with almost no disk
activity.

When we have large NFS activity (say, 600 to 1200 clients) connecting to
the server(s), they can get into a state where they are using up all of
memory, but they are dropping cache.  slabtop is showing 13 GB of memory
being used by the size-4096 slab object.  We have two ethernet channels
bonded, so we see in excess of 240 MB/s of data flowing out of the box,
and all of the sudden, disk activity has risen to 185 MB/s.  This
happens if we are using 8 or more nfs threads.  If we limit the threads
to 6 or less, this doesn't happen.  Of course, we are starving clients,
but at least the jobs that my customers are throwing out there are
progressing.  The question becomes, what is causing the memory to be
used up by the slab size-4096 object?  Why when all of the sudden a
bunch of clients ask for data does this object grow from 100 MB to 13
GB?  I have set the memory settings to something that I thought was
reasonable.

Here is some more of the particulars:

sysctl.conf tcp memory settings:

# NFS Tuning Parameters
sunrpc.udp_slot_table_entries = 128
sunrpc.tcp_slot_table_entries = 128
vm.overcommit_ratio = 80

net.core.rmem_max=524288
net.core.rmem_default=262144
net.core.wmem_max=524288
net.core.wmem_default=262144
net.ipv4.tcp_rmem = 8192 262144 524288
net.ipv4.tcp_wmem = 8192 262144 524288
net.ipv4.tcp_sack=0
net.ipv4.tcp_timestamps=0
vm.min_free_kbytes=50000
vm.overcommit_memory=1
net.ipv4.tcp_reordering=127

# Enable tcp_low_latency
net.ipv4.tcp_low_latency=1

Here is a current reading from a slabtop of a system where this error is
happening:

3007154 3007154 100%    4.00K 3007154        1  12028616K size-4096

Note the size of the object cache, usually it is 50 - 100 MB (I have
another box with 32 threads and the same settings which is bouncing
between 50 and 128 MB right now).

I have a lot of client boxes that need access to these servers, and
would really benefit from having more threads, but if I increase the
number of threads, it pushes everything out of cache, forcing re-reads,
and really slows down our jobs.

Any thoughts on this?


Thanks, 

Norman Weathers

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2008-06-19 18:46 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-03 18:50 Problems with large number of clients and reads Norman Weathers
2008-06-04 13:49 ` Chuck Lever
     [not found]   ` <76bd70e30806040649h53ab5d66x8c3423c551e94f77-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-06-04 14:13     ` Norman Weathers
2008-06-05 18:54       ` Norman Weathers
2008-06-06 14:44         ` Chuck Lever
2008-06-09 13:56           ` Weathers, Norman R.
2008-06-06  0:06 ` Dean Hildebrand
2008-06-09 13:20   ` Weathers, Norman R.
2008-06-06 16:09 ` J. Bruce Fields
2008-06-09 14:19   ` Weathers, Norman R.
     [not found]     ` <0122F800A3B64C449565A9E8C2977010155587-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-09 18:53       ` J. Bruce Fields
2008-06-10 14:30         ` Weathers, Norman R.
     [not found]           ` <0122F800A3B64C449565A9E8C297701002D75D9F-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-10 17:16             ` J. Bruce Fields
2008-06-10 22:12               ` Weathers, Norman R.
     [not found]                 ` <0122F800A3B64C449565A9E8C297701002D75DA3-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-11 18:46                   ` J. Bruce Fields
2008-06-11 19:52                     ` CONFIG_DEBUG_SLAB_LEAK omits size-4096 and larger? J. Bruce Fields
2008-06-11 19:52                       ` J. Bruce Fields
2008-06-11 20:09                       ` Jeff Layton
2008-06-11 20:09                         ` Jeff Layton
     [not found]                         ` <20080611160947.5f08fb16-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-06-11 20:57                           ` J. Bruce Fields
2008-06-11 20:57                             ` J. Bruce Fields
2008-06-11 22:46                             ` Weathers, Norman R.
2008-06-11 22:46                               ` Weathers, Norman R.
     [not found]                               ` <0122F800A3B64C449565A9E8C297701002D75DAA-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-11 22:54                                 ` J. Bruce Fields
2008-06-11 22:54                                   ` J. Bruce Fields
2008-06-12 19:54                                   ` Weathers, Norman R.
2008-06-12 19:54                                     ` Weathers, Norman R.
     [not found]                                     ` <0122F800A3B64C449565A9E8C297701002D75DAE-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-13 20:15                                       ` J. Bruce Fields
2008-06-13 20:15                                         ` J. Bruce Fields
2008-06-13 21:53                                         ` Weathers, Norman R.
2008-06-13 21:53                                           ` Weathers, Norman R.
     [not found]                                           ` <0122F800A3B64C449565A9E8C297701002D75DB6-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-13 22:04                                             ` J. Bruce Fields
2008-06-13 22:04                                               ` J. Bruce Fields
2008-06-13 22:53                                               ` Weathers, Norman R.
2008-06-13 22:53                                                 ` Weathers, Norman R.
     [not found]                                                 ` <0122F800A3B64C449565A9E8C297701002D75DB7-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-16 17:43                                                   ` J. Bruce Fields
2008-06-16 17:43                                                     ` J. Bruce Fields
2008-06-19 15:53                                                     ` Weathers, Norman R.
2008-06-19 15:53                                                       ` Weathers, Norman R.
     [not found]                                                       ` <0122F800A3B64C449565A9E8C297701002D75DD4-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-19 18:46                                                         ` J. Bruce Fields
2008-06-19 18:46                                                           ` J. Bruce Fields

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.