Linux NFS development
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Norman Weathers
	<norman.r.weathers-496aOtIFJR1B+Kdf37RAV9BPR1lH4CV8@public.gmane.org>
Cc: linux-nfs@vger.kernel.org
Subject: Re: Problems with large number of clients and reads
Date: Fri, 6 Jun 2008 12:09:22 -0400	[thread overview]
Message-ID: <20080606160922.GG30863@fieldses.org> (raw)
In-Reply-To: <1212519001.24900.14.camel@hololw58>

On Tue, Jun 03, 2008 at 01:50:01PM -0500, Norman Weathers wrote:
> Hello all,
> 
> We are having some issues with some high throughput servers of ours.
> 
> Here is the issue, we are using a vanilla 2.6.22.14 kernel on a node
> with 2 Dual Core Intels (3 GHz) and 16 GB of ram.  The files that are
> being served are around 2 GB each, and there are usually 3 to 5 of them
> being read, so once read they fit into memory nicely, and when all is
> working correctly, we have a perfectly filled cache, with almost no disk
> activity.
> 
> When we have large NFS activity (say, 600 to 1200 clients) connecting to
> the server(s), they can get into a state where they are using up all of
> memory, but they are dropping cache.  slabtop is showing 13 GB of memory
> being used by the size-4096 slab object.  We have two ethernet channels
> bonded, so we see in excess of 240 MB/s of data flowing out of the box,
> and all of the sudden, disk activity has risen to 185 MB/s.  This
> happens if we are using 8 or more nfs threads.  If we limit the threads
> to 6 or less, this doesn't happen.  Of course, we are starving clients,
> but at least the jobs that my customers are throwing out there are
> progressing.  The question becomes, what is causing the memory to be
> used up by the slab size-4096 object?  Why when all of the sudden a
> bunch of clients ask for data does this object grow from 100 MB to 13
> GB?  I have set the memory settings to something that I thought was
> reasonable.
> 
> Here is some more of the particulars:
> 
> sysctl.conf tcp memory settings:
> 
> # NFS Tuning Parameters
> sunrpc.udp_slot_table_entries = 128
> sunrpc.tcp_slot_table_entries = 128
> vm.overcommit_ratio = 80
> 
> net.core.rmem_max=524288
> net.core.rmem_default=262144
> net.core.wmem_max=524288
> net.core.wmem_default=262144
> net.ipv4.tcp_rmem = 8192 262144 524288
> net.ipv4.tcp_wmem = 8192 262144 524288
> net.ipv4.tcp_sack=0
> net.ipv4.tcp_timestamps=0
> vm.min_free_kbytes=50000
> vm.overcommit_memory=1
> net.ipv4.tcp_reordering=127
> 
> # Enable tcp_low_latency
> net.ipv4.tcp_low_latency=1
> 
> Here is a current reading from a slabtop of a system where this error is
> happening:
> 
> 3007154 3007154 100%    4.00K 3007154        1  12028616K size-4096
> 
> Note the size of the object cache, usually it is 50 - 100 MB (I have
> another box with 32 threads and the same settings which is bouncing
> between 50 and 128 MB right now).
> 
> I have a lot of client boxes that need access to these servers, and
> would really benefit from having more threads, but if I increase the
> number of threads, it pushes everything out of cache, forcing re-reads,
> and really slows down our jobs.
> 
> Any thoughts on this?

I'd've thought that suggests a leak of memory allocated by kmalloc().

Does the size-4096 cache decrease eventually, or does it stay that large
until you reboot?

--b.

  parent reply	other threads:[~2008-06-06 16:09 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-03 18:50 Problems with large number of clients and reads Norman Weathers
2008-06-04 13:49 ` Chuck Lever
     [not found]   ` <76bd70e30806040649h53ab5d66x8c3423c551e94f77-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-06-04 14:13     ` Norman Weathers
2008-06-05 18:54       ` Norman Weathers
2008-06-06 14:44         ` Chuck Lever
2008-06-09 13:56           ` Weathers, Norman R.
2008-06-06  0:06 ` Dean Hildebrand
2008-06-09 13:20   ` Weathers, Norman R.
2008-06-06 16:09 ` J. Bruce Fields [this message]
2008-06-09 14:19   ` Weathers, Norman R.
     [not found]     ` <0122F800A3B64C449565A9E8C2977010155587-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-09 18:53       ` J. Bruce Fields
2008-06-10 14:30         ` Weathers, Norman R.
     [not found]           ` <0122F800A3B64C449565A9E8C297701002D75D9F-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-10 17:16             ` J. Bruce Fields
2008-06-10 22:12               ` Weathers, Norman R.
     [not found]                 ` <0122F800A3B64C449565A9E8C297701002D75DA3-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-11 18:46                   ` J. Bruce Fields
2008-06-11 19:52                     ` CONFIG_DEBUG_SLAB_LEAK omits size-4096 and larger? J. Bruce Fields
2008-06-11 20:09                       ` Jeff Layton
     [not found]                         ` <20080611160947.5f08fb16-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-06-11 20:57                           ` J. Bruce Fields
2008-06-11 22:46                             ` Weathers, Norman R.
     [not found]                               ` <0122F800A3B64C449565A9E8C297701002D75DAA-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-11 22:54                                 ` J. Bruce Fields
2008-06-12 19:54                                   ` Weathers, Norman R.
     [not found]                                     ` <0122F800A3B64C449565A9E8C297701002D75DAE-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-13 20:15                                       ` J. Bruce Fields
2008-06-13 21:53                                         ` Weathers, Norman R.
     [not found]                                           ` <0122F800A3B64C449565A9E8C297701002D75DB6-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-13 22:04                                             ` J. Bruce Fields
2008-06-13 22:53                                               ` Weathers, Norman R.
     [not found]                                                 ` <0122F800A3B64C449565A9E8C297701002D75DB7-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-16 17:43                                                   ` J. Bruce Fields
2008-06-19 15:53                                                     ` Weathers, Norman R.
     [not found]                                                       ` <0122F800A3B64C449565A9E8C297701002D75DD4-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-19 18:46                                                         ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080606160922.GG30863@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=norman.r.weathers-496aOtIFJR1B+Kdf37RAV9BPR1lH4CV8@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox