All of lore.kernel.org
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: "Weathers,
	Norman R."
	<Norman.R.Weathers-496aOtIFJR1B+Kdf37RAV9BPR1lH4CV8@public.gmane.org>
Cc: Jeff Layton <jlayton@poochiereds.net>,
	linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org,
	Neil Brown <neilb@suse.de>
Subject: Re: CONFIG_DEBUG_SLAB_LEAK omits size-4096 and larger?
Date: Fri, 13 Jun 2008 18:04:22 -0400	[thread overview]
Message-ID: <20080613220422.GC14338@fieldses.org> (raw)
In-Reply-To: <0122F800A3B64C449565A9E8C297701002D75DB6-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>

On Fri, Jun 13, 2008 at 04:53:31PM -0500, Weathers, Norman R. wrote:
>  
> 
> > > The big one seems to be the __alloc_skb. (This is with 16 
> > threads, and
> > > it says that we are using up somewhere between 12 and 14 GB 
> > of memory,
> > > about 2 to 3 gig of that is disk cache).  If I were to put anymore
> > > threads out there, the server would become almost 
> > unresponsive (it was
> > > bad enough as it was).   
> > > 
> > > At the same time, I also noticed this:
> > > 
> > > skbuff_fclone_cache: 1842524 __alloc_skb+0x50/0x170
> > > 
> > > Don't know for sure if that is meaningful or not....
> > 
> > OK, so, starting at net/core/skbuff.c, this means that this memory was
> > allocated by __alloc_skb() calls with something nonzero in the third
> > ("fclone") argument.  The only such caller is alloc_skb_fclone().
> > Callers of alloc_skb_fclone() include:
> > 
> > 	sk_stream_alloc_skb:
> > 		do_tcp_sendpages
> > 		tcp_sendmsg
> > 		tcp_fragment
> > 		tso_fragment
> 
> Interesting you should mention the tso...  We recently went through and
> turned on TSO on all of our systems, trying it out to see if it helped
> with performance...  This could be something to do with that.  I can try
> disabling the tso on all of the servers and see if that helps with the
> memory.  Actually, I think I will, and I will monitor the situation.  I
> think it might help some, but I still think there may be something else
> going on in a deep corner...

I'll plead total ignorance about TSO, and it sounds like a long
shot--but sure, it'd be worth trying, thanks.

> 
> > 		tcp_mtu_probe
> > 	tcp_send_fin
> > 	tcp_connect
> > 	buf_acquire:
> > 		lots of callers in tipc code (whatever that is).
> > 
> > So unless you're using tipc, or you have something in userspace going
> > haywire (perhaps netstat would help rule that out?), then I suppose
> > there's something wrong with knfsd's tcp code.  Which makes sense, I
> > guess.
> > 
> 
> Not for sure what tipc is either....
> 
> > I'd think this sort of allocation would be limited by the number of
> > sockets times the size of the send and receive buffers.
> > svc_xprt.c:svc_check_conn_limits() claims to be limiting the number of
> > sockets to (nrthreads+3)*20.  (You aren't hitting the "too many open
> > connections" printk there, are you?)  The total buffer size should be
> > bounded by something like 4 megs.
> > 
> > --b.
> > 
> 
> Yes, we are getting a continuous stream of the too many open connections
> scrolling across our logs.  

That's interesting!  So we should probably look more closely at the
svc_check_conn_limits() behavior.  I wonder whether some pathological
behavior is triggered in the case where you're constantly over the limit
it's trying to enforce.

(Remind me how many active clients you have?)

> No problems.  I feel good if I exercised some deep corner of the code
> and found something that needed flushed out, that's what the experience
> is all about, isn't it?

Yep!

--b.

WARNING: multiple messages have this Message-ID (diff)
From: "J. Bruce Fields" <bfields@fieldses.org>
To: "Weathers, Norman R." <Norman.R.Weathers@conocophillips.com>
Cc: Jeff Layton <jlayton@poochiereds.net>,
	linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org,
	Neil Brown <neilb@suse.de>
Subject: Re: CONFIG_DEBUG_SLAB_LEAK omits size-4096 and larger?
Date: Fri, 13 Jun 2008 18:04:22 -0400	[thread overview]
Message-ID: <20080613220422.GC14338@fieldses.org> (raw)
In-Reply-To: <0122F800A3B64C449565A9E8C297701002D75DB6@hoexmb9.conoco.net>

On Fri, Jun 13, 2008 at 04:53:31PM -0500, Weathers, Norman R. wrote:
>  
> 
> > > The big one seems to be the __alloc_skb. (This is with 16 
> > threads, and
> > > it says that we are using up somewhere between 12 and 14 GB 
> > of memory,
> > > about 2 to 3 gig of that is disk cache).  If I were to put anymore
> > > threads out there, the server would become almost 
> > unresponsive (it was
> > > bad enough as it was).   
> > > 
> > > At the same time, I also noticed this:
> > > 
> > > skbuff_fclone_cache: 1842524 __alloc_skb+0x50/0x170
> > > 
> > > Don't know for sure if that is meaningful or not....
> > 
> > OK, so, starting at net/core/skbuff.c, this means that this memory was
> > allocated by __alloc_skb() calls with something nonzero in the third
> > ("fclone") argument.  The only such caller is alloc_skb_fclone().
> > Callers of alloc_skb_fclone() include:
> > 
> > 	sk_stream_alloc_skb:
> > 		do_tcp_sendpages
> > 		tcp_sendmsg
> > 		tcp_fragment
> > 		tso_fragment
> 
> Interesting you should mention the tso...  We recently went through and
> turned on TSO on all of our systems, trying it out to see if it helped
> with performance...  This could be something to do with that.  I can try
> disabling the tso on all of the servers and see if that helps with the
> memory.  Actually, I think I will, and I will monitor the situation.  I
> think it might help some, but I still think there may be something else
> going on in a deep corner...

I'll plead total ignorance about TSO, and it sounds like a long
shot--but sure, it'd be worth trying, thanks.

> 
> > 		tcp_mtu_probe
> > 	tcp_send_fin
> > 	tcp_connect
> > 	buf_acquire:
> > 		lots of callers in tipc code (whatever that is).
> > 
> > So unless you're using tipc, or you have something in userspace going
> > haywire (perhaps netstat would help rule that out?), then I suppose
> > there's something wrong with knfsd's tcp code.  Which makes sense, I
> > guess.
> > 
> 
> Not for sure what tipc is either....
> 
> > I'd think this sort of allocation would be limited by the number of
> > sockets times the size of the send and receive buffers.
> > svc_xprt.c:svc_check_conn_limits() claims to be limiting the number of
> > sockets to (nrthreads+3)*20.  (You aren't hitting the "too many open
> > connections" printk there, are you?)  The total buffer size should be
> > bounded by something like 4 megs.
> > 
> > --b.
> > 
> 
> Yes, we are getting a continuous stream of the too many open connections
> scrolling across our logs.  

That's interesting!  So we should probably look more closely at the
svc_check_conn_limits() behavior.  I wonder whether some pathological
behavior is triggered in the case where you're constantly over the limit
it's trying to enforce.

(Remind me how many active clients you have?)

> No problems.  I feel good if I exercised some deep corner of the code
> and found something that needed flushed out, that's what the experience
> is all about, isn't it?

Yep!

--b.

  parent reply	other threads:[~2008-06-13 22:04 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-03 18:50 Problems with large number of clients and reads Norman Weathers
2008-06-04 13:49 ` Chuck Lever
     [not found]   ` <76bd70e30806040649h53ab5d66x8c3423c551e94f77-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-06-04 14:13     ` Norman Weathers
2008-06-05 18:54       ` Norman Weathers
2008-06-06 14:44         ` Chuck Lever
2008-06-09 13:56           ` Weathers, Norman R.
2008-06-06  0:06 ` Dean Hildebrand
2008-06-09 13:20   ` Weathers, Norman R.
2008-06-06 16:09 ` J. Bruce Fields
2008-06-09 14:19   ` Weathers, Norman R.
     [not found]     ` <0122F800A3B64C449565A9E8C2977010155587-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-09 18:53       ` J. Bruce Fields
2008-06-10 14:30         ` Weathers, Norman R.
     [not found]           ` <0122F800A3B64C449565A9E8C297701002D75D9F-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-10 17:16             ` J. Bruce Fields
2008-06-10 22:12               ` Weathers, Norman R.
     [not found]                 ` <0122F800A3B64C449565A9E8C297701002D75DA3-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-11 18:46                   ` J. Bruce Fields
2008-06-11 19:52                     ` CONFIG_DEBUG_SLAB_LEAK omits size-4096 and larger? J. Bruce Fields
2008-06-11 19:52                       ` J. Bruce Fields
2008-06-11 20:09                       ` Jeff Layton
2008-06-11 20:09                         ` Jeff Layton
     [not found]                         ` <20080611160947.5f08fb16-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-06-11 20:57                           ` J. Bruce Fields
2008-06-11 20:57                             ` J. Bruce Fields
2008-06-11 22:46                             ` Weathers, Norman R.
2008-06-11 22:46                               ` Weathers, Norman R.
     [not found]                               ` <0122F800A3B64C449565A9E8C297701002D75DAA-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-11 22:54                                 ` J. Bruce Fields
2008-06-11 22:54                                   ` J. Bruce Fields
2008-06-12 19:54                                   ` Weathers, Norman R.
2008-06-12 19:54                                     ` Weathers, Norman R.
     [not found]                                     ` <0122F800A3B64C449565A9E8C297701002D75DAE-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-13 20:15                                       ` J. Bruce Fields
2008-06-13 20:15                                         ` J. Bruce Fields
2008-06-13 21:53                                         ` Weathers, Norman R.
2008-06-13 21:53                                           ` Weathers, Norman R.
     [not found]                                           ` <0122F800A3B64C449565A9E8C297701002D75DB6-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-13 22:04                                             ` J. Bruce Fields [this message]
2008-06-13 22:04                                               ` J. Bruce Fields
2008-06-13 22:53                                               ` Weathers, Norman R.
2008-06-13 22:53                                                 ` Weathers, Norman R.
     [not found]                                                 ` <0122F800A3B64C449565A9E8C297701002D75DB7-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-16 17:43                                                   ` J. Bruce Fields
2008-06-16 17:43                                                     ` J. Bruce Fields
2008-06-19 15:53                                                     ` Weathers, Norman R.
2008-06-19 15:53                                                       ` Weathers, Norman R.
     [not found]                                                       ` <0122F800A3B64C449565A9E8C297701002D75DD4-zIGg2qceuZx7uNL6xugVa6xOck334EZe@public.gmane.org>
2008-06-19 18:46                                                         ` J. Bruce Fields
2008-06-19 18:46                                                           ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080613220422.GC14338@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=Norman.R.Weathers-496aOtIFJR1B+Kdf37RAV9BPR1lH4CV8@public.gmane.org \
    --cc=jlayton@poochiereds.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.