public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Kevin Constantine <Kevin.Constantine-FfNkGbSheRGpB8w63BLUukEOCMrvLtNR@public.gmane.org>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Greg Banks <gnb@sgi.com>,
	Linux NFS ML <linux-nfs@vger.kernel.org>,
	Harshula Jayasuriya <harshula@sgi.com>
Subject: Re: [patch 3/3] knfsd: add file to export stats about nfsd pools
Date: Thu, 12 Feb 2009 17:53:20 -0800	[thread overview]
Message-ID: <4994D290.3020906@disney.com> (raw)
In-Reply-To: <20090212171106.GB21445@fieldses.org>

On 02/12/09 09:11, J. Bruce Fields wrote:
> On Tue, Jan 13, 2009 at 09:26:36PM +1100, Greg Banks wrote:
>> Add /proc/fs/nfsd/pool_stats to export to userspace various
>> statistics about the operation of rpc server thread pools.
> 
> Could you explainw hy these specific statistics (total packets,
> sockets_queued, threads_woken, overloads_avoided, threads_timedout) are
> the important ones to capture?  Could you give examples of what sort of
> problems could be solved using them?
> 
> As you said, an important question for the sysadmin is "should I
> configure more nfsds?"  How do they answer that?
> 

I typically use the "th" line to determine whether to add more threads 
or not by looking at the distribution of values across the histogram. 
If things are weighted more towards the 90-100% group, I'll add more 
threads and watch the traffic patterns.

Usually, the question of how many to add is answered by trial and error.

echo 32 > /proc/fs/nfsd/threads
Did that improve my throughput? yes?
echo 128 > /proc/fs/nfsd/threads
Did that improve my throughput? no it actually decreased.
rinse... repeat...

> --b.
> 
>> This patch is based on a forward-ported version of
>> knfsd-add-pool-thread-stats which has been shipping in the SGI
>> "Enhanced NFS" product since 2006 and which was previously
>> posted:
> 
>> http://article.gmane.org/gmane.linux.nfs/10375
>>
>> It has also been updated thus:
>>
>>  * moved EXPORT_SYMBOL() to near the function it exports
>>  * made the new struct struct seq_operations const
>>  * used SEQ_START_TOKEN instead of ((void *)1)
>>  * merged fix from SGI PV 990526 "sunrpc: use dprintk instead of
>>    printk in svc_pool_stats_*()" by Harshula Jayasuriya.
>>  * merged fix from SGI PV 964001 "Crash reading pool_stats before
>>    nfsds are started".
>>
>> Signed-off-by: Greg Banks <gnb@sgi.com>
>> Signed-off-by: Harshula Jayasuriya <harshula@sgi.com>
>> ---
>>
>>  fs/nfsd/nfsctl.c           |   12 ++++
>>  fs/nfsd/nfssvc.c           |    7 ++
>>  include/linux/sunrpc/svc.h |   11 +++
>>  net/sunrpc/svc_xprt.c      |  100 +++++++++++++++++++++++++++++++++-
>>  4 files changed, 129 insertions(+), 1 deletion(-)
>>
>> Index: bfields/fs/nfsd/nfsctl.c
>> ===================================================================
>> --- bfields.orig/fs/nfsd/nfsctl.c
>> +++ bfields/fs/nfsd/nfsctl.c
>> @@ -60,6 +60,7 @@ enum {
>>  	NFSD_FO_UnlockFS,
>>  	NFSD_Threads,
>>  	NFSD_Pool_Threads,
>> +	NFSD_Pool_Stats,
>>  	NFSD_Versions,
>>  	NFSD_Ports,
>>  	NFSD_MaxBlkSize,
>> @@ -172,6 +173,16 @@ static const struct file_operations expo
>>  	.owner		= THIS_MODULE,
>>  };
>>  
>> +extern int nfsd_pool_stats_open(struct inode *inode, struct file *file);
>> +
>> +static struct file_operations pool_stats_operations = {
>> +	.open		= nfsd_pool_stats_open,
>> +	.read		= seq_read,
>> +	.llseek		= seq_lseek,
>> +	.release	= seq_release,
>> +	.owner		= THIS_MODULE,
>> +};
>> +
>>  /*----------------------------------------------------------------------------*/
>>  /*
>>   * payload - write methods
>> @@ -1246,6 +1257,7 @@ static int nfsd_fill_super(struct super_
>>  		[NFSD_Fh] = {"filehandle", &transaction_ops, S_IWUSR|S_IRUSR},
>>  		[NFSD_Threads] = {"threads", &transaction_ops, S_IWUSR|S_IRUSR},
>>  		[NFSD_Pool_Threads] = {"pool_threads", &transaction_ops, S_IWUSR|S_IRUSR},
>> +		[NFSD_Pool_Stats] = {"pool_stats", &pool_stats_operations, S_IRUGO},
>>  		[NFSD_Versions] = {"versions", &transaction_ops, S_IWUSR|S_IRUSR},
>>  		[NFSD_Ports] = {"portlist", &transaction_ops, S_IWUSR|S_IRUGO},
>>  		[NFSD_MaxBlkSize] = {"max_block_size", &transaction_ops, S_IWUSR|S_IRUGO},
>> Index: bfields/fs/nfsd/nfssvc.c
>> ===================================================================
>> --- bfields.orig/fs/nfsd/nfssvc.c
>> +++ bfields/fs/nfsd/nfssvc.c
>> @@ -546,3 +546,10 @@ nfsd_dispatch(struct svc_rqst *rqstp, __
>>  	nfsd_cache_update(rqstp, proc->pc_cachetype, statp + 1);
>>  	return 1;
>>  }
>> +
>> +int nfsd_pool_stats_open(struct inode *inode, struct file *file)
>> +{
>> +	if (nfsd_serv == NULL)
>> +		return -ENODEV;
>> +	return svc_pool_stats_open(nfsd_serv, file);
>> +}
>> Index: bfields/include/linux/sunrpc/svc.h
>> ===================================================================
>> --- bfields.orig/include/linux/sunrpc/svc.h
>> +++ bfields/include/linux/sunrpc/svc.h
>> @@ -24,6 +24,15 @@
>>   */
>>  typedef int		(*svc_thread_fn)(void *);
>>  
>> +/* statistics for svc_pool structures */
>> +struct svc_pool_stats {
>> +	unsigned long	packets;
>> +	unsigned long	sockets_queued;
>> +	unsigned long	threads_woken;
>> +	unsigned long	overloads_avoided;
>> +	unsigned long	threads_timedout;
>> +};
>> +
>>  /*
>>   *
>>   * RPC service thread pool.
>> @@ -42,6 +51,7 @@ struct svc_pool {
>>  	unsigned int		sp_nrthreads;	/* # of threads in pool */
>>  	struct list_head	sp_all_threads;	/* all server threads */
>>  	int			sp_nwaking;	/* number of threads woken but not yet active */
>> +	struct svc_pool_stats	sp_stats;	/* statistics on pool operation */
>>  } ____cacheline_aligned_in_smp;
>>  
>>  /*
>> @@ -396,6 +406,7 @@ struct svc_serv *  svc_create_pooled(str
>>  			sa_family_t, void (*shutdown)(struct svc_serv *),
>>  			svc_thread_fn, struct module *);
>>  int		   svc_set_num_threads(struct svc_serv *, struct svc_pool *, int);
>> +int		   svc_pool_stats_open(struct svc_serv *serv, struct file *file);
>>  void		   svc_destroy(struct svc_serv *);
>>  int		   svc_process(struct svc_rqst *);
>>  int		   svc_register(const struct svc_serv *, const unsigned short,
>> Index: bfields/net/sunrpc/svc_xprt.c
>> ===================================================================
>> --- bfields.orig/net/sunrpc/svc_xprt.c
>> +++ bfields/net/sunrpc/svc_xprt.c
>> @@ -318,6 +318,8 @@ void svc_xprt_enqueue(struct svc_xprt *x
>>  		goto out_unlock;
>>  	}
>>  
>> +	pool->sp_stats.packets++;
>> +
>>  	/* Mark transport as busy. It will remain in this state until
>>  	 * the provider calls svc_xprt_received. We update XPT_BUSY
>>  	 * atomically because it also guards against trying to enqueue
>> @@ -355,6 +357,7 @@ void svc_xprt_enqueue(struct svc_xprt *x
>>  	if (pool->sp_nwaking >= SVC_MAX_WAKING) {
>>  		/* too many threads are runnable and trying to wake up */
>>  		thread_avail = 0;
>> +		pool->sp_stats.overloads_avoided++;
>>  	}
>>  
>>  	if (thread_avail) {
>> @@ -374,11 +377,13 @@ void svc_xprt_enqueue(struct svc_xprt *x
>>  		atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved);
>>  		rqstp->rq_waking = 1;
>>  		pool->sp_nwaking++;
>> +		pool->sp_stats.threads_woken++;
>>  		BUG_ON(xprt->xpt_pool != pool);
>>  		wake_up(&rqstp->rq_wait);
>>  	} else {
>>  		dprintk("svc: transport %p put into queue\n", xprt);
>>  		list_add_tail(&xprt->xpt_ready, &pool->sp_sockets);
>> +		pool->sp_stats.sockets_queued++;
>>  		BUG_ON(xprt->xpt_pool != pool);
>>  	}
>>  
>> @@ -591,6 +596,7 @@ int svc_recv(struct svc_rqst *rqstp, lon
>>  	int			pages;
>>  	struct xdr_buf		*arg;
>>  	DECLARE_WAITQUEUE(wait, current);
>> +	long			time_left;
>>  
>>  	dprintk("svc: server %p waiting for data (to = %ld)\n",
>>  		rqstp, timeout);
>> @@ -676,12 +682,14 @@ int svc_recv(struct svc_rqst *rqstp, lon
>>  		add_wait_queue(&rqstp->rq_wait, &wait);
>>  		spin_unlock_bh(&pool->sp_lock);
>>  
>> -		schedule_timeout(timeout);
>> + 		time_left = schedule_timeout(timeout);
>>  
>>  		try_to_freeze();
>>  
>>  		spin_lock_bh(&pool->sp_lock);
>>  		remove_wait_queue(&rqstp->rq_wait, &wait);
>> +		if (!time_left)
>> +			pool->sp_stats.threads_timedout++;
>>  
>>  		xprt = rqstp->rq_xprt;
>>  		if (!xprt) {
>> @@ -1103,3 +1111,93 @@ int svc_xprt_names(struct svc_serv *serv
>>  	return totlen;
>>  }
>>  EXPORT_SYMBOL_GPL(svc_xprt_names);
>> +
>> +
>> +/*----------------------------------------------------------------------------*/
>> +
>> +static void *svc_pool_stats_start(struct seq_file *m, loff_t *pos)
>> +{
>> +	unsigned int pidx = (unsigned int)*pos;
>> +	struct svc_serv *serv = m->private;
>> +
>> +	dprintk("svc_pool_stats_start, *pidx=%u\n", pidx);
>> +
>> +	lock_kernel();
>> +	/* bump up the pseudo refcount while traversing */
>> +	svc_get(serv);
>> +	unlock_kernel();
>> +
>> +	if (!pidx)
>> +		return SEQ_START_TOKEN;
>> +	return (pidx > serv->sv_nrpools ? NULL : &serv->sv_pools[pidx-1]);
>> +}
>> +
>> +static void *svc_pool_stats_next(struct seq_file *m, void *p, loff_t *pos)
>> +{
>> +	struct svc_pool *pool = p;
>> +	struct svc_serv *serv = m->private;
>> +
>> +	dprintk("svc_pool_stats_next, *pos=%llu\n", *pos);
>> +
>> +	if (p == SEQ_START_TOKEN) {
>> +		pool = &serv->sv_pools[0];
>> +	} else {
>> +		unsigned int pidx = (pool - &serv->sv_pools[0]);
>> +		if (pidx < serv->sv_nrpools-1)
>> +			pool = &serv->sv_pools[pidx+1];
>> +		else
>> +			pool = NULL;
>> +	}
>> +	++*pos;
>> +	return pool;
>> +}
>> +
>> +static void svc_pool_stats_stop(struct seq_file *m, void *p)
>> +{
>> +	struct svc_serv *serv = m->private;
>> +
>> +	lock_kernel();
>> +	/* this function really, really should have been called svc_put() */
>> +	svc_destroy(serv);
>> +	unlock_kernel();
>> +}
>> +
>> +static int svc_pool_stats_show(struct seq_file *m, void *p)
>> +{
>> +	struct svc_pool *pool = p;
>> +
>> +	if (p == SEQ_START_TOKEN) {
>> +		seq_puts(m, "# pool packets-arrived sockets-enqueued threads-woken overloads-avoided threads-timedout\n");
>> +		return 0;
>> +	}
>> +
>> +	seq_printf(m, "%u %lu %lu %lu %lu %lu\n",
>> +		pool->sp_id,
>> +		pool->sp_stats.packets,
>> +		pool->sp_stats.sockets_queued,
>> +		pool->sp_stats.threads_woken,
>> +		pool->sp_stats.overloads_avoided,
>> +		pool->sp_stats.threads_timedout);
>> +
>> +	return 0;
>> +}
>> +
>> +static const struct seq_operations svc_pool_stats_seq_ops = {
>> +	.start	= svc_pool_stats_start,
>> +	.next	= svc_pool_stats_next,
>> +	.stop	= svc_pool_stats_stop,
>> +	.show	= svc_pool_stats_show,
>> +};
>> +
>> +int svc_pool_stats_open(struct svc_serv *serv, struct file *file)
>> +{
>> +	int err;
>> +
>> +	err = seq_open(file, &svc_pool_stats_seq_ops);
>> +	if (!err)
>> +		((struct seq_file *) file->private_data)->private = serv;
>> +	return err;
>> +}
>> +EXPORT_SYMBOL(svc_pool_stats_open);
>> +
>> +/*----------------------------------------------------------------------------*/
>>
>> --
>> -- 
>> Greg Banks, P.Engineer, SGI Australian Software Group.
>> the brightly coloured sporks of revolution.
>> I don't speak for SGI.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2009-02-13  2:02 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-13 10:26 [patch 0/3] First tranche of SGI Enhanced NFS patches Greg Banks
2009-01-13 10:26 ` [patch 1/3] knfsd: remove the nfsd thread busy histogram Greg Banks
2009-01-13 16:41   ` Chuck Lever
2009-01-13 22:50     ` Greg Banks
     [not found]       ` <496D1ACC.7070106-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
2009-02-11 21:59         ` J. Bruce Fields
2009-01-13 10:26 ` [patch 2/3] knfsd: avoid overloading the CPU scheduler with enormous load averages Greg Banks
2009-01-13 14:33   ` Peter Staubach
2009-01-13 22:15     ` Greg Banks
     [not found]       ` <496D1294.1060407-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
2009-01-13 22:35         ` Peter Staubach
2009-01-13 23:04           ` Greg Banks
2009-02-11 23:10   ` J. Bruce Fields
2009-02-19  6:25     ` Greg Banks
2009-03-15 21:21       ` J. Bruce Fields
2009-03-16  3:10         ` Greg Banks
2009-01-13 10:26 ` [patch 3/3] knfsd: add file to export stats about nfsd pools Greg Banks
2009-02-12 17:11   ` J. Bruce Fields
2009-02-13  1:53     ` Kevin Constantine [this message]
2009-02-19  7:04       ` Greg Banks
2009-02-19  6:42     ` Greg Banks
2009-03-15 21:25       ` J. Bruce Fields
2009-03-16  3:21         ` Greg Banks
2009-03-16 13:37           ` J. Bruce Fields
2009-02-09  5:24 ` [patch 0/3] First tranche of SGI Enhanced NFS patches Greg Banks
2009-02-09 20:47   ` J. Bruce Fields
2009-02-09 23:26     ` Greg Banks

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4994D290.3020906@disney.com \
    --to=kevin.constantine-ffnkgbshergpb8w63bluukeocmrvltnr@public.gmane.org \
    --cc=bfields@fieldses.org \
    --cc=gnb@sgi.com \
    --cc=harshula@sgi.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox