Linux NFS development
 help / color / mirror / Atom feed
From: Greg Banks <gnb@melbourne.sgi.com>
To: Jeff Layton <jlayton@redhat.com>
Cc: linux-nfs@vger.kernel.org, nfsv4@linux-nfs.org
Subject: Re: [PATCH 0/3] [RFC] knfsd: convert to kthread API and	remove	signaling for shutdown
Date: Tue, 20 May 2008 13:36:26 -0700	[thread overview]
Message-ID: <4833364A.4010803@melbourne.sgi.com> (raw)
In-Reply-To: <4831F860.6050801@melbourne.sgi.com>

[-- Attachment #1: Type: text/plain, Size: 606 bytes --]

Greg Banks wrote:
> [...]
> I'll send out the patch I mentioned when we
> chatted at Cthon, maybe you can consider the issue it fixes.
>
>   
Here's the patch.  You probably don't want to do it the same way, but
the intent is to delay the allocation of the svc_rqst structure until
the nfsd is running on the correct cpu so that the allocation comes from
the correct local memory on NUMA systems.  Absent the patch, most of the
nfsds spend a lot of time updating remote memory allocated on node0.

-- 
Greg Banks, P.Engineer, SGI Australian Software Group.
The cake is *not* a lie.
I don't speak for SGI.


[-- Attachment #2: knfsd-allocate-svc-rqst-and-pages-on-correct-nodes-4 --]
[-- Type: text/plain, Size: 4068 bytes --]

When allocating the per-thread NFS server data structures, ensure
the allocation proceeds on the NUMA node where the thread will then
spend all of its life.  This reduces the amount of cross-node traffic
needed to run NFS at high traffic rates on large NUMA machines.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
---

 net/sunrpc/svc.c |   94 ++++++++++++++++++++++++++++++++------------
 1 file changed, 70 insertions(+), 24 deletions(-)

Index: linux-2.6.16/net/sunrpc/svc.c
===================================================================
--- linux-2.6.16.orig/net/sunrpc/svc.c
+++ linux-2.6.16/net/sunrpc/svc.c
@@ -538,30 +538,43 @@ svc_release_buffer(struct svc_rqst *rqst
 	rqstp->rq_argused = 0;
 }
 
+struct svc_create_thread_rec
+{
+	svc_thread_fn func;
+	struct svc_serv *serv;
+	struct svc_pool *pool;
+	struct completion comp;
+};
+
 /*
- * Create a thread in the given pool.  Caller must hold BKL.
- * On a NUMA or SMP machine, with a multi-pool serv, the thread
- * will be restricted to run on the cpus belonging to the pool.
+ * Trampoline function called as the thread function for new nfsd threads.
+ * Performs all the per-thread initialisation in the thread's own context,
+ * so that the thread's memory policy and cpus_allowed mask govern the
+ * allocation of the per-thread structures and the buffer pages used
+ * to do network and disk IO.  For large NUMA machines with a lot of
+ * network and disk bandwidth, getting this page placement right can
+ * double system throughput.
  */
-static int
-__svc_create_thread(svc_thread_fn func, struct svc_serv *serv, struct svc_pool *pool)
+static int __svc_create_thread_tramp(void *data)
 {
+	struct svc_create_thread_rec *rec = data;
+	struct svc_serv *serv = rec->serv;
+	struct svc_pool *pool = rec->pool;
+	int error = -ENOMEM;
 	struct svc_rqst	*rqstp;
-	int		error = -ENOMEM;
-	int		have_oldmask = 0;
-	cpumask_t	oldmask;
 
-	rqstp = kmalloc(sizeof(*rqstp), GFP_KERNEL);
+	rqstp = kzalloc(sizeof(*rqstp), GFP_KERNEL);
 	if (!rqstp)
 		goto out;
 
-	memset(rqstp, 0, sizeof(*rqstp));
 	init_waitqueue_head(&rqstp->rq_wait);
 
-	if (!(rqstp->rq_argp = kmalloc(serv->sv_xdrsize, GFP_KERNEL))
-	 || !(rqstp->rq_resp = kmalloc(serv->sv_xdrsize, GFP_KERNEL))
-	 || !svc_init_buffer(rqstp, serv->sv_bufsz))
-		goto out_thread;
+	if (!(rqstp->rq_argp = kmalloc(serv->sv_xdrsize, GFP_KERNEL)))
+		goto out_argp;
+	if (!(rqstp->rq_resp = kmalloc(serv->sv_xdrsize, GFP_KERNEL)))
+		goto out_resp;
+	if (!svc_init_buffer(rqstp, serv->sv_bufsz))
+		goto out_buffer;
 
 	serv->sv_nrthreads++;
 	pool->sp_nrthreads++;
@@ -571,24 +584,57 @@ __svc_create_thread(svc_thread_fn func, 
 	rqstp->rq_server = serv;
 	rqstp->rq_pool = pool;
 
+	svc_sock_update_bufs(serv);
+
+	complete(&rec->comp);
+
+	rec->func(rqstp);
+	return 0;
+
+out_buffer:
+	svc_release_buffer(rqstp);
+	kfree(rqstp->rq_resp);
+out_resp:
+	kfree(rqstp->rq_argp);
+out_argp:
+	kfree(rqstp);
+out:
+	complete(&rec->comp);
+	return error;
+}
+
+/*
+ * Create a thread in the given pool.  Caller must hold BKL.
+ * On a NUMA or SMP machine, with a multi-pool serv, the thread
+ * will be restricted to run on the cpus belonging to the pool.
+ */
+static int
+__svc_create_thread(svc_thread_fn func, struct svc_serv *serv, struct svc_pool *pool)
+{
+	int		error = -ENOMEM;
+	int		have_oldmask = 0;
+	cpumask_t	oldmask;
+	struct svc_create_thread_rec rec = {
+				.func = func,
+				.serv = serv,
+				.pool = pool
+			};
+	init_completion(&rec.comp);
+
 	if (serv->sv_nrpools > 1)
 		have_oldmask = svc_pool_map_set_cpumask(pool->sp_id, &oldmask);
 
-	error = kernel_thread((int (*)(void *)) func, rqstp, 0);
+	error = kernel_thread(__svc_create_thread_tramp, &rec, 0);
+	if (error > 0)
+		error = 0;
 
 	if (have_oldmask)
 		set_cpus_allowed(current, oldmask);
 
-	if (error < 0)
-		goto out_thread;
-	svc_sock_update_bufs(serv);
-	error = 0;
-out:
-	return error;
+	/* wait for the child thread to finish initialising */
+	wait_for_completion(&rec.comp);
 
-out_thread:
-	svc_exit_thread(rqstp);
-	goto out;
+	return error;
 }
 
 /*

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
NFSv4 mailing list
NFSv4@linux-nfs.org
http://linux-nfs.org/cgi-bin/mailman/listinfo/nfsv4

  parent reply	other threads:[~2008-05-20 20:36 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-18  2:35 [PATCH 0/3] [RFC] knfsd: convert to kthread API and remove signaling for shutdown Jeff Layton
2008-05-18  2:35 ` [PATCH 1/3] [RFC] knfsd: convert knfsd to kthread API Jeff Layton
2008-05-18  2:35   ` [PATCH 2/3] [RFC] sunrpc: remove unneeded fields from svc_serv struct Jeff Layton
2008-05-18  2:35     ` [PATCH 3/3] [RFC] knfsd: remove signal defines and extraneous variables Jeff Layton
2008-05-19  6:07 ` [PATCH 0/3] [RFC] knfsd: convert to kthread API and remove signaling for shutdown Neil Brown
     [not found]   ` <18481.6416.571430.593722-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2008-05-19 21:01     ` Jeff Layton
2008-05-19 22:00     ` Greg Banks
2008-05-19 23:52       ` Neil Brown
     [not found]         ` <18482.4782.858347.981553-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2008-05-20  2:04           ` Greg Banks
2008-05-20  2:24         ` Jeff Layton
     [not found]           ` <20080519222457.6f24daa5-PC62bkCOHzGdMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2008-05-20  2:34             ` Greg Banks
2008-05-20 11:05               ` Jeff Layton
     [not found]             ` <483238B3.4010702-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
2008-05-20 13:33               ` Talpey, Thomas
2008-05-20  3:13           ` Neil Brown
2008-05-20 11:13             ` Jeff Layton
     [not found]             ` <18482.16837.381955.636390-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2008-05-20 20:26               ` Greg Banks
2008-05-20 20:36       ` Greg Banks [this message]
     [not found]         ` <4833364A.4010803-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
2008-05-21  1:48           ` Jeff Layton
     [not found]             ` <20080520214823.576ad7a7-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-05-21  3:29               ` Greg Banks
     [not found]                 ` <48339730.3060206-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
2008-05-30 16:25                   ` Jeff Layton
     [not found]                     ` <20080530122517.4f18c48e-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-05-30 18:46                       ` J. Bruce Fields
2008-05-30 20:59                         ` Jeff Layton
2008-06-02  5:51                       ` Greg Banks
     [not found]                         ` <48438A76.6000400-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
2008-06-02 10:41                           ` Jeff Layton
     [not found]                             ` <20080602064132.10c69c88-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-06-03  3:27                               ` Greg Banks
     [not found]                                 ` <4844BA3C.3010605-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
2008-06-03 10:51                                   ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4833364A.4010803@melbourne.sgi.com \
    --to=gnb@melbourne.sgi.com \
    --cc=jlayton@redhat.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=nfsv4@linux-nfs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox