[PATCH 0/2] nfsd: add a usermodehelper upcall for client id tracking

linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] nfsd: add a usermodehelper upcall for client id tracking
@ 2012-10-01 11:51 Jeff Layton
  2012-10-01 11:51 ` [PATCH 1/2] nfsd: add a usermodehelper upcall for NFSv4 client ID tracking Jeff Layton
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Jeff Layton @ 2012-10-01 11:51 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs, bharrosh, steved, skinsbursky

A few months ago I did some patches to add an upcall to replace the
legacy clientid tracking scheme in nfsd. At the time, several people
including Boaz and Steve expressed that they would prefer an upcall that
used call_usermodehelper instead of requiring a running daemon.

While I'm loath to add yet another upcall to the kernel, I must admit
that they have a good point. Yet another running daemon for something as
infrequently called as this is less than ideal. The idiot-proofness of a
usermodehelper upcall is hard to beat for the "just works" factor.

This patchset adds a new set of client tracking ops to the kernel that
use call_usermodehelper to exec a program in userland to do its bidding.
I'll also be posting a set of nfs-utils patches soon to add the callout
program for this.

This seems to work as expected and is fairly simple. There are a couple
of lingering issues.

- Do we need to switch to a different mount namespace somehow based on
  the net namespace? Eventually we want to allow nfsd to run in a
  container. At some point we'll need a mechanism to ensure that the
  upcall runs within the correct container. Stanislav, I'd appreciate
  your input here...

- What about nfsdcld? I don't believe any distros have deployed it yet,
  and there aren't a lot of compelling reasons to stick with it if this
  patchset seems reasonable. I suggest we plan to deprecate it in a couple
  of releases unless there are objections.

Unless anyone sees major problems with this approach, I think these
patches are reasonable for 3.7.

Jeff Layton (2):
  nfsd: add a usermodehelper upcall for NFSv4 client ID tracking
  nfsd: change algorithm for selecting the client_tracking_ops

 fs/nfsd/nfs4recover.c | 174 +++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 164 insertions(+), 10 deletions(-)

-- 
1.7.11.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] nfsd: add a usermodehelper upcall for NFSv4 client ID tracking
  2012-10-01 11:51 [PATCH 0/2] nfsd: add a usermodehelper upcall for client id tracking Jeff Layton
@ 2012-10-01 11:51 ` Jeff Layton
  2012-10-01 11:51 ` [PATCH 2/2] nfsd: change algorithm for selecting the client_tracking_ops Jeff Layton
  2012-10-01 14:05 ` [PATCH 0/2] nfsd: add a usermodehelper upcall for client id tracking Stanislav Kinsbursky
  2 siblings, 0 replies; 10+ messages in thread
From: Jeff Layton @ 2012-10-01 11:51 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs, bharrosh, steved, skinsbursky

Add a new client tracker upcall type that uses call_usermodehelper to
call out to a program. This seems to be the preferred method of
calling out to usermode these days for seldom-called upcalls. It's
simple and doesn't require a running daemon, so it should "just work"
as long as the binary is installed.

The client tracking exit operation is also changed to check for a
NULL pointer before running. The UMH upcall doesn't need to do anything
at module teardown time.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/nfsd/nfs4recover.c | 138 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 137 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index 43295d4..d8e07d5 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -926,6 +926,141 @@ static struct nfsd4_client_tracking_ops nfsd4_cld_tracking_ops = {
 	.grace_done	= nfsd4_cld_grace_done,
 };
 
+/* upcall via usermodehelper */
+static char cltrack_prog[PATH_MAX] = "/sbin/nfsdcltrack";
+module_param_string(cltrack_prog, cltrack_prog, sizeof(cltrack_prog), 0600);
+MODULE_PARM_DESC(cltrack_prog, "Path to the nfsdcltrack upcall program");
+
+static int
+nfsd4_cltrack_upcall(char *cmd, char *arg)
+{
+	static char *envp[] = { "HOME=/",
+		"TERM=linux",
+		"PATH=/sbin:/usr/sbin:/bin:/usr/bin",
+		NULL
+	};
+	char *argv[4];
+	int ret;
+
+	if (unlikely(!cltrack_prog[0])) {
+		dprintk("%s: cltrack_prog is disabled\n", __func__);
+		return -EACCES;
+	}
+
+	dprintk("%s: cmd: %s\n", __func__, cmd);
+	dprintk("%s: arg: %s\n", __func__, arg ? arg : "(null)");
+
+	argv[0] = (char *)cltrack_prog;
+	argv[1] = cmd;
+	argv[2] = arg;
+	argv[3] = NULL;
+
+	ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_PROC);
+	/*
+	 * Disable the upcall mechanism if we're getting an ENOENT
+	 * error. The admin can re-enable it on the fly by using sysfs
+	 * once the problem has been fixed.
+	 */
+	if (ret == -ENOENT || ret == -EACCES) {
+		printk(KERN_ERR "NFSD: %s was not found or isn't executable. "
+			"Please reset nfsd.cltrack_prog module parameter "
+			"once problem is resolved (%d)!\n",
+			cltrack_prog, ret);
+		cltrack_prog[0] = '\0';
+	}
+	dprintk("%s: %s return value: %d\n", __func__, cltrack_prog, ret);
+
+	return ret;
+}
+
+static char *
+bin_to_hex_dup(const unsigned char *src, int srclen)
+{
+	int i;
+	char *buf, *hex;
+
+	/* +1 for terminating NULL */
+	buf = kmalloc((srclen * 2) + 1, GFP_KERNEL);
+	if (!buf)
+		return buf;
+
+	hex = buf;
+	for (i = 0; i < srclen; i++) {
+		sprintf(hex, "%2.2x", *src++);
+		hex += 2;
+	}
+	return buf;
+}
+
+static int
+nfsd4_umh_cltrack_init(struct net __attribute__((unused)) *net)
+{
+	return nfsd4_cltrack_upcall("init", NULL);
+}
+
+static void
+nfsd4_umh_cltrack_create(struct nfs4_client *clp)
+{
+	char *hexid;
+
+	hexid = bin_to_hex_dup(clp->cl_name.data, clp->cl_name.len);
+	if (!hexid) {
+		dprintk("%s: can't allocate memory for upcall!\n", __func__);
+		return;
+	}
+	nfsd4_cltrack_upcall("create", hexid);
+	kfree(hexid);
+}
+
+static void
+nfsd4_umh_cltrack_remove(struct nfs4_client *clp)
+{
+	char *hexid;
+
+	hexid = bin_to_hex_dup(clp->cl_name.data, clp->cl_name.len);
+	if (!hexid) {
+		dprintk("%s: can't allocate memory for upcall!\n", __func__);
+		return;
+	}
+	nfsd4_cltrack_upcall("remove", hexid);
+	kfree(hexid);
+}
+
+static int
+nfsd4_umh_cltrack_check(struct nfs4_client *clp)
+{
+	int ret;
+	char *hexid;
+
+	hexid = bin_to_hex_dup(clp->cl_name.data, clp->cl_name.len);
+	if (!hexid) {
+		dprintk("%s: can't allocate memory for upcall!\n", __func__);
+		return -ENOMEM;
+	}
+	ret = nfsd4_cltrack_upcall("check", hexid);
+	kfree(hexid);
+	return ret;
+}
+
+static void
+nfsd4_umh_cltrack_grace_done(struct net __attribute__((unused)) *net,
+				time_t boot_time)
+{
+	char timestr[22]; /* FIXME: better way to determine max size? */
+
+	sprintf(timestr, "%ld", boot_time);
+	nfsd4_cltrack_upcall("gracedone", timestr);
+}
+
+static struct nfsd4_client_tracking_ops nfsd4_umh_tracking_ops = {
+	.init		= nfsd4_umh_cltrack_init,
+	.exit		= NULL,
+	.create		= nfsd4_umh_cltrack_create,
+	.remove		= nfsd4_umh_cltrack_remove,
+	.check		= nfsd4_umh_cltrack_check,
+	.grace_done	= nfsd4_umh_cltrack_grace_done,
+};
+
 int
 nfsd4_client_tracking_init(struct net *net)
 {
@@ -956,7 +1091,8 @@ void
 nfsd4_client_tracking_exit(struct net *net)
 {
 	if (client_tracking_ops) {
-		client_tracking_ops->exit(net);
+		if (client_tracking_ops->exit)
+			client_tracking_ops->exit(net);
 		client_tracking_ops = NULL;
 	}
 }
-- 
1.7.11.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] nfsd: change algorithm for selecting the client_tracking_ops
  2012-10-01 11:51 [PATCH 0/2] nfsd: add a usermodehelper upcall for client id tracking Jeff Layton
  2012-10-01 11:51 ` [PATCH 1/2] nfsd: add a usermodehelper upcall for NFSv4 client ID tracking Jeff Layton
@ 2012-10-01 11:51 ` Jeff Layton
  2012-10-01 14:05 ` [PATCH 0/2] nfsd: add a usermodehelper upcall for client id tracking Stanislav Kinsbursky
  2 siblings, 0 replies; 10+ messages in thread
From: Jeff Layton @ 2012-10-01 11:51 UTC (permalink / raw)
  To: bfields; +Cc: linux-nfs, bharrosh, steved, skinsbursky

If the legacy tracking dir exists, use that. Next try the usermodehelper
upcall. It should succeed or fail quickly, so trying it first should be
harmless. If that fails, fall back to using nfsdcld.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/nfsd/nfs4recover.c | 36 +++++++++++++++++++++++++++---------
 1 file changed, 27 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index d8e07d5..3584313 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -1067,17 +1067,35 @@ nfsd4_client_tracking_init(struct net *net)
 	int status;
 	struct path path;
 
-	if (!client_tracking_ops) {
-		client_tracking_ops = &nfsd4_cld_tracking_ops;
-		status = kern_path(nfs4_recoverydir(), LOOKUP_FOLLOW, &path);
-		if (!status) {
-			if (S_ISDIR(path.dentry->d_inode->i_mode))
-				client_tracking_ops =
-						&nfsd4_legacy_tracking_ops;
-			path_put(&path);
-		}
+	/* just run the init if it the method is already decided */
+	if (client_tracking_ops)
+		goto do_init;
+
+	/*
+	 * See if the recoverydir exists and is a directory. If it is,
+	 * then use the legacy ops.
+	 */
+	client_tracking_ops = &nfsd4_legacy_tracking_ops;
+	status = kern_path(nfs4_recoverydir(), LOOKUP_FOLLOW, &path);
+	if (!status) {
+		status = S_ISDIR(path.dentry->d_inode->i_mode);
+		path_put(&path);
+		if (status)
+			goto do_init;
 	}
 
+	/*
+	 * Next, try a UMH upcall. It should succeed or fail quickly, so
+	 * there's little harm in trying that first.
+	 */
+	client_tracking_ops = &nfsd4_umh_tracking_ops;
+	status = client_tracking_ops->init(net);
+	if (!status)
+		return status;
+
+	/* Finally, try to use nfsdcld */
+	client_tracking_ops = &nfsd4_cld_tracking_ops;
+do_init:
 	status = client_tracking_ops->init(net);
 	if (status) {
 		printk(KERN_WARNING "NFSD: Unable to initialize client "
-- 
1.7.11.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] nfsd: add a usermodehelper upcall for client id tracking
  2012-10-01 11:51 [PATCH 0/2] nfsd: add a usermodehelper upcall for client id tracking Jeff Layton
  2012-10-01 11:51 ` [PATCH 1/2] nfsd: add a usermodehelper upcall for NFSv4 client ID tracking Jeff Layton
  2012-10-01 11:51 ` [PATCH 2/2] nfsd: change algorithm for selecting the client_tracking_ops Jeff Layton
@ 2012-10-01 14:05 ` Stanislav Kinsbursky
  2012-10-01 14:12   ` bfields
  2 siblings, 1 reply; 10+ messages in thread
From: Stanislav Kinsbursky @ 2012-10-01 14:05 UTC (permalink / raw)
  To: Jeff Layton
  Cc: bfields@fieldses.org, linux-nfs@vger.kernel.org,
	bharrosh@panasas.com, steved@redhat.com

01.10.2012 15:51, Jeff Layton пишет:
> - Do we need to switch to a different mount namespace somehow based on
>    the net namespace? Eventually we want to allow nfsd to run in a
>    container. At some point we'll need a mechanism to ensure that the
>    upcall runs within the correct container. Stanislav, I'd appreciate
>    your input here...

Hello, Jeff.
As far as I can see - yes, we have to switch to desired mount namespace in 
kernel (because we can't do this in user namespace).
Moreover, for NFSd mount namespace looks even more important than net namespace. 
I.e. we can't share one NFSd between mount namespaces (at least, I don't 
understand how to do this safely).
So, if we are going to have NFSd per container, we have to tie NFSd and mount 
namespace somehow.
Having one NFSd for more than one network namespace looks feasible. But we have 
some races with creation and destruction of service per-net transports, and 
Bruce was suggesting to think about maybe we can just simplify all that stuff 
and create service per network namespace.

BTW, is it possible to split service (trasports, etc.) and it's "executors" 
(kernel threads)? If we could do it, then it would be really easy to create as 
many services (with it's unique namespaces combination) as required and teach 
"executors" to switch to desired namespaces while handling service requests.

-- 
Best regards,
Stanislav Kinsbursky

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] nfsd: add a usermodehelper upcall for client id tracking
  2012-10-01 14:05 ` [PATCH 0/2] nfsd: add a usermodehelper upcall for client id tracking Stanislav Kinsbursky
@ 2012-10-01 14:12   ` bfields
  2012-10-01 14:30     ` Stanislav Kinsbursky
  0 siblings, 1 reply; 10+ messages in thread
From: bfields @ 2012-10-01 14:12 UTC (permalink / raw)
  To: Stanislav Kinsbursky
  Cc: Jeff Layton, linux-nfs@vger.kernel.org, bharrosh@panasas.com,
	steved@redhat.com

On Mon, Oct 01, 2012 at 06:05:57PM +0400, Stanislav Kinsbursky wrote:
> 01.10.2012 15:51, Jeff Layton пишет:
> >- Do we need to switch to a different mount namespace somehow based on
> >   the net namespace? Eventually we want to allow nfsd to run in a
> >   container. At some point we'll need a mechanism to ensure that the
> >   upcall runs within the correct container. Stanislav, I'd appreciate
> >   your input here...
> 
> Hello, Jeff.
> As far as I can see - yes, we have to switch to desired mount
> namespace in kernel (because we can't do this in user namespace).
> Moreover, for NFSd mount namespace looks even more important than
> net namespace. I.e. we can't share one NFSd between mount namespaces
> (at least, I don't understand how to do this safely).
> So, if we are going to have NFSd per container, we have to tie NFSd
> and mount namespace somehow.
> Having one NFSd for more than one network namespace looks feasible.
> But we have some races with creation and destruction of service
> per-net transports, and Bruce was suggesting to think about maybe we
> can just simplify all that stuff and create service per network
> namespace.
> 
> BTW, is it possible to split service (trasports, etc.) and it's
> "executors" (kernel threads)? If we could do it, then it would be
> really easy to create as many services (with it's unique namespaces
> combination) as required and teach "executors" to switch to desired
> namespaces while handling service requests.

I'm sure it is possible, and would be more efficient than requiring
(number of containers) * (number of threads per container) threads.

But I think it's also a little more complicated to do, and that to start
off we'd be better off just keeping the services entirely separate.

--b.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] nfsd: add a usermodehelper upcall for client id tracking
  2012-10-01 14:12   ` bfields
@ 2012-10-01 14:30     ` Stanislav Kinsbursky
  2012-10-01 14:48       ` bfields
  2012-10-01 14:52       ` Jeff Layton
  0 siblings, 2 replies; 10+ messages in thread
From: Stanislav Kinsbursky @ 2012-10-01 14:30 UTC (permalink / raw)
  To: bfields@fieldses.org
  Cc: Jeff Layton, linux-nfs@vger.kernel.org, bharrosh@panasas.com,
	steved@redhat.com

01.10.2012 18:12, bfields@fieldses.org пишет:
> On Mon, Oct 01, 2012 at 06:05:57PM +0400, Stanislav Kinsbursky wrote:
>> 01.10.2012 15:51, Jeff Layton пишет:
>>> - Do we need to switch to a different mount namespace somehow based on
>>>    the net namespace? Eventually we want to allow nfsd to run in a
>>>    container. At some point we'll need a mechanism to ensure that the
>>>    upcall runs within the correct container. Stanislav, I'd appreciate
>>>    your input here...
>>
>> Hello, Jeff.
>> As far as I can see - yes, we have to switch to desired mount
>> namespace in kernel (because we can't do this in user namespace).
>> Moreover, for NFSd mount namespace looks even more important than
>> net namespace. I.e. we can't share one NFSd between mount namespaces
>> (at least, I don't understand how to do this safely).
>> So, if we are going to have NFSd per container, we have to tie NFSd
>> and mount namespace somehow.
>> Having one NFSd for more than one network namespace looks feasible.
>> But we have some races with creation and destruction of service
>> per-net transports, and Bruce was suggesting to think about maybe we
>> can just simplify all that stuff and create service per network
>> namespace.
>>
>> BTW, is it possible to split service (trasports, etc.) and it's
>> "executors" (kernel threads)? If we could do it, then it would be
>> really easy to create as many services (with it's unique namespaces
>> combination) as required and teach "executors" to switch to desired
>> namespaces while handling service requests.
>
> I'm sure it is possible, and would be more efficient than requiring
> (number of containers) * (number of threads per container) threads.
>
> But I think it's also a little more complicated to do, and that to start
> off we'd be better off just keeping the services entirely separate.
>

At first sight, this two solutions (split of service/threads and service per 
namespace) doesn't look like they can share a lot of code base.
I believe, that we have to think about mount namespaces during NFSd 
containerization. And, if I understand you right, when you say about "entirely 
separated services", you mean separated by networks namespace.
Separations per mount namespace as a second step will take actually the same 
time and amount of code, as it would be a first step.

So, as I can see it, the real question is: do we really want limited (only 
per-net) NFSd containerisation as soon as possible (and in this case it's much 
simpler to create NFSd/Lockd service and it's threads per network namespace) or 
we should skip this step and concentrate on proper implementation from the very 
beginning (split of service and it's threads and tie NFSd to mount namespace)?

> --b.
>


-- 
Best regards,
Stanislav Kinsbursky

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] nfsd: add a usermodehelper upcall for client id tracking
  2012-10-01 14:30     ` Stanislav Kinsbursky
@ 2012-10-01 14:48       ` bfields
  2012-10-01 15:47         ` Stanislav Kinsbursky
  2012-10-01 14:52       ` Jeff Layton
  1 sibling, 1 reply; 10+ messages in thread
From: bfields @ 2012-10-01 14:48 UTC (permalink / raw)
  To: Stanislav Kinsbursky
  Cc: Jeff Layton, linux-nfs@vger.kernel.org, bharrosh@panasas.com,
	steved@redhat.com

On Mon, Oct 01, 2012 at 06:30:18PM +0400, Stanislav Kinsbursky wrote:
> 01.10.2012 18:12, bfields@fieldses.org пишет:
> >On Mon, Oct 01, 2012 at 06:05:57PM +0400, Stanislav Kinsbursky wrote:
> >>01.10.2012 15:51, Jeff Layton пишет:
> >>>- Do we need to switch to a different mount namespace somehow based on
> >>>   the net namespace? Eventually we want to allow nfsd to run in a
> >>>   container. At some point we'll need a mechanism to ensure that the
> >>>   upcall runs within the correct container. Stanislav, I'd appreciate
> >>>   your input here...
> >>
> >>Hello, Jeff.
> >>As far as I can see - yes, we have to switch to desired mount
> >>namespace in kernel (because we can't do this in user namespace).
> >>Moreover, for NFSd mount namespace looks even more important than
> >>net namespace. I.e. we can't share one NFSd between mount namespaces
> >>(at least, I don't understand how to do this safely).
> >>So, if we are going to have NFSd per container, we have to tie NFSd
> >>and mount namespace somehow.
> >>Having one NFSd for more than one network namespace looks feasible.
> >>But we have some races with creation and destruction of service
> >>per-net transports, and Bruce was suggesting to think about maybe we
> >>can just simplify all that stuff and create service per network
> >>namespace.
> >>
> >>BTW, is it possible to split service (trasports, etc.) and it's
> >>"executors" (kernel threads)? If we could do it, then it would be
> >>really easy to create as many services (with it's unique namespaces
> >>combination) as required and teach "executors" to switch to desired
> >>namespaces while handling service requests.
> >
> >I'm sure it is possible, and would be more efficient than requiring
> >(number of containers) * (number of threads per container) threads.
> >
> >But I think it's also a little more complicated to do, and that to start
> >off we'd be better off just keeping the services entirely separate.
> >
> 
> At first sight, this two solutions (split of service/threads and
> service per namespace) doesn't look like they can share a lot of
> code base.
> I believe, that we have to think about mount namespaces during NFSd
> containerization. And, if I understand you right, when you say about
> "entirely separated services", you mean separated by networks
> namespace.
> Separations per mount namespace as a second step will take actually
> the same time and amount of code, as it would be a first step.
> 
> So, as I can see it, the real question is: do we really want limited
> (only per-net) NFSd containerisation as soon as possible (and in
> this case it's much simpler to create NFSd/Lockd service and it's
> threads per network namespace) or we should skip this step and
> concentrate on proper implementation from the very beginning (split
> of service and it's threads and tie NFSd to mount namespace)?

Sorry, I'm not following you:

	- tying to network namespace vs tying to mount namespace is a
	  completely separate issue from splitting service and threads,
	  right?
	- in detail, what would be the difference between tying to
	  network namespace and mount namespace?  How do you see
	  tying to mount namespace working, and why is that better?

--b.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] nfsd: add a usermodehelper upcall for client id tracking
  2012-10-01 14:30     ` Stanislav Kinsbursky
  2012-10-01 14:48       ` bfields
@ 2012-10-01 14:52       ` Jeff Layton
  2012-10-01 15:49         ` Stanislav Kinsbursky
  1 sibling, 1 reply; 10+ messages in thread
From: Jeff Layton @ 2012-10-01 14:52 UTC (permalink / raw)
  To: Stanislav Kinsbursky
  Cc: bfields@fieldses.org, linux-nfs@vger.kernel.org,
	bharrosh@panasas.com, steved@redhat.com

On Mon, 1 Oct 2012 18:30:18 +0400
Stanislav Kinsbursky <skinsbursky@parallels.com> wrote:

> 01.10.2012 18:12, bfields@fieldses.org пишет:
> > On Mon, Oct 01, 2012 at 06:05:57PM +0400, Stanislav Kinsbursky wrote:
> >> 01.10.2012 15:51, Jeff Layton пишет:
> >>> - Do we need to switch to a different mount namespace somehow based on
> >>>    the net namespace? Eventually we want to allow nfsd to run in a
> >>>    container. At some point we'll need a mechanism to ensure that the
> >>>    upcall runs within the correct container. Stanislav, I'd appreciate
> >>>    your input here...
> >>
> >> Hello, Jeff.
> >> As far as I can see - yes, we have to switch to desired mount
> >> namespace in kernel (because we can't do this in user namespace).
> >> Moreover, for NFSd mount namespace looks even more important than
> >> net namespace. I.e. we can't share one NFSd between mount namespaces
> >> (at least, I don't understand how to do this safely).
> >> So, if we are going to have NFSd per container, we have to tie NFSd
> >> and mount namespace somehow.
> >> Having one NFSd for more than one network namespace looks feasible.
> >> But we have some races with creation and destruction of service
> >> per-net transports, and Bruce was suggesting to think about maybe we
> >> can just simplify all that stuff and create service per network
> >> namespace.
> >>
> >> BTW, is it possible to split service (trasports, etc.) and it's
> >> "executors" (kernel threads)? If we could do it, then it would be
> >> really easy to create as many services (with it's unique namespaces
> >> combination) as required and teach "executors" to switch to desired
> >> namespaces while handling service requests.
> >
> > I'm sure it is possible, and would be more efficient than requiring
> > (number of containers) * (number of threads per container) threads.
> >
> > But I think it's also a little more complicated to do, and that to start
> > off we'd be better off just keeping the services entirely separate.
> >
> 
> At first sight, this two solutions (split of service/threads and service per 
> namespace) doesn't look like they can share a lot of code base.
> I believe, that we have to think about mount namespaces during NFSd 
> containerization. And, if I understand you right, when you say about "entirely 
> separated services", you mean separated by networks namespace.
> Separations per mount namespace as a second step will take actually the same 
> time and amount of code, as it would be a first step.
> 
> So, as I can see it, the real question is: do we really want limited (only 
> per-net) NFSd containerisation as soon as possible (and in this case it's much 
> simpler to create NFSd/Lockd service and it's threads per network namespace) or 
> we should skip this step and concentrate on proper implementation from the very 
> beginning (split of service and it's threads and tie NFSd to mount namespace)?
> 

Ok, to bring the discussion back around, I'll note that the current
legacy clientid tracking code doesn't pay any attention to namespaces
either. So I don't think we're missing much by punting on that just
yet.

Once you've hashed out the mount namespace design, we should be able to
make it pay attention to that that without too much trouble.
-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] nfsd: add a usermodehelper upcall for client id tracking
  2012-10-01 14:48       ` bfields
@ 2012-10-01 15:47         ` Stanislav Kinsbursky
  0 siblings, 0 replies; 10+ messages in thread
From: Stanislav Kinsbursky @ 2012-10-01 15:47 UTC (permalink / raw)
  To: bfields@fieldses.org
  Cc: Jeff Layton, linux-nfs@vger.kernel.org, bharrosh@panasas.com,
	steved@redhat.com

01.10.2012 18:48, bfields@fieldses.org пишет:
> On Mon, Oct 01, 2012 at 06:30:18PM +0400, Stanislav Kinsbursky wrote:
>> 01.10.2012 18:12, bfields@fieldses.org пишет:
>>> On Mon, Oct 01, 2012 at 06:05:57PM +0400, Stanislav Kinsbursky wrote:
>>>> 01.10.2012 15:51, Jeff Layton пишет:
>>>>> - Do we need to switch to a different mount namespace somehow based on
>>>>>    the net namespace? Eventually we want to allow nfsd to run in a
>>>>>    container. At some point we'll need a mechanism to ensure that the
>>>>>    upcall runs within the correct container. Stanislav, I'd appreciate
>>>>>    your input here...
>>>>
>>>> Hello, Jeff.
>>>> As far as I can see - yes, we have to switch to desired mount
>>>> namespace in kernel (because we can't do this in user namespace).
>>>> Moreover, for NFSd mount namespace looks even more important than
>>>> net namespace. I.e. we can't share one NFSd between mount namespaces
>>>> (at least, I don't understand how to do this safely).
>>>> So, if we are going to have NFSd per container, we have to tie NFSd
>>>> and mount namespace somehow.
>>>> Having one NFSd for more than one network namespace looks feasible.
>>>> But we have some races with creation and destruction of service
>>>> per-net transports, and Bruce was suggesting to think about maybe we
>>>> can just simplify all that stuff and create service per network
>>>> namespace.
>>>>
>>>> BTW, is it possible to split service (trasports, etc.) and it's
>>>> "executors" (kernel threads)? If we could do it, then it would be
>>>> really easy to create as many services (with it's unique namespaces
>>>> combination) as required and teach "executors" to switch to desired
>>>> namespaces while handling service requests.
>>>
>>> I'm sure it is possible, and would be more efficient than requiring
>>> (number of containers) * (number of threads per container) threads.
>>>
>>> But I think it's also a little more complicated to do, and that to start
>>> off we'd be better off just keeping the services entirely separate.
>>>
>>
>> At first sight, this two solutions (split of service/threads and
>> service per namespace) doesn't look like they can share a lot of
>> code base.
>> I believe, that we have to think about mount namespaces during NFSd
>> containerization. And, if I understand you right, when you say about
>> "entirely separated services", you mean separated by networks
>> namespace.
>> Separations per mount namespace as a second step will take actually
>> the same time and amount of code, as it would be a first step.
>>
>> So, as I can see it, the real question is: do we really want limited
>> (only per-net) NFSd containerisation as soon as possible (and in
>> this case it's much simpler to create NFSd/Lockd service and it's
>> threads per network namespace) or we should skip this step and
>> concentrate on proper implementation from the very beginning (split
>> of service and it's threads and tie NFSd to mount namespace)?
>
> Sorry, I'm not following you:
>
> 	- tying to network namespace vs tying to mount namespace is a
> 	  completely separate issue from splitting service and threads,
> 	  right?

Not really, I guess. I.e. we already tied service to network namespace. All we 
need to implement fully separated per network namespace services is to drop 
service per-net resources allocation and replace it by full service per-net 
creation network namespace (including staring new kthreads).

> 	- in detail, what would be the difference between tying to
> 	  network namespace and mount namespace?  How do you see
> 	  tying to mount namespace working, and why is that better?
>

If we want to tie NFSd also (!) to mount namespace, then we can't do that 
easily. And the simplest (and currently looks suitable) way I see so far is to:

1) separate service network part from service process part and
2) tie service network part to mount (!) namespace, leaving service process part 
in main pid namespace.

The issue is not in mount namespace itself,  but in second namespace to tie to. 
And it looks like for NFSd mount namespace is more important, because we can 
share it between service instances.
This means also, that one mount namespace could be used by more that one NFSd 
service in case their network namespaces are different.

> --b.
>


-- 
Best regards,
Stanislav Kinsbursky

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] nfsd: add a usermodehelper upcall for client id tracking
  2012-10-01 14:52       ` Jeff Layton
@ 2012-10-01 15:49         ` Stanislav Kinsbursky
  0 siblings, 0 replies; 10+ messages in thread
From: Stanislav Kinsbursky @ 2012-10-01 15:49 UTC (permalink / raw)
  To: Jeff Layton
  Cc: bfields@fieldses.org, linux-nfs@vger.kernel.org,
	bharrosh@panasas.com, steved@redhat.com

01.10.2012 18:52, Jeff Layton пишет:
> On Mon, 1 Oct 2012 18:30:18 +0400
> Stanislav Kinsbursky <skinsbursky@parallels.com> wrote:
>
>> 01.10.2012 18:12, bfields@fieldses.org пишет:
>>> On Mon, Oct 01, 2012 at 06:05:57PM +0400, Stanislav Kinsbursky wrote:
>>>> 01.10.2012 15:51, Jeff Layton пишет:
>>>>> - Do we need to switch to a different mount namespace somehow based on
>>>>>     the net namespace? Eventually we want to allow nfsd to run in a
>>>>>     container. At some point we'll need a mechanism to ensure that the
>>>>>     upcall runs within the correct container. Stanislav, I'd appreciate
>>>>>     your input here...
>>>>
>>>> Hello, Jeff.
>>>> As far as I can see - yes, we have to switch to desired mount
>>>> namespace in kernel (because we can't do this in user namespace).
>>>> Moreover, for NFSd mount namespace looks even more important than
>>>> net namespace. I.e. we can't share one NFSd between mount namespaces
>>>> (at least, I don't understand how to do this safely).
>>>> So, if we are going to have NFSd per container, we have to tie NFSd
>>>> and mount namespace somehow.
>>>> Having one NFSd for more than one network namespace looks feasible.
>>>> But we have some races with creation and destruction of service
>>>> per-net transports, and Bruce was suggesting to think about maybe we
>>>> can just simplify all that stuff and create service per network
>>>> namespace.
>>>>
>>>> BTW, is it possible to split service (trasports, etc.) and it's
>>>> "executors" (kernel threads)? If we could do it, then it would be
>>>> really easy to create as many services (with it's unique namespaces
>>>> combination) as required and teach "executors" to switch to desired
>>>> namespaces while handling service requests.
>>>
>>> I'm sure it is possible, and would be more efficient than requiring
>>> (number of containers) * (number of threads per container) threads.
>>>
>>> But I think it's also a little more complicated to do, and that to start
>>> off we'd be better off just keeping the services entirely separate.
>>>
>>
>> At first sight, this two solutions (split of service/threads and service per
>> namespace) doesn't look like they can share a lot of code base.
>> I believe, that we have to think about mount namespaces during NFSd
>> containerization. And, if I understand you right, when you say about "entirely
>> separated services", you mean separated by networks namespace.
>> Separations per mount namespace as a second step will take actually the same
>> time and amount of code, as it would be a first step.
>>
>> So, as I can see it, the real question is: do we really want limited (only
>> per-net) NFSd containerisation as soon as possible (and in this case it's much
>> simpler to create NFSd/Lockd service and it's threads per network namespace) or
>> we should skip this step and concentrate on proper implementation from the very
>> beginning (split of service and it's threads and tie NFSd to mount namespace)?
>>
>
> Ok, to bring the discussion back around, I'll note that the current
> legacy clientid tracking code doesn't pay any attention to namespaces
> either. So I don't think we're missing much by punting on that just
> yet.
>

Have nothing to object to.

> Once you've hashed out the mount namespace design, we should be able to
> make it pay attention to that that without too much trouble.
>

Thanks. Now I've realised, that I'm the one who has to do it. :)

-- 
Best regards,
Stanislav Kinsbursky

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-10-01 15:49 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-01 11:51 [PATCH 0/2] nfsd: add a usermodehelper upcall for client id tracking Jeff Layton
2012-10-01 11:51 ` [PATCH 1/2] nfsd: add a usermodehelper upcall for NFSv4 client ID tracking Jeff Layton
2012-10-01 11:51 ` [PATCH 2/2] nfsd: change algorithm for selecting the client_tracking_ops Jeff Layton
2012-10-01 14:05 ` [PATCH 0/2] nfsd: add a usermodehelper upcall for client id tracking Stanislav Kinsbursky
2012-10-01 14:12   ` bfields
2012-10-01 14:30     ` Stanislav Kinsbursky
2012-10-01 14:48       ` bfields
2012-10-01 15:47         ` Stanislav Kinsbursky
2012-10-01 14:52       ` Jeff Layton
2012-10-01 15:49         ` Stanislav Kinsbursky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).