All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stanislav Kinsbursky <skinsbursky@parallels.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: <linux-nfs@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<devel@openvz.org>
Subject: Re: [Devel] [PATCH 2/6] nfsd: swap fs root in NFSd kthreads
Date: Mon, 14 Jan 2013 10:17:59 +0400	[thread overview]
Message-ID: <50F3A317.9040604@parallels.com> (raw)
In-Reply-To: <20130111172015.GG17909@fieldses.org>

Thanks!

11.01.2013 21:20, J. Bruce Fields пишет:
> On Fri, Jan 11, 2013 at 12:03:12PM -0500, J. Bruce Fields wrote:
>> On Fri, Jan 11, 2013 at 06:56:58PM +0400, Stanislav Kinsbursky wrote:
>>> 11.12.2012 19:35, J. Bruce Fields пишет:
>>>> On Tue, Dec 11, 2012 at 10:20:36AM -0500, J. Bruce Fields wrote:
>>>>> On Tue, Dec 11, 2012 at 07:07:00PM +0400, Stanislav Kinsbursky wrote:
>>>>>> I don't really understand, how  mountd's root can be wrong. I.e.
>>>>>> its' always right as I see it. NFSd kthreads have to swap/use
>>>>>> relative path/whatever to communicate with proper mountd.
>>>>>> Or I'm missing something?
>>>>>
>>>>> Ugh, I see the problem: I thought svc_export_request was called at the
>>>>> time mountd does the read, but instead its done at the time nfsd does
>>>>> the upcall.
>>>>>
>>>>> I suspect that's wrong, and we really want this done in the context of
>>>>> the mountd process when it does the read call.  If d_path is called
>>>>> there then we have no problem.
>>>>
>>>> Right, so I'd be happier if we could modify sunrpc_cache_pipe_upcall to
>>>> skip calling cache_request and instead delay that until cache_read().  I
>>>> think that should be possible.
>>>>
>>>
>>> So, Bruce, what we going to do (or what you want me to do) with the rest of NFSd changes?
>>> I.e. how I should solve this d_path() problem?
>>> I.e. I don't understand what did you mean by "I'd be happier if we could modify sunrpc_cache_pipe_upcall to
>>> skip calling cache_request and instead delay that until cache_read()".
>>> Could you give me a hint?
>>
>> Definitely.  So normally the way these upcalls happen are:
>>
>> 	1. the kernel does a cache lookup, finds no matching item, and
>> 	   calls sunrpc_cache_pipe_upcall().
>> 	2. sunrpc_cache_pipe_upcall() formats the upcall: it allocates a
>> 	   struct cache_request crq and fills crq->buf with the upcall
>> 	   data by calling the cache's ->cache_request() method.
>> 	3. Then rpc.mountd realizes there's data available in
>> 	   /proc/net/rpc/nfsd.fh/content, so it does a read on that file.
>> 	4. cache_read copies the formatted upcall from crq->buf to
>> 	   to userspace.
>>
>> So all I'm suggesting is that instead of calling ->cache_request() at
>> step 2, we do it at step 4.
>>
>> Then cache_request will be called from rpc.mountd's read.  So we'll know
>> which container rpc.mountd is in.
>>
>> Does that make sense?
>
> The following is untested, ugly, and almost certainly insufficient and
> wrong, but maybe it's a starting point:
>
> --b.
>
> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> index 9f84703..f15e4c1 100644
> --- a/net/sunrpc/cache.c
> +++ b/net/sunrpc/cache.c
> @@ -744,6 +744,7 @@ struct cache_request {
>   	char			* buf;
>   	int			len;
>   	int			readers;
> +	void (*cache_request)(struct cache_detail *, struct cache_head *, char **, int *);
>   };
>   struct cache_reader {
>   	struct cache_queue	q;
> @@ -785,10 +786,19 @@ static ssize_t cache_read(struct file *filp, char __user *buf, size_t count,
>   	spin_unlock(&queue_lock);
>
>   	if (rp->offset == 0 && !test_bit(CACHE_PENDING, &rq->item->flags)) {
> +		char *bp;
> +		int len = PAGE_SIZE;
> +
>   		err = -EAGAIN;
>   		spin_lock(&queue_lock);
>   		list_move(&rp->q.list, &rq->q.list);
>   		spin_unlock(&queue_lock);
> +
> +		bp = rq->buf;
> +		rq->cache_request(cd, rq->item, &bp, &len);
> +		if (rq->len < 0)
> +			goto out;
> +		rq->len = PAGE_SIZE - len;
>   	} else {
>   		if (rp->offset + count > rq->len)
>   			count = rq->len - rp->offset;
> @@ -1149,8 +1159,6 @@ int sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h,
>
>   	char *buf;
>   	struct cache_request *crq;
> -	char *bp;
> -	int len;
>
>   	if (!cache_listeners_exist(detail)) {
>   		warn_no_listener(detail);
> @@ -1167,19 +1175,10 @@ int sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h,
>   		return -EAGAIN;
>   	}
>
> -	bp = buf; len = PAGE_SIZE;
> -
> -	cache_request(detail, h, &bp, &len);
> -
> -	if (len < 0) {
> -		kfree(buf);
> -		kfree(crq);
> -		return -EAGAIN;
> -	}
> +	crq->cache_request = cache_request;
>   	crq->q.reader = 0;
>   	crq->item = cache_get(h);
>   	crq->buf = buf;
> -	crq->len = PAGE_SIZE - len;
>   	crq->readers = 0;
>   	spin_lock(&queue_lock);
>   	list_add_tail(&crq->q.list, &detail->queue);
>


-- 
Best regards,
Stanislav Kinsbursky

  reply	other threads:[~2013-01-14  6:18 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-06 15:34 [PATCH 0/6] nfsd: make is works in a container Stanislav Kinsbursky
2012-12-06 15:34 ` [PATCH 1/6] nfsd: pass proper net to nfsd_destroy() from NFSd kthreads Stanislav Kinsbursky
2012-12-06 15:34 ` [PATCH 2/6] nfsd: swap fs root in " Stanislav Kinsbursky
2012-12-10 20:28   ` J. Bruce Fields
2012-12-11 14:00     ` Stanislav Kinsbursky
2012-12-11 14:12       ` [Devel] " Stanislav Kinsbursky
2012-12-11 14:51         ` Stanislav Kinsbursky
2012-12-11 14:56         ` J. Bruce Fields
2012-12-11 14:58           ` Al Viro
2012-12-11 15:07           ` Stanislav Kinsbursky
2012-12-11 15:20             ` J. Bruce Fields
2012-12-11 15:35               ` J. Bruce Fields
2012-12-12  7:45                 ` Stanislav Kinsbursky
2013-01-11 14:56                 ` Stanislav Kinsbursky
2013-01-11 17:03                   ` J. Bruce Fields
2013-01-11 17:20                     ` J. Bruce Fields
2013-01-14  6:17                       ` Stanislav Kinsbursky [this message]
2013-01-14  6:08                     ` Stanislav Kinsbursky
2012-12-11 14:54       ` Al Viro
2012-12-11 14:57         ` Stanislav Kinsbursky
2012-12-06 15:34 ` [PATCH 3/6] nfsd: make containerise NFSd filesystem Stanislav Kinsbursky
2012-12-06 15:34 ` [PATCH 4/6] nfsd: use proper net while reading "exports" file Stanislav Kinsbursky
2012-12-06 15:35 ` [PATCH 5/6] nfsd: disable usermode helper client tracker in container Stanislav Kinsbursky
2012-12-06 15:35 ` [PATCH 6/6] nfsd: enable NFSv4 state in containers Stanislav Kinsbursky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50F3A317.9040604@parallels.com \
    --to=skinsbursky@parallels.com \
    --cc=bfields@fieldses.org \
    --cc=devel@openvz.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.