linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Nick Bowler <nbowler@elliptictech.com>,
	LKML Kernel <linux-kernel@vger.kernel.org>,
	"J. Bruce Fields" <bfields@redhat.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: Regression, bisected: sqlite locking failure on nfs
Date: Mon, 1 Nov 2010 17:12:48 -0400	[thread overview]
Message-ID: <597A1A30-E61B-484B-AB79-31AB83AE7A50@oracle.com> (raw)
In-Reply-To: <1288643729.5009.25.camel@heimdal.trondhjem.org>


On Nov 1, 2010, at 4:35 PM, Trond Myklebust wrote:

> On Mon, 2010-11-01 at 15:55 -0400, Chuck Lever wrote:
>> On Nov 1, 2010, at 3:48 PM, Trond Myklebust wrote:
>> 
>>> On Mon, 2010-11-01 at 15:45 -0400, Chuck Lever wrote:
>>>> On Nov 1, 2010, at 3:42 PM, Trond Myklebust wrote:
>>>> 
>>>>> On Mon, 2010-11-01 at 15:22 -0400, Trond Myklebust wrote:
>>>>>> I suspect nlmclnt_lookup_host() is to blame. It appears to be the _only_
>>>>>> thing in the kernel that actually sets this 'srcaddr' field, and it sets
>>>>>> it to
>>>>>> 
>>>>>> const struct sockaddr source = {
>>>>>> 	.sa_family      = AF_UNSPEC,
>>>>>> };
>>>>>> 
>>>>>> You triggered the bug by removing the line
>>>>>> 
>>>>>> 	transport->srcaddr.ss_family = family;
>>>>>> 
>>>>>> from xs_create_sock().
>>>>>> 
>>>>>> Trond
>>>>> 
>>>>> Does this fix the regression?
>>>>> 
>>>>> Trond
>>>>> 
>>>>> ----------------------------------------------------------------------------------------------
>>>>> NLM: Fix a regression in lockd
>>>>> 
>>>>> From: Trond Myklebust <Trond.Myklebust@netapp.com>
>>>>> 
>>>>> Nick Bowler reports:
>>>>> There are no unusual messages on the client... but I just logged into
>>>>> the server and I see lots of messages of the following form:
>>>>> 
>>>>> nfsd: request from insecure port (192.168.8.199:35766)!
>>>>> nfsd: request from insecure port (192.168.8.199:35766)!
>>>>> nfsd: request from insecure port (192.168.8.199:35766)!
>>>>> nfsd: request from insecure port (192.168.8.199:35766)!
>>>>> nfsd: request from insecure port (192.168.8.199:35766)!
>>>>> 
>>>>> Bisected to commit 9247685088398cf21bcb513bd2832b4cd42516c4 (SUNRPC:
>>>>> Properly initialize sock_xprt.srcaddr in all cases)
>>>>> 
>>>>> Apparently, removing the 'transport->srcaddr.ss_family = family' from
>>>>> xs_create_sock() triggers this due to nlmclnt_lookup_host() incorrectly
>>>>> initialising the srcaddr family to AF_UNSPEC.
>>>>> 
>>>>> Reported-by: Nick Bowler <nbowler@elliptictech.com>
>>>>> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
>>>>> ---
>>>>> 
>>>>> fs/lockd/host.c |    5 -----
>>>>> 1 files changed, 0 insertions(+), 5 deletions(-)
>>>>> 
>>>>> 
>>>>> diff --git a/fs/lockd/host.c b/fs/lockd/host.c
>>>>> index 25e21e4..9ff0c0e 100644
>>>>> --- a/fs/lockd/host.c
>>>>> +++ b/fs/lockd/host.c
>>>>> @@ -238,9 +238,6 @@ struct nlm_host *nlmclnt_lookup_host(const struct sockaddr *sap,
>>>>> 				     const char *hostname,
>>>>> 				     int noresvport)
>>>>> {
>>>>> -	const struct sockaddr source = {
>>>>> -		.sa_family	= AF_UNSPEC,
>>>>> -	};
>>>>> 	struct nlm_lookup_host_info ni = {
>>>>> 		.server		= 0,
>>>>> 		.sap		= sap,
>>>>> @@ -249,8 +246,6 @@ struct nlm_host *nlmclnt_lookup_host(const struct sockaddr *sap,
>>>>> 		.version	= version,
>>>>> 		.hostname	= hostname,
>>>>> 		.hostname_len	= strlen(hostname),
>>>>> -		.src_sap	= &source,
>>>>> -		.src_len	= sizeof(source),
>>>>> 		.noresvport	= noresvport,
>>>>> 	};
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> What about that memcpy() in nlm_lookup_host()?  With this patch, wouldn't it be copying garbage into the host's srcaddr field?
>>>> 
>>> 
>>> It shouldn't. ni->src_len is now zero.
>> 
>> Yech.  All this still assumes that ANYADDR is all zeroes, and that the memory this is going into is already initialized to zeroes.  It's asking for trouble if we re-arrange all this someday.
>> 
>> I've got an untested one line patch that should fix this for any upper layer caller.  Posting now.
>> 
> 
> No! Upper layer callers should simply not be setting .saddress. Pretty
> much the only thing that _should_ be setting .saddress is the lockd
> callback, and possibly the nfsv4 server callback.

Then the take 2 patch description should be updated to reflect your actual intent.  I don't think it's especially clear from your patch description why "nlmclnt_lookup_host() ... initialising the srcaddr family to AF_UNSPEC" is incorrect from an architectural standpoint.  The specifics of "fixing bad behavior" are obvious already, but the architectural problem needs to be documented.

Otherwise I'm OK with the general idea of take 2, since it passes in a NULL pointer when no special bind address is desired, as other callers of rpc_create() do.  Of course, take 2 has to pass Nick's test case too.

-- 
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





  reply	other threads:[~2010-11-01 21:13 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20101101175854.GA3550@elliptictech.com>
2010-11-01 18:07 ` Regression, bisected: sqlite locking failure on nfs Chuck Lever
2010-11-01 18:19   ` Nick Bowler
2010-11-01 18:30     ` Chuck Lever
2010-11-01 19:22       ` Trond Myklebust
2010-11-01 19:42         ` Trond Myklebust
2010-11-01 19:45           ` Chuck Lever
2010-11-01 19:48             ` Trond Myklebust
2010-11-01 19:55               ` Chuck Lever
2010-11-01 20:35                 ` Trond Myklebust
2010-11-01 21:12                   ` Chuck Lever [this message]
2010-11-01 20:09               ` Trond Myklebust
2010-11-01 20:33                 ` Nick Bowler
2010-11-02 13:14                   ` Trond Myklebust
2010-11-02 14:05                     ` Nick Bowler
     [not found]           ` <1288640536.5009.18.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2010-11-01 19:59             ` Nick Bowler
2010-11-01 19:43         ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=597A1A30-E61B-484B-AB79-31AB83AE7A50@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=bfields@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=nbowler@elliptictech.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).