linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: dai.ngo@oracle.com
To: Chuck Lever III <chuck.lever@oracle.com>
Cc: Bruce Fields <bfields@fieldses.org>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH RFC v5 0/2] nfsd: Initial implementation of NFSv4 Courteous Server
Date: Mon, 29 Nov 2021 13:01:22 -0800	[thread overview]
Message-ID: <cd53324c-be0c-5fd2-e081-8012d4326713@oracle.com> (raw)
In-Reply-To: <e1093e42-2871-8810-de76-58d1ea357898@oracle.com>

[-- Attachment #1: Type: text/plain, Size: 6240 bytes --]


On 11/29/21 11:36 AM, dai.ngo@oracle.com wrote:
>
> On 11/29/21 11:03 AM, Chuck Lever III wrote:
>> Hello Dai!
>>
>>
>>> On Nov 29, 2021, at 1:32 PM, Dai Ngo <dai.ngo@oracle.com> wrote:
>>>
>>>
>>> On 11/29/21 9:30 AM, J. Bruce Fields wrote:
>>>> On Mon, Nov 29, 2021 at 09:13:16AM -0800, dai.ngo@oracle.com wrote:
>>>>> Hi Bruce,
>>>>>
>>>>> On 11/21/21 7:04 PM, dai.ngo@oracle.com wrote:
>>>>>> On 11/17/21 4:34 PM, J. Bruce Fields wrote:
>>>>>>> On Wed, Nov 17, 2021 at 01:46:02PM -0800, dai.ngo@oracle.com wrote:
>>>>>>>> On 11/17/21 9:59 AM, dai.ngo@oracle.com wrote:
>>>>>>>>> On 11/17/21 6:14 AM, J. Bruce Fields wrote:
>>>>>>>>>> On Tue, Nov 16, 2021 at 03:06:32PM -0800, dai.ngo@oracle.com 
>>>>>>>>>> wrote:
>>>>>>>>>>> Just a reminder that this patch is still waiting for your 
>>>>>>>>>>> review.
>>>>>>>>>> Yeah, I was procrastinating and hoping yo'ud figure out the 
>>>>>>>>>> pynfs
>>>>>>>>>> failure for me....
>>>>>>>>> Last time I ran 4.0 OPEN18 test by itself and it passed. I 
>>>>>>>>> will run
>>>>>>>>> all OPEN tests together with 5.15-rc7 to see if the problem 
>>>>>>>>> you've
>>>>>>>>> seen still there.
>>>>>>>> I ran all tests in nfsv4.1 and nfsv4.0 with courteous and 
>>>>>>>> non-courteous
>>>>>>>> 5.15-rc7 server.
>>>>>>>>
>>>>>>>> Nfs4.1 results are the same for both courteous and
>>>>>>>> non-courteous server:
>>>>>>>>> Of those: 0 Skipped, 0 Failed, 0 Warned, 169 Passed
>>>>>>>> Results of nfs4.0 with non-courteous server:
>>>>>>>>> Of those: 8 Skipped, 1 Failed, 0 Warned, 577 Passed
>>>>>>>> test failed: LOCK24
>>>>>>>>
>>>>>>>> Results of nfs4.0 with courteous server:
>>>>>>>>> Of those: 8 Skipped, 3 Failed, 0 Warned, 575 Passed
>>>>>>>> tests failed: LOCK24, OPEN18, OPEN30
>>>>>>>>
>>>>>>>> OPEN18 and OPEN30 test pass if each is run by itself.
>>>>>>> Could well be a bug in the tests, I don't know.
>>>>>> The reason OPEN18 failed was because the test timed out waiting for
>>>>>> the reply of an OPEN call. The RPC connection used for the test was
>>>>>> configured with 15 secs timeout. Note that OPEN18 only fails when
>>>>>> the tests were run with 'all' option, this test passes if it's run
>>>>>> by itself.
>>>>>>
>>>>>> With courteous server, by the time OPEN18 runs, there are about 1026
>>>>>> courtesy 4.0 clients on the server and all of these clients have 
>>>>>> opened
>>>>>> the same file X with WRITE access. These clients were created by the
>>>>>> previous tests. After each test completed, since 4.0 does not have
>>>>>> session, the client states are not cleaned up immediately on the
>>>>>> server and are allowed to become courtesy clients.
>>>>>>
>>>>>> When OPEN18 runs (about 20 minutes after the 1st test started), it
>>>>>> sends OPEN of file X with OPEN4_SHARE_DENY_WRITE which causes the
>>>>>> server to check for conflicts with courtesy clients. The loop that
>>>>>> checks 1026 courtesy clients for share/access conflict took less
>>>>>> than 1 sec. But it took about 55 secs, on my VM, for the server
>>>>>> to expire all 1026 courtesy clients.
>>>>>>
>>>>>> I modified pynfs to configure the 4.0 RPC connection with 60 seconds
>>>>>> timeout and OPEN18 now consistently passed. The 4.0 test results are
>>>>>> now the same for courteous and non-courteous server:
>>>>>>
>>>>>> 8 Skipped, 1 Failed, 0 Warned, 577 Passed
>>>>>>
>>>>>> Note that 4.1 tests do not suffer this timeout problem because the
>>>>>> 4.1 clients and sessions are destroyed after each test completes.
>>>>> Do you want me to send the patch to increase the timeout for pynfs?
>>>>> or is there any other things you think we should do?
>>>> I don't know.
>>>>
>>>> 55 seconds to clean up 1026 clients is about 50ms per client, which is
>>>> pretty slow.  I wonder why.  I guess it's probably updating the stable
>>>> storage information.  Is /var/lib/nfs/ on your server backed by a hard
>>>> drive or an SSD or something else?
>>> My server is a virtualbox VM that has 1 CPU, 4GB RAM and 64GB of hard
>>> disk. I think a production system that supports this many clients 
>>> should
>>> have faster CPUs, faster storage.
>>>
>>>> I wonder if that's an argument for limiting the number of courtesy
>>>> clients.
>>> I think we might want to treat 4.0 clients a bit different from 4.1
>>> clients. With 4.0, every client will become a courtesy client after
>>> the client is done with the export and unmounts it.
>> It should be safe for a server to purge a client's lease immediately
>> if there is no open or lock state associated with it.
>
> In this case, each client has opened files so there are open states
> associated with them.
>
>>
>> When an NFSv4.0 client unmounts, all files should be closed at that
>> point,
>
> I'm not sure pynfs does proper clean up after each subtest, I will
> check. There must be state associated with the client in order for
> it to become courtesy client.

pynfs 4.0 test uses LOOKUP, OPEN with OPEN4_CREATE to create the
test file and uses PUTFH and REMOVE to remove the test file when
done. I don't see where the open state associated the removed
file being freed by nfsd_remove. I guess for 4.0, the open state
remains valid on the server until the client lease expires.

I attached the pcap of OPEN18 test for reference.

-Dai

>
>> so the server can wait for the lease to expire and purge it
>> normally. Or am I missing something?
>
> When 4.0 client lease expires and there are still states associated
> with the client then the server allows this client to become courtesy
> client.
>
> -Dai
>
>>
>>
>>> Since there is
>>> no destroy session/client with 4.0, the courteous server allows the
>>> client to be around and becomes a courtesy client. So after awhile,
>>> even with normal usage, there will be lots 4.0 courtesy clients
>>> hanging around and these clients won't be destroyed until 24hrs
>>> later, or until they cause conflicts with other clients.
>>>
>>> We can reduce the courtesy_client_expiry time for 4.0 clients from
>>> 24hrs to 15/20 mins, enough for most network partition to heal?,
>>> or limit the number of 4.0 courtesy clients. Or don't support 4.0
>>> clients at all which is my preference since I think in general users
>>> should skip 4.0 and use 4.1 instead.
>>>
>>> -Dai
>> -- 
>> Chuck Lever
>>
>>
>>

[-- Attachment #2: pynfs_open18_40.pcap --]
[-- Type: application/octet-stream, Size: 16244 bytes --]

  reply	other threads:[~2021-11-29 21:03 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-29  0:56 [PATCH RFC v5 0/2] nfsd: Initial implementation of NFSv4 Courteous Server Dai Ngo
2021-09-29  0:56 ` [PATCH RFC v5 1/2] fs/lock: add new callback, lm_expire_lock, to lock_manager_operations Dai Ngo
2021-09-29  0:56 ` [PATCH RFC v5 2/2] nfsd: Initial implementation of NFSv4 Courteous Server Dai Ngo
2021-10-01 20:53 ` [PATCH RFC v5 0/2] " J. Bruce Fields
2021-10-01 21:41   ` dai.ngo
2021-10-01 23:03     ` J. Bruce Fields
2021-11-16 23:06     ` dai.ngo
2021-11-17 14:14       ` J. Bruce Fields
2021-11-17 17:59         ` dai.ngo
2021-11-17 21:46           ` dai.ngo
2021-11-18  0:34             ` J. Bruce Fields
2021-11-22  3:04               ` dai.ngo
2021-11-29 17:13                 ` dai.ngo
2021-11-29 17:30                   ` J. Bruce Fields
2021-11-29 18:32                     ` dai.ngo
2021-11-29 19:03                       ` Chuck Lever III
2021-11-29 19:13                         ` Bruce Fields
2021-11-29 19:39                           ` dai.ngo
2021-11-29 19:36                         ` dai.ngo
2021-11-29 21:01                           ` dai.ngo [this message]
2021-11-29 21:10                           ` Chuck Lever III
2021-11-30  0:11                             ` dai.ngo
2021-11-30  1:42                               ` Chuck Lever III
2021-11-30  4:08                                 ` Trond Myklebust
2021-11-30  4:47                                   ` Chuck Lever III
2021-11-30  4:57                                     ` Trond Myklebust
2021-11-30  7:22                                       ` dai.ngo
2021-11-30 13:37                                         ` Trond Myklebust
2021-12-01  3:52                                           ` dai.ngo
2021-12-01 14:19                                             ` bfields
2021-11-30 15:36                                         ` Chuck Lever III
2021-11-30 16:05                                           ` Bruce Fields
2021-11-30 16:14                                             ` Trond Myklebust
2021-11-30 19:01                                               ` bfields
2021-11-30  7:13                                 ` dai.ngo
2021-11-30 15:32                                   ` Bruce Fields
2021-12-01  3:50                                     ` dai.ngo
2021-12-01 14:36                                       ` Bruce Fields
2021-12-01 14:51                                         ` Bruce Fields
2021-12-01 18:47                                           ` dai.ngo
2021-12-01 19:25                                             ` Bruce Fields
2021-12-02 17:53                                           ` Chuck Lever III
2021-12-01 17:42                                         ` Bruce Fields
2021-12-01 18:03                                           ` Bruce Fields
2021-12-01 19:50                                             ` Bruce Fields
2021-12-03 21:22                                               ` Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cd53324c-be0c-5fd2-e081-8012d4326713@oracle.com \
    --to=dai.ngo@oracle.com \
    --cc=bfields@fieldses.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).