From: dai.ngo@oracle.com
To: "bfields@fieldses.org" <bfields@fieldses.org>
Cc: Trond Myklebust <trondmy@hammerspace.com>,
"chuck.lever@oracle.com" <chuck.lever@oracle.com>,
"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH RFC v6 2/2] nfsd: Initial implementation of NFSv4 Courteous Server
Date: Wed, 8 Dec 2021 09:29:31 -0800 [thread overview]
Message-ID: <cdf6317b-aa42-539a-bc7f-3150e83cbc60@oracle.com> (raw)
In-Reply-To: <20211208163937.GA29555@fieldses.org>
On 12/8/21 8:39 AM, bfields@fieldses.org wrote:
> On Wed, Dec 08, 2021 at 08:25:28AM -0800, dai.ngo@oracle.com wrote:
>> On 12/8/21 8:16 AM, Trond Myklebust wrote:
>>> On Wed, 2021-12-08 at 07:54 -0800, dai.ngo@oracle.com wrote:
>>>> On 12/6/21 11:55 AM, Chuck Lever III wrote:
>>>>
>>>>>> +
>>>>>> +/*
>>>>>> + * Function to check if the nfserr_share_denied error for 'fp'
>>>>>> resulted
>>>>>> + * from conflict with courtesy clients then release their state to
>>>>>> resolve
>>>>>> + * the conflict.
>>>>>> + *
>>>>>> + * Function returns:
>>>>>> + * 0 - no conflict with courtesy clients
>>>>>> + * >0 - conflict with courtesy clients resolved, try
>>>>>> access/deny check again
>>>>>> + * -1 - conflict with courtesy clients being resolved in
>>>>>> background
>>>>>> + * return nfserr_jukebox to NFS client
>>>>>> + */
>>>>>> +static int
>>>>>> +nfs4_destroy_clnts_with_sresv_conflict(struct svc_rqst *rqstp,
>>>>>> + struct nfs4_file *fp, struct
>>>>>> nfs4_ol_stateid *stp,
>>>>>> + u32 access, bool share_access)
>>>>>> +{
>>>>>> + int cnt = 0;
>>>>>> + int async_cnt = 0;
>>>>>> + bool no_retry = false;
>>>>>> + struct nfs4_client *cl;
>>>>>> + struct list_head *pos, *next, reaplist;
>>>>>> + struct nfsd_net *nn = net_generic(SVC_NET(rqstp),
>>>>>> nfsd_net_id);
>>>>>> +
>>>>>> + INIT_LIST_HEAD(&reaplist);
>>>>>> + spin_lock(&nn->client_lock);
>>>>>> + list_for_each_safe(pos, next, &nn->client_lru) {
>>>>>> + cl = list_entry(pos, struct nfs4_client, cl_lru);
>>>>>> + /*
>>>>>> + * check all nfs4_ol_stateid of this client
>>>>>> + * for conflicts with 'access'mode.
>>>>>> + */
>>>>>> + if (nfs4_check_deny_bmap(cl, fp, stp, access,
>>>>>> share_access)) {
>>>>>> + if (!test_bit(NFSD4_COURTESY_CLIENT, &cl-
>>>>>>> cl_flags)) {
>>>>>> + /* conflict with non-courtesy
>>>>>> client */
>>>>>> + no_retry = true;
>>>>>> + cnt = 0;
>>>>>> + goto out;
>>>>>> + }
>>>>>> + /*
>>>>>> + * if too many to resolve synchronously
>>>>>> + * then do the rest in background
>>>>>> + */
>>>>>> + if (cnt > 100) {
>>>>>> + set_bit(NFSD4_DESTROY_COURTESY_CLIE
>>>>>> NT, &cl->cl_flags);
>>>>>> + async_cnt++;
>>>>>> + continue;
>>>>>> + }
>>>>>> + if (mark_client_expired_locked(cl))
>>>>>> + continue;
>>>>>> + cnt++;
>>>>>> + list_add(&cl->cl_lru, &reaplist);
>>>>>> + }
>>>>>> + }
>>>>> Bruce suggested simply returning NFS4ERR_DELAY for all cases.
>>>>> That would simplify this quite a bit for what is a rare edge
>>>>> case.
>>>> If we always do this asynchronously by returning NFS4ERR_DELAY
>>>> for all cases then the following pynfs tests need to be modified
>>>> to handle the error:
>>>>
>>>> RENEW3 st_renew.testExpired :
>>>> FAILURE
>>>> LKU10 st_locku.testTimedoutUnlock :
>>>> FAILURE
>>>> CLOSE9 st_close.testTimedoutClose2 :
>>>> FAILURE
>>>>
>>>> and any new tests that opens file have to be prepared to handle
>>>> NFS4ERR_DELAY due to the lack of destroy_clientid in 4.0.
>>>>
>>>> Do we still want to take this approach?
>>> NFS4ERR_DELAY is a valid error for both CLOSE and LOCKU (see RFC7530
>>> section 13.2 https://urldefense.com/v3/__https://datatracker.ietf.org/doc/html/rfc7530*section-13.2__;Iw!!ACWV5N9M2RV99hQ!f8vZHAJophxXdSSJvnxDCSBSRpWFxEOZBo2ZLvjPzXLVrvMYR8RKcc0_Jvjhng$
>>> ) so if pynfs complains, then it needs fixing regardless.
>>>
>>> RENEW, on the other hand, cannot return NFS4ERR_DELAY, but why would it
>>> need to? Either the lease is still valid, or else someone is already
>>> trying to tear it down due to an expiration event. I don't see why
>>> courtesy locks need to add any further complexity to that test.
>> RENEW fails in the 2nd open:
>>
>> c.create_confirm(t.word(), access=OPEN4_SHARE_ACCESS_BOTH,
>> deny=OPEN4_SHARE_DENY_BOTH) <<====== DENY_BOTH
>> sleeptime = c.getLeaseTime() * 2
>> env.sleep(sleeptime)
>> c2 = env.c2
>> c2.init_connection()
>> c2.open_confirm(t.word(), access=OPEN4_SHARE_ACCESS_READ, <<=== needs to handle NFS4ERR_DELAY
>> deny=OPEN4_SHARE_DENY_NONE)
>>
>> CLOSE and LOCKU also fail in the OPEN, similar to the RENEW test.
>> Any new pynfs 4.0 test that does open might get NFS4ERR_DELAY.
> So it's a RENEW test, not the RENEW operation.
>
> A general-purpose client always has to be prepared for DELAY on OPEN.
> But pynfs isn't a general-purpose client, and it assumes that it's the
> only one touching the files and directories it creates.
>
> Within pynfs we've got a problem that the tests don't necessarily clean
> up after themselves completely, so in theory a test could interfere with
> later results.
>
> But each test uses its own files--e.g. in the fragment above note that
> the file it's testing gets the name t.word(), which is by design unique
> to that test. So it shouldn't be hitting any conflicts with state held
> by previous tests. Am I missing something?
Both calls, c.create_confirm and c2.open_confirm, use the same file name
t.word().
However, it's strange that if I run RENEW3 by itself then it passes.
From the network trace, env.c2/c2.init_connection generates a SETCLIENTID
which might get rid off all the states of the previous client since
they're the same.
I will investigate why it fails when running 'all' with NFS4ERR_DELAY.
Do you know if there is an option to specify a list of tests to run,
instead of 'all'?
-Dai
next prev parent reply other threads:[~2021-12-08 17:29 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-06 17:59 [PATCH RFC v6 0/2] nfsd: Initial implementation of NFSv4 Courteous Server Dai Ngo
2021-12-06 17:59 ` [PATCH RFC v6 1/2] fs/lock: add new callback, lm_expire_lock, to lock_manager_operations Dai Ngo
2021-12-06 18:39 ` Chuck Lever III
2021-12-06 19:52 ` Trond Myklebust
2021-12-06 20:05 ` bfields
2021-12-06 20:36 ` dai.ngo
2021-12-06 22:05 ` Trond Myklebust
2021-12-06 23:07 ` dai.ngo
2021-12-06 17:59 ` [PATCH RFC v6 2/2] nfsd: Initial implementation of NFSv4 Courteous Server Dai Ngo
2021-12-06 19:55 ` Chuck Lever III
2021-12-06 21:44 ` dai.ngo
2021-12-06 22:30 ` Chuck Lever III
2021-12-06 22:52 ` Bruce Fields
2021-12-07 22:00 ` Chuck Lever III
2021-12-07 22:35 ` Bruce Fields
2021-12-08 15:17 ` Chuck Lever III
2021-12-08 15:54 ` dai.ngo
2021-12-08 15:58 ` Chuck Lever III
2021-12-08 16:16 ` Trond Myklebust
2021-12-08 16:25 ` dai.ngo
2021-12-08 16:39 ` bfields
2021-12-08 17:29 ` dai.ngo [this message]
2021-12-08 17:45 ` bfields
2021-12-10 17:51 ` dai.ngo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cdf6317b-aa42-539a-bc7f-3150e83cbc60@oracle.com \
--to=dai.ngo@oracle.com \
--cc=bfields@fieldses.org \
--cc=chuck.lever@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=trondmy@hammerspace.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).