public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: dai.ngo@oracle.com
To: Chuck Lever III <chuck.lever@oracle.com>,
	Jeff Layton <jlayton@kernel.org>
Cc: Neil Brown <neilb@suse.de>, Olga Kornievskaia <kolga@netapp.com>,
	Tom Talpey <tom@talpey.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2] nfsd: don't hand out write delegations on O_WRONLY opens
Date: Wed, 2 Aug 2023 14:32:59 -0700	[thread overview]
Message-ID: <5296d1a2-e410-c5bd-a8ca-66b8b42f158e@oracle.com> (raw)
In-Reply-To: <6801380f-75cb-49b2-4e89-49821193fe32@oracle.com>


On 8/2/23 2:22 PM, dai.ngo@oracle.com wrote:
>
> On 8/2/23 1:57 PM, Chuck Lever III wrote:
>>
>>> On Aug 2, 2023, at 4:48 PM, Jeff Layton <jlayton@kernel.org> wrote:
>>>
>>> On Wed, 2023-08-02 at 13:15 -0700, dai.ngo@oracle.com wrote:
>>>> On 8/2/23 11:15 AM, Jeff Layton wrote:
>>>>> On Wed, 2023-08-02 at 09:29 -0700, dai.ngo@oracle.com wrote:
>>>>>> On 8/1/23 6:33 AM, Jeff Layton wrote:
>>>>>>> I noticed that xfstests generic/001 was failing against 
>>>>>>> linux-next nfsd.
>>>>>>>
>>>>>>> The client would request a OPEN4_SHARE_ACCESS_WRITE open, and 
>>>>>>> the server
>>>>>>> would hand out a write delegation. The client would then try to 
>>>>>>> use that
>>>>>>> write delegation as the source stateid in a COPY
>>>>>> not sure why the client opens the source file of a COPY operation 
>>>>>> with
>>>>>> OPEN4_SHARE_ACCESS_WRITE?
>>>>>>
>>>>> It doesn't. The original open is to write the data for the file being
>>>>> copied. It then opens the file again for READ, but since it has a 
>>>>> write
>>>>> delegation, it doesn't need to talk to the server at all -- it can 
>>>>> just
>>>>> use that stateid for later operations.
>>>>>
>>>>>>>    or CLONE operation, and
>>>>>>> the server would respond with NFS4ERR_STALE.
>>>>>> If the server does not allow client to use write delegation for the
>>>>>> READ, should the correct error return be NFS4ERR_OPENMODE?
>>>>>>
>>>>> The server must allow the client to use a write delegation for read
>>>>> operations. It's required by the spec, AFAIU.
>>>>>
>>>>> The error in this case was just bogus. The vfs copy routine would 
>>>>> return
>>>>> -EBADF since the file didn't have FMODE_READ, and the nfs server 
>>>>> would
>>>>> translate that into NFS4ERR_STALE.
>>>>>
>>>>> Probably there is a better v4 error code that we could translate 
>>>>> EBADF
>>>>> to, but with this patch it shouldn't be a problem any longer.
>>>>>
>>>>>>> The problem is that the struct file associated with the 
>>>>>>> delegation does
>>>>>>> not necessarily have read permissions. It's handing out a write
>>>>>>> delegation on what is effectively an O_WRONLY open. RFC 8881 
>>>>>>> states:
>>>>>>>
>>>>>>>    "An OPEN_DELEGATE_WRITE delegation allows the client to 
>>>>>>> handle, on its
>>>>>>>     own, all opens."
>>>>>>>
>>>>>>> Given that the client didn't request any read permissions, and 
>>>>>>> that nfsd
>>>>>>> didn't check for any, it seems wrong to give out a write 
>>>>>>> delegation.
>>>>>>>
>>>>>>> Only hand out a write delegation if we have a O_RDWR descriptor
>>>>>>> available. If it fails to find an appropriate write descriptor, go
>>>>>>> ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was
>>>>>>> requested.
>>>>>>>
>>>>>>> This fixes xfstest generic/001.
>>>>>>>
>>>>>>> Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412
>>>>>>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>>>>>>> ---
>>>>>>> Changes in v2:
>>>>>>> - Rework the logic when finding struct file for the delegation. The
>>>>>>>     earlier patch might still have attached a O_WRONLY file to 
>>>>>>> the deleg
>>>>>>>     in some cases, and could still have handed out a write 
>>>>>>> delegation on
>>>>>>>     an O_WRONLY OPEN request in some cases.
>>>>>>> ---
>>>>>>>    fs/nfsd/nfs4state.c | 29 ++++++++++++++++++-----------
>>>>>>>    1 file changed, 18 insertions(+), 11 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>>>>>> index ef7118ebee00..e79d82fd05e7 100644
>>>>>>> --- a/fs/nfsd/nfs4state.c
>>>>>>> +++ b/fs/nfsd/nfs4state.c
>>>>>>> @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open 
>>>>>>> *open, struct nfs4_ol_stateid *stp,
>>>>>>>     struct nfs4_file *fp = stp->st_stid.sc_file;
>>>>>>>     struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate;
>>>>>>>     struct nfs4_delegation *dp;
>>>>>>> - struct nfsd_file *nf;
>>>>>>> + struct nfsd_file *nf = NULL;
>>>>>>>     struct file_lock *fl;
>>>>>>>     u32 dl_type;
>>>>>>>
>>>>>>> @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open 
>>>>>>> *open, struct nfs4_ol_stateid *stp,
>>>>>>>     if (fp->fi_had_conflict)
>>>>>>>     return ERR_PTR(-EAGAIN);
>>>>>>>
>>>>>>> - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) {
>>>>>>> - nf = find_writeable_file(fp);
>>>>>>> + /*
>>>>>>> + * Try for a write delegation first. We need an O_RDWR file
>>>>>>> + * since a write delegation allows the client to perform any open
>>>>>>> + * from its cache.
>>>>>>> + */
>>>>>>> + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == 
>>>>>>> NFS4_SHARE_ACCESS_BOTH) {
>>>>>>> + nf = nfsd_file_get(fp->fi_fds[O_RDWR]);
>>>>>>>     dl_type = NFS4_OPEN_DELEGATE_WRITE;
>>>>>>> - } else {
>>>>>> Does this mean OPEN4_SHARE_ACCESS_WRITE do not get a write 
>>>>>> delegation?
>>>>>> It does not seem right.
>>>>>>
>>>>>> -Dai
>>>>>>
>>>>> Why? Per RFC 8881:
>>>>>
>>>>> "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on 
>>>>> its
>>>>> own, all opens."
>>>>>
>>>>> All opens. That includes read opens.
>>>>>
>>>>> An OPEN4_SHARE_ACCESS_WRITE open will succeed on a file to which the
>>>>> user has no read permissions. Therefore, we can't grant a write
>>>>> delegation since can't guarantee that the user is allowed to do that.
>>>> If the server grants the write delegation on an OPEN with
>>>> OPEN4_SHARE_ACCESS_WRITE on the file with WR-only access mode then
>>>> why can't the server checks and denies the subsequent READ?
>>>>
>>>> Per RFC 8881, section 9.1.2:
>>>>
>>>>      For delegation stateids, the access mode is based on the type of
>>>>      delegation.
>>>>
>>>>      When a READ, WRITE, or SETATTR (that specifies the size 
>>>> attribute)
>>>>      operation is done, the operation is subject to checking 
>>>> against the
>>>>      access mode to verify that the operation is appropriate given the
>>>>      stateid with which the operation is associated.
>>>>
>>>>      In the case of WRITE-type operations (i.e., WRITEs and 
>>>> SETATTRs that
>>>>      set size), the server MUST verify that the access mode allows 
>>>> writing
>>>>      and MUST return an NFS4ERR_OPENMODE error if it does not. In 
>>>> the case
>>>>      of READ, the server may perform the corresponding check on the 
>>>> access
>>>>      mode, or it may choose to allow READ on OPENs for 
>>>> OPEN4_SHARE_ACCESS_WRITE,
>>>>      to accommodate clients whose WRITE implementation may 
>>>> unavoidably do
>>>>      reads (e.g., due to buffer cache constraints). However, even 
>>>> if READs
>>>>      are allowed in these circumstances, the server MUST still 
>>>> check for
>>>>      locks that conflict with the READ (e.g., another OPEN specified
>>>>      OPEN4_SHARE_DENY_READ or OPEN4_SHARE_DENY_BOTH). Note that a 
>>>> server
>>>>      that does enforce the access mode check on READs need not 
>>>> explicitly
>>>>      check for conflicting share reservations since the existence 
>>>> of OPEN
>>>>      for OPEN4_SHARE_ACCESS_READ guarantees that no conflicting share
>>>>      reservation can exist.
>>>>
>>>> FWIW, The Solaris server grants write delegation on OPEN with
>>>> OPEN4_SHARE_ACCESS_WRITE on file with access mode either RW or
>>>> WR-only. Maybe this is a bug? or the spec is not clear?
>>>>
>>> I don't think that's necessarily a bug.
>>>
>>> It's not that the spec demands that we only hand out delegations on 
>>> BOTH
>>> opens.  This is more of a quirk of the Linux implementation. Linux'
>>> write delegations require an open O_RDWR file descriptor because we may
>>> be called upon to do a read on its behalf.
>>>
>>> Technically, we could probably just have it check for
>>> OPEN4_SHARE_ACCESS_WRITE, but in the case where READ isn't also set,
>>> then you're unlikely to get a delegation. Either the O_RDWR descriptor
>>> will be NULL, or there are other, conflicting opens already present.
>>>
>>> Solaris may have a completely different design that doesn't require
>>> this. I haven't looked at its code to know.
>> I'm comfortable for now with not handing out write delegations for
>> SHARE_ACCESS_WRITE opens. I prefer that to permission checking on
>> every READ operation.
>
> I'm fine with just handling out write delegation for SHARE_ACCESS_BOTH
> only.
>
> Just a concern about not checking for access at the time of READ 
> operation.
or not checking file permission at the time WRITE.
> If the file was opened with SHARE_ACCESS_WRITE (no write delegation 
> granted)
> and the file access mode was changed to read-only, is it a correct 
> behavior
> for the server to allow the READ to go through?
I meant for the WRITE to go through.
>
> -Dai
>
>>
>> If we find that it's a significant performance issue, we can revisit.
>>
>>
>>>> It'd would be interesting to know how ONTAP server behaves in
>>>> this scenario.
>>>>
>>> Indeed. Most likely it behaves more like Solaris does, but it'd nice to
>>> know.
>>>
>>>>>
>>>>>>> + }
>>>>>>> +
>>>>>>> + /*
>>>>>>> + * If the file is being opened O_RDONLY or we couldn't get a 
>>>>>>> O_RDWR
>>>>>>> + * file for some reason, then try for a read deleg instead.
>>>>>>> + */
>>>>>>> + if (!nf && (open->op_share_access & NFS4_SHARE_ACCESS_READ)) {
>>>>>>>     nf = find_readable_file(fp);
>>>>>>>     dl_type = NFS4_OPEN_DELEGATE_READ;
>>>>>>>     }
>>>>>>> - if (!nf) {
>>>>>>> - /*
>>>>>>> - * We probably could attempt another open and get a read
>>>>>>> - * delegation, but for now, don't bother until the
>>>>>>> - * client actually sends us one.
>>>>>>> - */
>>>>>>> +
>>>>>>> + if (!nf)
>>>>>>>     return ERR_PTR(-EAGAIN);
>>>>>>> - }
>>>>>>> +
>>>>>>>     spin_lock(&state_lock);
>>>>>>>     spin_lock(&fp->fi_lock);
>>>>>>>     if (nfs4_delegation_exists(clp, fp))
>>>>>>>
>>>>>>> ---
>>>>>>> base-commit: a734662572708cf062e974f659ae50c24fc1ad17
>>>>>>> change-id: 20230731-wdeleg-bbdb6b25a3c6
>>>>>>>
>>>>>>> Best regards,
>>> -- 
>>> Jeff Layton <jlayton@kernel.org>
>> -- 
>> Chuck Lever
>>
>>

  reply	other threads:[~2023-08-02 21:34 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-01 13:33 [PATCH v2] nfsd: don't hand out write delegations on O_WRONLY opens Jeff Layton
2023-08-01 22:26 ` NeilBrown
2023-08-01 22:51   ` Chuck Lever
2023-08-02  0:07     ` Jeff Layton
2023-08-02 16:29 ` dai.ngo
2023-08-02 18:15   ` Jeff Layton
2023-08-02 18:25     ` Chuck Lever III
2023-08-02 20:15     ` dai.ngo
2023-08-02 20:48       ` Jeff Layton
2023-08-02 20:57         ` Chuck Lever III
2023-08-02 21:13           ` Jeff Layton
2023-08-02 21:26             ` dai.ngo
2023-08-02 21:22           ` dai.ngo
2023-08-02 21:32             ` dai.ngo [this message]
2023-08-02 21:52               ` Jeff Layton
     [not found]                 ` <3dad0420-11b5-6e6a-a1ae-72970fbfdb34@oracle.com>
2023-08-03 11:27                   ` Jeff Layton
2023-08-03 17:01                     ` dai.ngo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5296d1a2-e410-c5bd-a8ca-66b8b42f158e@oracle.com \
    --to=dai.ngo@oracle.com \
    --cc=chuck.lever@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=kolga@netapp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox