public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: dai.ngo@oracle.com
To: Chuck Lever III <chuck.lever@oracle.com>,
	Jeff Layton <jlayton@kernel.org>
Cc: Neil Brown <neilb@suse.de>, Olga Kornievskaia <kolga@netapp.com>,
	Tom Talpey <tom@talpey.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2] nfsd: don't hand out write delegations on O_WRONLY opens
Date: Wed, 2 Aug 2023 14:22:31 -0700	[thread overview]
Message-ID: <6801380f-75cb-49b2-4e89-49821193fe32@oracle.com> (raw)
In-Reply-To: <26761CA2-923C-43FC-BDC6-14012115EAA0@oracle.com>


On 8/2/23 1:57 PM, Chuck Lever III wrote:
>
>> On Aug 2, 2023, at 4:48 PM, Jeff Layton <jlayton@kernel.org> wrote:
>>
>> On Wed, 2023-08-02 at 13:15 -0700, dai.ngo@oracle.com wrote:
>>> On 8/2/23 11:15 AM, Jeff Layton wrote:
>>>> On Wed, 2023-08-02 at 09:29 -0700, dai.ngo@oracle.com wrote:
>>>>> On 8/1/23 6:33 AM, Jeff Layton wrote:
>>>>>> I noticed that xfstests generic/001 was failing against linux-next nfsd.
>>>>>>
>>>>>> The client would request a OPEN4_SHARE_ACCESS_WRITE open, and the server
>>>>>> would hand out a write delegation. The client would then try to use that
>>>>>> write delegation as the source stateid in a COPY
>>>>> not sure why the client opens the source file of a COPY operation with
>>>>> OPEN4_SHARE_ACCESS_WRITE?
>>>>>
>>>> It doesn't. The original open is to write the data for the file being
>>>> copied. It then opens the file again for READ, but since it has a write
>>>> delegation, it doesn't need to talk to the server at all -- it can just
>>>> use that stateid for later operations.
>>>>
>>>>>>    or CLONE operation, and
>>>>>> the server would respond with NFS4ERR_STALE.
>>>>> If the server does not allow client to use write delegation for the
>>>>> READ, should the correct error return be NFS4ERR_OPENMODE?
>>>>>
>>>> The server must allow the client to use a write delegation for read
>>>> operations. It's required by the spec, AFAIU.
>>>>
>>>> The error in this case was just bogus. The vfs copy routine would return
>>>> -EBADF since the file didn't have FMODE_READ, and the nfs server would
>>>> translate that into NFS4ERR_STALE.
>>>>
>>>> Probably there is a better v4 error code that we could translate EBADF
>>>> to, but with this patch it shouldn't be a problem any longer.
>>>>
>>>>>> The problem is that the struct file associated with the delegation does
>>>>>> not necessarily have read permissions. It's handing out a write
>>>>>> delegation on what is effectively an O_WRONLY open. RFC 8881 states:
>>>>>>
>>>>>>    "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its
>>>>>>     own, all opens."
>>>>>>
>>>>>> Given that the client didn't request any read permissions, and that nfsd
>>>>>> didn't check for any, it seems wrong to give out a write delegation.
>>>>>>
>>>>>> Only hand out a write delegation if we have a O_RDWR descriptor
>>>>>> available. If it fails to find an appropriate write descriptor, go
>>>>>> ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was
>>>>>> requested.
>>>>>>
>>>>>> This fixes xfstest generic/001.
>>>>>>
>>>>>> Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412
>>>>>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>>>>>> ---
>>>>>> Changes in v2:
>>>>>> - Rework the logic when finding struct file for the delegation. The
>>>>>>     earlier patch might still have attached a O_WRONLY file to the deleg
>>>>>>     in some cases, and could still have handed out a write delegation on
>>>>>>     an O_WRONLY OPEN request in some cases.
>>>>>> ---
>>>>>>    fs/nfsd/nfs4state.c | 29 ++++++++++++++++++-----------
>>>>>>    1 file changed, 18 insertions(+), 11 deletions(-)
>>>>>>
>>>>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>>>>> index ef7118ebee00..e79d82fd05e7 100644
>>>>>> --- a/fs/nfsd/nfs4state.c
>>>>>> +++ b/fs/nfsd/nfs4state.c
>>>>>> @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
>>>>>>     struct nfs4_file *fp = stp->st_stid.sc_file;
>>>>>>     struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate;
>>>>>>     struct nfs4_delegation *dp;
>>>>>> - struct nfsd_file *nf;
>>>>>> + struct nfsd_file *nf = NULL;
>>>>>>     struct file_lock *fl;
>>>>>>     u32 dl_type;
>>>>>>
>>>>>> @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
>>>>>>     if (fp->fi_had_conflict)
>>>>>>     return ERR_PTR(-EAGAIN);
>>>>>>
>>>>>> - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) {
>>>>>> - nf = find_writeable_file(fp);
>>>>>> + /*
>>>>>> + * Try for a write delegation first. We need an O_RDWR file
>>>>>> + * since a write delegation allows the client to perform any open
>>>>>> + * from its cache.
>>>>>> + */
>>>>>> + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == NFS4_SHARE_ACCESS_BOTH) {
>>>>>> + nf = nfsd_file_get(fp->fi_fds[O_RDWR]);
>>>>>>     dl_type = NFS4_OPEN_DELEGATE_WRITE;
>>>>>> - } else {
>>>>> Does this mean OPEN4_SHARE_ACCESS_WRITE do not get a write delegation?
>>>>> It does not seem right.
>>>>>
>>>>> -Dai
>>>>>
>>>> Why? Per RFC 8881:
>>>>
>>>> "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its
>>>> own, all opens."
>>>>
>>>> All opens. That includes read opens.
>>>>
>>>> An OPEN4_SHARE_ACCESS_WRITE open will succeed on a file to which the
>>>> user has no read permissions. Therefore, we can't grant a write
>>>> delegation since can't guarantee that the user is allowed to do that.
>>> If the server grants the write delegation on an OPEN with
>>> OPEN4_SHARE_ACCESS_WRITE on the file with WR-only access mode then
>>> why can't the server checks and denies the subsequent READ?
>>>
>>> Per RFC 8881, section 9.1.2:
>>>
>>>      For delegation stateids, the access mode is based on the type of
>>>      delegation.
>>>
>>>      When a READ, WRITE, or SETATTR (that specifies the size attribute)
>>>      operation is done, the operation is subject to checking against the
>>>      access mode to verify that the operation is appropriate given the
>>>      stateid with which the operation is associated.
>>>
>>>      In the case of WRITE-type operations (i.e., WRITEs and SETATTRs that
>>>      set size), the server MUST verify that the access mode allows writing
>>>      and MUST return an NFS4ERR_OPENMODE error if it does not. In the case
>>>      of READ, the server may perform the corresponding check on the access
>>>      mode, or it may choose to allow READ on OPENs for OPEN4_SHARE_ACCESS_WRITE,
>>>      to accommodate clients whose WRITE implementation may unavoidably do
>>>      reads (e.g., due to buffer cache constraints). However, even if READs
>>>      are allowed in these circumstances, the server MUST still check for
>>>      locks that conflict with the READ (e.g., another OPEN specified
>>>      OPEN4_SHARE_DENY_READ or OPEN4_SHARE_DENY_BOTH). Note that a server
>>>      that does enforce the access mode check on READs need not explicitly
>>>      check for conflicting share reservations since the existence of OPEN
>>>      for OPEN4_SHARE_ACCESS_READ guarantees that no conflicting share
>>>      reservation can exist.
>>>
>>> FWIW, The Solaris server grants write delegation on OPEN with
>>> OPEN4_SHARE_ACCESS_WRITE on file with access mode either RW or
>>> WR-only. Maybe this is a bug? or the spec is not clear?
>>>
>> I don't think that's necessarily a bug.
>>
>> It's not that the spec demands that we only hand out delegations on BOTH
>> opens.  This is more of a quirk of the Linux implementation. Linux'
>> write delegations require an open O_RDWR file descriptor because we may
>> be called upon to do a read on its behalf.
>>
>> Technically, we could probably just have it check for
>> OPEN4_SHARE_ACCESS_WRITE, but in the case where READ isn't also set,
>> then you're unlikely to get a delegation. Either the O_RDWR descriptor
>> will be NULL, or there are other, conflicting opens already present.
>>
>> Solaris may have a completely different design that doesn't require
>> this. I haven't looked at its code to know.
> I'm comfortable for now with not handing out write delegations for
> SHARE_ACCESS_WRITE opens. I prefer that to permission checking on
> every READ operation.

I'm fine with just handling out write delegation for SHARE_ACCESS_BOTH
only.

Just a concern about not checking for access at the time of READ operation.
If the file was opened with SHARE_ACCESS_WRITE (no write delegation granted)
and the file access mode was changed to read-only, is it a correct behavior
for the server to allow the READ to go through?

-Dai

>
> If we find that it's a significant performance issue, we can revisit.
>
>
>>> It'd would be interesting to know how ONTAP server behaves in
>>> this scenario.
>>>
>> Indeed. Most likely it behaves more like Solaris does, but it'd nice to
>> know.
>>
>>>>
>>>>>> + }
>>>>>> +
>>>>>> + /*
>>>>>> + * If the file is being opened O_RDONLY or we couldn't get a O_RDWR
>>>>>> + * file for some reason, then try for a read deleg instead.
>>>>>> + */
>>>>>> + if (!nf && (open->op_share_access & NFS4_SHARE_ACCESS_READ)) {
>>>>>>     nf = find_readable_file(fp);
>>>>>>     dl_type = NFS4_OPEN_DELEGATE_READ;
>>>>>>     }
>>>>>> - if (!nf) {
>>>>>> - /*
>>>>>> - * We probably could attempt another open and get a read
>>>>>> - * delegation, but for now, don't bother until the
>>>>>> - * client actually sends us one.
>>>>>> - */
>>>>>> +
>>>>>> + if (!nf)
>>>>>>     return ERR_PTR(-EAGAIN);
>>>>>> - }
>>>>>> +
>>>>>>     spin_lock(&state_lock);
>>>>>>     spin_lock(&fp->fi_lock);
>>>>>>     if (nfs4_delegation_exists(clp, fp))
>>>>>>
>>>>>> ---
>>>>>> base-commit: a734662572708cf062e974f659ae50c24fc1ad17
>>>>>> change-id: 20230731-wdeleg-bbdb6b25a3c6
>>>>>>
>>>>>> Best regards,
>> -- 
>> Jeff Layton <jlayton@kernel.org>
> --
> Chuck Lever
>
>

  parent reply	other threads:[~2023-08-02 21:22 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-01 13:33 [PATCH v2] nfsd: don't hand out write delegations on O_WRONLY opens Jeff Layton
2023-08-01 22:26 ` NeilBrown
2023-08-01 22:51   ` Chuck Lever
2023-08-02  0:07     ` Jeff Layton
2023-08-02 16:29 ` dai.ngo
2023-08-02 18:15   ` Jeff Layton
2023-08-02 18:25     ` Chuck Lever III
2023-08-02 20:15     ` dai.ngo
2023-08-02 20:48       ` Jeff Layton
2023-08-02 20:57         ` Chuck Lever III
2023-08-02 21:13           ` Jeff Layton
2023-08-02 21:26             ` dai.ngo
2023-08-02 21:22           ` dai.ngo [this message]
2023-08-02 21:32             ` dai.ngo
2023-08-02 21:52               ` Jeff Layton
     [not found]                 ` <3dad0420-11b5-6e6a-a1ae-72970fbfdb34@oracle.com>
2023-08-03 11:27                   ` Jeff Layton
2023-08-03 17:01                     ` dai.ngo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6801380f-75cb-49b2-4e89-49821193fe32@oracle.com \
    --to=dai.ngo@oracle.com \
    --cc=chuck.lever@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=kolga@netapp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox