Re: how to properly handle failures during delegation recall process

linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Olga Kornievskaia <aglo@umich.edu>
To: Jeff Layton <jeff.layton@primarydata.com>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>,
	linux-nfs <linux-nfs@vger.kernel.org>,
	Thomas Haynes <thomas.haynes@primarydata.com>
Subject: Re: how to properly handle failures during delegation recall process
Date: Mon, 25 Apr 2016 16:03:45 -0400	[thread overview]
Message-ID: <CAN-5tyHttF-aPi6SWcFbDtbfD_RGwe87TYCiLsezqU2BsFFJjw@mail.gmail.com> (raw)
In-Reply-To: <20141105065719.23e912b1@tlielax.poochiereds.net>

Resurrecting an old thread... To handle ADMIN_REVOKED for DELEGRETURN
when we are processing a CB_RECALL:

On Wed, Nov 5, 2014 at 6:57 AM, Jeff Layton <jeff.layton@primarydata.com> wrote:
> On Mon, 13 Oct 2014 18:24:21 -0400
> Trond Myklebust <trond.myklebust@primarydata.com> wrote:
>
>> On Mon, Oct 13, 2014 at 6:13 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>> > On Mon, Oct 13, 2014 at 5:29 PM, Trond Myklebust
>> > <trond.myklebust@primarydata.com> wrote:
>> >> On Mon, Oct 13, 2014 at 5:12 PM, Trond Myklebust
>> >> <trond.myklebust@primarydata.com> wrote:
>> >>> On Mon, Oct 13, 2014 at 4:23 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>> >>>> On Mon, Oct 13, 2014 at 3:53 PM, Trond Myklebust
>> >>>> <trond.myklebust@primarydata.com> wrote:
>> >>>>> On Mon, Oct 13, 2014 at 2:51 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>> >>>>>> +               }
>> >>>>>> +       }
>> >>>>>>         return nfs4_handle_delegation_recall_error(server, state, stateid, err);
>> >>>>>>  }
>> >>>>>>
>> >>>>>> @@ -5957,6 +5973,10 @@ int nfs4_lock_delegation_recall(struct
>> >>>>>> file_lock *fl, struct nfs4_state *state,
>> >>>>>>         err = nfs4_set_lock_state(state, fl);
>> >>>>>>         if (err != 0)
>> >>>>>>                 return err;
>> >>>>>> +       if (test_bit(NFS_LOCK_LOST, &fl->fl_u.nfs4_fl.owner->ls_flags)) {
>> >>>>>> +               pr_warn_ratelimited("NFS: %s: Lock reclaim failed!\n",
>> >>>>>> __func__);
>> >>>>>> +               return -EIO;
>> >>>>>> +       }
>> >>>>>>         err = _nfs4_do_setlk(state, F_SETLK, fl, NFS_LOCK_NEW);
>> >>>>>
>> >>>>> Note that there is a potential race here if the server is playing with
>> >>>>> NFS4ERR_DELEG_REVOKED or NFS4ERR_ADMIN_REVOKED. Since we do not
>> >>>>> present the delegation as part of the LOCK request, we have no way of
>> >>>>> knowing if the delegation is still in effect. I guess we can check the
>> >>>>> return value of DELEGRETURN, but if it returns NFS4ERR_DELEG_REVOKED
>> >>>>> or NFS4ERR_ADMIN_REVOKED, then we still do not know whether or not the
>> >>>>> LOCK is safe.
>> >>>>
>> >>>> I'm not following you. We send LOCK before we send DELEGRETURN? Also,
>> >>>> we normally send in LOCK the open_stateid returned by the open with
>> >>>> cur so do we know that delegation is still in effect.
>> >>>
>> >>> How so? The open stateid doesn't tell you that the delegation is still
>> >>> in effect. If the DELEGRETURN returns NFS4ERR_DELEG_REVOKED, how can
>> >>> you tell if the delegation was revoked before or after the LOCK
>> >>> request was handled?
>> >>
>> >> Actually, let me answer that myself. You can sort of figure things out
>> >> in NFSv4.1 if you look at the SEQ4_STATUS_RECALLABLE_STATE_REVOKED
>> >> flag. If it is set, you should probably distrust the lock stateid that
>> >> you just attempted to recover, since you now know that at least one
>> >> delegation was just revoked.
>> >>
>> >> In that case, we probably also have to do a TEST_STATEID+FREE_STATEID
>> >> on the delegation stateid.
>> >
>> > I think we are mis-communicating here by talking about different
>> > nuances. I agree with you that when an operation is sent there is no
>> > way of knowing if in the mean while the server has decided to revoke
>> > the delegation. However, this is not what I'm confused about regarding
>> > your comment. I'm noticing that in the flow of operations, we send (1)
>> > open with cur, then (2) lock, then (3) delegreturn. So I was confused
>> > about how can we check return of delegreturn, step 3, if we are in
>> > step 2.
>> >
>> > I think the LOCK is safe if the reply to the LOCK is successful.
>>
>> It needs to be concurrent with the delegation as well, otherwise
>> general lock atomicity is broken.
>>
>> > Let me just step back from this to note that your solution to "lost
>> > locks during delegation" is to recognize the open with cure failure
>> > and skip locking and delegreturn. I can work on a patch for that.
>> >
>> > Do you agree that the state recovery should not be initiated in case
>> > we get those errors?
>>
>> State recovery _should_ be initiated so that we do reclaim opens (I
>> don't care about share lock atomicity on Linux). However
>> nfs_delegation_claim_locks() _should_not_ be called.
>>
>> I'll give some more thought as to how we can ensure the missing lock atomicity.
>>
>
>
> (cc'ing Tom here since we may want to consider providing guidance in
>  the spec for this situation)
>
> Ok, I think both of you are right here :). Here's my interpretation:
>
> Olga is correct that the LOCK operation itself is safe since LOCK
> doesn't actually modify the contents of the file. What it's not safe to
> do is to trust that LOCK unless and until the DELEGRETURN is also
> successful.
>
> First, let's clarify the potential race that Trond pointed out:
>
> Suppose we have a delegation and delegated locks. That delegation is
> recalled and we do something like this:
>
> OPEN with DELEGATE_CUR: NFS4_OK
> LOCK:                   NFS4_OK
> LOCK:                   NFS4_OK
> ...(maybe more successful locks here)...
> DELEGRETURN:            NFS4ERR_ADMIN_REVOKED
>
> ...at that point, we're screwed.
>
> The delegation was obviously revoked after we did the OPEN but before
> the DELEGRETURN. None of those LOCK requests can be trusted since
> another client may have opened the file at any point in there, acquired
> any one of those locks and then released it.
>
> For v4.1+ the client can do what Trond suggests. Check for
> SEQ4_STATUS_RECALLABLE_STATE_REVOKED in each LOCK response. If it's set
> then we can do the TEST_STATEID/FREE_STATEID dance. If the TEST_STATEID
> fails, then we must consider the most recently acquired lock lost.
> LOCKU it and give up trying to reclaim the rest of them.
>
> For v4.0, I'm not sure what the client can do other than wait until the
> DELEGRETURN. If that fails with NFS4ERR_ADMIN_REVOKED, then we'll just
> have to try to unwind the whole mess. Send LOCKUs for all of them and
> consider them all to be lost.

Would it then be an acceptable solution to:
-- make deleg_return synchronous so that we wait for the reply to
handle the error
-- if we get an error in reply, don't ignore the error and propogate
it back to the delegation code.
-- then mark all the locks lost for the inode.

I don't have anything to send LOCKUs as wouldn't the application/vfs
take care of that when we fail the IO.

Also another problem is that ADMIN_REVOKED gets mapped into EIO by
nfs4_map_errors() before it gets returned into the delegation code so
the patch will mark locks lost for anything else that was translated
into the EIO.

Here's a patch that I've tried that seems to work:

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 5166adc..e559407 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -430,6 +430,25 @@ static int nfs_end_delegation_return(struct inode
*inode, struct nfs_delegation
         goto out;

     err = nfs_do_return_delegation(inode, delegation, issync);
+    if (err == -EIO) {
+        struct file_lock *fl;
+        struct file_lock_context *flctx = inode->i_flctx;
+        struct list_head *list;
+
+        if (flctx == NULL)
+            goto out;
+
+        /* mark all locks lost */
+        list = &flctx->flc_posix;
+        down_write(&nfsi->rwsem);
+        spin_lock(&flctx->flc_lock);
+        list_for_each_entry(fl, list, fl_list) {
+            set_bit(NFS_LOCK_LOST,
+                &fl->fl_u.nfs4_fl.owner->ls_flags);
+        }
+        spin_unlock(&flctx->flc_lock);
+        up_write(&nfsi->rwsem);
+    }
 out:
     return err;
 }
@@ -490,7 +509,7 @@ restart:
             delegation = nfs_start_delegation_return_locked(NFS_I(inode));
             rcu_read_unlock();

-            err = nfs_end_delegation_return(inode, delegation, 0);
+            err = nfs_end_delegation_return(inode, delegation, 1);
             iput(inode);
             nfs_sb_deactive(server->super);
             if (!err)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 327b8c3..297a466 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5309,13 +5309,13 @@ static void nfs4_delegreturn_done(struct
rpc_task *task, void *calldata)
     switch (task->tk_status) {
     case 0:
         renew_lease(data->res.server, data->timestamp);
-    case -NFS4ERR_ADMIN_REVOKED:
-    case -NFS4ERR_DELEG_REVOKED:
     case -NFS4ERR_BAD_STATEID:
     case -NFS4ERR_OLD_STATEID:
     case -NFS4ERR_STALE_STATEID:
     case -NFS4ERR_EXPIRED:
         task->tk_status = 0;
+    case -NFS4ERR_ADMIN_REVOKED:
+    case -NFS4ERR_DELEG_REVOKED:
         if (data->roc)
             pnfs_roc_set_barrier(data->inode, data->roc_barrier);
         break;


>
> Actually, it may be reasonable to just do the same thing for v4.1. The
> client tracks NFS_LOCK_LOST on a per-lockstateid basis, so once you have
> any unreclaimable lock, any I/O done with that stateid is going to fail
> anyway. You might as well just release any locks you do hold at that
> point.
>
> The other question is whether the server ought to have any role to play
> here. In principle it could track whether an open/lock stateid is
> descended from a still outstanding delegation, and revoke those
> stateids if the delegation is revoked. That would probably not be
> trivial to do with the current Linux server implementation, however.
>
> --
> Jeff Layton <jlayton@primarydata.com>

     prev parent reply	other threads:[~2016-04-25 20:03 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-13 18:51 how to properly handle failures during delegation recall process Olga Kornievskaia
2014-10-13 19:53 ` Trond Myklebust
2014-10-13 20:23   ` Olga Kornievskaia
2014-10-13 21:12     ` Trond Myklebust
2014-10-13 21:29       ` Trond Myklebust
2014-10-13 22:13         ` Olga Kornievskaia
2014-10-13 22:24           ` Trond Myklebust
2014-10-14 15:48             ` Olga Kornievskaia
2014-10-16 18:43               ` Olga Kornievskaia
2014-10-21 18:22                 ` Olga Kornievskaia
2014-11-05 11:57             ` Jeff Layton
2014-11-05 12:41               ` Trond Myklebust
2014-11-05 18:31                 ` J. Bruce Fields
2014-11-05 19:42                   ` Jeff Layton
2014-11-05 19:54                     ` J. Bruce Fields
2014-11-05 20:06                       ` Jeff Layton
2014-11-05 20:13                         ` J. Bruce Fields
2014-11-05 20:06                     ` Trond Myklebust
2016-04-25 20:03               ` Olga Kornievskaia [this message]

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:5166adc dfblob:e559407 dfblob:327b8c3 dfblob:297a466 )
 OR (
bs:"Re: how to properly handle failures during delegation recall process" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAN-5tyHttF-aPi6SWcFbDtbfD_RGwe87TYCiLsezqU2BsFFJjw@mail.gmail.com \
    --to=aglo@umich.edu \
    --cc=jeff.layton@primarydata.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=thomas.haynes@primarydata.com \
    --cc=trond.myklebust@primarydata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).