Re: [PATCH 1/3] pnfs: avoid incorrect use of layout stateid

linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Benny Halevy <bhalevy@panasas.com>
To: Fred Isaman <iisaman@netapp.com>
Cc: linux-nfs@vger.kernel.org, Trond Myklebust <Trond.Myklebust@netapp.com>
Subject: Re: [PATCH 1/3] pnfs: avoid incorrect use of layout stateid
Date: Wed, 02 Feb 2011 06:19:37 +0200	[thread overview]
Message-ID: <4D48DB59.9010102@panasas.com> (raw)
In-Reply-To: <AF70A3A4-2A06-4BD1-AC69-270658CF6365@netapp.com>

On 2011-02-01 18:31, Fred Isaman wrote:
> 
> On Feb 1, 2011, at 10:38 AM, Benny Halevy wrote:
> 
>> On 2011-01-31 17:27, Fred Isaman wrote:
>>> The code could violate the following from RFC5661, section 12.5.3:
>>> "Once a client has no more layouts on a file, the layout stateid is no
>>> longer valid and MUST NOT be used."
>>
>> NACK.
>>
>> Fred, this is your interpretation of the forgetful model and it is taken
>> out of context.
>>
>> Until the spec is changed only the server invalidates the stateid by returning
>> lrs_present=false on LAYOUTRETURN. Merely forgetting the layout state without
>> LAYOUTRETURN does not determine that.
>>
>>
>> Also from 12.5.3:
>>   Once a layout stateid is established, the "other"
>>   field will stay constant unless the stateid is revoked or the client
>>   returns all layouts on the file and the server disposes of the stateid.
>>   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>
> 
> 
> OK, so the question is, does sending a LAYOUTGET with openstateid "return all layouts".
> Making the answer to this "yes" is perfectly consistent with the spec, given its complete absence
> of direction here.

I disagree, and so are other server implementations, including linux-pnfs!
(It will return BAD_STATEID if the client forgets the layout
in any case but LAYOUTRETURN or CB_LAYOUTRECALL/NOMATCHING_LAYOUT)

>  The alternative is a forgetful model where the client can forget layouts, but
> not layout stateid.

Right, and this is the direction we should go until the errata is in place.

> 
> The question is, is there any objection to the above interpretation of LAYOUTGET with openstateid?
> Because if not, then we will shortly get an appropriate errata,
> and there is no good reason to delay the patch until the errata is finalized.
> I know previously you had objected on the basis that this prevents parallel use of openstateid with LAYOUTGET.
> But given that you had indicated that such parallel use can not be done for other reasons,
> I have been working on the assumption that the interpretation above will be accepted without controversy at the upcoming IETF.
> 
> On the other hand, if there is objection to the above interpretation, that should be made known.

I simulated this with the layout-sim and given the LAYOUTGET sent with the initial stateid
is always fully serialized with other layout stateid-changing operations, this model works.

I object the timing, before the IETF discussion is over and before the upcoming Connectathon
where other server vendors have no chance to implement this erratum.

Benny

> 
> Fred
> 
> 
>> and
>>
>>   Once the client receives a layout stateid, it MUST use the correct
>>   "seqid" for subsequent LAYOUTGET or LAYOUTRETURN operations.  The
>>   correct "seqid" is defined as the highest "seqid" value from
>>   responses of fully processed LAYOUTGET or LAYOUTRETURN operations or
>>   arguments of a fully processed CB_LAYOUTRECALL operation.
>>
>> and 18.43.3
>>
>>   The loga_stateid field specifies a valid stateid.  If a layout is not
>>   currently held by the client, the loga_stateid field represents a
>>   stateid reflecting the correspondingly valid open, byte-range lock,
>>   or delegation stateid.  Once a layout is held on the file by the
>>   client, the loga_stateid field MUST be a stateid as returned from a
>>   previous LAYOUTGET or LAYOUTRETURN operation or provided by a
>>   CB_LAYOUTRECALL operation (see Section 12.5.3).
>>
>> Only when we agree that the client can throw away the layout state and
>> send a singular future LAYOUTGET with a non-layout stateid - something it MUST not
>> do at the moment - only then we can do what this patch suggests.
>>
>> Benny
>>
>>>
>>> This can occur when a layout already has a lseg, starts another
>>> non-everlapping LAYOUTGET, and a CB_LAYOUTRECALL for the existing lseg
>>> is processed before we hit pnfs_layout_process().
>>>
>>> Solve by setting, each time the client has no more lsegs for a file, a
>>> flag which blocks further use of the layout and triggers its removal.
>>>
>>> This also fixes a second bug which occurs in the same instance as
>>> above.  If we actually use pnfs_layout_process, we add the new lseg to
>>> the layout, but the layout has been removed from the nfs_client list
>>> by the intervening CB_LAYOUTRECALL and will not be added back.  Thus
>>> the newly acquired lseg will not be properly returned in the event of
>>> a subsequent CB_LAYOUTRECALL.
>>>
>>> Signed-off-by: Fred Isaman <iisaman@netapp.com>
>>> ---
>>> fs/nfs/pnfs.c |   12 +++++++++---
>>> 1 files changed, 9 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
>>> index 1b1bc1a..dcd5d54 100644
>>> --- a/fs/nfs/pnfs.c
>>> +++ b/fs/nfs/pnfs.c
>>> @@ -255,6 +255,9 @@ put_lseg_locked(struct pnfs_layout_segment *lseg,
>>> 			list_del_init(&lseg->pls_layout->plh_layouts);
>>> 			spin_unlock(&clp->cl_lock);
>>> 			clear_bit(NFS_LAYOUT_BULK_RECALL, &lseg->pls_layout->plh_flags);
>>> +			set_bit(NFS_LAYOUT_DESTROYED, &lseg->pls_layout->plh_flags);
>>> +			/* Matched by initial refcount set in alloc_init_layout_hdr */
>>> +			put_layout_hdr_locked(lseg->pls_layout);
>>> 		}
>>> 		rpc_wake_up(&NFS_SERVER(ino)->roc_rpcwaitq);
>>> 		list_add(&lseg->pls_list, tmp_list);
>>> @@ -299,6 +302,11 @@ mark_matching_lsegs_invalid(struct pnfs_layout_hdr *lo,
>>>
>>> 	dprintk("%s:Begin lo %p\n", __func__, lo);
>>>
>>> +	if (list_empty(&lo->plh_segs)) {
>>> +		if (!test_and_set_bit(NFS_LAYOUT_DESTROYED, &lo->plh_flags))
>>> +			put_layout_hdr_locked(lo);
>>> +		return 0;
>>> +	}
>>> 	list_for_each_entry_safe(lseg, next, &lo->plh_segs, pls_list)
>>> 		if (should_free_lseg(lseg->pls_range.iomode, iomode)) {
>>> 			dprintk("%s: freeing lseg %p iomode %d "
>>> @@ -332,10 +340,7 @@ pnfs_destroy_layout(struct nfs_inode *nfsi)
>>> 	spin_lock(&nfsi->vfs_inode.i_lock);
>>> 	lo = nfsi->layout;
>>> 	if (lo) {
>>> -		set_bit(NFS_LAYOUT_DESTROYED, &nfsi->layout->plh_flags);
>>> 		mark_matching_lsegs_invalid(lo, &tmp_list, IOMODE_ANY);
>>> -		/* Matched by refcount set to 1 in alloc_init_layout_hdr */
>>> -		put_layout_hdr_locked(lo);
>>> 	}
>>> 	spin_unlock(&nfsi->vfs_inode.i_lock);
>>> 	pnfs_free_lseg_list(&tmp_list);
>>> @@ -403,6 +408,7 @@ pnfs_layoutgets_blocked(struct pnfs_layout_hdr *lo, nfs4_stateid *stateid,
>>> 	    (int)(lo->plh_barrier - be32_to_cpu(stateid->stateid.seqid)) >= 0)
>>> 		return true;
>>> 	return lo->plh_block_lgets ||
>>> +		test_bit(NFS_LAYOUT_DESTROYED, &lo->plh_flags) ||
>>> 		test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags) ||
>>> 		(list_empty(&lo->plh_segs) &&
>>> 		 (atomic_read(&lo->plh_outstanding) > lget));
>

next prev parent reply	other threads:[~2011-02-02  4:19 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-31 15:27 [PATCH 0/3]: pnfs: fix pnfs lock inversion Fred Isaman
2011-01-31 15:27 ` [PATCH 1/3] pnfs: avoid incorrect use of layout stateid Fred Isaman
2011-02-01 15:38   ` Benny Halevy
2011-02-01 16:31     ` Fred Isaman
2011-02-02  4:19       ` Benny Halevy [this message]
2011-02-02  5:56         ` Trond Myklebust
2011-02-03  8:15           ` Benny Halevy
2011-02-02 15:45         ` Fred Isaman
2011-02-03  8:17           ` Benny Halevy
2011-01-31 15:27 ` [PATCH 2/3] pnfs: do not need to clear NFS_LAYOUT_BULK_RECALL flag Fred Isaman
2011-01-31 15:27 ` [PATCH 3/3] pnfs: fix pnfs lock inversion of i_lock and cl_lock Fred Isaman
2011-02-01 15:41   ` Benny Halevy
2011-02-01 15:54     ` Fred Isaman
2011-02-01 16:03       ` Benny Halevy
2011-02-03  8:58   ` Benny Halevy
  -- strict thread matches above, loose matches on Subject: below --
2011-02-03 18:28 [PATCH 0/3] pnfs: fix pnfs lock inversion, try 2 Fred Isaman
2011-02-03 18:28 ` [PATCH 1/3] pnfs: avoid incorrect use of layout stateid Fred Isaman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D48DB59.9010102@panasas.com \
    --to=bhalevy@panasas.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=iisaman@netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).