linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Benny Halevy <bhalevy@tonian.com>
To: tao.peng@emc.com
Cc: bergwolf@gmail.com, Trond.Myklebust@netapp.com,
	gusev.vitaliy@nexenta.com, gusev.vitaliy@gmail.com,
	linux-nfs@vger.kernel.org
Subject: Re: [PATCH] nfs: fix inifinite loop at nfs4_layoutcommit_release
Date: Tue, 13 Sep 2011 00:54:19 -0700	[thread overview]
Message-ID: <4E6F0C2B.3040505@tonian.com> (raw)
In-Reply-To: <F19688880B763E40B28B2B462677FBF805C3299F6D@MX09A.corp.emc.com>

On 2011-09-13 00:02, tao.peng@emc.com wrote:
> 
>> -----Original Message-----
>> From: Benny Halevy [mailto:bhalevy@tonian.com]
>> Sent: Tuesday, September 13, 2011 4:32 AM
>> To: Peng Tao
>> Cc: Trond Myklebust; Peng, Tao; gusev.vitaliy@nexenta.com;
>> gusev.vitaliy@gmail.com; linux-nfs@vger.kernel.org
>> Subject: Re: [PATCH] nfs: fix inifinite loop at nfs4_layoutcommit_release
>>
>> On 2011-09-12 07:56, Peng Tao wrote:
>>> On Sat, Sep 10, 2011 at 3:14 PM, Benny Halevy <bhalevy@tonian.com> wrote:
>>>> On 2011-09-09 11:20, Trond Myklebust wrote:
>>>>> On Thu, 2011-09-08 at 23:11 -0400, tao.peng@emc.com wrote:
>>>>>> HI, Trond,
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Myklebust, Trond [mailto:Trond.Myklebust@netapp.com]
>>>>>>> Sent: Friday, September 09, 2011 1:05 AM
>>>>>>> To: Peng Tao
>>>>>>> Cc: Peng, Tao; gusev.vitaliy@nexenta.com; gusev.vitaliy@gmail.com;
>>>>>>> linux-nfs@vger.kernel.org
>>>>>>> Subject: RE: [PATCH] nfs: fix inifinite loop at nfs4_layoutcommit_release
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Peng Tao [mailto:bergwolf@gmail.com]
>>>>>>>> Sent: Thursday, September 08, 2011 11:00 AM
>>>>>>>> To: Myklebust, Trond
>>>>>>>> Cc: tao.peng@emc.com; gusev.vitaliy@nexenta.com;
>>>>>>>> gusev.vitaliy@gmail.com; linux-nfs@vger.kernel.org
>>>>>>>> Subject: Re: [PATCH] nfs: fix inifinite loop at
>>>>>>>> nfs4_layoutcommit_release
>>>>>>>>
>>>>>>>> On Thu, Sep 8, 2011 at 8:01 PM, Myklebust, Trond
>>>>>>>> <Trond.Myklebust@netapp.com> wrote:
>>>>>>>>>> -----Original Message-----
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes, but as far as I can see, even in the blocks case there can be
>>>>>>>> multiple extents per layout segment. What if I write to one
>>>>>>>> uninitialised extent, layoutcommit, then write to another uninitialized
>>>>>>>> extent in the same layout segment and layoutcommit? In my reading of
>>>>>>>> the code, there is a chance that the second layoutcommit will fail to
>>>>>>>> pick up the layout segment, and so will fail to notify the MDS that the
>>>>>>>> second extent now contains data.
>>>>>>>>
>>>>>>>> blocklayout does not decide what to layoutcommit according to the lseg
>>>>>>>> list. Instead, it keeps track of each extent's state at the
>>>>>>>> granularity of blocksize, and encode whatever needs layoutcommitted in
>>>>>>>> the layoutcommit call. So in your above case, as long as the second
>>>>>>>> layoutcommit is issued, blocklayout will encode the newly written
>>>>>>>> extent in the second layoutcommit call, even if the lseg is not
>>>>>>>> attached to the second layoutcommit.
>>>>>>>>
>>>>>>>> But that leads to another two question:
>>>>>>>> 1. How do we protect against layoutrecall if lseg is not linked to
>>>>>>>> layoutcommit? For this one, can we just reject layoutrecall if there
>>>>>>>> is inflight layoutcommit? It will be less parallel but can guarantee
>>>>>>>> current implementation correctness.
>>>>>>>> 2. blocklayout ONLY: bl_committing may be overloaded by several
>>>>>>>> layoutcommit calls and we don't have information in
>>>>>>>> cleanup_layoutcommit() on how many entry should be removed from
>>>>>>>> bl_committing. Maybe we can add a (void*) to struct
>>>>>>>> nfs4_layoutcommit_data, so that LD can pass some private information
>>>>>>>> between encode_layoutcommit() and cleanup_layoutcommit()?
>>>>>>>
>>>>>>> 3. What is the purpose of pinning the layout segment at all if neither blocks,
>> nor
>>>>>>> objects nor files cares?
>>>>>> I believe it is for protecting against layoutrecall. But since we are seperating
>> lseg and LD specific layout information management, it is actually not working as
>> expected.
>>>>>>
>>>>
>>>> The layout segments are not really in use while in LAYOUTCOMMIT.
>>>> We only need to get the stateid right with respect to concurrent layout recalls.
>>> LAYOUTCOMMIT takes lseg reference to mark them as in use so that
>>> layoutrecall cannot free them.
>>>
>>
>> And if layoutrecall would have freed layout segments during layoutcommit,
>> what is your specific concern?
> In layoutcommit_release, blocklayout need to access the corresponding extents to convert their states. If the layout segments are freed by layoutrecall, it can cause problems.
> 

See my response to Trond on his previous message.  I think the best thing to do is
to return NFS4ERR_DELAY if the gets a conflicting CB_LAYOUTRECALL while the
LAYOUTCOMMIT is in progress.  The server may need to reject the LAYOUTCOMMIT
in this case to prevent a distributed deadlock so the client should be prepared
to retry.

Benny

>>
>> Benny

  reply	other threads:[~2011-09-13  7:59 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-28  6:22 [PATCH] nfs: fix inifinite loop at nfs4_layoutcommit_release Vitaliy Gusev
2011-09-06 19:29 ` Trond Myklebust
2011-09-06 22:13   ` Vitaliy Gusev
2011-09-06 22:32     ` Trond Myklebust
2011-09-08 10:21       ` tao.peng
2011-09-08 12:01         ` Myklebust, Trond
2011-09-08 15:00           ` Peng Tao
2011-09-08 17:05             ` Myklebust, Trond
2011-09-09  3:11               ` tao.peng
2011-09-09 18:20                 ` Trond Myklebust
2011-09-10  7:14                   ` Benny Halevy
2011-09-12 14:56                     ` Peng Tao
2011-09-12 20:31                       ` Benny Halevy
2011-09-12 21:10                         ` Trond Myklebust
2011-09-13  7:50                           ` Benny Halevy
2011-09-13  8:32                             ` tao.peng
2011-09-14  6:43                               ` Benny Halevy
2011-09-14  7:53                                 ` tao.peng
     [not found]                                   ` <F19688880B763E40B28B2B462677FBF805C3F7A911-AYrsSIZi/B2B3McK65YKY9BPR1lH4CV8@public.gmane.org>
2011-09-14 13:20                                     ` Benny Halevy
2011-09-13  8:09                           ` tao.peng
2011-09-14  6:46                             ` Benny Halevy
2011-09-13  7:02                         ` tao.peng
2011-09-13  7:54                           ` Benny Halevy [this message]
2011-09-12 14:48                   ` Peng Tao
2011-09-08 10:00     ` tao.peng
2011-09-08 13:02       ` Vitaliy Gusev
2011-09-08 15:09         ` Peng Tao
2011-09-09 17:46           ` Vitaliy Gusev
2011-09-12 14:42             ` Peng Tao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E6F0C2B.3040505@tonian.com \
    --to=bhalevy@tonian.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=bergwolf@gmail.com \
    --cc=gusev.vitaliy@gmail.com \
    --cc=gusev.vitaliy@nexenta.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=tao.peng@emc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).