From: Benny Halevy <bhalevy@tonian.com>
To: tao.peng@emc.com
Cc: bergwolf@gmail.com, Trond.Myklebust@netapp.com,
gusev.vitaliy@nexenta.com, gusev.vitaliy@gmail.com,
linux-nfs@vger.kernel.org
Subject: Re: [PATCH] nfs: fix inifinite loop at nfs4_layoutcommit_release
Date: Tue, 13 Sep 2011 00:54:19 -0700 [thread overview]
Message-ID: <4E6F0C2B.3040505@tonian.com> (raw)
In-Reply-To: <F19688880B763E40B28B2B462677FBF805C3299F6D@MX09A.corp.emc.com>
On 2011-09-13 00:02, tao.peng@emc.com wrote:
>
>> -----Original Message-----
>> From: Benny Halevy [mailto:bhalevy@tonian.com]
>> Sent: Tuesday, September 13, 2011 4:32 AM
>> To: Peng Tao
>> Cc: Trond Myklebust; Peng, Tao; gusev.vitaliy@nexenta.com;
>> gusev.vitaliy@gmail.com; linux-nfs@vger.kernel.org
>> Subject: Re: [PATCH] nfs: fix inifinite loop at nfs4_layoutcommit_release
>>
>> On 2011-09-12 07:56, Peng Tao wrote:
>>> On Sat, Sep 10, 2011 at 3:14 PM, Benny Halevy <bhalevy@tonian.com> wrote:
>>>> On 2011-09-09 11:20, Trond Myklebust wrote:
>>>>> On Thu, 2011-09-08 at 23:11 -0400, tao.peng@emc.com wrote:
>>>>>> HI, Trond,
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Myklebust, Trond [mailto:Trond.Myklebust@netapp.com]
>>>>>>> Sent: Friday, September 09, 2011 1:05 AM
>>>>>>> To: Peng Tao
>>>>>>> Cc: Peng, Tao; gusev.vitaliy@nexenta.com; gusev.vitaliy@gmail.com;
>>>>>>> linux-nfs@vger.kernel.org
>>>>>>> Subject: RE: [PATCH] nfs: fix inifinite loop at nfs4_layoutcommit_release
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Peng Tao [mailto:bergwolf@gmail.com]
>>>>>>>> Sent: Thursday, September 08, 2011 11:00 AM
>>>>>>>> To: Myklebust, Trond
>>>>>>>> Cc: tao.peng@emc.com; gusev.vitaliy@nexenta.com;
>>>>>>>> gusev.vitaliy@gmail.com; linux-nfs@vger.kernel.org
>>>>>>>> Subject: Re: [PATCH] nfs: fix inifinite loop at
>>>>>>>> nfs4_layoutcommit_release
>>>>>>>>
>>>>>>>> On Thu, Sep 8, 2011 at 8:01 PM, Myklebust, Trond
>>>>>>>> <Trond.Myklebust@netapp.com> wrote:
>>>>>>>>>> -----Original Message-----
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes, but as far as I can see, even in the blocks case there can be
>>>>>>>> multiple extents per layout segment. What if I write to one
>>>>>>>> uninitialised extent, layoutcommit, then write to another uninitialized
>>>>>>>> extent in the same layout segment and layoutcommit? In my reading of
>>>>>>>> the code, there is a chance that the second layoutcommit will fail to
>>>>>>>> pick up the layout segment, and so will fail to notify the MDS that the
>>>>>>>> second extent now contains data.
>>>>>>>>
>>>>>>>> blocklayout does not decide what to layoutcommit according to the lseg
>>>>>>>> list. Instead, it keeps track of each extent's state at the
>>>>>>>> granularity of blocksize, and encode whatever needs layoutcommitted in
>>>>>>>> the layoutcommit call. So in your above case, as long as the second
>>>>>>>> layoutcommit is issued, blocklayout will encode the newly written
>>>>>>>> extent in the second layoutcommit call, even if the lseg is not
>>>>>>>> attached to the second layoutcommit.
>>>>>>>>
>>>>>>>> But that leads to another two question:
>>>>>>>> 1. How do we protect against layoutrecall if lseg is not linked to
>>>>>>>> layoutcommit? For this one, can we just reject layoutrecall if there
>>>>>>>> is inflight layoutcommit? It will be less parallel but can guarantee
>>>>>>>> current implementation correctness.
>>>>>>>> 2. blocklayout ONLY: bl_committing may be overloaded by several
>>>>>>>> layoutcommit calls and we don't have information in
>>>>>>>> cleanup_layoutcommit() on how many entry should be removed from
>>>>>>>> bl_committing. Maybe we can add a (void*) to struct
>>>>>>>> nfs4_layoutcommit_data, so that LD can pass some private information
>>>>>>>> between encode_layoutcommit() and cleanup_layoutcommit()?
>>>>>>>
>>>>>>> 3. What is the purpose of pinning the layout segment at all if neither blocks,
>> nor
>>>>>>> objects nor files cares?
>>>>>> I believe it is for protecting against layoutrecall. But since we are seperating
>> lseg and LD specific layout information management, it is actually not working as
>> expected.
>>>>>>
>>>>
>>>> The layout segments are not really in use while in LAYOUTCOMMIT.
>>>> We only need to get the stateid right with respect to concurrent layout recalls.
>>> LAYOUTCOMMIT takes lseg reference to mark them as in use so that
>>> layoutrecall cannot free them.
>>>
>>
>> And if layoutrecall would have freed layout segments during layoutcommit,
>> what is your specific concern?
> In layoutcommit_release, blocklayout need to access the corresponding extents to convert their states. If the layout segments are freed by layoutrecall, it can cause problems.
>
See my response to Trond on his previous message. I think the best thing to do is
to return NFS4ERR_DELAY if the gets a conflicting CB_LAYOUTRECALL while the
LAYOUTCOMMIT is in progress. The server may need to reject the LAYOUTCOMMIT
in this case to prevent a distributed deadlock so the client should be prepared
to retry.
Benny
>>
>> Benny
next prev parent reply other threads:[~2011-09-13 7:59 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-28 6:22 [PATCH] nfs: fix inifinite loop at nfs4_layoutcommit_release Vitaliy Gusev
2011-09-06 19:29 ` Trond Myklebust
2011-09-06 22:13 ` Vitaliy Gusev
2011-09-06 22:32 ` Trond Myklebust
2011-09-08 10:21 ` tao.peng
2011-09-08 12:01 ` Myklebust, Trond
2011-09-08 15:00 ` Peng Tao
2011-09-08 17:05 ` Myklebust, Trond
2011-09-09 3:11 ` tao.peng
2011-09-09 18:20 ` Trond Myklebust
2011-09-10 7:14 ` Benny Halevy
2011-09-12 14:56 ` Peng Tao
2011-09-12 20:31 ` Benny Halevy
2011-09-12 21:10 ` Trond Myklebust
2011-09-13 7:50 ` Benny Halevy
2011-09-13 8:32 ` tao.peng
2011-09-14 6:43 ` Benny Halevy
2011-09-14 7:53 ` tao.peng
[not found] ` <F19688880B763E40B28B2B462677FBF805C3F7A911-AYrsSIZi/B2B3McK65YKY9BPR1lH4CV8@public.gmane.org>
2011-09-14 13:20 ` Benny Halevy
2011-09-13 8:09 ` tao.peng
2011-09-14 6:46 ` Benny Halevy
2011-09-13 7:02 ` tao.peng
2011-09-13 7:54 ` Benny Halevy [this message]
2011-09-12 14:48 ` Peng Tao
2011-09-08 10:00 ` tao.peng
2011-09-08 13:02 ` Vitaliy Gusev
2011-09-08 15:09 ` Peng Tao
2011-09-09 17:46 ` Vitaliy Gusev
2011-09-12 14:42 ` Peng Tao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E6F0C2B.3040505@tonian.com \
--to=bhalevy@tonian.com \
--cc=Trond.Myklebust@netapp.com \
--cc=bergwolf@gmail.com \
--cc=gusev.vitaliy@gmail.com \
--cc=gusev.vitaliy@nexenta.com \
--cc=linux-nfs@vger.kernel.org \
--cc=tao.peng@emc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).