From: Benny Halevy <bhalevy@tonian.com>
To: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: tao.peng@emc.com, gusev.vitaliy@nexenta.com,
gusev.vitaliy@gmail.com, linux-nfs@vger.kernel.org,
bergwolf@gmail.com
Subject: Re: [PATCH] nfs: fix inifinite loop at nfs4_layoutcommit_release
Date: Sat, 10 Sep 2011 00:14:16 -0700 [thread overview]
Message-ID: <4E6B0E48.7050208@tonian.com> (raw)
In-Reply-To: <1315592430.17611.15.camel@lade.trondhjem.org>
On 2011-09-09 11:20, Trond Myklebust wrote:
> On Thu, 2011-09-08 at 23:11 -0400, tao.peng@emc.com wrote:
>> HI, Trond,
>>
>>> -----Original Message-----
>>> From: Myklebust, Trond [mailto:Trond.Myklebust@netapp.com]
>>> Sent: Friday, September 09, 2011 1:05 AM
>>> To: Peng Tao
>>> Cc: Peng, Tao; gusev.vitaliy@nexenta.com; gusev.vitaliy@gmail.com;
>>> linux-nfs@vger.kernel.org
>>> Subject: RE: [PATCH] nfs: fix inifinite loop at nfs4_layoutcommit_release
>>>
>>>> -----Original Message-----
>>>> From: Peng Tao [mailto:bergwolf@gmail.com]
>>>> Sent: Thursday, September 08, 2011 11:00 AM
>>>> To: Myklebust, Trond
>>>> Cc: tao.peng@emc.com; gusev.vitaliy@nexenta.com;
>>>> gusev.vitaliy@gmail.com; linux-nfs@vger.kernel.org
>>>> Subject: Re: [PATCH] nfs: fix inifinite loop at
>>>> nfs4_layoutcommit_release
>>>>
>>>> On Thu, Sep 8, 2011 at 8:01 PM, Myklebust, Trond
>>>> <Trond.Myklebust@netapp.com> wrote:
>>>>>> -----Original Message-----
>>>>
>>>>>
>>>>> Yes, but as far as I can see, even in the blocks case there can be
>>>> multiple extents per layout segment. What if I write to one
>>>> uninitialised extent, layoutcommit, then write to another uninitialized
>>>> extent in the same layout segment and layoutcommit? In my reading of
>>>> the code, there is a chance that the second layoutcommit will fail to
>>>> pick up the layout segment, and so will fail to notify the MDS that the
>>>> second extent now contains data.
>>>>
>>>> blocklayout does not decide what to layoutcommit according to the lseg
>>>> list. Instead, it keeps track of each extent's state at the
>>>> granularity of blocksize, and encode whatever needs layoutcommitted in
>>>> the layoutcommit call. So in your above case, as long as the second
>>>> layoutcommit is issued, blocklayout will encode the newly written
>>>> extent in the second layoutcommit call, even if the lseg is not
>>>> attached to the second layoutcommit.
>>>>
>>>> But that leads to another two question:
>>>> 1. How do we protect against layoutrecall if lseg is not linked to
>>>> layoutcommit? For this one, can we just reject layoutrecall if there
>>>> is inflight layoutcommit? It will be less parallel but can guarantee
>>>> current implementation correctness.
>>>> 2. blocklayout ONLY: bl_committing may be overloaded by several
>>>> layoutcommit calls and we don't have information in
>>>> cleanup_layoutcommit() on how many entry should be removed from
>>>> bl_committing. Maybe we can add a (void*) to struct
>>>> nfs4_layoutcommit_data, so that LD can pass some private information
>>>> between encode_layoutcommit() and cleanup_layoutcommit()?
>>>
>>> 3. What is the purpose of pinning the layout segment at all if neither blocks, nor
>>> objects nor files cares?
>> I believe it is for protecting against layoutrecall. But since we are seperating lseg and LD specific layout information management, it is actually not working as expected.
>>
The layout segments are not really in use while in LAYOUTCOMMIT.
We only need to get the stateid right with respect to concurrent layout recalls.
>>> IOW: if both objects and blocks track the information that they need for
>>> layoutcommit outside the layout segments, then why do we need the
>>> NFS_LSEG_LAYOUTCOMMIT bit and pls_lc_list at all?
>> If we remove NFS_LSEG_LAYOUTCOMMIT bit and pls_lc_list, we must find other method to protect agains freeing lseg while layoutcommit is needed or is going on. e.g., check for NFS_INO_LAYOUTCOMMIT bit and inflight layoutcommit in initiate_file_draining().
>
> The easiest solution is to ensure we have only _one_ LAYOUTCOMMIT on the
> wire at a time. Otherwise, you are looking at a many-to-many mapping
> between layoutcommits and lsegs.
>
> We should not expect to need multiple layoutcommits on the wire if pNFS
> is working smoothly (i.e. no layout recalls), so optimising for that
> case is wrong.
> I'd also think that we want to order layoutcommit and ordinary commits
> (for those NFS files servers that require both) so that we don't end up
> writing a file size change to disk before the actual data is committed.
>
> So why not just protect the layoutcommit call using the existing
> nfs_commit_set_lock/nfs_commit_clear_lock?
>
Sounds good to me,
Benny
next prev parent reply other threads:[~2011-09-10 7:14 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-28 6:22 [PATCH] nfs: fix inifinite loop at nfs4_layoutcommit_release Vitaliy Gusev
2011-09-06 19:29 ` Trond Myklebust
2011-09-06 22:13 ` Vitaliy Gusev
2011-09-06 22:32 ` Trond Myklebust
2011-09-08 10:21 ` tao.peng
2011-09-08 12:01 ` Myklebust, Trond
2011-09-08 15:00 ` Peng Tao
2011-09-08 17:05 ` Myklebust, Trond
2011-09-09 3:11 ` tao.peng
2011-09-09 18:20 ` Trond Myklebust
2011-09-10 7:14 ` Benny Halevy [this message]
2011-09-12 14:56 ` Peng Tao
2011-09-12 20:31 ` Benny Halevy
2011-09-12 21:10 ` Trond Myklebust
2011-09-13 7:50 ` Benny Halevy
2011-09-13 8:32 ` tao.peng
2011-09-14 6:43 ` Benny Halevy
2011-09-14 7:53 ` tao.peng
[not found] ` <F19688880B763E40B28B2B462677FBF805C3F7A911-AYrsSIZi/B2B3McK65YKY9BPR1lH4CV8@public.gmane.org>
2011-09-14 13:20 ` Benny Halevy
2011-09-13 8:09 ` tao.peng
2011-09-14 6:46 ` Benny Halevy
2011-09-13 7:02 ` tao.peng
2011-09-13 7:54 ` Benny Halevy
2011-09-12 14:48 ` Peng Tao
2011-09-08 10:00 ` tao.peng
2011-09-08 13:02 ` Vitaliy Gusev
2011-09-08 15:09 ` Peng Tao
2011-09-09 17:46 ` Vitaliy Gusev
2011-09-12 14:42 ` Peng Tao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E6B0E48.7050208@tonian.com \
--to=bhalevy@tonian.com \
--cc=Trond.Myklebust@netapp.com \
--cc=bergwolf@gmail.com \
--cc=gusev.vitaliy@gmail.com \
--cc=gusev.vitaliy@nexenta.com \
--cc=linux-nfs@vger.kernel.org \
--cc=tao.peng@emc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).