From: Benny Halevy <bhalevy@tonian.com>
To: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: tao.peng@emc.com, gusev.vitaliy@nexenta.com,
gusev.vitaliy@gmail.com, linux-nfs@vger.kernel.org,
bergwolf@gmail.com
Subject: Re: [PATCH] nfs: fix inifinite loop at nfs4_layoutcommit_release
Date: Sat, 10 Sep 2011 00:14:16 -0700 [thread overview]
Message-ID: <4E6B0E48.7050208@tonian.com> (raw)
In-Reply-To: <1315592430.17611.15.camel@lade.trondhjem.org>
On 2011-09-09 11:20, Trond Myklebust wrote:
> On Thu, 2011-09-08 at 23:11 -0400, tao.peng@emc.com wrote:
>> HI, Trond,
>>
>>> -----Original Message-----
>>> From: Myklebust, Trond [mailto:Trond.Myklebust@netapp.com]
>>> Sent: Friday, September 09, 2011 1:05 AM
>>> To: Peng Tao
>>> Cc: Peng, Tao; gusev.vitaliy@nexenta.com; gusev.vitaliy@gmail.com;
>>> linux-nfs@vger.kernel.org
>>> Subject: RE: [PATCH] nfs: fix inifinite loop at nfs4_layoutcommit_release
>>>
>>>> -----Original Message-----
>>>> From: Peng Tao [mailto:bergwolf@gmail.com]
>>>> Sent: Thursday, September 08, 2011 11:00 AM
>>>> To: Myklebust, Trond
>>>> Cc: tao.peng@emc.com; gusev.vitaliy@nexenta.com;
>>>> gusev.vitaliy@gmail.com; linux-nfs@vger.kernel.org
>>>> Subject: Re: [PATCH] nfs: fix inifinite loop at
>>>> nfs4_layoutcommit_release
>>>>
>>>> On Thu, Sep 8, 2011 at 8:01 PM, Myklebust, Trond
>>>> <Trond.Myklebust@netapp.com> wrote:
>>>>>> -----Original Message-----
>>>>
>>>>>
>>>>> Yes, but as far as I can see, even in the blocks case there can be
>>>> multiple extents per layout segment. What if I write to one
>>>> uninitialised extent, layoutcommit, then write to another uninitialized
>>>> extent in the same layout segment and layoutcommit? In my reading of
>>>> the code, there is a chance that the second layoutcommit will fail to
>>>> pick up the layout segment, and so will fail to notify the MDS that the
>>>> second extent now contains data.
>>>>
>>>> blocklayout does not decide what to layoutcommit according to the lseg
>>>> list. Instead, it keeps track of each extent's state at the
>>>> granularity of blocksize, and encode whatever needs layoutcommitted in
>>>> the layoutcommit call. So in your above case, as long as the second
>>>> layoutcommit is issued, blocklayout will encode the newly written
>>>> extent in the second layoutcommit call, even if the lseg is not
>>>> attached to the second layoutcommit.
>>>>
>>>> But that leads to another two question:
>>>> 1. How do we protect against layoutrecall if lseg is not linked to
>>>> layoutcommit? For this one, can we just reject layoutrecall if there
>>>> is inflight layoutcommit? It will be less parallel but can guarantee
>>>> current implementation correctness.
>>>> 2. blocklayout ONLY: bl_committing may be overloaded by several
>>>> layoutcommit calls and we don't have information in
>>>> cleanup_layoutcommit() on how many entry should be removed from
>>>> bl_committing. Maybe we can add a (void*) to struct
>>>> nfs4_layoutcommit_data, so that LD can pass some private information
>>>> between encode_layoutcommit() and cleanup_layoutcommit()?
>>>
>>> 3. What is the purpose of pinning the layout segment at all if neither blocks, nor
>>> objects nor files cares?
>> I believe it is for protecting against layoutrecall. But since we are seperating lseg and LD specific layout information management, it is actually not working as expected.
>>
The layout segments are not really in use while in LAYOUTCOMMIT.
We only need to get the stateid right with respect to concurrent layout recalls.
>>> IOW: if both objects and blocks track the information that they need for
>>> layoutcommit outside the layout segments, then why do we need the
>>> NFS_LSEG_LAYOUTCOMMIT bit and pls_lc_list at all?
>> If we remove NFS_LSEG_LAYOUTCOMMIT bit and pls_lc_list, we must find other method to protect agains freeing lseg while layoutcommit is needed or is going on. e.g., check for NFS_INO_LAYOUTCOMMIT bit and inflight layoutcommit in initiate_file_draining().
>
> The easiest solution is to ensure we have only _one_ LAYOUTCOMMIT on the
> wire at a time. Otherwise, you are looking at a many-to-many mapping
> between layoutcommits and lsegs.
>
> We should not expect to need multiple layoutcommits on the wire if pNFS
> is working smoothly (i.e. no layout recalls), so optimising for that
> case is wrong.
> I'd also think that we want to order layoutcommit and ordinary commits
> (for those NFS files servers that require both) so that we don't end up
> writing a file size change to disk before the actual data is committed.
>
> So why not just protect the layoutcommit call using the existing
> nfs_commit_set_lock/nfs_commit_clear_lock?
>
Sounds good to me,
Benny
next prev parent reply other threads:[~2011-09-10 7:14 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-28 6:22 [PATCH] nfs: fix inifinite loop at nfs4_layoutcommit_release Vitaliy Gusev
2011-09-06 19:29 ` Trond Myklebust
2011-09-06 22:13 ` Vitaliy Gusev
2011-09-06 22:32 ` Trond Myklebust
2011-09-08 10:21 ` tao.peng
2011-09-08 12:01 ` Myklebust, Trond
2011-09-08 15:00 ` Peng Tao
2011-09-08 17:05 ` Myklebust, Trond
2011-09-09 3:11 ` tao.peng
2011-09-09 18:20 ` Trond Myklebust
2011-09-10 7:14 ` Benny Halevy [this message]
2011-09-12 14:56 ` Peng Tao
2011-09-12 20:31 ` Benny Halevy
2011-09-12 21:10 ` Trond Myklebust
2011-09-13 7:50 ` Benny Halevy
2011-09-13 8:32 ` tao.peng
2011-09-14 6:43 ` Benny Halevy
2011-09-14 7:53 ` tao.peng
[not found] ` <F19688880B763E40B28B2B462677FBF805C3F7A911-AYrsSIZi/B2B3McK65YKY9BPR1lH4CV8@public.gmane.org>
2011-09-14 13:20 ` Benny Halevy
2011-09-13 8:09 ` tao.peng
2011-09-14 6:46 ` Benny Halevy
2011-09-13 7:02 ` tao.peng
2011-09-13 7:54 ` Benny Halevy
2011-09-12 14:48 ` Peng Tao
2011-09-08 10:00 ` tao.peng
2011-09-08 13:02 ` Vitaliy Gusev
2011-09-08 15:09 ` Peng Tao
2011-09-09 17:46 ` Vitaliy Gusev
2011-09-12 14:42 ` Peng Tao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E6B0E48.7050208@tonian.com \
--to=bhalevy@tonian.com \
--cc=Trond.Myklebust@netapp.com \
--cc=bergwolf@gmail.com \
--cc=gusev.vitaliy@gmail.com \
--cc=gusev.vitaliy@nexenta.com \
--cc=linux-nfs@vger.kernel.org \
--cc=tao.peng@emc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.