linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Benny Halevy <bhalevy.lists@gmail.com>
To: tao.peng@emc.com
Cc: iisaman@netapp.com, rees@umich.edu, linux-nfs@vger.kernel.org,
	honey@citi.umich.edu
Subject: Re: [PATCH 03/34] pnfs: let layoutcommit code handle multiple segments
Date: Tue, 14 Jun 2011 10:28:59 -0400	[thread overview]
Message-ID: <4DF7702B.5090609@gmail.com> (raw)
In-Reply-To: <F19688880B763E40B28B2B462677FBF805BED9D553@MX09A.corp.emc.com>

On 2011-06-14 06:40, tao.peng@emc.com wrote:
> Hi, Fred,
> 
>> -----Original Message-----
>> From: linux-nfs-owner@vger.kernel.org [mailto:linux-nfs-owner@vger.kernel.org]
>> On Behalf Of Fred Isaman
>> Sent: Monday, June 13, 2011 10:37 PM
>> To: Jim Rees
>> Cc: linux-nfs@vger.kernel.org; peter honeyman
>> Subject: Re: [PATCH 03/34] pnfs: let layoutcommit code handle multiple segments
>>
>> On Sun, Jun 12, 2011 at 7:43 PM, Jim Rees <rees@umich.edu> wrote:
>>> From: Peng Tao <bergwolf@gmail.com>
>>>
>>> Some layout driver like block will have multiple segments.
>>> Generic code should be able to handle it.
>>>
>>> Signed-off-by: Peng Tao <peng_tao@emc.com>
>>> ---
>>>  fs/nfs/pnfs.c |   17 +++++++++++++----
>>>  fs/nfs/pnfs.h |    1 +
>>>  2 files changed, 14 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
>>> index e3d618b..f03a5e0 100644
>>> --- a/fs/nfs/pnfs.c
>>> +++ b/fs/nfs/pnfs.c
>>> @@ -892,7 +892,7 @@ pnfs_find_lseg(struct pnfs_layout_hdr *lo,
>>>        dprintk("%s:Begin\n", __func__);
>>>
>>>        assert_spin_locked(&lo->plh_inode->i_lock);
>>> -       list_for_each_entry(lseg, &lo->plh_segs, pls_list) {
>>> +       list_for_each_entry_reverse(lseg, &lo->plh_segs, pls_list) {
>>
>> This is a sortred list, and the order of search matters.  You can't
>> just reverse it here.
> The layout segment list is in offset increasing order. But the lookup code here assumes it's a decreasing ordered list.
> To fix it, we should either reverse lookup the list, or change the break condition test. Otherwise lookup always fails if not matching the first one.
> 

We shouldn't scan the list in reverse.
I'll send a fix upstream to fix the break condition.
This got broken when I last changed cmp_layout.
Basically we want to break out of the loop once we can't find a layout covering the first
byte of the range we're looking up.

Benny

>>
>>>                if (test_bit(NFS_LSEG_VALID, &lseg->pls_flags) &&
>>>                    is_matching_lseg(&lseg->pls_range, range)) {
>>>                        ret = get_lseg(lseg);
>>> @@ -1193,10 +1193,18 @@ pnfs_try_to_read_data(struct nfs_read_data *rdata,
>>>  static struct pnfs_layout_segment *pnfs_list_write_lseg(struct inode *inode)
>>>  {
>>>        struct pnfs_layout_segment *lseg, *rv = NULL;
>>> +       loff_t max_pos = 0;
>>> +
>>> +       list_for_each_entry(lseg, &NFS_I(inode)->layout->plh_segs, pls_list) {
>>> +               if (lseg->pls_range.iomode == IOMODE_RW) {
>>> +                       if (max_pos < lseg->pls_end_pos)
>>> +                               max_pos = lseg->pls_end_pos;
>>> +                       if (test_and_clear_bit(NFS_LSEG_LAYOUTCOMMIT,
>> &lseg->pls_flags))
>>> +                               rv = lseg;
>>> +               }
>>> +       }
>>> +       rv->pls_end_pos = max_pos;
>>>
>>
>> The idea here was that it could be extended to use segment by
>> returning a list of affected lsegs,
>> not so,e random one.  Because otherwise you have problems with the
>> fact that relevant but not
>> returned lsegs are going to get there refcounts messed up.
> The above code relies on NFS_INO_LAYOUTCOMMIT bit to ensure that only one inode lseg has NFS_LSEG_LAYOUTCOMMIT set. But, you are right. The layoutcommit code needs a second thought.
> How about making it return a list of affected lsegs and pass them around layoutcommit_procs?
> 
> Thanks,
> Tao
> 
>>
>> Fred
>>
>>> -       list_for_each_entry(lseg, &NFS_I(inode)->layout->plh_segs, pls_list)
>>> -               if (lseg->pls_range.iomode == IOMODE_RW)
>>> -                       rv = lseg;
>>>        return rv;
>>>  }
>>>
>>> @@ -1211,6 +1219,7 @@ pnfs_set_layoutcommit(struct nfs_write_data *wdata)
>>>        if (!test_and_set_bit(NFS_INO_LAYOUTCOMMIT, &nfsi->flags)) {
>>>                /* references matched in nfs4_layoutcommit_release */
>>>                get_lseg(wdata->lseg);
>>> +               set_bit(NFS_LSEG_LAYOUTCOMMIT,
>> &wdata->lseg->pls_flags);
>>>                wdata->lseg->pls_lc_cred =
>>>                        get_rpccred(wdata->args.context->state->owner-
>>> so_cred);
>>>                mark_as_dirty = true;
>>> diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
>>> index b071b56..a3fc0f2 100644
>>> --- a/fs/nfs/pnfs.h
>>> +++ b/fs/nfs/pnfs.h
>>> @@ -36,6 +36,7 @@
>>>  enum {
>>>        NFS_LSEG_VALID = 0,     /* cleared when lseg is recalled/returned */
>>>        NFS_LSEG_ROC,           /* roc bit received from server */
>>> +       NFS_LSEG_LAYOUTCOMMIT,  /* layoutcommit bit set for
>> layoutcommit */
>>>  };
>>>
>>>  struct pnfs_layout_segment {
>>> --
>>> 1.7.4.1
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2011-06-14 14:29 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-12 23:43 [PATCH 00/34] pnfs block layout driver based on v3.0-rc2 Jim Rees
2011-06-12 23:43 ` [PATCH 01/34] pnfs: GETDEVICELIST Jim Rees
2011-06-12 23:43 ` [PATCH 02/34] pnfs: add set-clear layoutdriver interface Jim Rees
2011-06-12 23:43 ` [PATCH 03/34] pnfs: let layoutcommit code handle multiple segments Jim Rees
2011-06-13 14:36   ` Fred Isaman
2011-06-14 10:40     ` tao.peng
2011-06-14 13:58       ` Fred Isaman
2011-06-14 14:28       ` Benny Halevy [this message]
2011-06-12 23:43 ` [PATCH 04/34] pnfs: hook nfs_write_begin/end to allow layout driver manipulation Jim Rees
2011-06-13 14:44   ` Fred Isaman
2011-06-14 11:01     ` tao.peng
2011-06-14 14:05       ` Fred Isaman
2011-06-14 15:53         ` Peng Tao
2011-06-14 16:02           ` Fred Isaman
2011-06-12 23:43 ` [PATCH 05/34] pnfs: ask for layout_blksize and save it in nfs_server Jim Rees
2011-06-14 15:01   ` Benny Halevy
2011-06-14 15:08     ` Peng Tao
2011-06-12 23:44 ` [PATCH 06/34] pnfs: cleanup_layoutcommit Jim Rees
2011-06-13 21:19   ` Benny Halevy
2011-06-14 15:16     ` Peng Tao
2011-06-14 15:10   ` Benny Halevy
2011-06-14 15:21     ` Peng Tao
2011-06-14 15:19   ` Benny Halevy
2011-06-12 23:44 ` [PATCH 07/34] pnfsblock: define PNFS_BLOCK Kconfig option Jim Rees
2011-06-14 15:13   ` Benny Halevy
2011-06-12 23:44 ` [PATCH 08/34] pnfsblock: blocklayout stub Jim Rees
2011-06-12 23:44 ` [PATCH 09/34] pnfsblock: layout alloc and free Jim Rees
2011-06-12 23:44 ` [PATCH 10/34] Add support for simple rpc pipefs Jim Rees
2011-06-12 23:44 ` [PATCH 11/34] pnfs-block: Add block device discovery pipe Jim Rees
2011-06-12 23:44 ` [PATCH 12/34] pnfsblock: basic extent code Jim Rees
2011-06-12 23:44 ` [PATCH 13/34] pnfsblock: add device operations Jim Rees
2011-06-12 23:44 ` [PATCH 14/34] pnfsblock: remove " Jim Rees
2011-06-12 23:44 ` [PATCH 15/34] pnfsblock: lseg alloc and free Jim Rees
2011-06-12 23:44 ` [PATCH 16/34] pnfsblock: merge extents Jim Rees
2011-06-12 23:44 ` [PATCH 17/34] pnfsblock: call and parse getdevicelist Jim Rees
2011-06-14 15:36   ` Benny Halevy
2011-06-12 23:44 ` [PATCH 18/34] pnfsblock: allow use of PG_owner_priv_1 flag Jim Rees
2011-06-13 15:56   ` Fred Isaman
2011-06-12 23:44 ` [PATCH 19/34] pnfsblock: xdr decode pnfs_block_layout4 Jim Rees
2011-06-12 23:44 ` [PATCH 20/34] pnfsblock: find_get_extent Jim Rees
2011-06-12 23:44 ` [PATCH 21/34] pnfsblock: SPLITME: add extent manipulation functions Jim Rees
2011-06-14 15:40   ` Benny Halevy
2011-06-12 23:44 ` [PATCH 22/34] pnfsblock: merge rw extents Jim Rees
2011-06-12 23:44 ` [PATCH 23/34] pnfsblock: encode_layoutcommit Jim Rees
2011-06-14 15:44   ` Benny Halevy
2011-06-12 23:44 ` [PATCH 24/34] pnfsblock: cleanup_layoutcommit Jim Rees
2011-06-12 23:44 ` [PATCH 25/34] pnfsblock: bl_read_pagelist Jim Rees
2011-06-12 23:44 ` [PATCH 26/34] pnfsblock: write_begin Jim Rees
2011-06-12 23:44 ` [PATCH 27/34] pnfsblock: write_end Jim Rees
2011-06-12 23:44 ` [PATCH 28/34] pnfsblock: write_end_cleanup Jim Rees
2011-06-12 23:45 ` [PATCH 29/34] pnfsblock: bl_write_pagelist support functions Jim Rees
2011-06-12 23:45 ` [PATCH 30/34] pnfsblock: bl_write_pagelist Jim Rees
2011-06-12 23:45 ` [PATCH 31/34] pnfsblock: note written INVAL areas for layoutcommit Jim Rees
2011-06-12 23:45 ` [PATCH 32/34] pnfsblock: Implement release_inval_marks Jim Rees
2011-06-12 23:45 ` [PATCH 33/34] Add configurable prefetch size for layoutget Jim Rees
2011-06-12 23:45 ` [PATCH 34/34] NFS41: do not update isize if inode needs layoutcommit Jim Rees
2011-06-14 16:15   ` Benny Halevy
2011-06-14 16:22     ` Fred Isaman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DF7702B.5090609@gmail.com \
    --to=bhalevy.lists@gmail.com \
    --cc=honey@citi.umich.edu \
    --cc=iisaman@netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=rees@umich.edu \
    --cc=tao.peng@emc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).