linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peng Tao <bergwolf@gmail.com>
To: Benny Halevy <bhalevy@tonian.com>, Boaz Harrosh <bharrosh@panasas.com>
Cc: linux-nfs@vger.kernel.org, Peng Tao <peng_tao@emc.com>
Subject: Re: [PATCH] pnfsblock: init pg_bsize properly
Date: Wed, 17 Aug 2011 17:35:20 +0800	[thread overview]
Message-ID: <CA+a=Yy6BSmEWurZuFN-8axTS0gfeXaguw5tcayeVjJmcFAztOQ@mail.gmail.com> (raw)
In-Reply-To: <4E4B6A81.2010204@tonian.com>

Hi, Benny and Boaz,

On Wed, Aug 17, 2011 at 3:15 PM, Benny Halevy <bhalevy@tonian.com> wrote:
>
> On 2011-08-17 00:05, Boaz Harrosh wrote:
>> On 08/12/2011 06:04 PM, Peng Tao wrote:
>>> pg_bsize is server->wsize/rsize by default. We would want to use the lseg length.
>>>
>>
>> Hi
>>
>> What is the problem you are trying to solve with this patch?
>>
>> From what I understand the only place that actually cares about
>> pg_bsize is nfs_generic_pg_test() which is only used in MDS
>> read/write. In the pNFS RW, the LD and pnfs has it's own .pg_test()
>> check that should not concern with pg_bsize (Unless for pnfs-files
>> which does). So the idea is that pg_bsize is the maximum set by
>> MDS server in regard to IO through MDS. And it should not be changed
>> by client.
>>
>> If it is not what you see then we should fix it. But should never
>> override MDS wsize/rsize.
In pnfs_do_multiple_reads/pnfs_do_multiple_writes, data->mds_ops will
be set as desc->pg_rpc_callops, which is determined in
nfs_generic_flush/nfs_generic_pagein according to desc->pg_bsize. For
blocklayout, we wouldn't want to set data->mds_ops to
partial_read/write ops, so I write the patch to use lseg length as
pg_bsize.

LD can override pg_bsize in pg_init because
nfs_pageio_reset_read_mds/nfs_pageio_reset_write_mds will reset it to
server rsize/wsize if pnfs is not tried.

Sorry that I didn't explain it clearly in the commit log...


>
> I second that.
>
> Benny
>
>>
>>> Signed-off-by: Peng Tao <peng_tao@emc.com>
>>> ---
>>>  fs/nfs/blocklayout/blocklayout.c |   20 ++++++++++++++++++--
>>>  1 files changed, 18 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c
>>> index 36648e1..9143e61 100644
>>> --- a/fs/nfs/blocklayout/blocklayout.c
>>> +++ b/fs/nfs/blocklayout/blocklayout.c
>>> @@ -919,14 +919,30 @@ bl_clear_layoutdriver(struct nfs_server *server)
>>>      return 0;
>>>  }
>>>
>>> +static void bl_pg_init_read(struct nfs_pageio_descriptor *pgio,
>>> +                        struct nfs_page *req)
>>> +{
>>> +    pnfs_generic_pg_init_read(pgio, req);
>>> +    if (pgio->pg_lseg)
>>> +            pgio->pg_bsize = pgio->pg_lseg->pls_range.length;
>>> +}
>>> +
>>> +static void bl_pg_init_write(struct nfs_pageio_descriptor *pgio,
>>> +                         struct nfs_page *req)
>>> +{
>>> +    pnfs_generic_pg_init_write(pgio, req);
>>> +    if (pgio->pg_lseg)
>>> +            pgio->pg_bsize = pgio->pg_lseg->pls_range.length;
>>> +}
>>> +
>>>  static const struct nfs_pageio_ops bl_pg_read_ops = {
>>> -    .pg_init = pnfs_generic_pg_init_read,
>>> +    .pg_init = bl_pg_init_read,
>>>      .pg_test = pnfs_generic_pg_test,
>>
>> I see here that you do not override .pg_test. This is your problem
>> look at objio_osd::objio_pg_test() it checks for similar boundaries
>> at the objects side. This is where you need to do these checks
>> for blocks as well.
For blocklayout, we don't need to force each IO under a certain size.
Currently (w/ and w/o this patch) the lseg coverage is the only
constraint for pagelist length. So pnfs_generic_pg_test is enough for
blocklayout.

Thanks,
Tao

>>
>>>      .pg_doio = pnfs_generic_pg_readpages,
>>>  };
>>>
>>>  static const struct nfs_pageio_ops bl_pg_write_ops = {
>>> -    .pg_init = pnfs_generic_pg_init_write,
>>> +    .pg_init = bl_pg_init_write,
>>>      .pg_test = pnfs_generic_pg_test,
>>
>> Same here
>>
>>>      .pg_doio = pnfs_generic_pg_writepages,
>>>  };
>>
>> Thanks
>> Boaz
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

  reply	other threads:[~2011-08-17  9:35 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-13  1:04 [PATCH] pnfsblock: init pg_bsize properly Peng Tao
2011-08-16 21:05 ` Boaz Harrosh
2011-08-17  7:15   ` Benny Halevy
2011-08-17  9:35     ` Peng Tao [this message]
2011-08-17 16:27       ` Benny Halevy
2011-08-18 14:34         ` Peng Tao
2011-08-22 23:52       ` Boaz Harrosh
2011-08-23  0:00         ` Myklebust, Trond
2011-08-23 15:01           ` Peng Tao
2011-08-23 21:19             ` Boaz Harrosh
2011-08-25 20:15               ` Jim Rees
2011-08-26  0:16                 ` Boaz Harrosh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+a=Yy6BSmEWurZuFN-8axTS0gfeXaguw5tcayeVjJmcFAztOQ@mail.gmail.com' \
    --to=bergwolf@gmail.com \
    --cc=bhalevy@tonian.com \
    --cc=bharrosh@panasas.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=peng_tao@emc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).