From: Benny Halevy <bhalevy@panasas.com>
To: Fred Isaman <iisaman@netapp.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>,
andros@netapp.com, linux-nfs@vger.kernel.org,
Andy Adamon <andros@citi.umich.edu>,
Dean Hildebrand <dhildeb@us.ibm.com>,
Boaz Harrosh <bharrosh@panasas.com>,
Oleg Drokin <green@linuxhacker.ru>, Tao Guo <guotao@nrchpc.ac.cn>
Subject: Re: [PATCH 09/16] pnfs: wave 3: shift pnfs_update_layout locations
Date: Tue, 15 Feb 2011 22:11:40 -0500 [thread overview]
Message-ID: <4D5B406C.4080801@panasas.com> (raw)
In-Reply-To: <AANLkTinxqbmWvWXeaPnw7oNtL_zfCYOTB=KjZFJQaKFn@mail.gmail.com>
On 2011-02-15 09:41, Fred Isaman wrote:
> On Mon, Feb 14, 2011 at 6:14 PM, Trond Myklebust
> <Trond.Myklebust@netapp.com> wrote:
>> On Mon, 2011-02-14 at 14:18 -0500, andros@netapp.com wrote:
>>> From: Fred Isaman <iisaman@netapp.com>
>>>
>>> Move the pnfs_update_layout call location to nfs_pageio_do_add_request().
>>> Grab the lseg sent in the doio function to nfs_read_rpcsetup and attach
>>> it to each nfs_read_data so it can be sent to the layout driver.
>>>
>>> Signed-off-by: Andy Adamon <andros@netapp.com>
>>> Signed-off-by: Andy Adamon <andros@citi.umich.edu>
>>> Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
>>> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
>>> Signed-off-by: Fred Isaman <iisaman@netapp.com>
>>> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
>>> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
>>> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
>>> Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
>>> ---
>>> fs/nfs/file.c | 4 ----
>>> fs/nfs/pagelist.c | 15 ++++++++++++---
>>> fs/nfs/pnfs.c | 4 ++--
>>> fs/nfs/pnfs.h | 1 +
>>> fs/nfs/read.c | 28 ++++++++++++++++------------
>>> fs/nfs/write.c | 4 ++--
>>> include/linux/nfs_page.h | 5 +++--
>>> include/linux/nfs_xdr.h | 1 +
>>> 8 files changed, 37 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/fs/nfs/file.c b/fs/nfs/file.c
>>> index 7bf029e..d85a534 100644
>>> --- a/fs/nfs/file.c
>>> +++ b/fs/nfs/file.c
>>> @@ -387,10 +387,6 @@ static int nfs_write_begin(struct file *file, struct address_space *mapping,
>>> file->f_path.dentry->d_name.name,
>>> mapping->host->i_ino, len, (long long) pos);
>>>
>>> - pnfs_update_layout(mapping->host,
>>> - nfs_file_open_context(file),
>>> - IOMODE_RW);
>>> -
>>> start:
>>> /*
>>> * Prevent starvation issues if someone is doing a consistency
>>> diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
>>> index e1164e3..e0a0cb4 100644
>>> --- a/fs/nfs/pagelist.c
>>> +++ b/fs/nfs/pagelist.c
>>> @@ -20,6 +20,7 @@
>>> #include <linux/nfs_mount.h>
>>>
>>> #include "internal.h"
>>> +#include "pnfs.h"
>>>
>>> static struct kmem_cache *nfs_page_cachep;
>>>
>>> @@ -213,7 +214,7 @@ nfs_wait_on_request(struct nfs_page *req)
>>> */
>>> void nfs_pageio_init(struct nfs_pageio_descriptor *desc,
>>> struct inode *inode,
>>> - int (*doio)(struct inode *, struct list_head *, unsigned int, size_t, int),
>>> + int (*doio)(struct inode *, struct list_head *, unsigned int, size_t, int, struct pnfs_layout_segment *),
>>> size_t bsize,
>>> int io_flags)
>>> {
>>> @@ -226,6 +227,7 @@ void nfs_pageio_init(struct nfs_pageio_descriptor *desc,
>>> desc->pg_doio = doio;
>>> desc->pg_ioflags = io_flags;
>>> desc->pg_error = 0;
>>> + desc->pg_lseg = NULL;
>>> }
>>>
>>> /**
>>> @@ -288,8 +290,13 @@ static int nfs_pageio_do_add_request(struct nfs_pageio_descriptor *desc,
>>> prev = nfs_list_entry(desc->pg_list.prev);
>>> if (!nfs_can_coalesce_requests(prev, req))
>>> return 0;
>>> - } else
>>> + } else {
>>> + put_lseg(desc->pg_lseg);
>>> desc->pg_base = req->wb_pgbase;
>>> + desc->pg_lseg = pnfs_update_layout(desc->pg_inode,
>>> + req->wb_context,
>>> + IOMODE_READ);
>>
>> Looking at this afresh after a week of vacation. Isn't it more natural
>> to do this as part of the pg_doio() callback?
>>
>> Your only reason for introducing the ->pg_lseg pointer is to be able to
>> pass it to the ->pg_doio() in the first place. Why not do that by simply
>> passing the 'desc' pointer to ->pg_doio(), and then having it call
>> pnfs_update_layout() instead of 'get_layout()'?
>>
>
> The problem is that it is not the only reason. Passing the lseg into
> the nfs_can_coalesce_requests is another. Calling pnfs_update_layout
> in ->pg_doio would be eliminate the opportunity to have a say in
> coalescing based on the layout.
>
>
As long as you correctly deal with short I/Os in to doio path (like we did
many moons ago) you should be fine if the layout you got does not cover
the whole coalesced range.
>>> + }
>>> nfs_list_remove_request(req);
>>> nfs_list_add_request(req, &desc->pg_list);
>>> desc->pg_count = newlen;
>>> @@ -307,7 +314,8 @@ static void nfs_pageio_doio(struct nfs_pageio_descriptor *desc)
>>> nfs_page_array_len(desc->pg_base,
>>> desc->pg_count),
>>> desc->pg_count,
>>> - desc->pg_ioflags);
>>> + desc->pg_ioflags,
>>> + desc->pg_lseg);
>>> if (error < 0)
>>> desc->pg_error = error;
>>> else
>>> @@ -345,6 +353,7 @@ int nfs_pageio_add_request(struct nfs_pageio_descriptor *desc,
>>> void nfs_pageio_complete(struct nfs_pageio_descriptor *desc)
>>> {
>>> nfs_pageio_doio(desc);
>>> + put_lseg(desc->pg_lseg);
>>> }
>>>
>>> /**
>>> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
>>> index f0a9578..dcd4356 100644
>>> --- a/fs/nfs/pnfs.c
>>> +++ b/fs/nfs/pnfs.c
>>> @@ -264,7 +264,7 @@ put_lseg_locked(struct pnfs_layout_segment *lseg,
>>> return 0;
>>> }
>>>
>>> -static void
>>> +void
>>> put_lseg(struct pnfs_layout_segment *lseg)
>>> {
>>> struct inode *ino;
>>> @@ -285,6 +285,7 @@ put_lseg(struct pnfs_layout_segment *lseg)
>>> pnfs_free_lseg_list(&free_me);
>>> }
>>> }
>>> +EXPORT_SYMBOL_GPL(put_lseg);
>>
>> Why is this needed here?
>>
>
> That looks like an artifact left over from older code. It is not needed.
>
>>
>>> static bool
>>> should_free_lseg(u32 lseg_iomode, u32 recall_iomode)
>>> @@ -797,7 +798,6 @@ pnfs_update_layout(struct inode *ino,
>>> out:
>>> dprintk("%s end, state 0x%lx lseg %p\n", __func__,
>>> nfsi->layout ? nfsi->layout->plh_flags : -1, lseg);
>>> - put_lseg(lseg); /* STUB - callers currently ignore return value */
>>> return lseg;
>>> out_unlock:
>>> spin_unlock(&ino->i_lock);
>>> diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
>>> index 9a994bc..121d6a3 100644
>>> --- a/fs/nfs/pnfs.h
>>> +++ b/fs/nfs/pnfs.h
>>> @@ -146,6 +146,7 @@ extern int nfs4_proc_layoutget(struct nfs4_layoutget *lgp);
>>>
>>> /* pnfs.c */
>>> void get_layout_hdr(struct pnfs_layout_hdr *lo);
>>> +void put_lseg(struct pnfs_layout_segment *lseg);
>>> struct pnfs_layout_segment *
>>> pnfs_update_layout(struct inode *ino, struct nfs_open_context *ctx,
>>> enum pnfs_iomode access_type);
>>> diff --git a/fs/nfs/read.c b/fs/nfs/read.c
>>> index aedcaa7..c453164 100644
>>> --- a/fs/nfs/read.c
>>> +++ b/fs/nfs/read.c
>>> @@ -20,17 +20,17 @@
>>> #include <linux/nfs_page.h>
>>>
>>> #include <asm/system.h>
>>> +#include "pnfs.h"
>>>
>>> #include "nfs4_fs.h"
>>> #include "internal.h"
>>> #include "iostat.h"
>>> #include "fscache.h"
>>> -#include "pnfs.h"
>>>
>>> #define NFSDBG_FACILITY NFSDBG_PAGECACHE
>>>
>>> -static int nfs_pagein_multi(struct inode *, struct list_head *, unsigned int, size_t, int);
>>> -static int nfs_pagein_one(struct inode *, struct list_head *, unsigned int, size_t, int);
>>> +static int nfs_pagein_multi(struct inode *, struct list_head *, unsigned int, size_t, int, struct pnfs_layout_segment *);
>>> +static int nfs_pagein_one(struct inode *, struct list_head *, unsigned int, size_t, int, struct pnfs_layout_segment *);
>>> static const struct rpc_call_ops nfs_read_partial_ops;
>>> static const struct rpc_call_ops nfs_read_full_ops;
>>>
>>> @@ -70,6 +70,7 @@ void nfs_readdata_free(struct nfs_read_data *p)
>>> static void nfs_readdata_release(struct nfs_read_data *rdata)
>>> {
>>> put_nfs_open_context(rdata->args.context);
>>> + put_lseg(rdata->lseg);
>>
>> Shouldn't you be calling put_lseg() _before_ put_nfs_open_context()? You
>> are not guaranteed that the inode still exists after that call.
>>
Good catch. If we need the layout to outlive the open context then
we should get a reference on the inode using iget and iput the inode
in put_layout_hdr_locked.
Benny
>
> Yes.
>
> Fred
next prev parent reply other threads:[~2011-02-16 3:11 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-14 19:18 [PATCH 0/16] pnfs wave 3 submission andros
2011-02-14 19:18 ` [PATCH 01/16] NFS remove unnecessary CONFIG_NFS_V4 from nfs_read_data andros
2011-02-15 9:16 ` Christoph Hellwig
2011-02-15 9:24 ` Taousif_Ansari-G5Y5guI6XLZWk0Htik3J/w
2011-02-15 14:51 ` Andy Adamson
2011-02-14 19:18 ` [PATCH 02/16] NFS put_layout_hdr can remove nfsi->layout andros
2011-02-14 19:18 ` [PATCH 03/16] NFS move nfs_client initialization into nfs_get_client andros
2011-02-16 2:58 ` Benny Halevy
2011-02-16 16:00 ` Andy Adamson
2011-02-14 19:18 ` [PATCH 04/16] pnfs: wave 3: send zero stateid seqid on v4.1 i/o andros
2011-02-14 19:18 ` [PATCH 05/16] pnfs: wave 3: new flag for state renewal check andros
2011-02-14 19:18 ` [PATCH 06/16] pnfs: wave 3: new flag for lease time check andros
2011-02-14 19:18 ` [PATCH 07/16] pnfs: wave 3: add MDS mount DS only check andros
2011-02-14 19:18 ` [PATCH 08/16] pnfs: wave 3: lseg refcounting andros
2011-02-15 9:25 ` Christoph Hellwig
2011-02-15 14:48 ` Fred Isaman
2011-02-15 14:58 ` Christoph Hellwig
2011-02-15 14:59 ` Benny Halevy
2011-02-15 15:06 ` Christoph Hellwig
2011-02-15 15:11 ` Fred Isaman
2011-02-15 16:02 ` Christoph Hellwig
2011-02-15 16:37 ` William A. (Andy) Adamson
2011-02-15 19:17 ` Andy Adamson
2011-02-15 19:29 ` Benny Halevy
2011-02-15 19:30 ` Andy Adamson
2011-02-15 15:07 ` Fred Isaman
2011-02-14 19:18 ` [PATCH 09/16] pnfs: wave 3: shift pnfs_update_layout locations andros
2011-02-14 23:14 ` Trond Myklebust
2011-02-15 14:41 ` Fred Isaman
2011-02-15 15:00 ` Trond Myklebust
2011-02-16 3:11 ` Benny Halevy [this message]
2011-02-14 19:18 ` [PATCH 10/16] pnfs: wave 3: coelesce across layout stripes andros
2011-02-14 23:42 ` Trond Myklebust
2011-02-15 14:43 ` William A. (Andy) Adamson
2011-02-15 15:03 ` Trond Myklebust
[not found] ` <1297782220.10103.13.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2011-02-15 15:10 ` Andy Adamson
2011-02-14 19:18 ` [PATCH 11/16] pnfs: wave 3: generic read andros
2011-02-14 23:36 ` Trond Myklebust
2011-02-15 14:47 ` Andy Adamson
2011-02-16 3:16 ` Benny Halevy
2011-02-16 14:53 ` Andy Adamson
2011-02-16 15:09 ` Trond Myklebust
2011-02-16 15:52 ` Benny Halevy
2011-02-16 15:56 ` Andy Adamson
2011-02-16 15:57 ` Sager, Mike
2011-02-14 19:18 ` [PATCH 12/16] pnfs: wave 3: data server connection andros
2011-02-14 19:18 ` [PATCH 13/16] pnfs: wave 3: filelayout i/o helpers andros
2011-02-15 9:31 ` Christoph Hellwig
2011-02-15 15:12 ` Andy Adamson
2011-02-14 19:18 ` [PATCH 14/16] pnfs: wave 3: filelayout read andros
2011-02-14 19:18 ` [PATCH 15/16] pnfs: wave 3: filelayout async error handler andros
2011-02-14 19:18 ` [PATCH 16/16] pnfs: wave 3: turn off pNFS on ds connection failure andros
2011-02-14 22:39 ` [PATCH 0/16] pnfs wave 3 submission Trond Myklebust
2011-02-15 14:44 ` William A. (Andy) Adamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D5B406C.4080801@panasas.com \
--to=bhalevy@panasas.com \
--cc=Trond.Myklebust@netapp.com \
--cc=andros@citi.umich.edu \
--cc=andros@netapp.com \
--cc=bharrosh@panasas.com \
--cc=dhildeb@us.ibm.com \
--cc=green@linuxhacker.ru \
--cc=guotao@nrchpc.ac.cn \
--cc=iisaman@netapp.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).