From: Boaz Harrosh <bharrosh@panasas.com>
To: Jim Rees <rees@umich.edu>
Cc: Peng Tao <bergwolf@gmail.com>,
"Isaman, Fred" <Fred.Isaman@netapp.com>,
Andy Adamson <andros@netapp.com>,
"Myklebust, Trond" <Trond.Myklebust@netapp.com>,
Benny Halevy <bhalevy@tonian.com>, <linux-nfs@vger.kernel.org>,
Peng Tao <peng_tao@emc.com>
Subject: Re: [PATCH] pnfsblock: init pg_bsize properly
Date: Thu, 25 Aug 2011 17:16:21 -0700 [thread overview]
Message-ID: <4E56E5D5.2070701@panasas.com> (raw)
In-Reply-To: <20110825201523.GA6901@merit.edu>
On 08/25/2011 01:15 PM, Jim Rees wrote:
>
> We discussed this on the call today. Boaz is going to write a brief
> description of how to fix this in the generic layer, then I'm going to
> implement it.
For blocks and objects we need something like the below [1].
(Done only for reads)
But I suspect I now broke files-LD. For files-LD what it actually
needs, (As I understood from trond) Is the same code like today
but with a similar patch as Peng's but for files-LD that sets
pg_bsize to the minimum of w/rsize and stripe_unit.
This is mainly because it needs that nfs_readpage_result/release_partial
which waits for all RPCs before it actually calls nfs_end_page_writeback
(PageUpTodate for reads) on that page that was shared between multiple
requests.
So I guess we need to do [2] option below (Only done for writes).
+ With added code to set this bit_flag in objects and blocks.
(Just like PNFS_LAYOUTRET_ON_SETATTR)
+ files-LD code to override pnfs_generic_pg_init_read/write and
set pg_bsize to min(pg_bsize, stripe_unit). (Can be its own patch)
+ Define empty pnfs_ld_ignore_rwsize() for !CONFIG_NFS_V4_1
---------------------------------------------------------------
[1] (only reads)
Do not use nfs_pagein_multi() for the pNFS case ...
-----
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index ab12913..a4d0191 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -296,7 +296,7 @@ extern int nfs_access_cache_shrinker(struct shrinker *shrink,
extern int nfs_initiate_read(struct nfs_read_data *data, struct rpc_clnt *clnt,
const struct rpc_call_ops *call_ops);
extern void nfs_read_prepare(struct rpc_task *task, void *calldata);
-extern int nfs_generic_pagein(struct nfs_pageio_descriptor *desc,
+extern int nfs_pagein_one(struct nfs_pageio_descriptor *desc,
struct list_head *head);
extern void nfs_pageio_reset_read_mds(struct nfs_pageio_descriptor *pgio);
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index e550e88..b7e3e41 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1356,7 +1356,7 @@ struct pnfs_layout_segment *
LIST_HEAD(head);
int ret;
- ret = nfs_generic_pagein(desc, &head);
+ ret = nfs_pagein_one(desc, &head);
if (ret != 0) {
put_lseg(desc->pg_lseg);
desc->pg_lseg = NULL;
diff --git a/fs/nfs/read.c b/fs/nfs/read.c
index 2171c04..ce5982a 100644
--- a/fs/nfs/read.c
+++ b/fs/nfs/read.c
@@ -336,7 +336,7 @@ static int nfs_pagein_multi(struct nfs_pageio_descriptor *desc, struct list_head
return -ENOMEM;
}
-static int nfs_pagein_one(struct nfs_pageio_descriptor *desc, struct list_head *res)
+int nfs_pagein_one(struct nfs_pageio_descriptor *desc, struct list_head *res)
{
struct nfs_page *req;
struct page **pages;
@@ -369,19 +369,15 @@ static int nfs_pagein_one(struct nfs_pageio_descriptor *desc, struct list_head *
return ret;
}
-int nfs_generic_pagein(struct nfs_pageio_descriptor *desc, struct list_head *head)
-{
- if (desc->pg_bsize < PAGE_CACHE_SIZE)
- return nfs_pagein_multi(desc, head);
- return nfs_pagein_one(desc, head);
-}
-
static int nfs_generic_pg_readpages(struct nfs_pageio_descriptor *desc)
{
LIST_HEAD(head);
int ret;
- ret = nfs_generic_pagein(desc, &head);
+ if (desc->pg_bsize < PAGE_CACHE_SIZE)
+ ret = nfs_pagein_multi(desc, &head);
+ else
+ ret = nfs_pagein_one(desc, &head);
if (ret == 0)
ret = nfs_do_multiple_reads(&head, desc->pg_rpc_callops);
return ret;
---------------------------------------------------------------
[2] (only writes)
Do not use nfs_pagein_multi() for layout drivers that
must not use it. (Objects and Blocks) ...
-------
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 01cbfd5..d32538a 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -68,6 +68,8 @@ enum {
enum layoutdriver_policy_flags {
/* Should the pNFS client commit and return the layout upon a setattr */
PNFS_LAYOUTRET_ON_SETATTR = 1 << 0,
+ /* Do not use nfs_xxx_partial_ops */
+ PNFS_IGNOR_RWSIZE = 2 << 0,
};
struct nfs4_deviceid_node;
@@ -315,6 +317,15 @@ static inline void pnfs_clear_request_commit(struct nfs_page *req)
PNFS_LAYOUTRET_ON_SETATTR;
}
+static inline bool
+pnfs_ld_ignore_rwsize(struct inode *inode)
+{
+ if (!pnfs_enabled_sb(NFS_SERVER(inode)))
+ return false;
+ return NFS_SERVER(inode)->pnfs_curr_ld->flags &
+ PNFS_IGNOR_RWSIZE;
+}
+
static inline int pnfs_return_layout(struct inode *ino)
{
struct nfs_inode *nfsi = NFS_I(ino);
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index b39b37f..6b25073 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -1029,7 +1029,8 @@ static int nfs_flush_one(struct nfs_pageio_descriptor *desc, struct list_head *r
int nfs_generic_flush(struct nfs_pageio_descriptor *desc, struct list_head *head)
{
- if (desc->pg_bsize < PAGE_CACHE_SIZE)
+ if (!pnfs_ld_ignore_rwsize(desc->pg_inode) &&
+ desc->pg_bsize < PAGE_CACHE_SIZE)
return nfs_flush_multi(desc, head);
return nfs_flush_one(desc, head);
}
prev parent reply other threads:[~2011-08-26 0:17 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-13 1:04 [PATCH] pnfsblock: init pg_bsize properly Peng Tao
2011-08-16 21:05 ` Boaz Harrosh
2011-08-17 7:15 ` Benny Halevy
2011-08-17 9:35 ` Peng Tao
2011-08-17 16:27 ` Benny Halevy
2011-08-18 14:34 ` Peng Tao
2011-08-22 23:52 ` Boaz Harrosh
2011-08-23 0:00 ` Myklebust, Trond
2011-08-23 15:01 ` Peng Tao
2011-08-23 21:19 ` Boaz Harrosh
2011-08-25 20:15 ` Jim Rees
2011-08-26 0:16 ` Boaz Harrosh [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E56E5D5.2070701@panasas.com \
--to=bharrosh@panasas.com \
--cc=Fred.Isaman@netapp.com \
--cc=Trond.Myklebust@netapp.com \
--cc=andros@netapp.com \
--cc=bergwolf@gmail.com \
--cc=bhalevy@tonian.com \
--cc=linux-nfs@vger.kernel.org \
--cc=peng_tao@emc.com \
--cc=rees@umich.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.