linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] NFS: Adapt readdirplus to application usage patterns
@ 2012-05-01 22:06 Trond Myklebust
  2012-05-01 22:39 ` NeilBrown
  0 siblings, 1 reply; 4+ messages in thread
From: Trond Myklebust @ 2012-05-01 22:06 UTC (permalink / raw)
  To: linux-nfs; +Cc: Neil Brown

While the use of READDIRPLUS is significantly more efficient than
READDIR followed by many LOOKUP calls, it is still less efficient
than just READDIR if the attributes are not required.

This patch tracks when lookups are attempted on the directory,
and uses that information to selectively disable READDIRPLUS
on that directory.
The first 'readdir' call is always served using READDIRPLUS.
Subsequent calls only use READDIRPLUS if there was a successful
lookup or revalidation on a child in the mean time.

Credit for the original idea should go to Neil Brown. See:
      http://www.spinics.net/lists/linux-nfs/msg19996.html
However, the implementation in this patch differs from Neil's
in that it focuses on tracking lookups rather than calls to
stat().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Neil Brown <neilb@suse.de>
---
 fs/nfs/dir.c           |   35 +++++++++++++++++++++++++++++++++--
 fs/nfs/inode.c         |    2 --
 include/linux/nfs_fs.h |    5 -----
 3 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 82b42e2..03a0220 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -475,6 +475,29 @@ different:
 }
 
 static
+bool nfs_use_readdirplus(struct inode *dir, struct file *filp)
+{
+	if (!nfs_server_capable(dir, NFS_CAP_READDIRPLUS))
+		return false;
+	if (test_and_clear_bit(NFS_INO_ADVISE_RDPLUS, &NFS_I(dir)->flags))
+		return true;
+	if (filp->f_pos == 0)
+		return true;
+	return false;
+}
+
+/*
+ * This function is called by the lookup code to request the use of
+ * readdirplus to accelerate any future lookups in the same
+ * directory.
+ */
+static
+void nfs_advise_use_readdirplus(struct inode *dir)
+{
+	set_bit(NFS_INO_ADVISE_RDPLUS, &NFS_I(dir)->flags);
+}
+
+static
 void nfs_prime_dcache(struct dentry *parent, struct nfs_entry *entry)
 {
 	struct qstr filename = {
@@ -874,7 +897,9 @@ static int nfs_readdir(struct file *filp, void *dirent, filldir_t filldir)
 	desc->file = filp;
 	desc->dir_cookie = &dir_ctx->dir_cookie;
 	desc->decode = NFS_PROTO(inode)->decode_dirent;
-	desc->plus = NFS_USE_READDIRPLUS(inode);
+	desc->plus = 0;
+	if (nfs_use_readdirplus(inode, filp))
+		desc->plus = 1;
 
 	nfs_block_sillyrename(dentry);
 	res = nfs_revalidate_mapping(inode, filp->f_mapping);
@@ -1114,7 +1139,7 @@ static int nfs_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
 	if (!inode) {
 		if (nfs_neg_need_reval(dir, dentry, nd))
 			goto out_bad;
-		goto out_valid;
+		goto out_valid_noent;
 	}
 
 	if (is_bad_inode(inode)) {
@@ -1156,6 +1181,9 @@ static int nfs_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
 out_set_verifier:
 	nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
  out_valid:
+	/* Success: notify readdir to use READDIRPLUS */
+	nfs_advise_use_readdirplus(dir);
+ out_valid_noent:
 	dput(parent);
 	dfprintk(LOOKUPCACHE, "NFS: %s(%s/%s) is valid\n",
 			__func__, dentry->d_parent->d_name.name,
@@ -1311,6 +1339,9 @@ static struct dentry *nfs_lookup(struct inode *dir, struct dentry * dentry, stru
 	if (IS_ERR(res))
 		goto out_unblock_sillyrename;
 
+	/* Success: notify readdir to use READDIRPLUS */
+	nfs_advise_use_readdirplus(dir);
+
 no_entry:
 	res = d_materialise_unique(dentry, inode);
 	if (res != NULL) {
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 0d53113..9f17cd1 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -298,8 +298,6 @@ nfs_fhget(struct super_block *sb, struct nfs_fh *fh, struct nfs_fattr *fattr)
 			inode->i_op = NFS_SB(sb)->nfs_client->rpc_ops->dir_inode_ops;
 			inode->i_fop = &nfs_dir_operations;
 			inode->i_data.a_ops = &nfs_dir_aops;
-			if (nfs_server_capable(inode, NFS_CAP_READDIRPLUS))
-				set_bit(NFS_INO_ADVISE_RDPLUS, &NFS_I(inode)->flags);
 			/* Deal with crossing mountpoints */
 			if (fattr->valid & NFS_ATTR_FATTR_MOUNTPOINT ||
 					fattr->valid & NFS_ATTR_FATTR_V4_REFERRAL) {
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 8a88c16..6cc7dba 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -314,11 +314,6 @@ static inline int nfs_server_capable(struct inode *inode, int cap)
 	return NFS_SERVER(inode)->caps & cap;
 }
 
-static inline int NFS_USE_READDIRPLUS(struct inode *inode)
-{
-	return test_bit(NFS_INO_ADVISE_RDPLUS, &NFS_I(inode)->flags);
-}
-
 static inline void nfs_set_verifier(struct dentry * dentry, unsigned long verf)
 {
 	dentry->d_time = verf;
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] NFS: Adapt readdirplus to application usage patterns
  2012-05-01 22:06 [PATCH] NFS: Adapt readdirplus to application usage patterns Trond Myklebust
@ 2012-05-01 22:39 ` NeilBrown
  2012-05-01 23:04   ` Myklebust, Trond
  0 siblings, 1 reply; 4+ messages in thread
From: NeilBrown @ 2012-05-01 22:39 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

[-- Attachment #1: Type: text/plain, Size: 2172 bytes --]

On Tue,  1 May 2012 18:06:23 -0400 Trond Myklebust
<Trond.Myklebust@netapp.com> wrote:

> While the use of READDIRPLUS is significantly more efficient than
> READDIR followed by many LOOKUP calls, it is still less efficient
> than just READDIR if the attributes are not required.
> 
> This patch tracks when lookups are attempted on the directory,
> and uses that information to selectively disable READDIRPLUS
> on that directory.
> The first 'readdir' call is always served using READDIRPLUS.
> Subsequent calls only use READDIRPLUS if there was a successful
> lookup or revalidation on a child in the mean time.
> 
> Credit for the original idea should go to Neil Brown. See:
>       http://www.spinics.net/lists/linux-nfs/msg19996.html
> However, the implementation in this patch differs from Neil's
> in that it focuses on tracking lookups rather than calls to
> stat().
> 
> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> Cc: Neil Brown <neilb@suse.de>

Thanks!  Looks good.
Tracking lookups rather than getattr does make sense - more general.

> @@ -874,7 +897,9 @@ static int nfs_readdir(struct file *filp, void *dirent, filldir_t filldir)
>  	desc->file = filp;
>  	desc->dir_cookie = &dir_ctx->dir_cookie;
>  	desc->decode = NFS_PROTO(inode)->decode_dirent;
> -	desc->plus = NFS_USE_READDIRPLUS(inode);
> +	desc->plus = 0;
> +	if (nfs_use_readdirplus(inode, filp))
> +		desc->plus = 1;
>  

This looks a bit odd to me.

        desc->plus = nfs_use_readdirplus(inode, filp):
??
Or if you think there is a type conflict:
        desc->plus = nfs_use_readdirplus(inode, file) ? 1 : 0;

But maybe you prefer the original - your choice of course.

Also, the current code will clear NFS_INO_ADVISE_RDPLUS if it gets -ETOOSMALL,
and it will stay cleared until the inode falls out of cache and is reloaded.
The new code will set it again a lot sooner.  Is that likely to be a problem
(I don't remember what conditions lead to ETOOSMALL)?
In fact I think it will ignore  the ETOOSMALL and always retry with
readdirplus for the first read in a directory.  That doesn't seem good ?

Thanks,
NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] NFS: Adapt readdirplus to application usage patterns
  2012-05-01 22:39 ` NeilBrown
@ 2012-05-01 23:04   ` Myklebust, Trond
  2012-05-01 23:22     ` NeilBrown
  0 siblings, 1 reply; 4+ messages in thread
From: Myklebust, Trond @ 2012-05-01 23:04 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-nfs@vger.kernel.org

T24gV2VkLCAyMDEyLTA1LTAyIGF0IDA4OjM5ICsxMDAwLCBOZWlsQnJvd24gd3JvdGU6DQo+IE9u
IFR1ZSwgIDEgTWF5IDIwMTIgMTg6MDY6MjMgLTA0MDAgVHJvbmQgTXlrbGVidXN0DQo+IDxUcm9u
ZC5NeWtsZWJ1c3RAbmV0YXBwLmNvbT4gd3JvdGU6DQo+IA0KPiA+IFdoaWxlIHRoZSB1c2Ugb2Yg
UkVBRERJUlBMVVMgaXMgc2lnbmlmaWNhbnRseSBtb3JlIGVmZmljaWVudCB0aGFuDQo+ID4gUkVB
RERJUiBmb2xsb3dlZCBieSBtYW55IExPT0tVUCBjYWxscywgaXQgaXMgc3RpbGwgbGVzcyBlZmZp
Y2llbnQNCj4gPiB0aGFuIGp1c3QgUkVBRERJUiBpZiB0aGUgYXR0cmlidXRlcyBhcmUgbm90IHJl
cXVpcmVkLg0KPiA+IA0KPiA+IFRoaXMgcGF0Y2ggdHJhY2tzIHdoZW4gbG9va3VwcyBhcmUgYXR0
ZW1wdGVkIG9uIHRoZSBkaXJlY3RvcnksDQo+ID4gYW5kIHVzZXMgdGhhdCBpbmZvcm1hdGlvbiB0
byBzZWxlY3RpdmVseSBkaXNhYmxlIFJFQURESVJQTFVTDQo+ID4gb24gdGhhdCBkaXJlY3Rvcnku
DQo+ID4gVGhlIGZpcnN0ICdyZWFkZGlyJyBjYWxsIGlzIGFsd2F5cyBzZXJ2ZWQgdXNpbmcgUkVB
RERJUlBMVVMuDQo+ID4gU3Vic2VxdWVudCBjYWxscyBvbmx5IHVzZSBSRUFERElSUExVUyBpZiB0
aGVyZSB3YXMgYSBzdWNjZXNzZnVsDQo+ID4gbG9va3VwIG9yIHJldmFsaWRhdGlvbiBvbiBhIGNo
aWxkIGluIHRoZSBtZWFuIHRpbWUuDQo+ID4gDQo+ID4gQ3JlZGl0IGZvciB0aGUgb3JpZ2luYWwg
aWRlYSBzaG91bGQgZ28gdG8gTmVpbCBCcm93bi4gU2VlOg0KPiA+ICAgICAgIGh0dHA6Ly93d3cu
c3Bpbmljcy5uZXQvbGlzdHMvbGludXgtbmZzL21zZzE5OTk2Lmh0bWwNCj4gPiBIb3dldmVyLCB0
aGUgaW1wbGVtZW50YXRpb24gaW4gdGhpcyBwYXRjaCBkaWZmZXJzIGZyb20gTmVpbCdzDQo+ID4g
aW4gdGhhdCBpdCBmb2N1c2VzIG9uIHRyYWNraW5nIGxvb2t1cHMgcmF0aGVyIHRoYW4gY2FsbHMg
dG8NCj4gPiBzdGF0KCkuDQo+ID4gDQo+ID4gU2lnbmVkLW9mZi1ieTogVHJvbmQgTXlrbGVidXN0
IDxUcm9uZC5NeWtsZWJ1c3RAbmV0YXBwLmNvbT4NCj4gPiBDYzogTmVpbCBCcm93biA8bmVpbGJA
c3VzZS5kZT4NCj4gDQo+IFRoYW5rcyEgIExvb2tzIGdvb2QuDQo+IFRyYWNraW5nIGxvb2t1cHMg
cmF0aGVyIHRoYW4gZ2V0YXR0ciBkb2VzIG1ha2Ugc2Vuc2UgLSBtb3JlIGdlbmVyYWwuDQoNClJp
Z2h0LiBJdCBpcyBtb3JlIGluIGxpbmUgd2l0aCB0aGUgb3JpZ2luYWwgaW50ZW50aW9uIG9mIFJF
QURESVJQTFVTLA0Kd2hpY2ggaXMgdG8gc2VydmUgYXMgYSBHQU5HX0xPT0tVUCByYXRoZXIgdGhh
biBhICdscyAtbCcgc3Vic3RpdHV0ZS4uLg0KDQo+ID4gQEAgLTg3NCw3ICs4OTcsOSBAQCBzdGF0
aWMgaW50IG5mc19yZWFkZGlyKHN0cnVjdCBmaWxlICpmaWxwLCB2b2lkICpkaXJlbnQsIGZpbGxk
aXJfdCBmaWxsZGlyKQ0KPiA+ICAJZGVzYy0+ZmlsZSA9IGZpbHA7DQo+ID4gIAlkZXNjLT5kaXJf
Y29va2llID0gJmRpcl9jdHgtPmRpcl9jb29raWU7DQo+ID4gIAlkZXNjLT5kZWNvZGUgPSBORlNf
UFJPVE8oaW5vZGUpLT5kZWNvZGVfZGlyZW50Ow0KPiA+IC0JZGVzYy0+cGx1cyA9IE5GU19VU0Vf
UkVBRERJUlBMVVMoaW5vZGUpOw0KPiA+ICsJZGVzYy0+cGx1cyA9IDA7DQo+ID4gKwlpZiAobmZz
X3VzZV9yZWFkZGlycGx1cyhpbm9kZSwgZmlscCkpDQo+ID4gKwkJZGVzYy0+cGx1cyA9IDE7DQo+
ID4gIA0KPiANCj4gVGhpcyBsb29rcyBhIGJpdCBvZGQgdG8gbWUuDQo+IA0KPiAgICAgICAgIGRl
c2MtPnBsdXMgPSBuZnNfdXNlX3JlYWRkaXJwbHVzKGlub2RlLCBmaWxwKToNCj4gPz8NCj4gT3Ig
aWYgeW91IHRoaW5rIHRoZXJlIGlzIGEgdHlwZSBjb25mbGljdDoNCj4gICAgICAgICBkZXNjLT5w
bHVzID0gbmZzX3VzZV9yZWFkZGlycGx1cyhpbm9kZSwgZmlsZSkgPyAxIDogMDsNCg0KZGVzYy0+
cGx1cyBpcyBhIGJpdCBmaWVsZCwgc28gdGhlIGNvbXBpbGVyIHdpbGwgYmUgdHJhbnNmb3JtaW5n
IHRoZQ0KYWJvdmUgYW55d2F5LiBJIHRoZXJlZm9yZSBmaWd1cmUgd2UganVzdCB3YW50IHdoYXRl
dmVyIGlzIGVhc2llc3Qgb24gdGhlDQpleWUuDQoNCj4gQnV0IG1heWJlIHlvdSBwcmVmZXIgdGhl
IG9yaWdpbmFsIC0geW91ciBjaG9pY2Ugb2YgY291cnNlLg0KPiANCj4gQWxzbywgdGhlIGN1cnJl
bnQgY29kZSB3aWxsIGNsZWFyIE5GU19JTk9fQURWSVNFX1JEUExVUyBpZiBpdCBnZXRzIC1FVE9P
U01BTEwsDQo+IGFuZCBpdCB3aWxsIHN0YXkgY2xlYXJlZCB1bnRpbCB0aGUgaW5vZGUgZmFsbHMg
b3V0IG9mIGNhY2hlIGFuZCBpcyByZWxvYWRlZC4NCj4gVGhlIG5ldyBjb2RlIHdpbGwgc2V0IGl0
IGFnYWluIGEgbG90IHNvb25lci4gIElzIHRoYXQgbGlrZWx5IHRvIGJlIGEgcHJvYmxlbQ0KPiAo
SSBkb24ndCByZW1lbWJlciB3aGF0IGNvbmRpdGlvbnMgbGVhZCB0byBFVE9PU01BTEwpPw0KPiBJ
biBmYWN0IEkgdGhpbmsgaXQgd2lsbCBpZ25vcmUgIHRoZSBFVE9PU01BTEwgYW5kIGFsd2F5cyBy
ZXRyeSB3aXRoDQo+IHJlYWRkaXJwbHVzIGZvciB0aGUgZmlyc3QgcmVhZCBpbiBhIGRpcmVjdG9y
eS4gIFRoYXQgZG9lc24ndCBzZWVtIGdvb2QgPw0KDQpFVE9PU01BTEwgbWVhbnMgdGhhdCB0aGVy
ZSBpc24ndCBlbm91Z2ggYnVmZmVyIHNwYWNlIHRvIGZpbGwgYSBmdWxsDQpyZWNvcmQuIEdpdmVu
IHRoYXQgd2Ugbm93IHByb3ZpZGUgPiAzMmsgYnVmZmVyIHNwYWNlLCB0aGF0IHNlZW1zIGxpa2Ug
aXQNCndvdWxkIGJlIGV4dHJlbWVseSBkaWZmaWN1bHQgdG8gcHJvZHVjZS4gRXZlbiBhIExpbnV4
IHNlcnZlciwgd2l0aCBpdHMNCjRrIHRvdGFsIGxpbWl0IHNob3VsZCBiZSBhYmxlIHRvIHByb3Zp
ZGUgYXQgbGVhc3QgMSByZWFkZGlycGx1cyByZWNvcmQuDQoNCklmIHdlIGRvIGZpbmQgYSBkaXJl
Y3RvcnkgdGhhdCBrZWVwcyB0cmlnZ2VyaW5nIHRoZSBFVE9PU01BTEwsIHRoZW4gSQ0Kc3VwcG9z
ZSB3ZSBjYW4gYWRkIGEgTkZTX0lOT19GT1JCSURfUkRQTFVTIGZsYWcgdG8gdHVybiByZWFkZGly
cGx1cyBvZmYNCnBlcm1hbmVudGx5IGZvciB0aGF0IGRpcmVjdG9yeS4gSG93ZXZlciB1bnRpbCBJ
IGZpbmQgdGhhdCBwYXJ0aWN1bGFyDQpiZWFzdCwgdGhlbiBJJ2QgYmUgaW5jbGluZWQgdG8gd2Fp
dC4NCg0KLS0gDQpUcm9uZCBNeWtsZWJ1c3QNCkxpbnV4IE5GUyBjbGllbnQgbWFpbnRhaW5lcg0K
DQpOZXRBcHANClRyb25kLk15a2xlYnVzdEBuZXRhcHAuY29tDQp3d3cubmV0YXBwLmNvbQ0KDQo=

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] NFS: Adapt readdirplus to application usage patterns
  2012-05-01 23:04   ` Myklebust, Trond
@ 2012-05-01 23:22     ` NeilBrown
  0 siblings, 0 replies; 4+ messages in thread
From: NeilBrown @ 2012-05-01 23:22 UTC (permalink / raw)
  To: Myklebust, Trond; +Cc: linux-nfs@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 1303 bytes --]

On Tue, 1 May 2012 23:04:51 +0000 "Myklebust, Trond"
<Trond.Myklebust@netapp.com> wrote:

> > Also, the current code will clear NFS_INO_ADVISE_RDPLUS if it gets -ETOOSMALL,
> > and it will stay cleared until the inode falls out of cache and is reloaded.
> > The new code will set it again a lot sooner.  Is that likely to be a problem
> > (I don't remember what conditions lead to ETOOSMALL)?
> > In fact I think it will ignore  the ETOOSMALL and always retry with
> > readdirplus for the first read in a directory.  That doesn't seem good ?
> 
> ETOOSMALL means that there isn't enough buffer space to fill a full
> record. Given that we now provide > 32k buffer space, that seems like it
> would be extremely difficult to produce. Even a Linux server, with its
> 4k total limit should be able to provide at least 1 readdirplus record.
> 
> If we do find a directory that keeps triggering the ETOOSMALL, then I
> suppose we can add a NFS_INO_FORBID_RDPLUS flag to turn readdirplus off
> permanently for that directory. However until I find that particular
> beast, then I'd be inclined to wait.
> 

I can't argue with that.  And looking at the code again it does disable
readdirplus for at least the next request by setting "desc->plus = 0".
So: all good.

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-05-01 23:22 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-01 22:06 [PATCH] NFS: Adapt readdirplus to application usage patterns Trond Myklebust
2012-05-01 22:39 ` NeilBrown
2012-05-01 23:04   ` Myklebust, Trond
2012-05-01 23:22     ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).