* [PATCH 1/3] NFS: Fix READDIR oops with NFSv4 on RDMA
2014-01-17 19:37 [PATCH 0/3] NFS/RDMA bug fixes Chuck Lever
@ 2014-01-17 19:38 ` Chuck Lever
2014-01-17 19:38 ` [PATCH 2/3] SUNRPC: Fix large reads on NFS/RDMA Chuck Lever
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Chuck Lever @ 2014-01-17 19:38 UTC (permalink / raw)
To: linux-nfs
When starting the Connectathon basic tests on an NFSv4 RDMA
mount, I encountered this oops:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff8129cc56>] memcpy+0x6/0x110
PGD 2106cd067 PUD 20fef9067 PMD 0
Oops: 0000 [#1] SMP
...
[<ffffffffa05dc1b1>] ? xdr_inline_decode+0xb1/0x120 [sunrpc]
[<ffffffffa071f19c>] nfs4_decode_dirent+0x4c/0x250 [nfsv4]
[<ffffffff81178a02>] ? alloc_pages_current+0xb2/0x170
[<ffffffffa06a1225>] nfs_readdir_page_filler+0xe5/0x2c0 [nfs]
[<ffffffffa06a1622>] nfs_readdir_xdr_to_array+0x222/0x2e0 [nfs]
[<ffffffffa06a1702>] nfs_readdir_filler+0x22/0x90 [nfs]
[<ffffffff8112f975>] ? add_to_page_cache_lru+0x35/0x50
[<ffffffff8112faee>] __read_cache_page+0x7e/0xe0
[<ffffffffa06a16e0>] ? nfs_readdir_xdr_to_array+0x2e0/0x2e0 [nfs]
[<ffffffffa06a16e0>] ? nfs_readdir_xdr_to_array+0x2e0/0x2e0 [nfs]
[<ffffffff8113079c>] do_read_cache_page+0x3c/0x110
[<ffffffff811308b9>] read_cache_page_async+0x19/0x20
[<ffffffff811308ce>] read_cache_page+0xe/0x20
[<ffffffffa06a1c1e>] nfs_readdir+0x14e/0x3d0 [nfs]
[<ffffffffa071f150>] ? decode_pathconf+0x1c0/0x1c0 [nfsv4]
[<ffffffff811a811d>] iterate_dir+0xad/0xd0
[<ffffffff811a71ca>] ? do_fcntl+0x28a/0x370
[<ffffffff811a82d5>] SyS_getdents+0x95/0x100
[<ffffffff811a83e0>] ? SyS_old_readdir+0xa0/0xa0
[<ffffffff815a7752>] system_call_fastpath+0x16/0x1b
The problem does not occur with NFSv3 over RDMA.
nfs4_decode_dirent() is confused because the xdr_buf's page
vector starts long after the first directory entry in the
server's reply.
Commit aa9c2669, "NFS: Client implementation of Labeled-NFS,"
is reported by git bisect as the first bad commit.
This commit changes the decode_readdir_maxsz macro. This macro
controls where the generic XDR routines split incoming readdir
reply data between the head[0] buffer and the page cache.
Security labels go with each directory entry, thus they are
always stored in the page cache, not in the head buffer. The
length of the reply that goes in head[0] should not change.
I've reverted the change to decode_readdir_maxsz.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=68371
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: <stable@vger.kernel.org> # 3.11+
---
fs/nfs/nfs4xdr.c | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 5be2868..79e1d02 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -203,8 +203,7 @@ static int nfs4_stat_to_errno(int);
2 + encode_verifier_maxsz + 5 + \
nfs4_label_maxsz)
#define decode_readdir_maxsz (op_decode_hdr_maxsz + \
- decode_verifier_maxsz + \
- nfs4_label_maxsz + nfs4_fattr_maxsz)
+ decode_verifier_maxsz)
#define encode_readlink_maxsz (op_encode_hdr_maxsz)
#define decode_readlink_maxsz (op_decode_hdr_maxsz + 1)
#define encode_write_maxsz (op_encode_hdr_maxsz + \
^ permalink raw reply related [flat|nested] 8+ messages in thread* [PATCH 2/3] SUNRPC: Fix large reads on NFS/RDMA
2014-01-17 19:37 [PATCH 0/3] NFS/RDMA bug fixes Chuck Lever
2014-01-17 19:38 ` [PATCH 1/3] NFS: Fix READDIR oops with NFSv4 on RDMA Chuck Lever
@ 2014-01-17 19:38 ` Chuck Lever
2014-01-21 20:17 ` Jeff Layton
2014-01-17 19:38 ` [PATCH 3/3] SUNRPC: remove KERN_INFO from dprintk() call sites Chuck Lever
[not found] ` <CABgxfbFbLKV98GavS0x_X_ZXw0hZXPEQCpfoJJi+uNp133qt2w@mail.gmail.com>
3 siblings, 1 reply; 8+ messages in thread
From: Chuck Lever @ 2014-01-17 19:38 UTC (permalink / raw)
To: linux-nfs
After commit a11a2bf4, "SUNRPC: Optimise away unnecessary data
moves in xdr_align_pages", Thu Aug 2 13:21:43 2012, READs larger
than a few hundred bytes via NFS/RDMA stopped working. This
commit exposed a long-standing bug in rpcrdma_inline_fixup().
I reproduce this with an rsize=4096 mount using the cthon04
basic tests. Test 5 fails with an EIO error.
For my reproducer, kernel log shows:
NFS: server cheating in read reply: count 4096 > recvd 0
rpcrdma_inline_fixup() is zeroing the xdr_stream::page_len
field, and xdr_align_pages() is now returning that value to
the READ XDR decoder function.
That field is set up by xdr_inline_pages() by the READ XDR
encoder function. As far as I can tell, it is supposed to
be left alone after that, as it describes the dimensions of
the reply xdr_stream, not the contents of that stream.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=68391
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: <stable@vger.kernel.org> # 3.7+
---
net/sunrpc/xprtrdma/rpc_rdma.c | 4 +---
1 files changed, 1 insertions(+), 3 deletions(-)
diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index e03725b..96ead52 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -649,9 +649,7 @@ rpcrdma_inline_fixup(struct rpc_rqst *rqst, char *srcp, int copy_len, int pad)
break;
page_base = 0;
}
- rqst->rq_rcv_buf.page_len = olen - copy_len;
- } else
- rqst->rq_rcv_buf.page_len = 0;
+ }
if (copy_len && rqst->rq_rcv_buf.tail[0].iov_len) {
curlen = copy_len;
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH 2/3] SUNRPC: Fix large reads on NFS/RDMA
2014-01-17 19:38 ` [PATCH 2/3] SUNRPC: Fix large reads on NFS/RDMA Chuck Lever
@ 2014-01-21 20:17 ` Jeff Layton
0 siblings, 0 replies; 8+ messages in thread
From: Jeff Layton @ 2014-01-21 20:17 UTC (permalink / raw)
To: Chuck Lever; +Cc: linux-nfs
On Fri, 17 Jan 2014 14:38:16 -0500
Chuck Lever <chuck.lever@oracle.com> wrote:
> After commit a11a2bf4, "SUNRPC: Optimise away unnecessary data
> moves in xdr_align_pages", Thu Aug 2 13:21:43 2012, READs larger
> than a few hundred bytes via NFS/RDMA stopped working. This
> commit exposed a long-standing bug in rpcrdma_inline_fixup().
>
> I reproduce this with an rsize=4096 mount using the cthon04
> basic tests. Test 5 fails with an EIO error.
>
> For my reproducer, kernel log shows:
>
> NFS: server cheating in read reply: count 4096 > recvd 0
>
> rpcrdma_inline_fixup() is zeroing the xdr_stream::page_len
> field, and xdr_align_pages() is now returning that value to
> the READ XDR decoder function.
>
> That field is set up by xdr_inline_pages() by the READ XDR
> encoder function. As far as I can tell, it is supposed to
> be left alone after that, as it describes the dimensions of
> the reply xdr_stream, not the contents of that stream.
>
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=68391
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> Cc: <stable@vger.kernel.org> # 3.7+
> ---
>
> net/sunrpc/xprtrdma/rpc_rdma.c | 4 +---
> 1 files changed, 1 insertions(+), 3 deletions(-)
>
> diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
> index e03725b..96ead52 100644
> --- a/net/sunrpc/xprtrdma/rpc_rdma.c
> +++ b/net/sunrpc/xprtrdma/rpc_rdma.c
> @@ -649,9 +649,7 @@ rpcrdma_inline_fixup(struct rpc_rqst *rqst, char *srcp, int copy_len, int pad)
> break;
> page_base = 0;
> }
> - rqst->rq_rcv_buf.page_len = olen - copy_len;
> - } else
> - rqst->rq_rcv_buf.page_len = 0;
> + }
>
> if (copy_len && rqst->rq_rcv_buf.tail[0].iov_len) {
> curlen = copy_len;
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Seems to fix the bug for me too...
Tested-by: Jeff Layton <jlayton@redhat.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 3/3] SUNRPC: remove KERN_INFO from dprintk() call sites
2014-01-17 19:37 [PATCH 0/3] NFS/RDMA bug fixes Chuck Lever
2014-01-17 19:38 ` [PATCH 1/3] NFS: Fix READDIR oops with NFSv4 on RDMA Chuck Lever
2014-01-17 19:38 ` [PATCH 2/3] SUNRPC: Fix large reads on NFS/RDMA Chuck Lever
@ 2014-01-17 19:38 ` Chuck Lever
2014-01-21 20:18 ` Jeff Layton
[not found] ` <CABgxfbFbLKV98GavS0x_X_ZXw0hZXPEQCpfoJJi+uNp133qt2w@mail.gmail.com>
3 siblings, 1 reply; 8+ messages in thread
From: Chuck Lever @ 2014-01-17 19:38 UTC (permalink / raw)
To: linux-nfs
The use of KERN_INFO causes garbage characters to appear
when debugging is enabled.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
net/sunrpc/xprtrdma/transport.c | 10 +++++-----
1 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
index 285dc08..1eb9c46 100644
--- a/net/sunrpc/xprtrdma/transport.c
+++ b/net/sunrpc/xprtrdma/transport.c
@@ -733,7 +733,7 @@ static void __exit xprt_rdma_cleanup(void)
{
int rc;
- dprintk(KERN_INFO "RPCRDMA Module Removed, deregister RPC RDMA transport\n");
+ dprintk("RPCRDMA Module Removed, deregister RPC RDMA transport\n");
#ifdef RPC_DEBUG
if (sunrpc_table_header) {
unregister_sysctl_table(sunrpc_table_header);
@@ -755,14 +755,14 @@ static int __init xprt_rdma_init(void)
if (rc)
return rc;
- dprintk(KERN_INFO "RPCRDMA Module Init, register RPC RDMA transport\n");
+ dprintk("RPCRDMA Module Init, register RPC RDMA transport\n");
- dprintk(KERN_INFO "Defaults:\n");
- dprintk(KERN_INFO "\tSlots %d\n"
+ dprintk("Defaults:\n");
+ dprintk("\tSlots %d\n"
"\tMaxInlineRead %d\n\tMaxInlineWrite %d\n",
xprt_rdma_slot_table_entries,
xprt_rdma_max_inline_read, xprt_rdma_max_inline_write);
- dprintk(KERN_INFO "\tPadding %d\n\tMemreg %d\n",
+ dprintk("\tPadding %d\n\tMemreg %d\n",
xprt_rdma_inline_write_padding, xprt_rdma_memreg_strategy);
#ifdef RPC_DEBUG
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH 3/3] SUNRPC: remove KERN_INFO from dprintk() call sites
2014-01-17 19:38 ` [PATCH 3/3] SUNRPC: remove KERN_INFO from dprintk() call sites Chuck Lever
@ 2014-01-21 20:18 ` Jeff Layton
0 siblings, 0 replies; 8+ messages in thread
From: Jeff Layton @ 2014-01-21 20:18 UTC (permalink / raw)
To: Chuck Lever; +Cc: linux-nfs
On Fri, 17 Jan 2014 14:38:25 -0500
Chuck Lever <chuck.lever@oracle.com> wrote:
> The use of KERN_INFO causes garbage characters to appear
> when debugging is enabled.
>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>
> net/sunrpc/xprtrdma/transport.c | 10 +++++-----
> 1 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
> index 285dc08..1eb9c46 100644
> --- a/net/sunrpc/xprtrdma/transport.c
> +++ b/net/sunrpc/xprtrdma/transport.c
> @@ -733,7 +733,7 @@ static void __exit xprt_rdma_cleanup(void)
> {
> int rc;
>
> - dprintk(KERN_INFO "RPCRDMA Module Removed, deregister RPC RDMA transport\n");
> + dprintk("RPCRDMA Module Removed, deregister RPC RDMA transport\n");
> #ifdef RPC_DEBUG
> if (sunrpc_table_header) {
> unregister_sysctl_table(sunrpc_table_header);
> @@ -755,14 +755,14 @@ static int __init xprt_rdma_init(void)
> if (rc)
> return rc;
>
> - dprintk(KERN_INFO "RPCRDMA Module Init, register RPC RDMA transport\n");
> + dprintk("RPCRDMA Module Init, register RPC RDMA transport\n");
>
> - dprintk(KERN_INFO "Defaults:\n");
> - dprintk(KERN_INFO "\tSlots %d\n"
> + dprintk("Defaults:\n");
> + dprintk("\tSlots %d\n"
> "\tMaxInlineRead %d\n\tMaxInlineWrite %d\n",
> xprt_rdma_slot_table_entries,
> xprt_rdma_max_inline_read, xprt_rdma_max_inline_write);
> - dprintk(KERN_INFO "\tPadding %d\n\tMemreg %d\n",
> + dprintk("\tPadding %d\n\tMemreg %d\n",
> xprt_rdma_inline_write_padding, xprt_rdma_memreg_strategy);
>
> #ifdef RPC_DEBUG
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Nice cleanup
Reviewed-by: Jeff Layton <jlayton@redhat.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <CABgxfbFbLKV98GavS0x_X_ZXw0hZXPEQCpfoJJi+uNp133qt2w@mail.gmail.com>]
* Re: [PATCH 0/3] NFS/RDMA bug fixes
[not found] ` <CABgxfbFbLKV98GavS0x_X_ZXw0hZXPEQCpfoJJi+uNp133qt2w@mail.gmail.com>
@ 2014-01-24 18:34 ` Chuck Lever
[not found] ` <CABgxfbEcKbeFMS=yobqp9TeHn4aa-Kjvi_tfbLK2xRbmFtmk9A@mail.gmail.com>
0 siblings, 1 reply; 8+ messages in thread
From: Chuck Lever @ 2014-01-24 18:34 UTC (permalink / raw)
To: Wendy Cheng; +Cc: Linux NFS Mailing List
On Jan 24, 2014, at 1:26 PM, Wendy Cheng <s.wendy.cheng@gmail.com> wrote:
> On Fri, Jan 17, 2014 at 11:37 AM, Chuck Lever <chuck.lever@oracle.com> wrote:
> This brief series fixes some long-standing regressions in our
> NFS/RDMA client implementation. Comments / review?
>
> ---
>
> Chuck Lever (3):
> SUNRPC: remove KERN_INFO from dprintk() call sites
> SUNRPC: Fix large reads on NFS/RDMA
> NFS: Fix READDIR oops with NFSv4 on RDMA
>
>
> fs/nfs/nfs4xdr.c | 3 +--
> net/sunrpc/xprtrdma/rpc_rdma.c | 4 +---
> net/sunrpc/xprtrdma/transport.c | 10 +++++-----
> 3 files changed, 7 insertions(+), 10 deletions(-)
>
>
> We have an embedded system that is currently set at NFS V3 with kernel 2.6.38 (as NFS client). Could folks suggest a good kernel version (close to 2.6.38 or RHEL based) to pick up NFS v4 (in order to pick up NFS over RDMA) ?
If NFS/RDMA is working in your kernel, I believe NFSv3 and NFSv4 should both work over RDMA.
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
^ permalink raw reply [flat|nested] 8+ messages in thread