* null dereference in nfs4_begin_drain_session
@ 2009-12-15 22:24 J. Bruce Fields
2009-12-15 22:32 ` Trond Myklebust
2009-12-15 22:52 ` Trond Myklebust
0 siblings, 2 replies; 4+ messages in thread
From: J. Bruce Fields @ 2009-12-15 22:24 UTC (permalink / raw)
To: Trond Myklebust; +Cc: linux-nfs
I got this just now on a test client running a branch including (among
other stuff) your 72211dbe727f7c1451aa5adfcbd1197b090eb276. Looks like
it was trying to run cthon tests over v4.0. Anything known? Let me
know if you want anything more.
--b.
BUG: unable to handle kernel NULL pointer dereference at 00000088
IP: [<c105b2bb>] __lock_acquire+0x21b/0x1880
*pde = 00000000
Oops: 0000 [#1] PREEMPT
last sysfs file: /sys/kernel/uevent_seqnum
Modules linked in:
Pid: 3137, comm: 192.168.122.129 Not tainted 2.6.32-07559-g0adf9c1 #553 /
EIP: 0060:[<c105b2bb>] EFLAGS: 00010046 CPU: 0
EIP is at __lock_acquire+0x21b/0x1880
EAX: 00000084 EBX: c6be42a0 ECX: 00000000 EDX: 00000001
ESI: 00000002 EDI: 00000000 EBP: c6bfff10 ESP: c6bffe9c
DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
Process 192.168.122.129 (pid: 3137, ti=c6bfe000 task=c6be42a0 task.ti=c6bfe000)
Stack:
00000046 00000046 00000000 00000000 00000000 00000000 00000000 c1b8a3e0
<0> c6bfff0c c1ead970 c6be4760 00000041 c20d6b48 00000084 00000000 00000000
<0> 00000000 00000000 0000094d 00000000 c1b8a3e0 c6bfff40 c10277f8 c6bfff00
Call Trace:
[<c10277f8>] ? update_curr+0x208/0x270
[<c105c99e>] ? lock_acquire+0x7e/0x110
[<c120a628>] ? nfs4_begin_drain_session+0x28/0x80
[<c188f5d2>] ? _raw_spin_lock+0x42/0x50
[<c120a628>] ? nfs4_begin_drain_session+0x28/0x80
[<c120a628>] ? nfs4_begin_drain_session+0x28/0x80
[<c120b1e0>] ? nfs4_run_state_manager+0x0/0x430
[<c120b3e7>] ? nfs4_run_state_manager+0x207/0x430
[<c120b1e0>] ? nfs4_run_state_manager+0x0/0x430
[<c104ae54>] ? kthread+0x74/0x80
[<c104ade0>] ? kthread+0x0/0x80
[<c100333b>] ? kernel_thread_helper+0x7/0x10
Code: fe ff ff e8 e8 ff 47 00 85 c0 74 dc 83 3d 20 0d 2a c2 00 75 d3 b8 75 6f a7 c1 ba b6 0a 00 00 e8 6c 2b fd ff 31 c0 eb c2 8b 45 c0 <8b> 40 04 85 c0 89 45 c8 0f 84 2b fe ff ff a1 c0 76 c7 c1 85 c0
EIP: [<c105b2bb>] __lock_acquire+0x21b/0x1880 SS:ESP 0068:c6bffe9c
CR2: 0000000000000088
---[ end trace 93dafd3a9c985071 ]---
note: 192.168.122.129[3137] exited with preempt_count 1
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: null dereference in nfs4_begin_drain_session
2009-12-15 22:24 null dereference in nfs4_begin_drain_session J. Bruce Fields
@ 2009-12-15 22:32 ` Trond Myklebust
2009-12-15 22:52 ` Trond Myklebust
1 sibling, 0 replies; 4+ messages in thread
From: Trond Myklebust @ 2009-12-15 22:32 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: Trond Myklebust, linux-nfs
On Tue, 2009-12-15 at 17:24 -0500, J. Bruce Fields wrote:
> I got this just now on a test client running a branch including (among
> other stuff) your 72211dbe727f7c1451aa5adfcbd1197b090eb276. Looks like
> it was trying to run cthon tests over v4.0. Anything known? Let me
> know if you want anything more.
>
> --b.
>
> BUG: unable to handle kernel NULL pointer dereference at 00000088
> IP: [<c105b2bb>] __lock_acquire+0x21b/0x1880
> *pde = 00000000
> Oops: 0000 [#1] PREEMPT
> last sysfs file: /sys/kernel/uevent_seqnum
> Modules linked in:
>
> Pid: 3137, comm: 192.168.122.129 Not tainted 2.6.32-07559-g0adf9c1 #553 /
> EIP: 0060:[<c105b2bb>] EFLAGS: 00010046 CPU: 0
> EIP is at __lock_acquire+0x21b/0x1880
> EAX: 00000084 EBX: c6be42a0 ECX: 00000000 EDX: 00000001
> ESI: 00000002 EDI: 00000000 EBP: c6bfff10 ESP: c6bffe9c
> DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> Process 192.168.122.129 (pid: 3137, ti=c6bfe000 task=c6be42a0 task.ti=c6bfe000)
> Stack:
> 00000046 00000046 00000000 00000000 00000000 00000000 00000000 c1b8a3e0
> <0> c6bfff0c c1ead970 c6be4760 00000041 c20d6b48 00000084 00000000 00000000
> <0> 00000000 00000000 0000094d 00000000 c1b8a3e0 c6bfff40 c10277f8 c6bfff00
> Call Trace:
> [<c10277f8>] ? update_curr+0x208/0x270
> [<c105c99e>] ? lock_acquire+0x7e/0x110
> [<c120a628>] ? nfs4_begin_drain_session+0x28/0x80
> [<c188f5d2>] ? _raw_spin_lock+0x42/0x50
> [<c120a628>] ? nfs4_begin_drain_session+0x28/0x80
> [<c120a628>] ? nfs4_begin_drain_session+0x28/0x80
> [<c120b1e0>] ? nfs4_run_state_manager+0x0/0x430
> [<c120b3e7>] ? nfs4_run_state_manager+0x207/0x430
> [<c120b1e0>] ? nfs4_run_state_manager+0x0/0x430
> [<c104ae54>] ? kthread+0x74/0x80
> [<c104ade0>] ? kthread+0x0/0x80
> [<c100333b>] ? kernel_thread_helper+0x7/0x10
> Code: fe ff ff e8 e8 ff 47 00 85 c0 74 dc 83 3d 20 0d 2a c2 00 75 d3 b8 75 6f a7 c1 ba b6 0a 00 00 e8 6c 2b fd ff 31 c0 eb c2 8b 45 c0 <8b> 40 04 85 c0 89 45 c8 0f 84 2b fe ff ff a1 c0 76 c7 c1 85 c0
> EIP: [<c105b2bb>] __lock_acquire+0x21b/0x1880 SS:ESP 0068:c6bffe9c
> CR2: 0000000000000088
> ---[ end trace 93dafd3a9c985071 ]---
> note: 192.168.122.129[3137] exited with preempt_count 1
>
Argh! This is exactly why I wanted those nfs4_begin_drain_session()
calls out of the state manager main loop. They _only_ make sense for the
NFSv4.1 code path, damnit!
Trond
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: null dereference in nfs4_begin_drain_session
2009-12-15 22:24 null dereference in nfs4_begin_drain_session J. Bruce Fields
2009-12-15 22:32 ` Trond Myklebust
@ 2009-12-15 22:52 ` Trond Myklebust
2009-12-17 22:54 ` J. Bruce Fields
1 sibling, 1 reply; 4+ messages in thread
From: Trond Myklebust @ 2009-12-15 22:52 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: linux-nfs
On Tue, 2009-12-15 at 17:24 -0500, J. Bruce Fields wrote:
> I got this just now on a test client running a branch including (among
> other stuff) your 72211dbe727f7c1451aa5adfcbd1197b090eb276. Looks like
> it was trying to run cthon tests over v4.0. Anything known? Let me
> know if you want anything more.
>
> --b.
>
> BUG: unable to handle kernel NULL pointer dereference at 00000088
> IP: [<c105b2bb>] __lock_acquire+0x21b/0x1880
> *pde = 00000000
> Oops: 0000 [#1] PREEMPT
> last sysfs file: /sys/kernel/uevent_seqnum
> Modules linked in:
>
> Pid: 3137, comm: 192.168.122.129 Not tainted 2.6.32-07559-g0adf9c1 #553 /
> EIP: 0060:[<c105b2bb>] EFLAGS: 00010046 CPU: 0
> EIP is at __lock_acquire+0x21b/0x1880
> EAX: 00000084 EBX: c6be42a0 ECX: 00000000 EDX: 00000001
> ESI: 00000002 EDI: 00000000 EBP: c6bfff10 ESP: c6bffe9c
> DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> Process 192.168.122.129 (pid: 3137, ti=c6bfe000 task=c6be42a0 task.ti=c6bfe000)
> Stack:
> 00000046 00000046 00000000 00000000 00000000 00000000 00000000 c1b8a3e0
> <0> c6bfff0c c1ead970 c6be4760 00000041 c20d6b48 00000084 00000000 00000000
> <0> 00000000 00000000 0000094d 00000000 c1b8a3e0 c6bfff40 c10277f8 c6bfff00
> Call Trace:
> [<c10277f8>] ? update_curr+0x208/0x270
> [<c105c99e>] ? lock_acquire+0x7e/0x110
> [<c120a628>] ? nfs4_begin_drain_session+0x28/0x80
> [<c188f5d2>] ? _raw_spin_lock+0x42/0x50
> [<c120a628>] ? nfs4_begin_drain_session+0x28/0x80
> [<c120a628>] ? nfs4_begin_drain_session+0x28/0x80
> [<c120b1e0>] ? nfs4_run_state_manager+0x0/0x430
> [<c120b3e7>] ? nfs4_run_state_manager+0x207/0x430
> [<c120b1e0>] ? nfs4_run_state_manager+0x0/0x430
> [<c104ae54>] ? kthread+0x74/0x80
> [<c104ade0>] ? kthread+0x0/0x80
> [<c100333b>] ? kernel_thread_helper+0x7/0x10
> Code: fe ff ff e8 e8 ff 47 00 85 c0 74 dc 83 3d 20 0d 2a c2 00 75 d3 b8 75 6f a7 c1 ba b6 0a 00 00 e8 6c 2b fd ff 31 c0 eb c2 8b 45 c0 <8b> 40 04 85 c0 89 45 c8 0f 84 2b fe ff ff a1 c0 76 c7 c1 85 c0
> EIP: [<c105b2bb>] __lock_acquire+0x21b/0x1880 SS:ESP 0068:c6bffe9c
> CR2: 0000000000000088
> ---[ end trace 93dafd3a9c985071 ]---
> note: 192.168.122.129[3137] exited with preempt_count 1
>
The following patch should suffice to fix this...
Cheers
Trond
-----------------------------------------------------------------------------------------------------
commit 380454126f1357db9270f9d1ca05dfe1a6e4ad47
Author: Trond Myklebust <Trond.Myklebust@netapp.com>
Date: Tue Dec 15 17:36:57 2009 -0500
NFSv4: Fix a regression in the NFSv4 state manager
Commit 5601a00d671fe89f9b087513244abcd08ad67e7d (nfs: run state manager
in privileged mode) introduces a regression in the NFSv4 code when
compiled with CONFIG_NFS_V4_1. The calls to nfs4_end_drain_session()
from the main loop in nfs4_state_manager() Oops due to the lack of an
NFSv4.1 session when running NFSv4.0.
The fix is to move those two calls back into nfs41_init_clientid() and
nfs4_reset_session().
The calls to nfs4_end_drain_session() that remain inside
nfs4_state_manager() are safe, since the NFSv4.0 code will never set the
NFS4CLNT_SESSION_DRAINING bit.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 18e8b26..6d263ed 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -176,6 +176,7 @@ int nfs41_init_clientid(struct nfs_client *clp, struct rpc_cred *cred)
{
int status;
+ nfs4_begin_drain_session(clp);
status = nfs4_proc_exchange_id(clp, cred);
if (status != 0)
goto out;
@@ -1274,6 +1275,7 @@ static int nfs4_reset_session(struct nfs_client *clp)
{
int status;
+ nfs4_begin_drain_session(clp);
status = nfs4_proc_destroy_session(clp->cl_session);
if (status && status != -NFS4ERR_BADSESSION &&
status != -NFS4ERR_DEADSESSION) {
@@ -1299,7 +1301,6 @@ out:
#else /* CONFIG_NFS_V4_1 */
static int nfs4_reset_session(struct nfs_client *clp) { return 0; }
-static int nfs4_begin_drain_session(struct nfs_client *clp) { return 0; }
static int nfs4_end_drain_session(struct nfs_client *clp) { return 0; }
#endif /* CONFIG_NFS_V4_1 */
@@ -1332,7 +1333,6 @@ static void nfs4_state_manager(struct nfs_client *clp)
for(;;) {
if (test_and_clear_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state)) {
/* We're going to have to re-establish a clientid */
- nfs4_begin_drain_session(clp);
status = nfs4_reclaim_lease(clp);
if (status) {
nfs4_set_lease_expired(clp, status);
@@ -1359,7 +1359,6 @@ static void nfs4_state_manager(struct nfs_client *clp)
/* Initialize or reset the session */
if (test_and_clear_bit(NFS4CLNT_SESSION_RESET, &clp->cl_state)
&& nfs4_has_session(clp)) {
- nfs4_begin_drain_session(clp);
status = nfs4_reset_session(clp);
if (test_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state))
continue;
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: null dereference in nfs4_begin_drain_session
2009-12-15 22:52 ` Trond Myklebust
@ 2009-12-17 22:54 ` J. Bruce Fields
0 siblings, 0 replies; 4+ messages in thread
From: J. Bruce Fields @ 2009-12-17 22:54 UTC (permalink / raw)
To: Trond Myklebust; +Cc: linux-nfs
On Tue, Dec 15, 2009 at 05:52:20PM -0500, Trond Myklebust wrote:
> On Tue, 2009-12-15 at 17:24 -0500, J. Bruce Fields wrote:
> The following patch should suffice to fix this...
Yup, not seeing this any more, thanks!
(Sorry for the slow confirmation--my test hosts stopped booting. Argh!
Looks like it was something temporarily broken upstream...).
--b.
>
> Cheers
> Trond
> -----------------------------------------------------------------------------------------------------
> commit 380454126f1357db9270f9d1ca05dfe1a6e4ad47
> Author: Trond Myklebust <Trond.Myklebust@netapp.com>
> Date: Tue Dec 15 17:36:57 2009 -0500
>
> NFSv4: Fix a regression in the NFSv4 state manager
>
> Commit 5601a00d671fe89f9b087513244abcd08ad67e7d (nfs: run state manager
> in privileged mode) introduces a regression in the NFSv4 code when
> compiled with CONFIG_NFS_V4_1. The calls to nfs4_end_drain_session()
> from the main loop in nfs4_state_manager() Oops due to the lack of an
> NFSv4.1 session when running NFSv4.0.
>
> The fix is to move those two calls back into nfs41_init_clientid() and
> nfs4_reset_session().
>
> The calls to nfs4_end_drain_session() that remain inside
> nfs4_state_manager() are safe, since the NFSv4.0 code will never set the
> NFS4CLNT_SESSION_DRAINING bit.
>
> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
>
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index 18e8b26..6d263ed 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -176,6 +176,7 @@ int nfs41_init_clientid(struct nfs_client *clp, struct rpc_cred *cred)
> {
> int status;
>
> + nfs4_begin_drain_session(clp);
> status = nfs4_proc_exchange_id(clp, cred);
> if (status != 0)
> goto out;
> @@ -1274,6 +1275,7 @@ static int nfs4_reset_session(struct nfs_client *clp)
> {
> int status;
>
> + nfs4_begin_drain_session(clp);
> status = nfs4_proc_destroy_session(clp->cl_session);
> if (status && status != -NFS4ERR_BADSESSION &&
> status != -NFS4ERR_DEADSESSION) {
> @@ -1299,7 +1301,6 @@ out:
>
> #else /* CONFIG_NFS_V4_1 */
> static int nfs4_reset_session(struct nfs_client *clp) { return 0; }
> -static int nfs4_begin_drain_session(struct nfs_client *clp) { return 0; }
> static int nfs4_end_drain_session(struct nfs_client *clp) { return 0; }
> #endif /* CONFIG_NFS_V4_1 */
>
> @@ -1332,7 +1333,6 @@ static void nfs4_state_manager(struct nfs_client *clp)
> for(;;) {
> if (test_and_clear_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state)) {
> /* We're going to have to re-establish a clientid */
> - nfs4_begin_drain_session(clp);
> status = nfs4_reclaim_lease(clp);
> if (status) {
> nfs4_set_lease_expired(clp, status);
> @@ -1359,7 +1359,6 @@ static void nfs4_state_manager(struct nfs_client *clp)
> /* Initialize or reset the session */
> if (test_and_clear_bit(NFS4CLNT_SESSION_RESET, &clp->cl_state)
> && nfs4_has_session(clp)) {
> - nfs4_begin_drain_session(clp);
> status = nfs4_reset_session(clp);
> if (test_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state))
> continue;
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-12-17 22:54 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-15 22:24 null dereference in nfs4_begin_drain_session J. Bruce Fields
2009-12-15 22:32 ` Trond Myklebust
2009-12-15 22:52 ` Trond Myklebust
2009-12-17 22:54 ` J. Bruce Fields
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox