* null dereference in nfs4_begin_drain_session @ 2009-12-15 22:24 J. Bruce Fields 2009-12-15 22:32 ` Trond Myklebust 2009-12-15 22:52 ` Trond Myklebust 0 siblings, 2 replies; 4+ messages in thread From: J. Bruce Fields @ 2009-12-15 22:24 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-nfs I got this just now on a test client running a branch including (among other stuff) your 72211dbe727f7c1451aa5adfcbd1197b090eb276. Looks like it was trying to run cthon tests over v4.0. Anything known? Let me know if you want anything more. --b. BUG: unable to handle kernel NULL pointer dereference at 00000088 IP: [<c105b2bb>] __lock_acquire+0x21b/0x1880 *pde = 00000000 Oops: 0000 [#1] PREEMPT last sysfs file: /sys/kernel/uevent_seqnum Modules linked in: Pid: 3137, comm: 192.168.122.129 Not tainted 2.6.32-07559-g0adf9c1 #553 / EIP: 0060:[<c105b2bb>] EFLAGS: 00010046 CPU: 0 EIP is at __lock_acquire+0x21b/0x1880 EAX: 00000084 EBX: c6be42a0 ECX: 00000000 EDX: 00000001 ESI: 00000002 EDI: 00000000 EBP: c6bfff10 ESP: c6bffe9c DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 Process 192.168.122.129 (pid: 3137, ti=c6bfe000 task=c6be42a0 task.ti=c6bfe000) Stack: 00000046 00000046 00000000 00000000 00000000 00000000 00000000 c1b8a3e0 <0> c6bfff0c c1ead970 c6be4760 00000041 c20d6b48 00000084 00000000 00000000 <0> 00000000 00000000 0000094d 00000000 c1b8a3e0 c6bfff40 c10277f8 c6bfff00 Call Trace: [<c10277f8>] ? update_curr+0x208/0x270 [<c105c99e>] ? lock_acquire+0x7e/0x110 [<c120a628>] ? nfs4_begin_drain_session+0x28/0x80 [<c188f5d2>] ? _raw_spin_lock+0x42/0x50 [<c120a628>] ? nfs4_begin_drain_session+0x28/0x80 [<c120a628>] ? nfs4_begin_drain_session+0x28/0x80 [<c120b1e0>] ? nfs4_run_state_manager+0x0/0x430 [<c120b3e7>] ? nfs4_run_state_manager+0x207/0x430 [<c120b1e0>] ? nfs4_run_state_manager+0x0/0x430 [<c104ae54>] ? kthread+0x74/0x80 [<c104ade0>] ? kthread+0x0/0x80 [<c100333b>] ? kernel_thread_helper+0x7/0x10 Code: fe ff ff e8 e8 ff 47 00 85 c0 74 dc 83 3d 20 0d 2a c2 00 75 d3 b8 75 6f a7 c1 ba b6 0a 00 00 e8 6c 2b fd ff 31 c0 eb c2 8b 45 c0 <8b> 40 04 85 c0 89 45 c8 0f 84 2b fe ff ff a1 c0 76 c7 c1 85 c0 EIP: [<c105b2bb>] __lock_acquire+0x21b/0x1880 SS:ESP 0068:c6bffe9c CR2: 0000000000000088 ---[ end trace 93dafd3a9c985071 ]--- note: 192.168.122.129[3137] exited with preempt_count 1 ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: null dereference in nfs4_begin_drain_session 2009-12-15 22:24 null dereference in nfs4_begin_drain_session J. Bruce Fields @ 2009-12-15 22:32 ` Trond Myklebust 2009-12-15 22:52 ` Trond Myklebust 1 sibling, 0 replies; 4+ messages in thread From: Trond Myklebust @ 2009-12-15 22:32 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Trond Myklebust, linux-nfs On Tue, 2009-12-15 at 17:24 -0500, J. Bruce Fields wrote: > I got this just now on a test client running a branch including (among > other stuff) your 72211dbe727f7c1451aa5adfcbd1197b090eb276. Looks like > it was trying to run cthon tests over v4.0. Anything known? Let me > know if you want anything more. > > --b. > > BUG: unable to handle kernel NULL pointer dereference at 00000088 > IP: [<c105b2bb>] __lock_acquire+0x21b/0x1880 > *pde = 00000000 > Oops: 0000 [#1] PREEMPT > last sysfs file: /sys/kernel/uevent_seqnum > Modules linked in: > > Pid: 3137, comm: 192.168.122.129 Not tainted 2.6.32-07559-g0adf9c1 #553 / > EIP: 0060:[<c105b2bb>] EFLAGS: 00010046 CPU: 0 > EIP is at __lock_acquire+0x21b/0x1880 > EAX: 00000084 EBX: c6be42a0 ECX: 00000000 EDX: 00000001 > ESI: 00000002 EDI: 00000000 EBP: c6bfff10 ESP: c6bffe9c > DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 > Process 192.168.122.129 (pid: 3137, ti=c6bfe000 task=c6be42a0 task.ti=c6bfe000) > Stack: > 00000046 00000046 00000000 00000000 00000000 00000000 00000000 c1b8a3e0 > <0> c6bfff0c c1ead970 c6be4760 00000041 c20d6b48 00000084 00000000 00000000 > <0> 00000000 00000000 0000094d 00000000 c1b8a3e0 c6bfff40 c10277f8 c6bfff00 > Call Trace: > [<c10277f8>] ? update_curr+0x208/0x270 > [<c105c99e>] ? lock_acquire+0x7e/0x110 > [<c120a628>] ? nfs4_begin_drain_session+0x28/0x80 > [<c188f5d2>] ? _raw_spin_lock+0x42/0x50 > [<c120a628>] ? nfs4_begin_drain_session+0x28/0x80 > [<c120a628>] ? nfs4_begin_drain_session+0x28/0x80 > [<c120b1e0>] ? nfs4_run_state_manager+0x0/0x430 > [<c120b3e7>] ? nfs4_run_state_manager+0x207/0x430 > [<c120b1e0>] ? nfs4_run_state_manager+0x0/0x430 > [<c104ae54>] ? kthread+0x74/0x80 > [<c104ade0>] ? kthread+0x0/0x80 > [<c100333b>] ? kernel_thread_helper+0x7/0x10 > Code: fe ff ff e8 e8 ff 47 00 85 c0 74 dc 83 3d 20 0d 2a c2 00 75 d3 b8 75 6f a7 c1 ba b6 0a 00 00 e8 6c 2b fd ff 31 c0 eb c2 8b 45 c0 <8b> 40 04 85 c0 89 45 c8 0f 84 2b fe ff ff a1 c0 76 c7 c1 85 c0 > EIP: [<c105b2bb>] __lock_acquire+0x21b/0x1880 SS:ESP 0068:c6bffe9c > CR2: 0000000000000088 > ---[ end trace 93dafd3a9c985071 ]--- > note: 192.168.122.129[3137] exited with preempt_count 1 > Argh! This is exactly why I wanted those nfs4_begin_drain_session() calls out of the state manager main loop. They _only_ make sense for the NFSv4.1 code path, damnit! Trond ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: null dereference in nfs4_begin_drain_session 2009-12-15 22:24 null dereference in nfs4_begin_drain_session J. Bruce Fields 2009-12-15 22:32 ` Trond Myklebust @ 2009-12-15 22:52 ` Trond Myklebust 2009-12-17 22:54 ` J. Bruce Fields 1 sibling, 1 reply; 4+ messages in thread From: Trond Myklebust @ 2009-12-15 22:52 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-nfs On Tue, 2009-12-15 at 17:24 -0500, J. Bruce Fields wrote: > I got this just now on a test client running a branch including (among > other stuff) your 72211dbe727f7c1451aa5adfcbd1197b090eb276. Looks like > it was trying to run cthon tests over v4.0. Anything known? Let me > know if you want anything more. > > --b. > > BUG: unable to handle kernel NULL pointer dereference at 00000088 > IP: [<c105b2bb>] __lock_acquire+0x21b/0x1880 > *pde = 00000000 > Oops: 0000 [#1] PREEMPT > last sysfs file: /sys/kernel/uevent_seqnum > Modules linked in: > > Pid: 3137, comm: 192.168.122.129 Not tainted 2.6.32-07559-g0adf9c1 #553 / > EIP: 0060:[<c105b2bb>] EFLAGS: 00010046 CPU: 0 > EIP is at __lock_acquire+0x21b/0x1880 > EAX: 00000084 EBX: c6be42a0 ECX: 00000000 EDX: 00000001 > ESI: 00000002 EDI: 00000000 EBP: c6bfff10 ESP: c6bffe9c > DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 > Process 192.168.122.129 (pid: 3137, ti=c6bfe000 task=c6be42a0 task.ti=c6bfe000) > Stack: > 00000046 00000046 00000000 00000000 00000000 00000000 00000000 c1b8a3e0 > <0> c6bfff0c c1ead970 c6be4760 00000041 c20d6b48 00000084 00000000 00000000 > <0> 00000000 00000000 0000094d 00000000 c1b8a3e0 c6bfff40 c10277f8 c6bfff00 > Call Trace: > [<c10277f8>] ? update_curr+0x208/0x270 > [<c105c99e>] ? lock_acquire+0x7e/0x110 > [<c120a628>] ? nfs4_begin_drain_session+0x28/0x80 > [<c188f5d2>] ? _raw_spin_lock+0x42/0x50 > [<c120a628>] ? nfs4_begin_drain_session+0x28/0x80 > [<c120a628>] ? nfs4_begin_drain_session+0x28/0x80 > [<c120b1e0>] ? nfs4_run_state_manager+0x0/0x430 > [<c120b3e7>] ? nfs4_run_state_manager+0x207/0x430 > [<c120b1e0>] ? nfs4_run_state_manager+0x0/0x430 > [<c104ae54>] ? kthread+0x74/0x80 > [<c104ade0>] ? kthread+0x0/0x80 > [<c100333b>] ? kernel_thread_helper+0x7/0x10 > Code: fe ff ff e8 e8 ff 47 00 85 c0 74 dc 83 3d 20 0d 2a c2 00 75 d3 b8 75 6f a7 c1 ba b6 0a 00 00 e8 6c 2b fd ff 31 c0 eb c2 8b 45 c0 <8b> 40 04 85 c0 89 45 c8 0f 84 2b fe ff ff a1 c0 76 c7 c1 85 c0 > EIP: [<c105b2bb>] __lock_acquire+0x21b/0x1880 SS:ESP 0068:c6bffe9c > CR2: 0000000000000088 > ---[ end trace 93dafd3a9c985071 ]--- > note: 192.168.122.129[3137] exited with preempt_count 1 > The following patch should suffice to fix this... Cheers Trond ----------------------------------------------------------------------------------------------------- commit 380454126f1357db9270f9d1ca05dfe1a6e4ad47 Author: Trond Myklebust <Trond.Myklebust@netapp.com> Date: Tue Dec 15 17:36:57 2009 -0500 NFSv4: Fix a regression in the NFSv4 state manager Commit 5601a00d671fe89f9b087513244abcd08ad67e7d (nfs: run state manager in privileged mode) introduces a regression in the NFSv4 code when compiled with CONFIG_NFS_V4_1. The calls to nfs4_end_drain_session() from the main loop in nfs4_state_manager() Oops due to the lack of an NFSv4.1 session when running NFSv4.0. The fix is to move those two calls back into nfs41_init_clientid() and nfs4_reset_session(). The calls to nfs4_end_drain_session() that remain inside nfs4_state_manager() are safe, since the NFSv4.0 code will never set the NFS4CLNT_SESSION_DRAINING bit. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index 18e8b26..6d263ed 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -176,6 +176,7 @@ int nfs41_init_clientid(struct nfs_client *clp, struct rpc_cred *cred) { int status; + nfs4_begin_drain_session(clp); status = nfs4_proc_exchange_id(clp, cred); if (status != 0) goto out; @@ -1274,6 +1275,7 @@ static int nfs4_reset_session(struct nfs_client *clp) { int status; + nfs4_begin_drain_session(clp); status = nfs4_proc_destroy_session(clp->cl_session); if (status && status != -NFS4ERR_BADSESSION && status != -NFS4ERR_DEADSESSION) { @@ -1299,7 +1301,6 @@ out: #else /* CONFIG_NFS_V4_1 */ static int nfs4_reset_session(struct nfs_client *clp) { return 0; } -static int nfs4_begin_drain_session(struct nfs_client *clp) { return 0; } static int nfs4_end_drain_session(struct nfs_client *clp) { return 0; } #endif /* CONFIG_NFS_V4_1 */ @@ -1332,7 +1333,6 @@ static void nfs4_state_manager(struct nfs_client *clp) for(;;) { if (test_and_clear_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state)) { /* We're going to have to re-establish a clientid */ - nfs4_begin_drain_session(clp); status = nfs4_reclaim_lease(clp); if (status) { nfs4_set_lease_expired(clp, status); @@ -1359,7 +1359,6 @@ static void nfs4_state_manager(struct nfs_client *clp) /* Initialize or reset the session */ if (test_and_clear_bit(NFS4CLNT_SESSION_RESET, &clp->cl_state) && nfs4_has_session(clp)) { - nfs4_begin_drain_session(clp); status = nfs4_reset_session(clp); if (test_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state)) continue; ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: null dereference in nfs4_begin_drain_session 2009-12-15 22:52 ` Trond Myklebust @ 2009-12-17 22:54 ` J. Bruce Fields 0 siblings, 0 replies; 4+ messages in thread From: J. Bruce Fields @ 2009-12-17 22:54 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-nfs On Tue, Dec 15, 2009 at 05:52:20PM -0500, Trond Myklebust wrote: > On Tue, 2009-12-15 at 17:24 -0500, J. Bruce Fields wrote: > The following patch should suffice to fix this... Yup, not seeing this any more, thanks! (Sorry for the slow confirmation--my test hosts stopped booting. Argh! Looks like it was something temporarily broken upstream...). --b. > > Cheers > Trond > ----------------------------------------------------------------------------------------------------- > commit 380454126f1357db9270f9d1ca05dfe1a6e4ad47 > Author: Trond Myklebust <Trond.Myklebust@netapp.com> > Date: Tue Dec 15 17:36:57 2009 -0500 > > NFSv4: Fix a regression in the NFSv4 state manager > > Commit 5601a00d671fe89f9b087513244abcd08ad67e7d (nfs: run state manager > in privileged mode) introduces a regression in the NFSv4 code when > compiled with CONFIG_NFS_V4_1. The calls to nfs4_end_drain_session() > from the main loop in nfs4_state_manager() Oops due to the lack of an > NFSv4.1 session when running NFSv4.0. > > The fix is to move those two calls back into nfs41_init_clientid() and > nfs4_reset_session(). > > The calls to nfs4_end_drain_session() that remain inside > nfs4_state_manager() are safe, since the NFSv4.0 code will never set the > NFS4CLNT_SESSION_DRAINING bit. > > Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> > > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c > index 18e8b26..6d263ed 100644 > --- a/fs/nfs/nfs4state.c > +++ b/fs/nfs/nfs4state.c > @@ -176,6 +176,7 @@ int nfs41_init_clientid(struct nfs_client *clp, struct rpc_cred *cred) > { > int status; > > + nfs4_begin_drain_session(clp); > status = nfs4_proc_exchange_id(clp, cred); > if (status != 0) > goto out; > @@ -1274,6 +1275,7 @@ static int nfs4_reset_session(struct nfs_client *clp) > { > int status; > > + nfs4_begin_drain_session(clp); > status = nfs4_proc_destroy_session(clp->cl_session); > if (status && status != -NFS4ERR_BADSESSION && > status != -NFS4ERR_DEADSESSION) { > @@ -1299,7 +1301,6 @@ out: > > #else /* CONFIG_NFS_V4_1 */ > static int nfs4_reset_session(struct nfs_client *clp) { return 0; } > -static int nfs4_begin_drain_session(struct nfs_client *clp) { return 0; } > static int nfs4_end_drain_session(struct nfs_client *clp) { return 0; } > #endif /* CONFIG_NFS_V4_1 */ > > @@ -1332,7 +1333,6 @@ static void nfs4_state_manager(struct nfs_client *clp) > for(;;) { > if (test_and_clear_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state)) { > /* We're going to have to re-establish a clientid */ > - nfs4_begin_drain_session(clp); > status = nfs4_reclaim_lease(clp); > if (status) { > nfs4_set_lease_expired(clp, status); > @@ -1359,7 +1359,6 @@ static void nfs4_state_manager(struct nfs_client *clp) > /* Initialize or reset the session */ > if (test_and_clear_bit(NFS4CLNT_SESSION_RESET, &clp->cl_state) > && nfs4_has_session(clp)) { > - nfs4_begin_drain_session(clp); > status = nfs4_reset_session(clp); > if (test_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state)) > continue; > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-12-17 22:54 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-12-15 22:24 null dereference in nfs4_begin_drain_session J. Bruce Fields 2009-12-15 22:32 ` Trond Myklebust 2009-12-15 22:52 ` Trond Myklebust 2009-12-17 22:54 ` J. Bruce Fields
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox