* Re: [bfields@pop.test.fieldses.org: all f08d6374 Merge branch 'nfs-for-next' into linux-next results] [not found] ` <20100921111657.330e7838@corrin.poochiereds.net> @ 2010-09-21 19:20 ` J. Bruce Fields 2010-09-21 19:36 ` Trond Myklebust 2010-09-21 19:42 ` [bfields-btOuF8jVhiP3NsPE3w5/ayCwEArCW2h5@public.gmane.org: " J. Bruce Fields 0 siblings, 2 replies; 13+ messages in thread From: J. Bruce Fields @ 2010-09-21 19:20 UTC (permalink / raw) To: Jeff Layton; +Cc: Trond Myklebust, linux-nfs Added linux-nfs to cc: On Tue, Sep 21, 2010 at 11:16:57AM -0700, Jeff Layton wrote: > On Tue, 21 Sep 2010 13:56:55 -0400 > Trond Myklebust <Trond.Myklebust@netapp.com> wrote: > > > On Mon, 2010-09-20 at 09:10 -0400, J. Bruce Fields wrote: > > > On Mon, Sep 20, 2010 at 09:01:06AM -0400, J. Bruce Fields wrote: > > > > On Sun, Sep 19, 2010 at 02:45:49PM -0400, J. Bruce Fields wrote: > > > > > On Sat, Sep 18, 2010 at 01:41:20PM -0400, Trond Myklebust wrote: > > > > > > Argh, yes... It is that very last commit from Chuck's new patch series > > > > > > that conflicts with Jeff Layton's unlink fixes. I've now reverted that > > > > > > commit... > > > > > > > > > > Great, thanks.--b. > > > > > > > > Unfortunately, now I'm getting some sort of crash. I'll try to get > > > > console and get more of the logs, but it looks like a bad pointer in > > > > nfs4_xdr_enc_rename, with worker_thread, etc. at the other end of the > > > > stack, so I assume it's in rpciod. > > > > > > general protection fault: 0000 [#1] PREEMPT > > > last sysfs file: /sys/devices/virtio-pci/virtio2/net/eth0/broadcast > > > CPU 0 > > > Modules linked in: [last unloaded: scsi_wait_scan] > > > > > > Pid: 2089, comm: kworker/0:2 Not tainted 2.6.36-rc4-00191-g5baeab4 #262 /Bochs > > > RIP: 0010:[<ffffffff8124335a>] [<ffffffff8124335a>] nfs4_xdr_enc_rename+0x2a/0x150 > > > RSP: 0018:ffff88001d723c50 EFLAGS: 00010206 > > > RAX: ffff88001e0fa07c RBX: ffff88001e9fb618 RCX: 5a5a5a5a5a5a5a5a > > ^^^^^^^^^^^^^^^^ > > Looks like a use-after-free issue. > > > > > RDX: 0000000000000000 RSI: ffff88001e0fa07c RDI: ffff88001e898320 > > > RBP: ffff88001d723cd0 R08: ffff88001e60e908 R09: 0000000000000000 > > > R10: 0000000000000002 R11: 0000000000000001 R12: ffff88001e0fa07c > > > R13: ffff88001e60e908 R14: ffff88001e898320 R15: ffff88001dcf2668 > > > FS: 0000000000000000(0000) GS:ffffffff81e1c000(0000) knlGS:0000000000000000 > > > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > > CR2: 00007fb75a46c360 CR3: 000000001e532000 CR4: 00000000000006f0 > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > > Process kworker/0:2 (pid: 2089, threadinfo ffff88001d722000, task ffff88001ddfe050) > > > Stack: > > > ffff88001ddfe050 ffffffff8103faf3 ffff88001ea1d000 ffff88001ea1d658 > > > <0> ffff88001d723c90 ffffffff8106a4dd ffffffff818fce45 ffffffff8190ec98 > > > <0> ffff88001e0fa024 ffff88001dcf2668 ffff88001e0fa028 ffff88001e9fb618 > > > Call Trace: > > > [<ffffffff8103faf3>] ? local_bh_enable_ip+0x83/0x110 > > > [<ffffffff8106a4dd>] ? trace_hardirqs_on_caller+0x14d/0x190 > > > [<ffffffff818fce45>] ? xprt_prepare_transmit+0x75/0xb0 > > > [<ffffffff8190ec98>] ? xdr_encode_opaque_fixed+0x48/0x90 > > > [<ffffffff81243330>] ? nfs4_xdr_enc_rename+0x0/0x150 > > > [<ffffffff81903f94>] rpcauth_wrap_req+0x84/0xb0 > > > [<ffffffff818fa637>] call_transmit+0x1a7/0x2c0 > > > [<ffffffff81903112>] __rpc_execute+0xa2/0x230 > > > [<ffffffff81903305>] rpc_async_schedule+0x15/0x20 > > > [<ffffffff81054bf2>] process_one_work+0x192/0x520 > > > [<ffffffff81054b85>] ? process_one_work+0x125/0x520 > > > [<ffffffff819032f0>] ? rpc_async_schedule+0x0/0x20 > > > [<ffffffff81055270>] worker_thread+0x140/0x390 > > > [<ffffffff81055130>] ? worker_thread+0x0/0x390 > > > [<ffffffff81059376>] kthread+0x96/0xa0 > > > [<ffffffff810030f4>] kernel_thread_helper+0x4/0x10 > > > [<ffffffff8196d89c>] ? kprobe_flush_task+0xbc/0xe0 > > > [<ffffffff8196857e>] ? restore_args+0x0/0x30 > > > [<ffffffff810592e0>] ? kthread+0x0/0xa0 > > > [<ffffffff810030f0>] ? kernel_thread_helper+0x0/0x10 > > > Code: 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 58 0f 1f 44 00 00 48 8b 4a 28 49 89 d5 31 d2 49 89 fe 48 89 f0 48 85 c9 74 10 <48> 8b 91 30 03 00 00 48 8b 92 20 03 00 00 8b 12 48 8d 5d b0 4c > > > RIP [<ffffffff8124335a>] nfs4_xdr_enc_rename+0x2a/0x150 > > > RSP <ffff88001d723c50> > > > > > > --b. > > > > I can see one double free in the nfs_async_rename() error case: the code > > calls nfs_async_rename_release() after it has already been called by > > rpc_run_task(). > > I don't know if that is sufficient to explain the above trace, though. > > > > Cheers > > Trond > > > > -------------------------------------------------------------------------------------------------------- > > NFS: Fix a use-after-free case in nfs_async_rename() > > > > From: Trond Myklebust <Trond.Myklebust@netapp.com> > > > > The call to nfs_async_rename_release() after rpc_run_task() is incorrect. > > The rpc_run_task() is always guaranteed to call the ->rpc_release() method. > > > > Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> > > --- > > > > fs/nfs/unlink.c | 9 ++------- > > 1 files changed, 2 insertions(+), 7 deletions(-) > > > > > > diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c > > index 698b3e6..47530aa 100644 > > --- a/fs/nfs/unlink.c > > +++ b/fs/nfs/unlink.c > > @@ -426,7 +426,6 @@ nfs_async_rename(struct inode *old_dir, struct inode *new_dir, > > .rpc_client = NFS_CLIENT(old_dir), > > .flags = RPC_TASK_ASYNC, > > }; > > - struct rpc_task *task; > > > > data = kmalloc(sizeof(*data), GFP_KERNEL); > > if (data == NULL) > > @@ -435,7 +434,7 @@ nfs_async_rename(struct inode *old_dir, struct inode *new_dir, > > > > data->cred = rpc_lookup_cred(); > > if (IS_ERR(data->cred)) { > > - task = (struct rpc_task *)data->cred; > > + struct rpc_task *task = ERR_CAST(data->cred); > > kfree(data); > > return task; > > } > > @@ -468,11 +467,7 @@ nfs_async_rename(struct inode *old_dir, struct inode *new_dir, > > > > NFS_PROTO(data->old_dir)->rename_setup(&msg, old_dir); > > > > - task = rpc_run_task(&task_setup_data); > > - if (IS_ERR(task)) > > - nfs_async_rename_release(data); > > - > > - return task; > > + return rpc_run_task(&task_setup_data); > > } > > > > /** > > > > Thanks Trond. Sorry I missed that. Good catch. > > Reviewed-by: Jeff Layton <jlayton@redhat.com> > > That said, I agree -- that doesn't seem sufficient to explain the > problem. Bruce, is this reproducible? All I need to do is mount nfs4.1 and run cthon -s. The last output I see from the test is: check for proper open/unlink operation nfsjunk files before unlink: That's on a kernel without Trond's fix above. --b. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bfields@pop.test.fieldses.org: all f08d6374 Merge branch 'nfs-for-next' into linux-next results] 2010-09-21 19:20 ` [bfields@pop.test.fieldses.org: all f08d6374 Merge branch 'nfs-for-next' into linux-next results] J. Bruce Fields @ 2010-09-21 19:36 ` Trond Myklebust 2010-09-21 20:00 ` Trond Myklebust 2010-09-21 19:42 ` [bfields-btOuF8jVhiP3NsPE3w5/ayCwEArCW2h5@public.gmane.org: " J. Bruce Fields 1 sibling, 1 reply; 13+ messages in thread From: Trond Myklebust @ 2010-09-21 19:36 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Jeff Layton, linux-nfs On Tue, 2010-09-21 at 15:20 -0400, J. Bruce Fields wrote: > All I need to do is mount nfs4.1 and run cthon -s. The last output I > see from the test is: > > check for proper open/unlink operation > nfsjunk files before unlink: > Oh... I bet I see what it is. It's that damned res->sr_slotid = NFS4_MAX_SLOT_TABLE and related initialisation junk that's biting us in the arse again... I'll fix it... Trond ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bfields@pop.test.fieldses.org: all f08d6374 Merge branch 'nfs-for-next' into linux-next results] 2010-09-21 19:36 ` Trond Myklebust @ 2010-09-21 20:00 ` Trond Myklebust [not found] ` <1285099232.30222.25.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org> 0 siblings, 1 reply; 13+ messages in thread From: Trond Myklebust @ 2010-09-21 20:00 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Jeff Layton, linux-nfs On Tue, 2010-09-21 at 15:36 -0400, Trond Myklebust wrote: > On Tue, 2010-09-21 at 15:20 -0400, J. Bruce Fields wrote: > > All I need to do is mount nfs4.1 and run cthon -s. The last output I > > see from the test is: > > > > check for proper open/unlink operation > > nfsjunk files before unlink: > > > > Oh... I bet I see what it is. > > It's that damned res->sr_slotid = NFS4_MAX_SLOT_TABLE and related > initialisation junk that's biting us in the arse again... > > I'll fix it... Does this work? Cheers Trond ------------------------------------------------------------------------------------- NFSv4.1: Fix the slotid initialisation in nfs_async_rename() From: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> --- fs/nfs/nfs4proc.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index c46e45e..72aa706 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -2550,6 +2550,7 @@ static void nfs4_proc_unlink_setup(struct rpc_message *msg, struct inode *dir) args->bitmask = server->cache_consistency_bitmask; res->server = server; + res->seq_res.sr_slotid = NFS4_MAX_SLOT_TABLE; msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_REMOVE]; } @@ -2575,6 +2576,7 @@ static void nfs4_proc_rename_setup(struct rpc_message *msg, struct inode *dir) msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_RENAME]; arg->bitmask = server->attr_bitmask; res->server = server; + res->seq_res.sr_slotid = NFS4_MAX_SLOT_TABLE; } static int nfs4_proc_rename_done(struct rpc_task *task, struct inode *old_dir, ^ permalink raw reply related [flat|nested] 13+ messages in thread
[parent not found: <1285099232.30222.25.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>]
* Re: [bfields-btOuF8jVhiP3NsPE3w5/ayCwEArCW2h5@public.gmane.org: all f08d6374 Merge branch 'nfs-for-next' into linux-next results] [not found] ` <1285099232.30222.25.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org> @ 2010-09-21 20:24 ` J. Bruce Fields 2010-09-21 20:46 ` [bfields@pop.test.fieldses.org: " Trond Myklebust 0 siblings, 1 reply; 13+ messages in thread From: J. Bruce Fields @ 2010-09-21 20:24 UTC (permalink / raw) To: Trond Myklebust; +Cc: Jeff Layton, linux-nfs On Tue, Sep 21, 2010 at 04:00:32PM -0400, Trond Myklebust wrote: > On Tue, 2010-09-21 at 15:36 -0400, Trond Myklebust wrote: > > On Tue, 2010-09-21 at 15:20 -0400, J. Bruce Fields wrote: > > > All I need to do is mount nfs4.1 and run cthon -s. The last output I > > > see from the test is: > > > > > > check for proper open/unlink operation > > > nfsjunk files before unlink: > > > > > > > Oh... I bet I see what it is. > > > > It's that damned res->sr_slotid = NFS4_MAX_SLOT_TABLE and related > > initialisation junk that's biting us in the arse again... > > > > I'll fix it... > > Does this work? Yup. --b. > > Cheers > Trond > ------------------------------------------------------------------------------------- > NFSv4.1: Fix the slotid initialisation in nfs_async_rename() > > From: Trond Myklebust <Trond.Myklebust@netapp.com> > > Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> > --- > > fs/nfs/nfs4proc.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c > index c46e45e..72aa706 100644 > --- a/fs/nfs/nfs4proc.c > +++ b/fs/nfs/nfs4proc.c > @@ -2550,6 +2550,7 @@ static void nfs4_proc_unlink_setup(struct rpc_message *msg, struct inode *dir) > > args->bitmask = server->cache_consistency_bitmask; > res->server = server; > + res->seq_res.sr_slotid = NFS4_MAX_SLOT_TABLE; > msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_REMOVE]; > } > > @@ -2575,6 +2576,7 @@ static void nfs4_proc_rename_setup(struct rpc_message *msg, struct inode *dir) > msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_RENAME]; > arg->bitmask = server->attr_bitmask; > res->server = server; > + res->seq_res.sr_slotid = NFS4_MAX_SLOT_TABLE; > } > > static int nfs4_proc_rename_done(struct rpc_task *task, struct inode *old_dir, > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bfields@pop.test.fieldses.org: all f08d6374 Merge branch 'nfs-for-next' into linux-next results] 2010-09-21 20:24 ` [bfields-btOuF8jVhiP3NsPE3w5/ayCwEArCW2h5@public.gmane.org: " J. Bruce Fields @ 2010-09-21 20:46 ` Trond Myklebust 2010-09-21 20:58 ` J. Bruce Fields 2010-09-21 21:58 ` Jeff Layton 0 siblings, 2 replies; 13+ messages in thread From: Trond Myklebust @ 2010-09-21 20:46 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Jeff Layton, linux-nfs On Tue, 2010-09-21 at 16:24 -0400, J. Bruce Fields wrote: > On Tue, Sep 21, 2010 at 04:00:32PM -0400, Trond Myklebust wrote: > > On Tue, 2010-09-21 at 15:36 -0400, Trond Myklebust wrote: > > > On Tue, 2010-09-21 at 15:20 -0400, J. Bruce Fields wrote: > > > > All I need to do is mount nfs4.1 and run cthon -s. The last output I > > > > see from the test is: > > > > > > > > check for proper open/unlink operation > > > > nfsjunk files before unlink: > > > > > > > > > > Oh... I bet I see what it is. > > > > > > It's that damned res->sr_slotid = NFS4_MAX_SLOT_TABLE and related > > > initialisation junk that's biting us in the arse again... > > > > > > I'll fix it... > > > > Does this work? > > Yup. It seems to fix things for me too... Once you mentioned NFSv4.1 it was easy to reproduce the bug. OK. I'll merge these into nfs-for-2.6.37... Cheers Trond ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bfields@pop.test.fieldses.org: all f08d6374 Merge branch 'nfs-for-next' into linux-next results] 2010-09-21 20:46 ` [bfields@pop.test.fieldses.org: " Trond Myklebust @ 2010-09-21 20:58 ` J. Bruce Fields 2010-09-21 21:58 ` Jeff Layton 1 sibling, 0 replies; 13+ messages in thread From: J. Bruce Fields @ 2010-09-21 20:58 UTC (permalink / raw) To: Trond Myklebust; +Cc: Jeff Layton, linux-nfs On Tue, Sep 21, 2010 at 04:46:21PM -0400, Trond Myklebust wrote: > On Tue, 2010-09-21 at 16:24 -0400, J. Bruce Fields wrote: > > On Tue, Sep 21, 2010 at 04:00:32PM -0400, Trond Myklebust wrote: > > > On Tue, 2010-09-21 at 15:36 -0400, Trond Myklebust wrote: > > > > On Tue, 2010-09-21 at 15:20 -0400, J. Bruce Fields wrote: > > > > > All I need to do is mount nfs4.1 and run cthon -s. The last output I > > > > > see from the test is: > > > > > > > > > > check for proper open/unlink operation > > > > > nfsjunk files before unlink: > > > > > > > > > > > > > Oh... I bet I see what it is. > > > > > > > > It's that damned res->sr_slotid = NFS4_MAX_SLOT_TABLE and related > > > > initialisation junk that's biting us in the arse again... > > > > > > > > I'll fix it... > > > > > > Does this work? > > > > Yup. > > It seems to fix things for me too... Once you mentioned NFSv4.1 it was > easy to reproduce the bug. OK, good. I've a slight fear I may have volunteered as Mr. Upstream NFS QA, which I didn't really intend, though of course I want to see stuff caught sooner rather than later.... Anyway, I should have noticed which test it failed on sooner. --b. > > OK. I'll merge these into nfs-for-2.6.37... > > Cheers > Trond > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bfields@pop.test.fieldses.org: all f08d6374 Merge branch 'nfs-for-next' into linux-next results] 2010-09-21 20:46 ` [bfields@pop.test.fieldses.org: " Trond Myklebust 2010-09-21 20:58 ` J. Bruce Fields @ 2010-09-21 21:58 ` Jeff Layton 2010-09-23 17:20 ` Benny Halevy 1 sibling, 1 reply; 13+ messages in thread From: Jeff Layton @ 2010-09-21 21:58 UTC (permalink / raw) To: Trond Myklebust; +Cc: J. Bruce Fields, linux-nfs On Tue, 21 Sep 2010 16:46:21 -0400 Trond Myklebust <trond.myklebust@fys.uio.no> wrote: > On Tue, 2010-09-21 at 16:24 -0400, J. Bruce Fields wrote: > > On Tue, Sep 21, 2010 at 04:00:32PM -0400, Trond Myklebust wrote: > > > On Tue, 2010-09-21 at 15:36 -0400, Trond Myklebust wrote: > > > > On Tue, 2010-09-21 at 15:20 -0400, J. Bruce Fields wrote: > > > > > All I need to do is mount nfs4.1 and run cthon -s. The last output I > > > > > see from the test is: > > > > > > > > > > check for proper open/unlink operation > > > > > nfsjunk files before unlink: > > > > > > > > > > > > > Oh... I bet I see what it is. > > > > > > > > It's that damned res->sr_slotid = NFS4_MAX_SLOT_TABLE and related > > > > initialisation junk that's biting us in the arse again... > > > > > > > > I'll fix it... > > > > > > Does this work? > > > > Yup. > > It seems to fix things for me too... Once you mentioned NFSv4.1 it was > easy to reproduce the bug. > > OK. I'll merge these into nfs-for-2.6.37... Thanks for fixing it. I should have mentioned that the 4.1 parts were written using the "cargo-cult" programming method of copying what unlink does, and that they needed careful review. Mea culpa! -- Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bfields@pop.test.fieldses.org: all f08d6374 Merge branch 'nfs-for-next' into linux-next results] 2010-09-21 21:58 ` Jeff Layton @ 2010-09-23 17:20 ` Benny Halevy 2010-09-23 17:29 ` Trond Myklebust 0 siblings, 1 reply; 13+ messages in thread From: Benny Halevy @ 2010-09-23 17:20 UTC (permalink / raw) To: Jeff Layton, Trond Myklebust; +Cc: J. Bruce Fields, linux-nfs On 2010-09-21 23:58, Jeff Layton wrote: > On Tue, 21 Sep 2010 16:46:21 -0400 > Trond Myklebust <trond.myklebust@fys.uio.no> wrote: > >> On Tue, 2010-09-21 at 16:24 -0400, J. Bruce Fields wrote: >>> On Tue, Sep 21, 2010 at 04:00:32PM -0400, Trond Myklebust wrote: >>>> On Tue, 2010-09-21 at 15:36 -0400, Trond Myklebust wrote: >>>>> On Tue, 2010-09-21 at 15:20 -0400, J. Bruce Fields wrote: >>>>>> All I need to do is mount nfs4.1 and run cthon -s. The last output I >>>>>> see from the test is: >>>>>> >>>>>> check for proper open/unlink operation >>>>>> nfsjunk files before unlink: >>>>>> >>>>> >>>>> Oh... I bet I see what it is. >>>>> >>>>> It's that damned res->sr_slotid = NFS4_MAX_SLOT_TABLE and related >>>>> initialisation junk that's biting us in the arse again... >>>>> >>>>> I'll fix it... >>>> >>>> Does this work? >>> >>> Yup. >> >> It seems to fix things for me too... Once you mentioned NFSv4.1 it was >> easy to reproduce the bug. >> >> OK. I'll merge these into nfs-for-2.6.37... > > Thanks for fixing it. I should have mentioned that the 4.1 parts were > written using the "cargo-cult" programming method of copying what > unlink does, and that they needed careful review. Mea culpa! > I still hate it that the sr_slotid requires explicit, non-trivial initialization. Once upon a time, the equivalent of sr_slotid used to be a pointer to struct slot. Using a pointer, implicitly initialized to NULL is significantly more straight forward as in the patch below. (passes cthon tests over your nfs-for-2.6.37 branch + the patch I sent earlier that fixes __put_nfs_open_context. w/o it I readily reproduce the crash reported on this thread) >From 50221fb5bd8fbd46f6cabe1fc795db7efc71bff7 Mon Sep 17 00:00:00 2001 From: Benny Halevy <bhalevy@panasas.com> Date: Thu, 23 Sep 2010 19:08:21 +0200 Subject: [PATCH] NFSv4.1: keep seq_res.sr_slot as pointer rather than an index Having to explicitly initialize sr_slotid to NFS4_MAX_SLOT_TABLE resulted in numerous bugs. Keeping the current slot as a pointer to the slot table is more straight forward and robust as it's implicitly set up to NULL wherever the seq_res member is initialized to zeroes. Signed-off-by: Benny Halevy <bhalevy@panasas.com> --- fs/nfs/nfs4proc.c | 52 +++++++++++++++++++---------------------------- fs/nfs/nfs4xdr.c | 6 +--- fs/nfs/read.c | 1 - fs/nfs/unlink.c | 3 +- fs/nfs/write.c | 2 - include/linux/nfs_xdr.h | 2 +- 6 files changed, 25 insertions(+), 41 deletions(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 72aa706..7f8cc33 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -334,10 +334,12 @@ static void renew_lease(const struct nfs_server *server, unsigned long timestamp * Must be called while holding tbl->slot_tbl_lock */ static void -nfs4_free_slot(struct nfs4_slot_table *tbl, u8 free_slotid) +nfs4_free_slot(struct nfs4_slot_table *tbl, struct nfs4_slot *free_slot) { + int free_slotid = free_slot - tbl->slots; int slotid = free_slotid; + BUG_ON(slotid < 0 || slotid >= NFS4_MAX_SLOT_TABLE); /* clear used bit in bitmap */ __clear_bit(slotid, tbl->used_slots); @@ -379,7 +381,7 @@ static void nfs41_sequence_free_slot(struct nfs4_sequence_res *res) struct nfs4_slot_table *tbl; tbl = &res->sr_session->fc_slot_table; - if (res->sr_slotid == NFS4_MAX_SLOT_TABLE) { + if (!res->sr_slot) { /* just wake up the next guy waiting since * we may have not consumed a slot after all */ dprintk("%s: No slot\n", __func__); @@ -387,17 +389,15 @@ static void nfs41_sequence_free_slot(struct nfs4_sequence_res *res) } spin_lock(&tbl->slot_tbl_lock); - nfs4_free_slot(tbl, res->sr_slotid); + nfs4_free_slot(tbl, res->sr_slot); nfs41_check_drain_session_complete(res->sr_session); spin_unlock(&tbl->slot_tbl_lock); - res->sr_slotid = NFS4_MAX_SLOT_TABLE; + res->sr_slot = NULL; } static int nfs41_sequence_done(struct rpc_task *task, struct nfs4_sequence_res *res) { unsigned long timestamp; - struct nfs4_slot_table *tbl; - struct nfs4_slot *slot; struct nfs_client *clp; /* @@ -410,17 +410,14 @@ static int nfs41_sequence_done(struct rpc_task *task, struct nfs4_sequence_res * res->sr_status = NFS_OK; /* -ERESTARTSYS can result in skipping nfs41_sequence_setup */ - if (res->sr_slotid == NFS4_MAX_SLOT_TABLE) + if (!res->sr_slot) goto out; - tbl = &res->sr_session->fc_slot_table; - slot = tbl->slots + res->sr_slotid; - /* Check the SEQUENCE operation status */ switch (res->sr_status) { case 0: /* Update the slot's sequence and clientid lease timer */ - ++slot->seq_nr; + ++res->sr_slot->seq_nr; timestamp = res->sr_renewal_time; clp = res->sr_session->clp; do_renew_lease(clp, timestamp); @@ -433,12 +430,14 @@ static int nfs41_sequence_done(struct rpc_task *task, struct nfs4_sequence_res * * returned NFS4ERR_DELAY as per Section 2.10.6.2 * of RFC5661. */ - dprintk("%s: slot=%d seq=%d: Operation in progress\n", - __func__, res->sr_slotid, slot->seq_nr); + dprintk("%s: slot=%ld seq=%d: Operation in progress\n", + __func__, + res->sr_slot - res->sr_session->fc_slot_table.slots, + res->sr_slot->seq_nr); goto out_retry; default: /* Just update the slot sequence no. */ - ++slot->seq_nr; + ++res->sr_slot->seq_nr; } out: /* The session may be reset by one of the error handlers. */ @@ -505,10 +504,9 @@ static int nfs41_setup_sequence(struct nfs4_session *session, dprintk("--> %s\n", __func__); /* slot already allocated? */ - if (res->sr_slotid != NFS4_MAX_SLOT_TABLE) + if (res->sr_slot != NULL) return 0; - res->sr_slotid = NFS4_MAX_SLOT_TABLE; tbl = &session->fc_slot_table; spin_lock(&tbl->slot_tbl_lock); @@ -550,7 +548,7 @@ static int nfs41_setup_sequence(struct nfs4_session *session, dprintk("<-- %s slotid=%d seqid=%d\n", __func__, slotid, slot->seq_nr); res->sr_session = session; - res->sr_slotid = slotid; + res->sr_slot = slot; res->sr_renewal_time = jiffies; res->sr_status_flags = 0; /* @@ -576,8 +574,9 @@ int nfs4_setup_sequence(const struct nfs_server *server, goto out; } - dprintk("--> %s clp %p session %p sr_slotid %d\n", - __func__, session->clp, session, res->sr_slotid); + dprintk("--> %s clp %p session %p sr_slot %ld\n", + __func__, session->clp, session, res->sr_slot ? + res->sr_slot - session->fc_slot_table.slots : -1); ret = nfs41_setup_sequence(session, args, res, cache_reply, task); @@ -650,7 +649,7 @@ static int nfs4_call_sync_sequence(struct nfs_server *server, .callback_data = &data }; - res->sr_slotid = NFS4_MAX_SLOT_TABLE; + res->sr_slot = NULL; if (privileged) task_setup.callback_ops = &nfs41_call_priv_sync_ops; task = rpc_run_task(&task_setup); @@ -735,7 +734,6 @@ static void nfs4_init_opendata_res(struct nfs4_opendata *p) p->o_res.server = p->o_arg.server; nfs_fattr_init(&p->f_attr); nfs_fattr_init(&p->dir_attr); - p->o_res.seq_res.sr_slotid = NFS4_MAX_SLOT_TABLE; } static struct nfs4_opendata *nfs4_opendata_alloc(struct path *path, @@ -1975,7 +1973,6 @@ int nfs4_do_close(struct path *path, struct nfs4_state *state, gfp_t gfp_mask, i calldata->res.fattr = &calldata->fattr; calldata->res.seqid = calldata->arg.seqid; calldata->res.server = server; - calldata->res.seq_res.sr_slotid = NFS4_MAX_SLOT_TABLE; path_get(path); calldata->path = *path; @@ -2550,7 +2547,7 @@ static void nfs4_proc_unlink_setup(struct rpc_message *msg, struct inode *dir) args->bitmask = server->cache_consistency_bitmask; res->server = server; - res->seq_res.sr_slotid = NFS4_MAX_SLOT_TABLE; + res->seq_res.sr_slot = NULL; msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_REMOVE]; } @@ -2576,7 +2573,6 @@ static void nfs4_proc_rename_setup(struct rpc_message *msg, struct inode *dir) msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_RENAME]; arg->bitmask = server->attr_bitmask; res->server = server; - res->seq_res.sr_slotid = NFS4_MAX_SLOT_TABLE; } static int nfs4_proc_rename_done(struct rpc_task *task, struct inode *old_dir, @@ -3646,7 +3642,6 @@ static int _nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, co memcpy(&data->stateid, stateid, sizeof(data->stateid)); data->res.fattr = &data->fattr; data->res.server = server; - data->res.seq_res.sr_slotid = NFS4_MAX_SLOT_TABLE; nfs_fattr_init(data->res.fattr); data->timestamp = jiffies; data->rpc_status = 0; @@ -3799,7 +3794,6 @@ static struct nfs4_unlockdata *nfs4_alloc_unlockdata(struct file_lock *fl, p->arg.fl = &p->fl; p->arg.seqid = seqid; p->res.seqid = seqid; - p->res.seq_res.sr_slotid = NFS4_MAX_SLOT_TABLE; p->arg.stateid = &lsp->ls_stateid; p->lsp = lsp; atomic_inc(&lsp->ls_count); @@ -3979,7 +3973,6 @@ static struct nfs4_lockdata *nfs4_alloc_lockdata(struct file_lock *fl, p->arg.lock_owner.clientid = server->nfs_client->cl_clientid; p->arg.lock_owner.id = lsp->ls_id.id; p->res.lock_seqid = p->arg.lock_seqid; - p->res.seq_res.sr_slotid = NFS4_MAX_SLOT_TABLE; p->lsp = lsp; p->server = server; atomic_inc(&lsp->ls_count); @@ -4612,7 +4605,6 @@ int nfs4_proc_get_lease_time(struct nfs_client *clp, struct nfs_fsinfo *fsinfo) }; int status; - res.lr_seq_res.sr_slotid = NFS4_MAX_SLOT_TABLE; dprintk("--> %s\n", __func__); task = rpc_run_task(&task_setup); @@ -5105,12 +5097,11 @@ static struct rpc_task *_nfs41_proc_sequence(struct nfs_client *clp, struct rpc_ if (!atomic_inc_not_zero(&clp->cl_count)) return ERR_PTR(-EIO); - calldata = kmalloc(sizeof(*calldata), GFP_NOFS); + calldata = kzalloc(sizeof(*calldata), GFP_NOFS); if (calldata == NULL) { nfs_put_client(clp); return ERR_PTR(-ENOMEM); } - calldata->res.sr_slotid = NFS4_MAX_SLOT_TABLE; msg.rpc_argp = &calldata->args; msg.rpc_resp = &calldata->res; calldata->clp = clp; @@ -5242,7 +5233,6 @@ static int nfs41_proc_reclaim_complete(struct nfs_client *clp) goto out; calldata->clp = clp; calldata->arg.one_fs = 0; - calldata->res.seq_res.sr_slotid = NFS4_MAX_SLOT_TABLE; msg.rpc_argp = &calldata->arg; msg.rpc_resp = &calldata->res; diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c index b0bd7ef..ccf09c4 100644 --- a/fs/nfs/nfs4xdr.c +++ b/fs/nfs/nfs4xdr.c @@ -4668,7 +4668,6 @@ static int decode_sequence(struct xdr_stream *xdr, struct rpc_rqst *rqstp) { #if defined(CONFIG_NFS_V4_1) - struct nfs4_slot *slot; struct nfs4_sessionid id; u32 dummy; int status; @@ -4700,15 +4699,14 @@ static int decode_sequence(struct xdr_stream *xdr, goto out_overflow; /* seqid */ - slot = &res->sr_session->fc_slot_table.slots[res->sr_slotid]; dummy = be32_to_cpup(p++); - if (dummy != slot->seq_nr) { + if (dummy != res->sr_slot->seq_nr) { dprintk("%s Invalid sequence number\n", __func__); goto out_err; } /* slot id */ dummy = be32_to_cpup(p++); - if (dummy != res->sr_slotid) { + if (dummy != res->sr_slot - res->sr_session->fc_slot_table.slots) { dprintk("%s Invalid slot id\n", __func__); goto out_err; } diff --git a/fs/nfs/read.c b/fs/nfs/read.c index 87adc27..79859c8 100644 --- a/fs/nfs/read.c +++ b/fs/nfs/read.c @@ -46,7 +46,6 @@ struct nfs_read_data *nfs_readdata_alloc(unsigned int pagecount) memset(p, 0, sizeof(*p)); INIT_LIST_HEAD(&p->pages); p->npages = pagecount; - p->res.seq_res.sr_slotid = NFS4_MAX_SLOT_TABLE; if (pagecount <= ARRAY_SIZE(p->page_array)) p->pagevec = p->page_array; else { diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c index 47530aa..9a16bad 100644 --- a/fs/nfs/unlink.c +++ b/fs/nfs/unlink.c @@ -262,7 +262,6 @@ nfs_async_unlink(struct inode *dir, struct dentry *dentry) status = PTR_ERR(data->cred); goto out_free; } - data->res.seq_res.sr_slotid = NFS4_MAX_SLOT_TABLE; data->res.dir_attr = &data->dir_attr; status = -EBUSY; @@ -427,7 +426,7 @@ nfs_async_rename(struct inode *old_dir, struct inode *new_dir, .flags = RPC_TASK_ASYNC, }; - data = kmalloc(sizeof(*data), GFP_KERNEL); + data = kzalloc(sizeof(*data), GFP_KERNEL); if (data == NULL) return ERR_PTR(-ENOMEM); task_setup_data.callback_data = data, diff --git a/fs/nfs/write.c b/fs/nfs/write.c index 874972d..4d6d35d 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -55,7 +55,6 @@ struct nfs_write_data *nfs_commitdata_alloc(void) if (p) { memset(p, 0, sizeof(*p)); INIT_LIST_HEAD(&p->pages); - p->res.seq_res.sr_slotid = NFS4_MAX_SLOT_TABLE; } return p; } @@ -75,7 +74,6 @@ struct nfs_write_data *nfs_writedata_alloc(unsigned int pagecount) memset(p, 0, sizeof(*p)); INIT_LIST_HEAD(&p->pages); p->npages = pagecount; - p->res.seq_res.sr_slotid = NFS4_MAX_SLOT_TABLE; if (pagecount <= ARRAY_SIZE(p->page_array)) p->pagevec = p->page_array; else { diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h index 172df83..5772b2c 100644 --- a/include/linux/nfs_xdr.h +++ b/include/linux/nfs_xdr.h @@ -170,7 +170,7 @@ struct nfs4_sequence_args { struct nfs4_sequence_res { struct nfs4_session *sr_session; - u8 sr_slotid; /* slot used to send request */ + struct nfs4_slot *sr_slot; /* slot used to send request */ int sr_status; /* sequence operation status */ unsigned long sr_renewal_time; u32 sr_status_flags; -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [bfields@pop.test.fieldses.org: all f08d6374 Merge branch 'nfs-for-next' into linux-next results] 2010-09-23 17:20 ` Benny Halevy @ 2010-09-23 17:29 ` Trond Myklebust 2010-09-23 18:52 ` Chuck Lever 2010-09-24 13:49 ` Andy Adamson 0 siblings, 2 replies; 13+ messages in thread From: Trond Myklebust @ 2010-09-23 17:29 UTC (permalink / raw) To: Benny Halevy; +Cc: Jeff Layton, J. Bruce Fields, linux-nfs On Thu, 2010-09-23 at 19:20 +0200, Benny Halevy wrote: > On 2010-09-21 23:58, Jeff Layton wrote: > > On Tue, 21 Sep 2010 16:46:21 -0400 > > Trond Myklebust <trond.myklebust@fys.uio.no> wrote: > > > >> On Tue, 2010-09-21 at 16:24 -0400, J. Bruce Fields wrote: > >>> On Tue, Sep 21, 2010 at 04:00:32PM -0400, Trond Myklebust wrote: > >>>> On Tue, 2010-09-21 at 15:36 -0400, Trond Myklebust wrote: > >>>>> On Tue, 2010-09-21 at 15:20 -0400, J. Bruce Fields wrote: > >>>>>> All I need to do is mount nfs4.1 and run cthon -s. The last output I > >>>>>> see from the test is: > >>>>>> > >>>>>> check for proper open/unlink operation > >>>>>> nfsjunk files before unlink: > >>>>>> > >>>>> > >>>>> Oh... I bet I see what it is. > >>>>> > >>>>> It's that damned res->sr_slotid = NFS4_MAX_SLOT_TABLE and related > >>>>> initialisation junk that's biting us in the arse again... > >>>>> > >>>>> I'll fix it... > >>>> > >>>> Does this work? > >>> > >>> Yup. > >> > >> It seems to fix things for me too... Once you mentioned NFSv4.1 it was > >> easy to reproduce the bug. > >> > >> OK. I'll merge these into nfs-for-2.6.37... > > > > Thanks for fixing it. I should have mentioned that the 4.1 parts were > > written using the "cargo-cult" programming method of copying what > > unlink does, and that they needed careful review. Mea culpa! > > > > I still hate it that the sr_slotid requires explicit, non-trivial initialization. > Once upon a time, the equivalent of sr_slotid used to be a pointer to > struct slot. Using a pointer, implicitly initialized to NULL is significantly > more straight forward as in the patch below. > (passes cthon tests over your nfs-for-2.6.37 branch + the patch I > sent earlier that fixes __put_nfs_open_context. w/o it I readily reproduce the > crash reported on this thread) > > From 50221fb5bd8fbd46f6cabe1fc795db7efc71bff7 Mon Sep 17 00:00:00 2001 > From: Benny Halevy <bhalevy@panasas.com> > Date: Thu, 23 Sep 2010 19:08:21 +0200 > Subject: [PATCH] NFSv4.1: keep seq_res.sr_slot as pointer rather than an index > > Having to explicitly initialize sr_slotid to NFS4_MAX_SLOT_TABLE > resulted in numerous bugs. Keeping the current slot as a pointer > to the slot table is more straight forward and robust as it's > implicitly set up to NULL wherever the seq_res member is initialized > to zeroes. > > Signed-off-by: Benny Halevy <bhalevy@panasas.com> Looks fine to me, and I prefer this approach to the one we have today. Note, however, that I've already applied commit d688e11007419fd060ae74d8d952a5c4ece735aa (NFSv4.1: Fix the slotid initialisation in nfs_async_rename()) to the nfs-for-2.6.37 branch. Are people happy with me rebasing nfs-for-2.6.37 to remove the above commit, and apply this one instead? Cheers Trond ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bfields@pop.test.fieldses.org: all f08d6374 Merge branch 'nfs-for-next' into linux-next results] 2010-09-23 17:29 ` Trond Myklebust @ 2010-09-23 18:52 ` Chuck Lever 2010-09-23 21:06 ` J. Bruce Fields 2010-09-24 13:49 ` Andy Adamson 1 sibling, 1 reply; 13+ messages in thread From: Chuck Lever @ 2010-09-23 18:52 UTC (permalink / raw) To: Trond Myklebust; +Cc: Benny Halevy, Jeff Layton, J. Bruce Fields, linux-nfs On Sep 23, 2010, at 1:29 PM, Trond Myklebust wrote: > On Thu, 2010-09-23 at 19:20 +0200, Benny Halevy wrote: >> On 2010-09-21 23:58, Jeff Layton wrote: >>> On Tue, 21 Sep 2010 16:46:21 -0400 >>> Trond Myklebust <trond.myklebust@fys.uio.no> wrote: >>> >>>> On Tue, 2010-09-21 at 16:24 -0400, J. Bruce Fields wrote: >>>>> On Tue, Sep 21, 2010 at 04:00:32PM -0400, Trond Myklebust wrote: >>>>>> On Tue, 2010-09-21 at 15:36 -0400, Trond Myklebust wrote: >>>>>>> On Tue, 2010-09-21 at 15:20 -0400, J. Bruce Fields wrote: >>>>>>>> All I need to do is mount nfs4.1 and run cthon -s. The last output I >>>>>>>> see from the test is: >>>>>>>> >>>>>>>> check for proper open/unlink operation >>>>>>>> nfsjunk files before unlink: >>>>>>>> >>>>>>> >>>>>>> Oh... I bet I see what it is. >>>>>>> >>>>>>> It's that damned res->sr_slotid = NFS4_MAX_SLOT_TABLE and related >>>>>>> initialisation junk that's biting us in the arse again... >>>>>>> >>>>>>> I'll fix it... >>>>>> >>>>>> Does this work? >>>>> >>>>> Yup. >>>> >>>> It seems to fix things for me too... Once you mentioned NFSv4.1 it was >>>> easy to reproduce the bug. >>>> >>>> OK. I'll merge these into nfs-for-2.6.37... >>> >>> Thanks for fixing it. I should have mentioned that the 4.1 parts were >>> written using the "cargo-cult" programming method of copying what >>> unlink does, and that they needed careful review. Mea culpa! >>> >> >> I still hate it that the sr_slotid requires explicit, non-trivial initialization. >> Once upon a time, the equivalent of sr_slotid used to be a pointer to >> struct slot. Using a pointer, implicitly initialized to NULL is significantly >> more straight forward as in the patch below. >> (passes cthon tests over your nfs-for-2.6.37 branch + the patch I >> sent earlier that fixes __put_nfs_open_context. w/o it I readily reproduce the >> crash reported on this thread) >> >> From 50221fb5bd8fbd46f6cabe1fc795db7efc71bff7 Mon Sep 17 00:00:00 2001 >> From: Benny Halevy <bhalevy@panasas.com> >> Date: Thu, 23 Sep 2010 19:08:21 +0200 >> Subject: [PATCH] NFSv4.1: keep seq_res.sr_slot as pointer rather than an index >> >> Having to explicitly initialize sr_slotid to NFS4_MAX_SLOT_TABLE >> resulted in numerous bugs. Keeping the current slot as a pointer >> to the slot table is more straight forward and robust as it's >> implicitly set up to NULL wherever the seq_res member is initialized >> to zeroes. >> >> Signed-off-by: Benny Halevy <bhalevy@panasas.com> > > Looks fine to me, and I prefer this approach to the one we have today. > > Note, however, that I've already applied commit > d688e11007419fd060ae74d8d952a5c4ece735aa (NFSv4.1: Fix the slotid > initialisation in nfs_async_rename()) to the nfs-for-2.6.37 branch. > > Are people happy with me rebasing nfs-for-2.6.37 to remove the above > commit, and apply this one instead? I don't have a problem with rebasing what is still ostensibly a "development and testing" branch. And warning us first is nice too, thanks! :-) I haven't started looking at nfs-for-2.6.37 yet, as I've been waiting for the dust to settle a bit. I know the merge window approacheth. -- chuck[dot]lever[at]oracle[dot]com ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bfields@pop.test.fieldses.org: all f08d6374 Merge branch 'nfs-for-next' into linux-next results] 2010-09-23 18:52 ` Chuck Lever @ 2010-09-23 21:06 ` J. Bruce Fields 0 siblings, 0 replies; 13+ messages in thread From: J. Bruce Fields @ 2010-09-23 21:06 UTC (permalink / raw) To: Chuck Lever; +Cc: Trond Myklebust, Benny Halevy, Jeff Layton, linux-nfs > On Sep 23, 2010, at 1:29 PM, Trond Myklebust wrote: > > Looks fine to me, and I prefer this approach to the one we have today. > > > > Note, however, that I've already applied commit > > d688e11007419fd060ae74d8d952a5c4ece735aa (NFSv4.1: Fix the slotid > > initialisation in nfs_async_rename()) to the nfs-for-2.6.37 branch. > > > > Are people happy with me rebasing nfs-for-2.6.37 to remove the above > > commit, and apply this one instead? I'm happier with incremental patches and no rebasing--I like having the history around--but the rebase won't cause me serious practical problems. --b. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bfields@pop.test.fieldses.org: all f08d6374 Merge branch 'nfs-for-next' into linux-next results] 2010-09-23 17:29 ` Trond Myklebust 2010-09-23 18:52 ` Chuck Lever @ 2010-09-24 13:49 ` Andy Adamson 1 sibling, 0 replies; 13+ messages in thread From: Andy Adamson @ 2010-09-24 13:49 UTC (permalink / raw) To: Trond Myklebust; +Cc: Benny Halevy, Jeff Layton, J. Bruce Fields, linux-nfs On Sep 23, 2010, at 1:29 PM, Trond Myklebust wrote: > On Thu, 2010-09-23 at 19:20 +0200, Benny Halevy wrote: >> On 2010-09-21 23:58, Jeff Layton wrote: >>> On Tue, 21 Sep 2010 16:46:21 -0400 >>> Trond Myklebust <trond.myklebust@fys.uio.no> wrote: >>> >>>> On Tue, 2010-09-21 at 16:24 -0400, J. Bruce Fields wrote: >>>>> On Tue, Sep 21, 2010 at 04:00:32PM -0400, Trond Myklebust wrote: >>>>>> On Tue, 2010-09-21 at 15:36 -0400, Trond Myklebust wrote: >>>>>>> On Tue, 2010-09-21 at 15:20 -0400, J. Bruce Fields wrote: >>>>>>>> All I need to do is mount nfs4.1 and run cthon -s. The last output I >>>>>>>> see from the test is: >>>>>>>> >>>>>>>> check for proper open/unlink operation >>>>>>>> nfsjunk files before unlink: >>>>>>>> >>>>>>> >>>>>>> Oh... I bet I see what it is. >>>>>>> >>>>>>> It's that damned res->sr_slotid = NFS4_MAX_SLOT_TABLE and related >>>>>>> initialisation junk that's biting us in the arse again... >>>>>>> >>>>>>> I'll fix it... >>>>>> >>>>>> Does this work? >>>>> >>>>> Yup. >>>> >>>> It seems to fix things for me too... Once you mentioned NFSv4.1 it was >>>> easy to reproduce the bug. >>>> >>>> OK. I'll merge these into nfs-for-2.6.37... >>> >>> Thanks for fixing it. I should have mentioned that the 4.1 parts were >>> written using the "cargo-cult" programming method of copying what >>> unlink does, and that they needed careful review. Mea culpa! >>> >> >> I still hate it that the sr_slotid requires explicit, non-trivial initialization. >> Once upon a time, the equivalent of sr_slotid used to be a pointer to >> struct slot. Using a pointer, implicitly initialized to NULL is significantly >> more straight forward as in the patch below. >> (passes cthon tests over your nfs-for-2.6.37 branch + the patch I >> sent earlier that fixes __put_nfs_open_context. w/o it I readily reproduce the >> crash reported on this thread) >> >> From 50221fb5bd8fbd46f6cabe1fc795db7efc71bff7 Mon Sep 17 00:00:00 2001 >> From: Benny Halevy <bhalevy@panasas.com> >> Date: Thu, 23 Sep 2010 19:08:21 +0200 >> Subject: [PATCH] NFSv4.1: keep seq_res.sr_slot as pointer rather than an index >> >> Having to explicitly initialize sr_slotid to NFS4_MAX_SLOT_TABLE >> resulted in numerous bugs. Keeping the current slot as a pointer >> to the slot table is more straight forward and robust as it's >> implicitly set up to NULL wherever the seq_res member is initialized >> to zeroes. >> >> Signed-off-by: Benny Halevy <bhalevy@panasas.com> > > Looks fine to me, and I prefer this approach to the one we have today. > > Note, however, that I've already applied commit > d688e11007419fd060ae74d8d952a5c4ece735aa (NFSv4.1: Fix the slotid > initialisation in nfs_async_rename()) to the nfs-for-2.6.37 branch. > > Are people happy with me rebasing nfs-for-2.6.37 to remove the above > commit, and apply this one instead? Yes, I like this approach as well and think it should be applied. -->Andy > > Cheers > Trond > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bfields-btOuF8jVhiP3NsPE3w5/ayCwEArCW2h5@public.gmane.org: all f08d6374 Merge branch 'nfs-for-next' into linux-next results] 2010-09-21 19:20 ` [bfields@pop.test.fieldses.org: all f08d6374 Merge branch 'nfs-for-next' into linux-next results] J. Bruce Fields 2010-09-21 19:36 ` Trond Myklebust @ 2010-09-21 19:42 ` J. Bruce Fields 1 sibling, 0 replies; 13+ messages in thread From: J. Bruce Fields @ 2010-09-21 19:42 UTC (permalink / raw) To: Jeff Layton; +Cc: Trond Myklebust, linux-nfs On Tue, Sep 21, 2010 at 03:20:05PM -0400, J. Bruce Fields wrote: > All I need to do is mount nfs4.1 and run cthon -s. The last output I > see from the test is: > > check for proper open/unlink operation > nfsjunk files before unlink: > > > That's on a kernel without Trond's fix above. But, as expected, that fix makes no difference. --b. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2010-09-24 13:55 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20100918171546.GA9859@fieldses.org>
[not found] ` <1284831680.2787.1.camel@heimdal.trondhjem.org>
[not found] ` <20100919184549.GD32071@fieldses.org>
[not found] ` <20100920130106.GB4580@fieldses.org>
[not found] ` <20100920131025.GC4580@fieldses.org>
[not found] ` <1285091815.30222.19.camel@heimdal.trondhjem.org>
[not found] ` <20100921111657.330e7838@corrin.poochiereds.net>
2010-09-21 19:20 ` [bfields@pop.test.fieldses.org: all f08d6374 Merge branch 'nfs-for-next' into linux-next results] J. Bruce Fields
2010-09-21 19:36 ` Trond Myklebust
2010-09-21 20:00 ` Trond Myklebust
[not found] ` <1285099232.30222.25.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2010-09-21 20:24 ` [bfields-btOuF8jVhiP3NsPE3w5/ayCwEArCW2h5@public.gmane.org: " J. Bruce Fields
2010-09-21 20:46 ` [bfields@pop.test.fieldses.org: " Trond Myklebust
2010-09-21 20:58 ` J. Bruce Fields
2010-09-21 21:58 ` Jeff Layton
2010-09-23 17:20 ` Benny Halevy
2010-09-23 17:29 ` Trond Myklebust
2010-09-23 18:52 ` Chuck Lever
2010-09-23 21:06 ` J. Bruce Fields
2010-09-24 13:49 ` Andy Adamson
2010-09-21 19:42 ` [bfields-btOuF8jVhiP3NsPE3w5/ayCwEArCW2h5@public.gmane.org: " J. Bruce Fields
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).