linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] NFSv4.1: Don't allow a lock_state to be freed while checking
@ 2016-11-14 16:50 Benjamin Coddington
  2016-11-14 20:18 ` Benjamin Coddington
  0 siblings, 1 reply; 4+ messages in thread
From: Benjamin Coddington @ 2016-11-14 16:50 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker; +Cc: linux-nfs

While walking the list of lock_states, keep the current nfs4_lock_state
referenced so that it isn't freed while checking.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
 fs/nfs/nfs4proc.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 7897826d7c51..9a1cb9e8c4fc 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2564,12 +2564,15 @@ static void nfs41_check_delegation_stateid(struct nfs4_state *state)
 static int nfs41_check_expired_locks(struct nfs4_state *state)
 {
 	int status, ret = NFS_OK;
-	struct nfs4_lock_state *lsp;
+	struct nfs4_lock_state *lsp, *tmp;
 	struct nfs_server *server = NFS_SERVER(state->inode);
 
 	if (!test_bit(LK_STATE_IN_USE, &state->flags))
 		goto out;
-	list_for_each_entry(lsp, &state->lock_states, ls_locks) {
+	spin_lock(&state->state_lock);
+	list_for_each_entry_safe(lsp, tmp, &state->lock_states, ls_locks) {
+		atomic_inc(&lsp->ls_count);
+		spin_unlock(&state->state_lock);
 		if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
 			struct rpc_cred *cred = lsp->ls_state->owner->so_cred;
 
@@ -2588,7 +2591,10 @@ static int nfs41_check_expired_locks(struct nfs4_state *state)
 				break;
 			}
 		}
-	};
+		nfs4_put_lock_state(lsp);
+		spin_lock(&state->state_lock);
+	}
+	spin_unlock(&state->state_lock);
 out:
 	return ret;
 }
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] NFSv4.1: Don't allow a lock_state to be freed while checking
  2016-11-14 16:50 [PATCH] NFSv4.1: Don't allow a lock_state to be freed while checking Benjamin Coddington
@ 2016-11-14 20:18 ` Benjamin Coddington
  2016-11-15 16:12   ` Benjamin Coddington
  0 siblings, 1 reply; 4+ messages in thread
From: Benjamin Coddington @ 2016-11-14 20:18 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker; +Cc: linux-nfs

On 14 Nov 2016, at 11:50, Benjamin Coddington wrote:

> While walking the list of lock_states, keep the current 
> nfs4_lock_state
> referenced so that it isn't freed while checking.
>
> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
> ---
>  fs/nfs/nfs4proc.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 7897826d7c51..9a1cb9e8c4fc 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -2564,12 +2564,15 @@ static void 
> nfs41_check_delegation_stateid(struct nfs4_state *state)
>  static int nfs41_check_expired_locks(struct nfs4_state *state)
>  {
>  	int status, ret = NFS_OK;
> -	struct nfs4_lock_state *lsp;
> +	struct nfs4_lock_state *lsp, *tmp;
>  	struct nfs_server *server = NFS_SERVER(state->inode);
>
>  	if (!test_bit(LK_STATE_IN_USE, &state->flags))
>  		goto out;
> -	list_for_each_entry(lsp, &state->lock_states, ls_locks) {
> +	spin_lock(&state->state_lock);
> +	list_for_each_entry_safe(lsp, tmp, &state->lock_states, ls_locks) {
> +		atomic_inc(&lsp->ls_count);
> +		spin_unlock(&state->state_lock);
>  		if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
>  			struct rpc_cred *cred = lsp->ls_state->owner->so_cred;
>
> @@ -2588,7 +2591,10 @@ static int nfs41_check_expired_locks(struct 
> nfs4_state *state)
>  				break;
>  			}
>  		}
> -	};
> +		nfs4_put_lock_state(lsp);
> +		spin_lock(&state->state_lock);
> +	}
> +	spin_unlock(&state->state_lock);
>  out:
>  	return ret;
>  }

Actually, something else may be wrong here.. after several more hours of
testing with this and Trond's two fixes for CLOSE races this popped up:

[ 8102.015071] BUG: unable to handle kernel NULL pointer dereference at 
0000000000000048
[ 8102.015985] IP: [<ffffffff8182717c>] _raw_spin_lock+0xc/0x30
[ 8102.016622] PGD 0
[ 8102.016829]
[ 8102.017014] Oops: 0002 [#1] SMP
[ 8102.017357] Modules linked in: nfsv4 dns_resolver nfs ip6t_rpfilter 
ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat 
ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 
nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw 
ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 
nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw 
iptable_security ebtable_filter ebtables ip6table_filter ip6_tables nfsd 
auth_rpcgss nfs_acl lockd grace sunrpc virtio_balloon virtio_console 
virtio_net virtio_blk crct10dif_pclmul ppdev crc32_pclmul crc32c_intel 
ghash_clmulni_intel virtio_pci serio_raw i2c_piix4 virtio_ring virtio 
ata_generic parport_pc pata_acpi parport acpi_cpufreq tpm_tis 
tpm_tis_core tpm
[ 8102.025180] CPU: 3 PID: 4475 Comm: ::1-manager Not tainted 
4.9.0-rc5-00268-gbb3ec452d27c #45
[ 8102.026093] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
BIOS 1.8.1-20150318_183358- 04/01/2014
[ 8102.027122] task: ffff880132b93a80 task.stack: ffffc90003200000
[ 8102.027769] RIP: 0010:[<ffffffff8182717c>]  [<ffffffff8182717c>] 
_raw_spin_lock+0xc/0x30
[ 8102.028657] RSP: 0018:ffffc90003203d40  EFLAGS: 00010246
[ 8102.029227] RAX: 0000000000000000 RBX: ffff880132d9c31c RCX: 
0000000000000001
[ 8102.029996] RDX: 0000000000000001 RSI: 0000000000000048 RDI: 
0000000000000048
[ 8102.030771] RBP: ffffc90003203d58 R08: 0000000032b45601 R09: 
000000018010000e
[ 8102.031532] R10: 000000008010000e R11: 0000000000100000 R12: 
0000000000000048
[ 8102.032291] R13: 0000000000000000 R14: 0000000000000000 R15: 
ffff880139fc2050
[ 8102.033064] FS:  0000000000000000(0000) GS:ffff88013fd80000(0000) 
knlGS:0000000000000000
[ 8102.033930] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8102.034546] CR2: 0000000000000048 CR3: 0000000139678000 CR4: 
00000000000406e0
[ 8102.035312] Stack:
[ 8102.035542]  ffffffff81489818 ffff880132d9c200 0000000000000000 
ffffc90003203d78
[ 8102.036400]  ffffffffa02f9289 ffff8801314f29c0 0000000000000000 
ffffc90003203de8
[ 8102.037259]  ffffffffa02e3dd2 ffff88013393f000 ffff8801314f29e0 
ffff8801314f2a08
[ 8102.038120] Call Trace:
[ 8102.038392]  [<ffffffff81489818>] ? _atomic_dec_and_lock+0x48/0x5b
[ 8102.039073]  [<ffffffffa02f9289>] nfs4_put_lock_state+0x29/0x90 
[nfsv4]
[ 8102.039797]  [<ffffffffa02e3dd2>] nfs41_open_expired+0x192/0x410 
[nfsv4]
[ 8102.040523]  [<ffffffffa02f859f>] nfs4_do_reclaim+0x1af/0x630 [nfsv4]
[ 8102.041227]  [<ffffffffa02f8f30>] nfs4_run_state_manager+0x510/0x7d0 
[nfsv4]
[ 8102.041995]  [<ffffffffa02f8a20>] ? nfs4_do_reclaim+0x630/0x630 
[nfsv4]
[ 8102.042719]  [<ffffffffa02f8a20>] ? nfs4_do_reclaim+0x630/0x630 
[nfsv4]
[ 8102.043434]  [<ffffffff810c96a9>] kthread+0xd9/0xf0
[ 8102.043968]  [<ffffffff810c95d0>] ? kthread_park+0x60/0x60
[ 8102.044559]  [<ffffffff81827615>] ret_from_fork+0x25/0x30
[ 8102.045150] Code: c0 ba 01 00 00 00 f0 0f b1 17 85 c0 75 02 5d c3 89 
c6 e8 18 db 8c ff 5d c3 66 0f 1f 44 00 00 66 66 66 66 90 31 c0 ba 01 00 
00 00 <f0> 0f b1 17 85 c0 75 01 c3 55 89 c6 48 89 e5 e8 f0 da 8c ff 5d
[ 8102.048136] RIP  [<ffffffff8182717c>] _raw_spin_lock+0xc/0x30
[ 8102.048775]  RSP <ffffc90003203d40>
[ 8102.049153] CR2: 0000000000000048
[ 8102.049893] ---[ end trace 8dc47952e274960a ]---
[ 8102.050391] Kernel panic - not syncing: Fatal exception
[ 8102.051090] Kernel Offset: disabled
[ 8102.051475] ---[ end Kernel panic - not syncing: Fatal exception

OK, it now looks like the nfs4_state has been freed while walking the 
list of
lock_states.  I'll have to look at this a bit more tomorrow.  I have to 
ferry
the kids around now.

Ben

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] NFSv4.1: Don't allow a lock_state to be freed while checking
  2016-11-14 20:18 ` Benjamin Coddington
@ 2016-11-15 16:12   ` Benjamin Coddington
  2016-11-17 21:24     ` Benjamin Coddington
  0 siblings, 1 reply; 4+ messages in thread
From: Benjamin Coddington @ 2016-11-15 16:12 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker; +Cc: linux-nfs


On 14 Nov 2016, at 15:18, Benjamin Coddington wrote:

> On 14 Nov 2016, at 11:50, Benjamin Coddington wrote:
>
>> While walking the list of lock_states, keep the current 
>> nfs4_lock_state
>> referenced so that it isn't freed while checking.
>>
>> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
>> ---
>>  fs/nfs/nfs4proc.c | 12 +++++++++---
>>  1 file changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index 7897826d7c51..9a1cb9e8c4fc 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -2564,12 +2564,15 @@ static void 
>> nfs41_check_delegation_stateid(struct nfs4_state *state)
>>  static int nfs41_check_expired_locks(struct nfs4_state *state)
>>  {
>>  	int status, ret = NFS_OK;
>> -	struct nfs4_lock_state *lsp;
>> +	struct nfs4_lock_state *lsp, *tmp;
>>  	struct nfs_server *server = NFS_SERVER(state->inode);
>>
>>  	if (!test_bit(LK_STATE_IN_USE, &state->flags))
>>  		goto out;
>> -	list_for_each_entry(lsp, &state->lock_states, ls_locks) {
>> +	spin_lock(&state->state_lock);
>> +	list_for_each_entry_safe(lsp, tmp, &state->lock_states, ls_locks) {
>> +		atomic_inc(&lsp->ls_count);
>> +		spin_unlock(&state->state_lock);
>>  		if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
>>  			struct rpc_cred *cred = lsp->ls_state->owner->so_cred;
>>
>> @@ -2588,7 +2591,10 @@ static int nfs41_check_expired_locks(struct 
>> nfs4_state *state)
>>  				break;
>>  			}
>>  		}
>> -	};
>> +		nfs4_put_lock_state(lsp);
>> +		spin_lock(&state->state_lock);
>> +	}
>> +	spin_unlock(&state->state_lock);
>>  out:
>>  	return ret;
>>  }
>
> Actually, something else may be wrong here.. after several more hours 
> of
> testing with this and Trond's two fixes for CLOSE races this popped 
> up:

I've been an idiot.  This patch is wrong in two ways:

1. list_for_each_entry_safe() won't keep the *next* entry from being 
freed
2. There shouldn't be any reason (I think) it can't hold the state lock 
over
    nfs41_test_and_free_expired_stateid(), so the much simpler fix is to 
hold it
    the entire time. (I expect the problem I saw with that the first 
time was the
    CLOSE/OPEN race recently fixed).

*ben goes off and tests what should be a simple fix again*

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 7897826d7c51..34b36f148b65 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2569,6 +2569,7 @@ static int nfs41_check_expired_locks(struct 
nfs4_state *state)

         if (!test_bit(LK_STATE_IN_USE, &state->flags))
                 goto out;
+       spin_lock(&state->state_lock);
         list_for_each_entry(lsp, &state->lock_states, ls_locks) {
                 if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
                         struct rpc_cred *cred = 
lsp->ls_state->owner->so_cred;
@@ -2588,7 +2589,8 @@ static int nfs41_check_expired_locks(struct 
nfs4_state *state)
                                 break;
                         }
                 }
-       };
+       }
+       spin_unlock(&state->state_lock);
  out:
         return ret;
  }

No, it /does/ soft lockup this time:

crash> bt 2514
PID: 2514   TASK: ffff880131560000  CPU: 2   COMMAND: "::1-manager"
  #0 [ffffc90002a1b9d0] __schedule at ffffffff8182212e
  #1 [ffffc90002a1ba50] schedule at ffffffff818225d6
  #2 [ffffc90002a1ba68] rpc_wait_bit_killable at ffffffffa00d38f4 
[sunrpc]
  #3 [ffffc90002a1ba88] __wait_on_bit at ffffffff818229e1
  #4 [ffffc90002a1bac8] out_of_line_wait_on_bit at ffffffff81822b62
  #5 [ffffc90002a1bb40] __rpc_execute at ffffffffa00d477a [sunrpc]
  #6 [ffffc90002a1bba8] rpc_execute at ffffffffa00d6d41 [sunrpc]
  #7 [ffffc90002a1bbd8] rpc_run_task at ffffffffa00c7d99 [sunrpc]
  #8 [ffffc90002a1bc00] nfs4_call_sync_sequence at ffffffffa02d11e3 
[nfsv4]
  #9 [ffffc90002a1bc78] _nfs41_test_stateid at ffffffffa02d9941 [nfsv4]
#10 [ffffc90002a1bd30] nfs41_test_and_free_expired_stateid at 
ffffffffa02db9b9 [nfsv4]
#11 [ffffc90002a1bd90] nfs41_open_expired at ffffffffa02dbdd7 [nfsv4]
#12 [ffffc90002a1bdf0] nfs4_do_reclaim at ffffffffa02f057f [nfsv4]
#13 [ffffc90002a1be88] nfs4_run_state_manager at ffffffffa02f0f10 
[nfsv4]
#14 [ffffc90002a1bec8] kthread at ffffffff810c96a9
#15 [ffffc90002a1bf50] ret_from_fork at ffffffff81827615

crash> l *(0xffffffffa00d477a)
0xffffffffa00d477a is in __rpc_execute (net/sunrpc/sched.c:807).
802                     /* sync task: sleep here */
803                     dprintk("RPC: %5u sync task going to sleep\n", 
task->tk_pid);
804                     status = 
out_of_line_wait_on_bit(&task->tk_runstate,
805                                     RPC_TASK_QUEUED, 
rpc_wait_bit_killable,
806                                     TASK_KILLABLE);
807                     if (status == -ERESTARTSYS) {
808                             /*
809                              * When a sync task receives a signal, 
it exits with
810                              * -ERESTARTSYS. In order to catch any 
callbacks that
811                              * clean up after sleeping on some 
queue, we don't

Why it is queued here? Did the server return DELAY maybe?  
Investigations are
ongoing, advice welcomed.  I am going to resume email silence until I 
have
something worthwhile to contribute, rather than continue to reply to 
myself.

Ben

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] NFSv4.1: Don't allow a lock_state to be freed while checking
  2016-11-15 16:12   ` Benjamin Coddington
@ 2016-11-17 21:24     ` Benjamin Coddington
  0 siblings, 0 replies; 4+ messages in thread
From: Benjamin Coddington @ 2016-11-17 21:24 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker; +Cc: linux-nfs

On 15 Nov 2016, at 11:12, Benjamin Coddington wrote:

> On 14 Nov 2016, at 15:18, Benjamin Coddington wrote:
>
>> On 14 Nov 2016, at 11:50, Benjamin Coddington wrote:
>>
>>> While walking the list of lock_states, keep the current 
>>> nfs4_lock_state
>>> referenced so that it isn't freed while checking.
>>>
>>> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
>>> ---
>>>  fs/nfs/nfs4proc.c | 12 +++++++++---
>>>  1 file changed, 9 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>>> index 7897826d7c51..9a1cb9e8c4fc 100644
>>> --- a/fs/nfs/nfs4proc.c
>>> +++ b/fs/nfs/nfs4proc.c
>>> @@ -2564,12 +2564,15 @@ static void 
>>> nfs41_check_delegation_stateid(struct nfs4_state *state)
>>>  static int nfs41_check_expired_locks(struct nfs4_state *state)
>>>  {
>>>  	int status, ret = NFS_OK;
>>> -	struct nfs4_lock_state *lsp;
>>> +	struct nfs4_lock_state *lsp, *tmp;
>>>  	struct nfs_server *server = NFS_SERVER(state->inode);
>>>
>>>  	if (!test_bit(LK_STATE_IN_USE, &state->flags))
>>>  		goto out;
>>> -	list_for_each_entry(lsp, &state->lock_states, ls_locks) {
>>> +	spin_lock(&state->state_lock);
>>> +	list_for_each_entry_safe(lsp, tmp, &state->lock_states, ls_locks) 
>>> {
>>> +		atomic_inc(&lsp->ls_count);
>>> +		spin_unlock(&state->state_lock);
>>>  		if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
>>>  			struct rpc_cred *cred = lsp->ls_state->owner->so_cred;
>>>
>>> @@ -2588,7 +2591,10 @@ static int nfs41_check_expired_locks(struct 
>>> nfs4_state *state)
>>>  				break;
>>>  			}
>>>  		}
>>> -	};
>>> +		nfs4_put_lock_state(lsp);
>>> +		spin_lock(&state->state_lock);
>>> +	}
>>> +	spin_unlock(&state->state_lock);
>>>  out:
>>>  	return ret;
>>>  }
>>
>> Actually, something else may be wrong here.. after several more hours 
>> of
>> testing with this and Trond's two fixes for CLOSE races this popped 
>> up:
>
> I've been an idiot.  This patch is wrong in two ways:
>
> 1. list_for_each_entry_safe() won't keep the *next* entry from being 
> freed
> 2. There shouldn't be any reason (I think) it can't hold the state 
> lock over
>    nfs41_test_and_free_expired_stateid(), so the much simpler fix is 
> to hold it
>    the entire time. (I expect the problem I saw with that the first 
> time was the
>    CLOSE/OPEN race recently fixed).
>
> *ben goes off and tests what should be a simple fix again*
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 7897826d7c51..34b36f148b65 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -2569,6 +2569,7 @@ static int nfs41_check_expired_locks(struct 
> nfs4_state *state)
>
>         if (!test_bit(LK_STATE_IN_USE, &state->flags))
>                 goto out;
> +       spin_lock(&state->state_lock);
>         list_for_each_entry(lsp, &state->lock_states, ls_locks) {
>                 if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) {
>                         struct rpc_cred *cred = 
> lsp->ls_state->owner->so_cred;
> @@ -2588,7 +2589,8 @@ static int nfs41_check_expired_locks(struct 
> nfs4_state *state)
>                                 break;
>                         }
>                 }
> -       };
> +       }
> +       spin_unlock(&state->state_lock);
>  out:
>         return ret;
>  }
>
> No, it /does/ soft lockup this time:
>
> crash> bt 2514
> PID: 2514   TASK: ffff880131560000  CPU: 2   COMMAND: "::1-manager"
>  #0 [ffffc90002a1b9d0] __schedule at ffffffff8182212e
>  #1 [ffffc90002a1ba50] schedule at ffffffff818225d6
>  #2 [ffffc90002a1ba68] rpc_wait_bit_killable at ffffffffa00d38f4 
> [sunrpc]
>  #3 [ffffc90002a1ba88] __wait_on_bit at ffffffff818229e1
>  #4 [ffffc90002a1bac8] out_of_line_wait_on_bit at ffffffff81822b62
>  #5 [ffffc90002a1bb40] __rpc_execute at ffffffffa00d477a [sunrpc]
>  #6 [ffffc90002a1bba8] rpc_execute at ffffffffa00d6d41 [sunrpc]
>  #7 [ffffc90002a1bbd8] rpc_run_task at ffffffffa00c7d99 [sunrpc]
>  #8 [ffffc90002a1bc00] nfs4_call_sync_sequence at ffffffffa02d11e3 
> [nfsv4]
>  #9 [ffffc90002a1bc78] _nfs41_test_stateid at ffffffffa02d9941 [nfsv4]
> #10 [ffffc90002a1bd30] nfs41_test_and_free_expired_stateid at 
> ffffffffa02db9b9 [nfsv4]
> #11 [ffffc90002a1bd90] nfs41_open_expired at ffffffffa02dbdd7 [nfsv4]
> #12 [ffffc90002a1bdf0] nfs4_do_reclaim at ffffffffa02f057f [nfsv4]
> #13 [ffffc90002a1be88] nfs4_run_state_manager at ffffffffa02f0f10 
> [nfsv4]
> #14 [ffffc90002a1bec8] kthread at ffffffff810c96a9
> #15 [ffffc90002a1bf50] ret_from_fork at ffffffff81827615
>
> crash> l *(0xffffffffa00d477a)
> 0xffffffffa00d477a is in __rpc_execute (net/sunrpc/sched.c:807).
> 802                     /* sync task: sleep here */
> 803                     dprintk("RPC: %5u sync task going to sleep\n", 
> task->tk_pid);
> 804                     status = 
> out_of_line_wait_on_bit(&task->tk_runstate,
> 805                                     RPC_TASK_QUEUED, 
> rpc_wait_bit_killable,
> 806                                     TASK_KILLABLE);
> 807                     if (status == -ERESTARTSYS) {
> 808                             /*
> 809                              * When a sync task receives a signal, 
> it exits with
> 810                              * -ERESTARTSYS. In order to catch any 
> callbacks that
> 811                              * clean up after sleeping on some 
> queue, we don't
>
> Why it is queued here? Did the server return DELAY maybe?  
> Investigations are
> ongoing, advice welcomed.  I am going to resume email silence until I 
> have
> something worthwhile to contribute, rather than continue to reply to 
> myself.

I feel I've wasted a lot of time learning something which should be 
obvious:
Don't hold a spinlock and then send a RPC, let alone several of them.

Softlockup happens when rpciod is spinning on the state_lock in 
locku_done,
and then xprtiod is scheduled to receive data on the same CPU.  That 
data
would be the replies to TEST_STATEID which would complete TEST_STATEID 
that
would release the lock.

So, please drop this patch.  I'll send another that takes a reference 
and
sets a flag on each lock state for testing.  Then the spin lock can be
dropped while TEST_STATEID and possible FREE_STATEID are sent.

Ben

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-11-17 21:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-14 16:50 [PATCH] NFSv4.1: Don't allow a lock_state to be freed while checking Benjamin Coddington
2016-11-14 20:18 ` Benjamin Coddington
2016-11-15 16:12   ` Benjamin Coddington
2016-11-17 21:24     ` Benjamin Coddington

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).