All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] SUNRPC: Change WARN_ON to an warn message in rpc_exit_task
@ 2015-08-17 11:08 Kinglong Mee
  2015-08-17 13:35 ` Scott Mayhew
  0 siblings, 1 reply; 3+ messages in thread
From: Kinglong Mee @ 2015-08-17 11:08 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs@vger.kernel.org, kinglongmee

When testing nfs4.0 lock, I get WARN_ON message as below,
[  278.733391] ------------[ cut here ]------------
[  278.734496] WARNING: CPU: 0 PID: 263 at /root/linux-nfs/net/sunrpc/sched.c:692
rpc_exit_task+0x85/0x90 [sunrpc]()
[  278.736771] Modules linked in: nfsv4(OE) nfs(OE) nfsd(E) bnep bluetooth ip6t_rpfilter ip6t_REJECT
xt_conntrack cfg80211 rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables
ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security
ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw btrfs ppdev xfs
microcode xor serio_raw libcrc32c pcspkr raid6_pq e1000 parport_pc parport i2c_piix4 i2c_core
auth_rpcgss nfs_acl lockd sunrpc(OE) ata_generic pata_acpi [last unloaded: nfsd]
[  278.751462] CPU: 0 PID: 263 Comm: kworker/0:4 Tainted: G           OE 3.16.0-rc2+ #28
[  278.753364] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform,
BIOS 6.00 07/31/2013
[  278.755710] Workqueue: rpciod rpc_async_schedule [sunrpc]
[  278.757112]  0000000000000000 00000000f6173352 ffff880072e5fd30 ffffffff816eacfa
[  278.758811]  0000000000000000 ffff880072e5fd68 ffffffff8108bf3d ffff88005997f180
[  278.760527]  ffff88003ae44830 ffffffffa001a860 ffffffffa001a860 0000000000000000
[  278.762271] Call Trace:
[  278.762802]  [<ffffffff816eacfa>] dump_stack+0x45/0x56
[  278.763906]  [<ffffffff8108bf3d>] warn_slowpath_common+0x7d/0xa0
[  278.765419]  [<ffffffffa001a860>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
[  278.767243]  [<ffffffffa001a860>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
[  278.768758]  [<ffffffff8108c06a>] warn_slowpath_null+0x1a/0x20
[  278.770080]  [<ffffffffa001a8e5>] rpc_exit_task+0x85/0x90 [sunrpc]
[  278.771439]  [<ffffffffa001b204>] __rpc_execute+0x94/0x420 [sunrpc]
[  278.772829]  [<ffffffffa001b5b6>] rpc_async_schedule+0x26/0x30 [sunrpc]
[  278.774276]  [<ffffffff810a7f95>] process_one_work+0x175/0x430
[  278.775573]  [<ffffffff810a866d>] worker_thread+0x11d/0x530
[  278.776824]  [<ffffffff810a8550>] ? rescuer_thread+0x300/0x300
[  278.778069]  [<ffffffff810aee18>] kthread+0xd8/0xf0
[  278.779114]  [<ffffffff810aed40>] ? insert_kthread_work+0x40/0x40
[  278.780448]  [<ffffffff816f22bc>] ret_from_fork+0x7c/0xb0
[  278.781621]  [<ffffffff810aed40>] ? insert_kthread_work+0x40/0x40
[  278.782972] ---[ end trace 3350394592832e42 ]---

task1 (lock/unlock):
  --> call_decode
    --> rpc_exit_task (-10008)
      --> nfs4_release_lockowner_done
        --> rpc_restart_call_prepare
            task->tk_action = call_start;
task2 (umount):
      --> rpc_shutdown_client
        --> rpc_killall_tasks
            rovr->tk_flags |= RPC_TASK_KILLED; (set task1 to KILLED)
task1:
    --> rpc_exit_task
        WARN_ON(RPC_ASSASSINATED(task));

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
---
 net/sunrpc/sched.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
index 337ca85..562db0a 100644
--- a/net/sunrpc/sched.c
+++ b/net/sunrpc/sched.c
@@ -689,7 +689,14 @@ void rpc_exit_task(struct rpc_task *task)
 	if (task->tk_ops->rpc_call_done != NULL) {
 		task->tk_ops->rpc_call_done(task, task->tk_calldata);
 		if (task->tk_action != NULL) {
-			WARN_ON(RPC_ASSASSINATED(task));
+			/*
+			 * A known case here is,
+			 * The RPC_TASK_KILLED is set in rpc_killall_tasks by
+			 * another process.
+			 */
+			if (RPC_ASSASSINATED(task))
+				printk(KERN_WARNING "RPC: dead task tried to"
+					" walk away.\n");
 			/* Always release the RPC slot and buffer memory */
 			xprt_release(task);
 			rpc_reset_task_statistics(task);
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] SUNRPC: Change WARN_ON to an warn message in rpc_exit_task
  2015-08-17 11:08 [PATCH] SUNRPC: Change WARN_ON to an warn message in rpc_exit_task Kinglong Mee
@ 2015-08-17 13:35 ` Scott Mayhew
  2015-08-18  6:25   ` Kinglong Mee
  0 siblings, 1 reply; 3+ messages in thread
From: Scott Mayhew @ 2015-08-17 13:35 UTC (permalink / raw)
  To: Kinglong Mee; +Cc: Trond Myklebust, linux-nfs@vger.kernel.org

On Mon, 17 Aug 2015, Kinglong Mee wrote:

> When testing nfs4.0 lock, I get WARN_ON message as below,
> [  278.733391] ------------[ cut here ]------------
> [  278.734496] WARNING: CPU: 0 PID: 263 at /root/linux-nfs/net/sunrpc/sched.c:692
> rpc_exit_task+0x85/0x90 [sunrpc]()
> [  278.736771] Modules linked in: nfsv4(OE) nfs(OE) nfsd(E) bnep bluetooth ip6t_rpfilter ip6t_REJECT
> xt_conntrack cfg80211 rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables
> ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security
> ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
> nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw btrfs ppdev xfs
> microcode xor serio_raw libcrc32c pcspkr raid6_pq e1000 parport_pc parport i2c_piix4 i2c_core
> auth_rpcgss nfs_acl lockd sunrpc(OE) ata_generic pata_acpi [last unloaded: nfsd]
> [  278.751462] CPU: 0 PID: 263 Comm: kworker/0:4 Tainted: G           OE 3.16.0-rc2+ #28
> [  278.753364] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform,
> BIOS 6.00 07/31/2013
> [  278.755710] Workqueue: rpciod rpc_async_schedule [sunrpc]
> [  278.757112]  0000000000000000 00000000f6173352 ffff880072e5fd30 ffffffff816eacfa
> [  278.758811]  0000000000000000 ffff880072e5fd68 ffffffff8108bf3d ffff88005997f180
> [  278.760527]  ffff88003ae44830 ffffffffa001a860 ffffffffa001a860 0000000000000000
> [  278.762271] Call Trace:
> [  278.762802]  [<ffffffff816eacfa>] dump_stack+0x45/0x56
> [  278.763906]  [<ffffffff8108bf3d>] warn_slowpath_common+0x7d/0xa0
> [  278.765419]  [<ffffffffa001a860>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
> [  278.767243]  [<ffffffffa001a860>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
> [  278.768758]  [<ffffffff8108c06a>] warn_slowpath_null+0x1a/0x20
> [  278.770080]  [<ffffffffa001a8e5>] rpc_exit_task+0x85/0x90 [sunrpc]
> [  278.771439]  [<ffffffffa001b204>] __rpc_execute+0x94/0x420 [sunrpc]
> [  278.772829]  [<ffffffffa001b5b6>] rpc_async_schedule+0x26/0x30 [sunrpc]
> [  278.774276]  [<ffffffff810a7f95>] process_one_work+0x175/0x430
> [  278.775573]  [<ffffffff810a866d>] worker_thread+0x11d/0x530
> [  278.776824]  [<ffffffff810a8550>] ? rescuer_thread+0x300/0x300
> [  278.778069]  [<ffffffff810aee18>] kthread+0xd8/0xf0
> [  278.779114]  [<ffffffff810aed40>] ? insert_kthread_work+0x40/0x40
> [  278.780448]  [<ffffffff816f22bc>] ret_from_fork+0x7c/0xb0
> [  278.781621]  [<ffffffff810aed40>] ? insert_kthread_work+0x40/0x40
> [  278.782972] ---[ end trace 3350394592832e42 ]---
> 
> task1 (lock/unlock):
>   --> call_decode
>     --> rpc_exit_task (-10008)
>       --> nfs4_release_lockowner_done
>         --> rpc_restart_call_prepare
>             task->tk_action = call_start;
> task2 (umount):
>       --> rpc_shutdown_client
>         --> rpc_killall_tasks
>             rovr->tk_flags |= RPC_TASK_KILLED; (set task1 to KILLED)
> task1:
>     --> rpc_exit_task
>         WARN_ON(RPC_ASSASSINATED(task));
> 
> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
> ---
>  net/sunrpc/sched.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
> index 337ca85..562db0a 100644
> --- a/net/sunrpc/sched.c
> +++ b/net/sunrpc/sched.c
> @@ -689,7 +689,14 @@ void rpc_exit_task(struct rpc_task *task)
>  	if (task->tk_ops->rpc_call_done != NULL) {
>  		task->tk_ops->rpc_call_done(task, task->tk_calldata);
>  		if (task->tk_action != NULL) {
> -			WARN_ON(RPC_ASSASSINATED(task));
> +			/*
> +			 * A known case here is,
> +			 * The RPC_TASK_KILLED is set in rpc_killall_tasks by
> +			 * another process.
> +			 */
> +			if (RPC_ASSASSINATED(task))
> +				printk(KERN_WARNING "RPC: dead task tried to"
> +					" walk away.\n");
>  			/* Always release the RPC slot and buffer memory */
>  			xprt_release(task);
>  			rpc_reset_task_statistics(task);
> -- 
> 2.4.3
> 

If this is going to be changed anyway, could we maybe have some more
useful debugging information added to the message?  Something like:

        printk(KERN_WARNING "RPC: %s to %s was killed\n", 
                task->tk_msg.rpc_proc->p_name, 
                rpc_peeraddr2str(task->tk_client, RPC_DISPLAY_ADDR));

I recently had a case where this warning was popping several times per
minute.  The machine was both a client and a server... it turns out
that it was actually CB_NULLs that were causing the warning, and the
trigger for that was that several clients had the same value for the
clientaddr mount option.

Having some more useful info from the warning message would have saved
some time there.

-Scott

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] SUNRPC: Change WARN_ON to an warn message in rpc_exit_task
  2015-08-17 13:35 ` Scott Mayhew
@ 2015-08-18  6:25   ` Kinglong Mee
  0 siblings, 0 replies; 3+ messages in thread
From: Kinglong Mee @ 2015-08-18  6:25 UTC (permalink / raw)
  To: Scott Mayhew, Trond Myklebust; +Cc: linux-nfs@vger.kernel.org, kinglongmee

On 8/17/2015 21:35, Scott Mayhew wrote:
> On Mon, 17 Aug 2015, Kinglong Mee wrote:
> 
>> When testing nfs4.0 lock, I get WARN_ON message as below,
>> [  278.733391] ------------[ cut here ]------------
>> [  278.734496] WARNING: CPU: 0 PID: 263 at /root/linux-nfs/net/sunrpc/sched.c:692
>> rpc_exit_task+0x85/0x90 [sunrpc]()
>> [  278.736771] Modules linked in: nfsv4(OE) nfs(OE) nfsd(E) bnep bluetooth ip6t_rpfilter ip6t_REJECT
>> xt_conntrack cfg80211 rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables
>> ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security
>> ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
>> nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw btrfs ppdev xfs
>> microcode xor serio_raw libcrc32c pcspkr raid6_pq e1000 parport_pc parport i2c_piix4 i2c_core
>> auth_rpcgss nfs_acl lockd sunrpc(OE) ata_generic pata_acpi [last unloaded: nfsd]
>> [  278.751462] CPU: 0 PID: 263 Comm: kworker/0:4 Tainted: G           OE 3.16.0-rc2+ #28
>> [  278.753364] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform,
>> BIOS 6.00 07/31/2013
>> [  278.755710] Workqueue: rpciod rpc_async_schedule [sunrpc]
>> [  278.757112]  0000000000000000 00000000f6173352 ffff880072e5fd30 ffffffff816eacfa
>> [  278.758811]  0000000000000000 ffff880072e5fd68 ffffffff8108bf3d ffff88005997f180
>> [  278.760527]  ffff88003ae44830 ffffffffa001a860 ffffffffa001a860 0000000000000000
>> [  278.762271] Call Trace:
>> [  278.762802]  [<ffffffff816eacfa>] dump_stack+0x45/0x56
>> [  278.763906]  [<ffffffff8108bf3d>] warn_slowpath_common+0x7d/0xa0
>> [  278.765419]  [<ffffffffa001a860>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
>> [  278.767243]  [<ffffffffa001a860>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
>> [  278.768758]  [<ffffffff8108c06a>] warn_slowpath_null+0x1a/0x20
>> [  278.770080]  [<ffffffffa001a8e5>] rpc_exit_task+0x85/0x90 [sunrpc]
>> [  278.771439]  [<ffffffffa001b204>] __rpc_execute+0x94/0x420 [sunrpc]
>> [  278.772829]  [<ffffffffa001b5b6>] rpc_async_schedule+0x26/0x30 [sunrpc]
>> [  278.774276]  [<ffffffff810a7f95>] process_one_work+0x175/0x430
>> [  278.775573]  [<ffffffff810a866d>] worker_thread+0x11d/0x530
>> [  278.776824]  [<ffffffff810a8550>] ? rescuer_thread+0x300/0x300
>> [  278.778069]  [<ffffffff810aee18>] kthread+0xd8/0xf0
>> [  278.779114]  [<ffffffff810aed40>] ? insert_kthread_work+0x40/0x40
>> [  278.780448]  [<ffffffff816f22bc>] ret_from_fork+0x7c/0xb0
>> [  278.781621]  [<ffffffff810aed40>] ? insert_kthread_work+0x40/0x40
>> [  278.782972] ---[ end trace 3350394592832e42 ]---
>>
>> task1 (lock/unlock):
>>   --> call_decode
>>     --> rpc_exit_task (-10008)
>>       --> nfs4_release_lockowner_done
>>         --> rpc_restart_call_prepare
>>             task->tk_action = call_start;
>> task2 (umount):
>>       --> rpc_shutdown_client
>>         --> rpc_killall_tasks
>>             rovr->tk_flags |= RPC_TASK_KILLED; (set task1 to KILLED)
>> task1:
>>     --> rpc_exit_task
>>         WARN_ON(RPC_ASSASSINATED(task));
>>
>> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
>> ---
>>  net/sunrpc/sched.c | 9 ++++++++-
>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
>> index 337ca85..562db0a 100644
>> --- a/net/sunrpc/sched.c
>> +++ b/net/sunrpc/sched.c
>> @@ -689,7 +689,14 @@ void rpc_exit_task(struct rpc_task *task)
>>  	if (task->tk_ops->rpc_call_done != NULL) {
>>  		task->tk_ops->rpc_call_done(task, task->tk_calldata);
>>  		if (task->tk_action != NULL) {
>> -			WARN_ON(RPC_ASSASSINATED(task));
>> +			/*
>> +			 * A known case here is,
>> +			 * The RPC_TASK_KILLED is set in rpc_killall_tasks by
>> +			 * another process.
>> +			 */
>> +			if (RPC_ASSASSINATED(task))
>> +				printk(KERN_WARNING "RPC: dead task tried to"
>> +					" walk away.\n");
>>  			/* Always release the RPC slot and buffer memory */
>>  			xprt_release(task);
>>  			rpc_reset_task_statistics(task);
>> -- 
>> 2.4.3
>>
> 
> If this is going to be changed anyway, could we maybe have some more
> useful debugging information added to the message?  Something like:
> 
>         printk(KERN_WARNING "RPC: %s to %s was killed\n", 
>                 task->tk_msg.rpc_proc->p_name, 
>                 rpc_peeraddr2str(task->tk_client, RPC_DISPLAY_ADDR));
> 
> I recently had a case where this warning was popping several times per
> minute.  The machine was both a client and a server... it turns out
> that it was actually CB_NULLs that were causing the warning, and the
> trigger for that was that several clients had the same value for the
> clientaddr mount option.
> 
> Having some more useful info from the warning message would have saved
> some time there.

If it's harmless of this case, remove the WARN_ON directly maybe better.
What's your opinion, Trond? 

thanks,
Kinglong Mee

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-08-18  6:25 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-17 11:08 [PATCH] SUNRPC: Change WARN_ON to an warn message in rpc_exit_task Kinglong Mee
2015-08-17 13:35 ` Scott Mayhew
2015-08-18  6:25   ` Kinglong Mee

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.