* Re: WARNING: at net/ipv4/af_inet.c:154 inet_sock_destruct
[not found] <38b2ab8a0909290109m3f82c161j4fb0f1266152877e@mail.gmail.com>
@ 2009-09-29 9:18 ` Eric Dumazet
2009-09-29 9:29 ` Francis Moreau
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: Eric Dumazet @ 2009-09-29 9:18 UTC (permalink / raw)
To: Francis Moreau
Cc: Linux Kernel Mailing List, Linux Netdev List, David S. Miller
Francis Moreau a écrit :
> Hello,
>
> I got this kernel warning when stopping nfsd:
>
> [260104.553720] WARNING: at net/ipv4/af_inet.c:154
> inet_sock_destruct+0x164/0x182()
> [260104.553722] Hardware name: P5K-VM
> [260104.553724] Modules linked in: jfs loop nfsd lockd nfs_acl
> auth_rpcgss exportfs sunrpc [last unloaded: microcode]
> [260104.553736] Pid: 858, comm: nfsd Tainted: G M 2.6.31 #13
> [260104.553738] Call Trace:
> [260104.553743] [<ffffffff813ed53a>] ? inet_sock_destruct+0x164/0x182
> [260104.553748] [<ffffffff81044471>] warn_slowpath_common+0x7c/0xa9
> [260104.553751] [<ffffffff810444b2>] warn_slowpath_null+0x14/0x16
> [260104.553754] [<ffffffff813ed53a>] inet_sock_destruct+0x164/0x182
> [260104.553759] [<ffffffff8138e1c0>] __sk_free+0x23/0xe7
> [260104.553762] [<ffffffff8138e2fd>] sk_free+0x1f/0x21
> [260104.553765] [<ffffffff8138e3c7>] sk_common_release+0xc8/0xcd
> [260104.553769] [<ffffffff813e4459>] udp_lib_close+0xe/0x10
> [260104.553772] [<ffffffff813ecfe2>] inet_release+0x55/0x5c
> [260104.553775] [<ffffffff8138b746>] sock_release+0x1f/0x71
> [260104.553778] [<ffffffff8138b7bf>] sock_close+0x27/0x2b
> [260104.553782] [<ffffffff810d0641>] __fput+0xfb/0x1c0
> [260104.553787] [<ffffffff8104a197>] ? local_bh_disable+0x12/0x14
> [260104.553790] [<ffffffff810d0723>] fput+0x1d/0x1f
> [260104.553810] [<ffffffffa0014035>] svc_sock_free+0x40/0x56 [sunrpc]
> [260104.553827] [<ffffffffa001dea0>] svc_xprt_free+0x43/0x53 [sunrpc]
> [260104.553843] [<ffffffffa001de5d>] ? svc_xprt_free+0x0/0x53 [sunrpc]
> [260104.553847] [<ffffffff811b4641>] kref_put+0x43/0x4f
> [260104.553863] [<ffffffffa001d224>] svc_close_xprt+0x55/0x5e [sunrpc]
> [260104.553879] [<ffffffffa001d27d>] svc_close_all+0x50/0x69 [sunrpc]
> [260104.553894] [<ffffffffa0012922>] svc_destroy+0x9e/0x142 [sunrpc]
> [260104.553910] [<ffffffffa0012a7f>] svc_exit_thread+0xb9/0xc2 [sunrpc]
> [260104.553922] [<ffffffffa00707b1>] ? nfsd+0x0/0x151 [nfsd]
> [260104.553932] [<ffffffffa00708e8>] nfsd+0x137/0x151 [nfsd]
> [260104.553936] [<ffffffff8105ad28>] kthread+0x94/0x9c
> [260104.553941] [<ffffffff8100c1fa>] child_rip+0xa/0x20
> [260104.553944] [<ffffffff81047b00>] ? do_exit+0x5d7/0x691
> [260104.553948] [<ffffffff81039cf8>] ? finish_task_switch+0x6a/0xc7
> [260104.553953] [<ffffffff8100bb6d>] ? restore_args+0x0/0x30
> [260104.553956] [<ffffffff8105ac94>] ? kthread+0x0/0x9c
> [260104.553959] [<ffffffff8100c1f0>] ? child_rip+0x0/0x20
>
> It happens on 2.6.31 and older kernels as well though I don't remember
> when it really started.
Could you please try following patch ?
Thanks
[PATCH] net: Fix sock_wfree() race
Commit 2b85a34e911bf483c27cfdd124aeb1605145dc80
(net: No more expensive sock_hold()/sock_put() on each tx)
opens a window in sock_wfree() where another cpu
might free the socket we are working on.
A fix is to call sk->sk_write_space(sk) while still
holding a reference on sk.
Reported-by: Jike Song <albcamus@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
net/core/sock.c | 19 ++++++++++++-------
1 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/net/core/sock.c b/net/core/sock.c
index 30d5446..e1f034e 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1228,17 +1228,22 @@ void __init sk_init(void)
void sock_wfree(struct sk_buff *skb)
{
struct sock *sk = skb->sk;
- int res;
+ unsigned int len = skb->truesize;
- /* In case it might be waiting for more memory. */
- res = atomic_sub_return(skb->truesize, &sk->sk_wmem_alloc);
- if (!sock_flag(sk, SOCK_USE_WRITE_QUEUE))
+ if (!sock_flag(sk, SOCK_USE_WRITE_QUEUE)) {
+ /*
+ * Keep a reference on sk_wmem_alloc, this will be released
+ * after sk_write_space() call
+ */
+ atomic_sub(len - 1, &sk->sk_wmem_alloc);
sk->sk_write_space(sk);
+ len = 1;
+ }
/*
- * if sk_wmem_alloc reached 0, we are last user and should
- * free this sock, as sk_free() call could not do it.
+ * if sk_wmem_alloc reaches 0, we must finish what sk_free()
+ * could not do because of in-flight packets
*/
- if (res == 0)
+ if (atomic_sub_and_test(len, &sk->sk_wmem_alloc))
__sk_free(sk);
}
EXPORT_SYMBOL(sock_wfree);
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: WARNING: at net/ipv4/af_inet.c:154 inet_sock_destruct
2009-09-29 9:18 ` WARNING: at net/ipv4/af_inet.c:154 inet_sock_destruct Eric Dumazet
@ 2009-09-29 9:29 ` Francis Moreau
2009-09-30 11:40 ` Francis Moreau
2009-10-30 8:44 ` Francis Moreau
2 siblings, 0 replies; 15+ messages in thread
From: Francis Moreau @ 2009-09-29 9:29 UTC (permalink / raw)
To: Eric Dumazet
Cc: Linux Kernel Mailing List, Linux Netdev List, David S. Miller
On Tue, Sep 29, 2009 at 11:18 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Francis Moreau a écrit :
>>
>> It happens on 2.6.31 and older kernels as well though I don't remember
>> when it really started.
>
> Could you please try following patch ?
I'll report back the result at the end of the day (ie in 8 hours).
Thanks
--
Francis
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at net/ipv4/af_inet.c:154 inet_sock_destruct
2009-09-29 9:18 ` WARNING: at net/ipv4/af_inet.c:154 inet_sock_destruct Eric Dumazet
2009-09-29 9:29 ` Francis Moreau
@ 2009-09-30 11:40 ` Francis Moreau
2009-10-30 8:44 ` Francis Moreau
2 siblings, 0 replies; 15+ messages in thread
From: Francis Moreau @ 2009-09-30 11:40 UTC (permalink / raw)
To: Eric Dumazet
Cc: Linux Kernel Mailing List, Linux Netdev List, David S. Miller
On Tue, Sep 29, 2009 at 11:18 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Francis Moreau a écrit :
>> Hello,
>>
>> I got this kernel warning when stopping nfsd:
>>
>> [260104.553720] WARNING: at net/ipv4/af_inet.c:154
>> inet_sock_destruct+0x164/0x182()
>> [260104.553722] Hardware name: P5K-VM
>> [260104.553724] Modules linked in: jfs loop nfsd lockd nfs_acl
>> auth_rpcgss exportfs sunrpc [last unloaded: microcode]
>> [260104.553736] Pid: 858, comm: nfsd Tainted: G M 2.6.31 #13
>> [260104.553738] Call Trace:
>> [260104.553743] [<ffffffff813ed53a>] ? inet_sock_destruct+0x164/0x182
>> [260104.553748] [<ffffffff81044471>] warn_slowpath_common+0x7c/0xa9
>> [260104.553751] [<ffffffff810444b2>] warn_slowpath_null+0x14/0x16
>> [260104.553754] [<ffffffff813ed53a>] inet_sock_destruct+0x164/0x182
>> [260104.553759] [<ffffffff8138e1c0>] __sk_free+0x23/0xe7
>> [260104.553762] [<ffffffff8138e2fd>] sk_free+0x1f/0x21
>> [260104.553765] [<ffffffff8138e3c7>] sk_common_release+0xc8/0xcd
>> [260104.553769] [<ffffffff813e4459>] udp_lib_close+0xe/0x10
>> [260104.553772] [<ffffffff813ecfe2>] inet_release+0x55/0x5c
>> [260104.553775] [<ffffffff8138b746>] sock_release+0x1f/0x71
>> [260104.553778] [<ffffffff8138b7bf>] sock_close+0x27/0x2b
>> [260104.553782] [<ffffffff810d0641>] __fput+0xfb/0x1c0
>> [260104.553787] [<ffffffff8104a197>] ? local_bh_disable+0x12/0x14
>> [260104.553790] [<ffffffff810d0723>] fput+0x1d/0x1f
>> [260104.553810] [<ffffffffa0014035>] svc_sock_free+0x40/0x56 [sunrpc]
>> [260104.553827] [<ffffffffa001dea0>] svc_xprt_free+0x43/0x53 [sunrpc]
>> [260104.553843] [<ffffffffa001de5d>] ? svc_xprt_free+0x0/0x53 [sunrpc]
>> [260104.553847] [<ffffffff811b4641>] kref_put+0x43/0x4f
>> [260104.553863] [<ffffffffa001d224>] svc_close_xprt+0x55/0x5e [sunrpc]
>> [260104.553879] [<ffffffffa001d27d>] svc_close_all+0x50/0x69 [sunrpc]
>> [260104.553894] [<ffffffffa0012922>] svc_destroy+0x9e/0x142 [sunrpc]
>> [260104.553910] [<ffffffffa0012a7f>] svc_exit_thread+0xb9/0xc2 [sunrpc]
>> [260104.553922] [<ffffffffa00707b1>] ? nfsd+0x0/0x151 [nfsd]
>> [260104.553932] [<ffffffffa00708e8>] nfsd+0x137/0x151 [nfsd]
>> [260104.553936] [<ffffffff8105ad28>] kthread+0x94/0x9c
>> [260104.553941] [<ffffffff8100c1fa>] child_rip+0xa/0x20
>> [260104.553944] [<ffffffff81047b00>] ? do_exit+0x5d7/0x691
>> [260104.553948] [<ffffffff81039cf8>] ? finish_task_switch+0x6a/0xc7
>> [260104.553953] [<ffffffff8100bb6d>] ? restore_args+0x0/0x30
>> [260104.553956] [<ffffffff8105ac94>] ? kthread+0x0/0x9c
>> [260104.553959] [<ffffffff8100c1f0>] ? child_rip+0x0/0x20
>>
>> It happens on 2.6.31 and older kernels as well though I don't remember
>> when it really started.
>
> Could you please try following patch ?
>
No trace of this bug has been seen so far.
thanks
--
Francis
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at net/ipv4/af_inet.c:154 inet_sock_destruct
2009-09-29 9:18 ` WARNING: at net/ipv4/af_inet.c:154 inet_sock_destruct Eric Dumazet
2009-09-29 9:29 ` Francis Moreau
2009-09-30 11:40 ` Francis Moreau
@ 2009-10-30 8:44 ` Francis Moreau
2009-10-30 9:41 ` Eric Dumazet
2009-10-30 11:27 ` Eric Dumazet
2 siblings, 2 replies; 15+ messages in thread
From: Francis Moreau @ 2009-10-30 8:44 UTC (permalink / raw)
To: Eric Dumazet
Cc: Linux Kernel Mailing List, Linux Netdev List, David S. Miller
Hello Eric,
It seems I still have a related bug, please have a look to the following oops.
This happened on a 2.6.32-rc5 where your patch is included.
[107304.558821] nfsd: last server has exited, flushing export cache
[107304.558848] ------------[ cut here ]------------
[107304.558858] WARNING: at net/ipv4/af_inet.c:153
inet_sock_destruct+0x161/0x17c()
[107304.558862] Hardware name: P5K-VM
[107304.558865] Modules linked in: kvm_intel kvm jfs loop nfsd lockd
nfs_acl auth_rpcgss exportfs sunrpc [last unloaded: microcode]
[107304.558889] Pid: 8198, comm: nfsd Tainted: G M 2.6.32-rc5 #25
[107304.558892] Call Trace:
[107304.558899] [<ffffffff81429f19>] ? inet_sock_destruct+0x161/0x17c
[107304.558907] [<ffffffff810487e9>] warn_slowpath_common+0x7c/0xa9
[107304.558914] [<ffffffff8104882a>] warn_slowpath_null+0x14/0x16
[107304.558920] [<ffffffff81429f19>] inet_sock_destruct+0x161/0x17c
[107304.558927] [<ffffffff813c8741>] __sk_free+0x23/0xe7
[107304.558933] [<ffffffff813c8881>] sk_free+0x1f/0x21
[107304.558939] [<ffffffff813c894b>] sk_common_release+0xc8/0xcd
[107304.558944] [<ffffffff81420b59>] udp_lib_close+0xe/0x10
[107304.558951] [<ffffffff814299bf>] inet_release+0x55/0x5c
[107304.558957] [<ffffffff813c5aa9>] sock_release+0x1f/0x71
[107304.558962] [<ffffffff813c5b22>] sock_close+0x27/0x2b
[107304.558968] [<ffffffff810eb60f>] __fput+0xfb/0x1c0
[107304.558973] [<ffffffff810eb6f1>] fput+0x1d/0x1f
[107304.558995] [<ffffffffa0013e23>] svc_sock_free+0x40/0x56 [sunrpc]
[107304.559018] [<ffffffffa001f392>] svc_xprt_free+0x43/0x53 [sunrpc]
[107304.559038] [<ffffffffa001f34f>] ? svc_xprt_free+0x0/0x53 [sunrpc]
[107304.559048] [<ffffffff811d9275>] kref_put+0x43/0x4f
[107304.559069] [<ffffffffa001e67a>] svc_close_xprt+0x55/0x5e [sunrpc]
[107304.559088] [<ffffffffa001e6d3>] svc_close_all+0x50/0x69 [sunrpc]
[107304.559107] [<ffffffffa0012a2b>] svc_destroy+0x9e/0x142 [sunrpc]
[107304.559126] [<ffffffffa0012b88>] svc_exit_thread+0xb9/0xc2 [sunrpc]
[107304.559138] [<ffffffffa008981b>] ? nfsd+0x0/0x13f [nfsd]
[107304.559149] [<ffffffffa0089940>] nfsd+0x125/0x13f [nfsd]
[107304.559157] [<ffffffff810685e3>] kthread+0x82/0x8a
[107304.559164] [<ffffffff8100c13a>] child_rip+0xa/0x20
[107304.559172] [<ffffffff8100baad>] ? restore_args+0x0/0x30
[107304.559179] [<ffffffff81068561>] ? kthread+0x0/0x8a
[107304.559185] [<ffffffff8100c130>] ? child_rip+0x0/0x20
[107304.559191] ---[ end trace c107131f4762168c ]---
[107304.927931] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state
recovery directory
[107304.932765] NFSD: starting 90-second grace period
--
Francis
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at net/ipv4/af_inet.c:154 inet_sock_destruct
2009-10-30 8:44 ` Francis Moreau
@ 2009-10-30 9:41 ` Eric Dumazet
2009-10-30 11:27 ` Eric Dumazet
1 sibling, 0 replies; 15+ messages in thread
From: Eric Dumazet @ 2009-10-30 9:41 UTC (permalink / raw)
To: Francis Moreau
Cc: Linux Kernel Mailing List, Linux Netdev List, David S. Miller
Francis Moreau a écrit :
> Hello Eric,
>
> It seems I still have a related bug, please have a look to the following oops.
>
> This happened on a 2.6.32-rc5 where your patch is included.
>
> [107304.558821] nfsd: last server has exited, flushing export cache
> [107304.558848] ------------[ cut here ]------------
> [107304.558858] WARNING: at net/ipv4/af_inet.c:153
> inet_sock_destruct+0x161/0x17c()
> [107304.558862] Hardware name: P5K-VM
> [107304.558865] Modules linked in: kvm_intel kvm jfs loop nfsd lockd
> nfs_acl auth_rpcgss exportfs sunrpc [last unloaded: microcode]
> [107304.558889] Pid: 8198, comm: nfsd Tainted: G M 2.6.32-rc5 #25
> [107304.558892] Call Trace:
> [107304.558899] [<ffffffff81429f19>] ? inet_sock_destruct+0x161/0x17c
> [107304.558907] [<ffffffff810487e9>] warn_slowpath_common+0x7c/0xa9
> [107304.558914] [<ffffffff8104882a>] warn_slowpath_null+0x14/0x16
> [107304.558920] [<ffffffff81429f19>] inet_sock_destruct+0x161/0x17c
> [107304.558927] [<ffffffff813c8741>] __sk_free+0x23/0xe7
> [107304.558933] [<ffffffff813c8881>] sk_free+0x1f/0x21
> [107304.558939] [<ffffffff813c894b>] sk_common_release+0xc8/0xcd
> [107304.558944] [<ffffffff81420b59>] udp_lib_close+0xe/0x10
> [107304.558951] [<ffffffff814299bf>] inet_release+0x55/0x5c
> [107304.558957] [<ffffffff813c5aa9>] sock_release+0x1f/0x71
> [107304.558962] [<ffffffff813c5b22>] sock_close+0x27/0x2b
> [107304.558968] [<ffffffff810eb60f>] __fput+0xfb/0x1c0
> [107304.558973] [<ffffffff810eb6f1>] fput+0x1d/0x1f
> [107304.558995] [<ffffffffa0013e23>] svc_sock_free+0x40/0x56 [sunrpc]
> [107304.559018] [<ffffffffa001f392>] svc_xprt_free+0x43/0x53 [sunrpc]
> [107304.559038] [<ffffffffa001f34f>] ? svc_xprt_free+0x0/0x53 [sunrpc]
> [107304.559048] [<ffffffff811d9275>] kref_put+0x43/0x4f
> [107304.559069] [<ffffffffa001e67a>] svc_close_xprt+0x55/0x5e [sunrpc]
> [107304.559088] [<ffffffffa001e6d3>] svc_close_all+0x50/0x69 [sunrpc]
> [107304.559107] [<ffffffffa0012a2b>] svc_destroy+0x9e/0x142 [sunrpc]
> [107304.559126] [<ffffffffa0012b88>] svc_exit_thread+0xb9/0xc2 [sunrpc]
> [107304.559138] [<ffffffffa008981b>] ? nfsd+0x0/0x13f [nfsd]
> [107304.559149] [<ffffffffa0089940>] nfsd+0x125/0x13f [nfsd]
> [107304.559157] [<ffffffff810685e3>] kthread+0x82/0x8a
> [107304.559164] [<ffffffff8100c13a>] child_rip+0xa/0x20
> [107304.559172] [<ffffffff8100baad>] ? restore_args+0x0/0x30
> [107304.559179] [<ffffffff81068561>] ? kthread+0x0/0x8a
> [107304.559185] [<ffffffff8100c130>] ? child_rip+0x0/0x20
> [107304.559191] ---[ end trace c107131f4762168c ]---
> [107304.927931] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state
> recovery directory
> [107304.932765] NFSD: starting 90-second grace period
>
Thanks Francis, I think I found the problem.
I am preparing a patch, test it, and submit it in couple of hours
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at net/ipv4/af_inet.c:154 inet_sock_destruct
2009-10-30 8:44 ` Francis Moreau
2009-10-30 9:41 ` Eric Dumazet
@ 2009-10-30 11:27 ` Eric Dumazet
2009-10-30 12:33 ` Francis Moreau
2009-10-30 15:03 ` Eric Dumazet
1 sibling, 2 replies; 15+ messages in thread
From: Eric Dumazet @ 2009-10-30 11:27 UTC (permalink / raw)
To: Francis Moreau
Cc: Linux Kernel Mailing List, Linux Netdev List, David S. Miller
Francis Moreau a écrit :
> Hello Eric,
>
> It seems I still have a related bug, please have a look to the following oops.
>
> This happened on a 2.6.32-rc5 where your patch is included.
>
> [107304.558821] nfsd: last server has exited, flushing export cache
> [107304.558848] ------------[ cut here ]------------
> [107304.558858] WARNING: at net/ipv4/af_inet.c:153
> inet_sock_destruct+0x161/0x17c()
> [107304.558862] Hardware name: P5K-VM
> [107304.558865] Modules linked in: kvm_intel kvm jfs loop nfsd lockd
> nfs_acl auth_rpcgss exportfs sunrpc [last unloaded: microcode]
> [107304.558889] Pid: 8198, comm: nfsd Tainted: G M 2.6.32-rc5 #25
> [107304.558892] Call Trace:
> [107304.558899] [<ffffffff81429f19>] ? inet_sock_destruct+0x161/0x17c
> [107304.558907] [<ffffffff810487e9>] warn_slowpath_common+0x7c/0xa9
> [107304.558914] [<ffffffff8104882a>] warn_slowpath_null+0x14/0x16
> [107304.558920] [<ffffffff81429f19>] inet_sock_destruct+0x161/0x17c
> [107304.558927] [<ffffffff813c8741>] __sk_free+0x23/0xe7
> [107304.558933] [<ffffffff813c8881>] sk_free+0x1f/0x21
> [107304.558939] [<ffffffff813c894b>] sk_common_release+0xc8/0xcd
> [107304.558944] [<ffffffff81420b59>] udp_lib_close+0xe/0x10
> [107304.558951] [<ffffffff814299bf>] inet_release+0x55/0x5c
> [107304.558957] [<ffffffff813c5aa9>] sock_release+0x1f/0x71
> [107304.558962] [<ffffffff813c5b22>] sock_close+0x27/0x2b
> [107304.558968] [<ffffffff810eb60f>] __fput+0xfb/0x1c0
> [107304.558973] [<ffffffff810eb6f1>] fput+0x1d/0x1f
> [107304.558995] [<ffffffffa0013e23>] svc_sock_free+0x40/0x56 [sunrpc]
> [107304.559018] [<ffffffffa001f392>] svc_xprt_free+0x43/0x53 [sunrpc]
> [107304.559038] [<ffffffffa001f34f>] ? svc_xprt_free+0x0/0x53 [sunrpc]
> [107304.559048] [<ffffffff811d9275>] kref_put+0x43/0x4f
> [107304.559069] [<ffffffffa001e67a>] svc_close_xprt+0x55/0x5e [sunrpc]
> [107304.559088] [<ffffffffa001e6d3>] svc_close_all+0x50/0x69 [sunrpc]
> [107304.559107] [<ffffffffa0012a2b>] svc_destroy+0x9e/0x142 [sunrpc]
> [107304.559126] [<ffffffffa0012b88>] svc_exit_thread+0xb9/0xc2 [sunrpc]
> [107304.559138] [<ffffffffa008981b>] ? nfsd+0x0/0x13f [nfsd]
> [107304.559149] [<ffffffffa0089940>] nfsd+0x125/0x13f [nfsd]
> [107304.559157] [<ffffffff810685e3>] kthread+0x82/0x8a
> [107304.559164] [<ffffffff8100c13a>] child_rip+0xa/0x20
> [107304.559172] [<ffffffff8100baad>] ? restore_args+0x0/0x30
> [107304.559179] [<ffffffff81068561>] ? kthread+0x0/0x8a
> [107304.559185] [<ffffffff8100c130>] ? child_rip+0x0/0x20
> [107304.559191] ---[ end trace c107131f4762168c ]---
> [107304.927931] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state
> recovery directory
> [107304.932765] NFSD: starting 90-second grace period
>
This oops occurring again and again with SUNRPC finally gave me the right pointer.
David, we added two years ago memory accounting to UDP, and this changed
requirements about calling skb_free_datagram() in the right context.
I wish we had an ASSERT_SOCK_LOCKED() debugging facility :(
Francis, would you please test following patch ?
Thanks
[PATCH] net: fix sk_forward_alloc corruption
On UDP sockets, we must call skb_free_datagram() with socket locked,
or risk sk_forward_alloc corruption. This requirement is not respected
in SUNRPC.
Add a convenient helper, skb_free_datagram_locked() and use it in SUNRPC
Reported-by: Francis Moreau <francis.moro@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
include/linux/skbuff.h | 2 ++
net/core/datagram.c | 10 +++++++++-
net/ipv4/udp.c | 4 +---
net/ipv6/udp.c | 4 +---
net/sunrpc/svcsock.c | 10 +++++-----
net/sunrpc/xprtsock.c | 2 +-
6 files changed, 19 insertions(+), 13 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index df7b23a..266878f 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1757,6 +1757,8 @@ extern int skb_copy_datagram_const_iovec(const struct sk_buff *from,
int to_offset,
int size);
extern void skb_free_datagram(struct sock *sk, struct sk_buff *skb);
+extern void skb_free_datagram_locked(struct sock *sk,
+ struct sk_buff *skb);
extern int skb_kill_datagram(struct sock *sk, struct sk_buff *skb,
unsigned int flags);
extern __wsum skb_checksum(const struct sk_buff *skb, int offset,
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 1c6cf3a..4ade301 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -224,6 +224,15 @@ void skb_free_datagram(struct sock *sk, struct sk_buff *skb)
consume_skb(skb);
sk_mem_reclaim_partial(sk);
}
+EXPORT_SYMBOL(skb_free_datagram);
+
+void skb_free_datagram_locked(struct sock *sk, struct sk_buff *skb)
+{
+ lock_sock(sk);
+ skb_free_datagram(sk, skb);
+ release_sock(sk);
+}
+EXPORT_SYMBOL(skb_free_datagram_locked);
/**
* skb_kill_datagram - Free a datagram skbuff forcibly
@@ -752,5 +761,4 @@ unsigned int datagram_poll(struct file *file, struct socket *sock,
EXPORT_SYMBOL(datagram_poll);
EXPORT_SYMBOL(skb_copy_and_csum_datagram_iovec);
EXPORT_SYMBOL(skb_copy_datagram_iovec);
-EXPORT_SYMBOL(skb_free_datagram);
EXPORT_SYMBOL(skb_recv_datagram);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index d0d436d..0fa9f70 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -999,9 +999,7 @@ try_again:
err = ulen;
out_free:
- lock_sock(sk);
- skb_free_datagram(sk, skb);
- release_sock(sk);
+ skb_free_datagram_locked(sk, skb);
out:
return err;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 3a60f12..cf538ed 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -288,9 +288,7 @@ try_again:
err = ulen;
out_free:
- lock_sock(sk);
- skb_free_datagram(sk, skb);
- release_sock(sk);
+ skb_free_datagram_locked(sk, skb);
out:
return err;
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index ccc5e83..1c246a4 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -111,7 +111,7 @@ static void svc_release_skb(struct svc_rqst *rqstp)
rqstp->rq_xprt_ctxt = NULL;
dprintk("svc: service %p, releasing skb %p\n", rqstp, skb);
- skb_free_datagram(svsk->sk_sk, skb);
+ skb_free_datagram_locked(svsk->sk_sk, skb);
}
}
@@ -578,7 +578,7 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
"svc: received unknown control message %d/%d; "
"dropping RPC reply datagram\n",
cmh->cmsg_level, cmh->cmsg_type);
- skb_free_datagram(svsk->sk_sk, skb);
+ skb_free_datagram_locked(svsk->sk_sk, skb);
return 0;
}
@@ -588,18 +588,18 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
if (csum_partial_copy_to_xdr(&rqstp->rq_arg, skb)) {
local_bh_enable();
/* checksum error */
- skb_free_datagram(svsk->sk_sk, skb);
+ skb_free_datagram_locked(svsk->sk_sk, skb);
return 0;
}
local_bh_enable();
- skb_free_datagram(svsk->sk_sk, skb);
+ skb_free_datagram_locked(svsk->sk_sk, skb);
} else {
/* we can use it in-place */
rqstp->rq_arg.head[0].iov_base = skb->data +
sizeof(struct udphdr);
rqstp->rq_arg.head[0].iov_len = len;
if (skb_checksum_complete(skb)) {
- skb_free_datagram(svsk->sk_sk, skb);
+ skb_free_datagram_locked(svsk->sk_sk, skb);
return 0;
}
rqstp->rq_xprt_ctxt = skb;
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 37c5475..d61be4a 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -883,7 +883,7 @@ static void xs_udp_data_ready(struct sock *sk, int len)
out_unlock:
spin_unlock(&xprt->transport_lock);
dropit:
- skb_free_datagram(sk, skb);
+ skb_free_datagram_locked(sk, skb);
out:
read_unlock(&sk->sk_callback_lock);
}
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: WARNING: at net/ipv4/af_inet.c:154 inet_sock_destruct
2009-10-30 11:27 ` Eric Dumazet
@ 2009-10-30 12:33 ` Francis Moreau
2009-10-30 13:41 ` Francis Moreau
2009-10-30 15:03 ` Eric Dumazet
1 sibling, 1 reply; 15+ messages in thread
From: Francis Moreau @ 2009-10-30 12:33 UTC (permalink / raw)
To: Eric Dumazet
Cc: Linux Kernel Mailing List, Linux Netdev List, David S. Miller
On Fri, Oct 30, 2009 at 12:27 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> This oops occurring again and again with SUNRPC finally gave me the right pointer.
>
> David, we added two years ago memory accounting to UDP, and this changed
> requirements about calling skb_free_datagram() in the right context.
>
> I wish we had an ASSERT_SOCK_LOCKED() debugging facility :(
>
> Francis, would you please test following patch ?
I'm applying it and testing it during a couple of days and see if
something wrong still happens.
Since I have no specific test case, I'll report if this oops happen
again after a couple of days (probably on next Wednesday).
--
Francis
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at net/ipv4/af_inet.c:154 inet_sock_destruct
2009-10-30 12:33 ` Francis Moreau
@ 2009-10-30 13:41 ` Francis Moreau
2009-10-30 14:55 ` Eric Dumazet
0 siblings, 1 reply; 15+ messages in thread
From: Francis Moreau @ 2009-10-30 13:41 UTC (permalink / raw)
To: Eric Dumazet
Cc: Linux Kernel Mailing List, Linux Netdev List, David S. Miller
On Fri, Oct 30, 2009 at 1:33 PM, Francis Moreau <francis.moro@gmail.com> wrote:
> On Fri, Oct 30, 2009 at 12:27 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>
>> This oops occurring again and again with SUNRPC finally gave me the right pointer.
>>
>> David, we added two years ago memory accounting to UDP, and this changed
>> requirements about calling skb_free_datagram() in the right context.
>>
>> I wish we had an ASSERT_SOCK_LOCKED() debugging facility :(
>>
>> Francis, would you please test following patch ?
>
> I'm applying it and testing it during a couple of days and see if
> something wrong still happens.
Hmm, with the patch applied on a 2.6.32-rc5, my machine locks hard
when starting nfsd.
--
Francis
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at net/ipv4/af_inet.c:154 inet_sock_destruct
2009-10-30 13:41 ` Francis Moreau
@ 2009-10-30 14:55 ` Eric Dumazet
0 siblings, 0 replies; 15+ messages in thread
From: Eric Dumazet @ 2009-10-30 14:55 UTC (permalink / raw)
To: Francis Moreau
Cc: Linux Kernel Mailing List, Linux Netdev List, David S. Miller
Francis Moreau a écrit :
> On Fri, Oct 30, 2009 at 1:33 PM, Francis Moreau <francis.moro@gmail.com> wrote:
>> On Fri, Oct 30, 2009 at 12:27 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>> This oops occurring again and again with SUNRPC finally gave me the right pointer.
>>>
>>> David, we added two years ago memory accounting to UDP, and this changed
>>> requirements about calling skb_free_datagram() in the right context.
>>>
>>> I wish we had an ASSERT_SOCK_LOCKED() debugging facility :(
>>>
>>> Francis, would you please test following patch ?
>> I'm applying it and testing it during a couple of days and see if
>> something wrong still happens.
>
> Hmm, with the patch applied on a 2.6.32-rc5, my machine locks hard
> when starting nfsd.
>
Please ignore/revert the last part of the patch (about net/sunrpc/xprtsock.c)
In xs_udp_data_ready() we really want to call skb_free_datagram(),
not skb_free_datagram_locked(), because socket is already locked.
Thanks
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 37c5475..d61be4a 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -883,7 +883,7 @@ static void xs_udp_data_ready(struct sock *sk, int len)
out_unlock:
spin_unlock(&xprt->transport_lock);
dropit:
+ skb_free_datagram(sk, skb);
- skb_free_datagram_locked(sk, skb);
out:
read_unlock(&sk->sk_callback_lock);
}
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: WARNING: at net/ipv4/af_inet.c:154 inet_sock_destruct
2009-10-30 11:27 ` Eric Dumazet
2009-10-30 12:33 ` Francis Moreau
@ 2009-10-30 15:03 ` Eric Dumazet
2009-10-30 19:26 ` David Miller
2009-11-06 8:45 ` Francis Moreau
1 sibling, 2 replies; 15+ messages in thread
From: Eric Dumazet @ 2009-10-30 15:03 UTC (permalink / raw)
To: Francis Moreau
Cc: Linux Kernel Mailing List, Linux Netdev List, David S. Miller
Eric Dumazet a écrit :
> Francis Moreau a écrit :
>> Hello Eric,
>>
>> It seems I still have a related bug, please have a look to the following oops.
>>
>> This happened on a 2.6.32-rc5 where your patch is included.
>>
>> [107304.558821] nfsd: last server has exited, flushing export cache
>> [107304.558848] ------------[ cut here ]------------
>> [107304.558858] WARNING: at net/ipv4/af_inet.c:153
>> inet_sock_destruct+0x161/0x17c()
>> [107304.558862] Hardware name: P5K-VM
>> [107304.558865] Modules linked in: kvm_intel kvm jfs loop nfsd lockd
>> nfs_acl auth_rpcgss exportfs sunrpc [last unloaded: microcode]
>> [107304.558889] Pid: 8198, comm: nfsd Tainted: G M 2.6.32-rc5 #25
>> [107304.558892] Call Trace:
>> [107304.558899] [<ffffffff81429f19>] ? inet_sock_destruct+0x161/0x17c
>> [107304.558907] [<ffffffff810487e9>] warn_slowpath_common+0x7c/0xa9
>> [107304.558914] [<ffffffff8104882a>] warn_slowpath_null+0x14/0x16
>> [107304.558920] [<ffffffff81429f19>] inet_sock_destruct+0x161/0x17c
>> [107304.558927] [<ffffffff813c8741>] __sk_free+0x23/0xe7
>> [107304.558933] [<ffffffff813c8881>] sk_free+0x1f/0x21
>> [107304.558939] [<ffffffff813c894b>] sk_common_release+0xc8/0xcd
>> [107304.558944] [<ffffffff81420b59>] udp_lib_close+0xe/0x10
>> [107304.558951] [<ffffffff814299bf>] inet_release+0x55/0x5c
>> [107304.558957] [<ffffffff813c5aa9>] sock_release+0x1f/0x71
>> [107304.558962] [<ffffffff813c5b22>] sock_close+0x27/0x2b
>> [107304.558968] [<ffffffff810eb60f>] __fput+0xfb/0x1c0
>> [107304.558973] [<ffffffff810eb6f1>] fput+0x1d/0x1f
>> [107304.558995] [<ffffffffa0013e23>] svc_sock_free+0x40/0x56 [sunrpc]
>> [107304.559018] [<ffffffffa001f392>] svc_xprt_free+0x43/0x53 [sunrpc]
>> [107304.559038] [<ffffffffa001f34f>] ? svc_xprt_free+0x0/0x53 [sunrpc]
>> [107304.559048] [<ffffffff811d9275>] kref_put+0x43/0x4f
>> [107304.559069] [<ffffffffa001e67a>] svc_close_xprt+0x55/0x5e [sunrpc]
>> [107304.559088] [<ffffffffa001e6d3>] svc_close_all+0x50/0x69 [sunrpc]
>> [107304.559107] [<ffffffffa0012a2b>] svc_destroy+0x9e/0x142 [sunrpc]
>> [107304.559126] [<ffffffffa0012b88>] svc_exit_thread+0xb9/0xc2 [sunrpc]
>> [107304.559138] [<ffffffffa008981b>] ? nfsd+0x0/0x13f [nfsd]
>> [107304.559149] [<ffffffffa0089940>] nfsd+0x125/0x13f [nfsd]
>> [107304.559157] [<ffffffff810685e3>] kthread+0x82/0x8a
>> [107304.559164] [<ffffffff8100c13a>] child_rip+0xa/0x20
>> [107304.559172] [<ffffffff8100baad>] ? restore_args+0x0/0x30
>> [107304.559179] [<ffffffff81068561>] ? kthread+0x0/0x8a
>> [107304.559185] [<ffffffff8100c130>] ? child_rip+0x0/0x20
>> [107304.559191] ---[ end trace c107131f4762168c ]---
>> [107304.927931] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state
>> recovery directory
>> [107304.932765] NFSD: starting 90-second grace period
>>
>
> This oops occurring again and again with SUNRPC finally gave me the right pointer.
>
> David, we added two years ago memory accounting to UDP, and this changed
> requirements about calling skb_free_datagram() in the right context.
>
> I wish we had an ASSERT_SOCK_LOCKED() debugging facility :(
>
> Francis, would you please test following patch ?
Here is the updated patch (against linux-2.6) , in case other people are interested to test it.
[PATCH take2] net: fix sk_forward_alloc corruption
On UDP sockets, we must call skb_free_datagram() with socket locked,
or risk sk_forward_alloc corruption. This requirement is not respected
in SUNRPC.
Add a convenient helper, skb_free_datagram_locked() and use it in SUNRPC
Reported-by: Francis Moreau <francis.moro@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
include/linux/skbuff.h | 2 ++
net/core/datagram.c | 10 +++++++++-
net/ipv4/udp.c | 4 +---
net/ipv6/udp.c | 4 +---
net/sunrpc/svcsock.c | 10 +++++-----
5 files changed, 18 insertions(+), 12 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index df7b23a..266878f 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1757,6 +1757,8 @@ extern int skb_copy_datagram_const_iovec(const struct sk_buff *from,
int to_offset,
int size);
extern void skb_free_datagram(struct sock *sk, struct sk_buff *skb);
+extern void skb_free_datagram_locked(struct sock *sk,
+ struct sk_buff *skb);
extern int skb_kill_datagram(struct sock *sk, struct sk_buff *skb,
unsigned int flags);
extern __wsum skb_checksum(const struct sk_buff *skb, int offset,
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 1c6cf3a..4ade301 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -224,6 +224,15 @@ void skb_free_datagram(struct sock *sk, struct sk_buff *skb)
consume_skb(skb);
sk_mem_reclaim_partial(sk);
}
+EXPORT_SYMBOL(skb_free_datagram);
+
+void skb_free_datagram_locked(struct sock *sk, struct sk_buff *skb)
+{
+ lock_sock(sk);
+ skb_free_datagram(sk, skb);
+ release_sock(sk);
+}
+EXPORT_SYMBOL(skb_free_datagram_locked);
/**
* skb_kill_datagram - Free a datagram skbuff forcibly
@@ -752,5 +761,4 @@ unsigned int datagram_poll(struct file *file, struct socket *sock,
EXPORT_SYMBOL(datagram_poll);
EXPORT_SYMBOL(skb_copy_and_csum_datagram_iovec);
EXPORT_SYMBOL(skb_copy_datagram_iovec);
-EXPORT_SYMBOL(skb_free_datagram);
EXPORT_SYMBOL(skb_recv_datagram);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index d0d436d..0fa9f70 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -999,9 +999,7 @@ try_again:
err = ulen;
out_free:
- lock_sock(sk);
- skb_free_datagram(sk, skb);
- release_sock(sk);
+ skb_free_datagram_locked(sk, skb);
out:
return err;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 3a60f12..cf538ed 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -288,9 +288,7 @@ try_again:
err = ulen;
out_free:
- lock_sock(sk);
- skb_free_datagram(sk, skb);
- release_sock(sk);
+ skb_free_datagram_locked(sk, skb);
out:
return err;
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index ccc5e83..1c246a4 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -111,7 +111,7 @@ static void svc_release_skb(struct svc_rqst *rqstp)
rqstp->rq_xprt_ctxt = NULL;
dprintk("svc: service %p, releasing skb %p\n", rqstp, skb);
- skb_free_datagram(svsk->sk_sk, skb);
+ skb_free_datagram_locked(svsk->sk_sk, skb);
}
}
@@ -578,7 +578,7 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
"svc: received unknown control message %d/%d; "
"dropping RPC reply datagram\n",
cmh->cmsg_level, cmh->cmsg_type);
- skb_free_datagram(svsk->sk_sk, skb);
+ skb_free_datagram_locked(svsk->sk_sk, skb);
return 0;
}
@@ -588,18 +588,18 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
if (csum_partial_copy_to_xdr(&rqstp->rq_arg, skb)) {
local_bh_enable();
/* checksum error */
- skb_free_datagram(svsk->sk_sk, skb);
+ skb_free_datagram_locked(svsk->sk_sk, skb);
return 0;
}
local_bh_enable();
- skb_free_datagram(svsk->sk_sk, skb);
+ skb_free_datagram_locked(svsk->sk_sk, skb);
} else {
/* we can use it in-place */
rqstp->rq_arg.head[0].iov_base = skb->data +
sizeof(struct udphdr);
rqstp->rq_arg.head[0].iov_len = len;
if (skb_checksum_complete(skb)) {
- skb_free_datagram(svsk->sk_sk, skb);
+ skb_free_datagram_locked(svsk->sk_sk, skb);
return 0;
}
rqstp->rq_xprt_ctxt = skb;
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: WARNING: at net/ipv4/af_inet.c:154 inet_sock_destruct
2009-10-30 15:03 ` Eric Dumazet
@ 2009-10-30 19:26 ` David Miller
2009-10-31 9:20 ` Eric Dumazet
2009-11-06 8:45 ` Francis Moreau
1 sibling, 1 reply; 15+ messages in thread
From: David Miller @ 2009-10-30 19:26 UTC (permalink / raw)
To: eric.dumazet; +Cc: francis.moro, linux-kernel, netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 30 Oct 2009 16:03:53 +0100
> [PATCH take2] net: fix sk_forward_alloc corruption
>
> On UDP sockets, we must call skb_free_datagram() with socket locked,
> or risk sk_forward_alloc corruption. This requirement is not respected
> in SUNRPC.
>
> Add a convenient helper, skb_free_datagram_locked() and use it in SUNRPC
>
> Reported-by: Francis Moreau <francis.moro@gmail.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
I've tentatively applied this to my net-2.6 tree, I won't
push it out until we have positive testing results.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at net/ipv4/af_inet.c:154 inet_sock_destruct
2009-10-30 19:26 ` David Miller
@ 2009-10-31 9:20 ` Eric Dumazet
0 siblings, 0 replies; 15+ messages in thread
From: Eric Dumazet @ 2009-10-31 9:20 UTC (permalink / raw)
To: David Miller; +Cc: francis.moro, linux-kernel, netdev
David Miller a écrit :
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Fri, 30 Oct 2009 16:03:53 +0100
>
>> [PATCH take2] net: fix sk_forward_alloc corruption
>>
>> On UDP sockets, we must call skb_free_datagram() with socket locked,
>> or risk sk_forward_alloc corruption. This requirement is not respected
>> in SUNRPC.
>>
>> Add a convenient helper, skb_free_datagram_locked() and use it in SUNRPC
>>
>> Reported-by: Francis Moreau <francis.moro@gmail.com>
>> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
>
> I've tentatively applied this to my net-2.6 tree, I won't
> push it out until we have positive testing results.
I tested nfs/nfsd with fillowing additional debugging patch.
Without the fix (net: fix sk_forward_alloc corruption), I got warnings,
while with fix applied, I got no warnings. Its probably hard to hit
original bug since its a very small race window.
Trace without fix :
[ 226.686788] RPC: Registered udp transport module.
[ 226.686791] RPC: Registered tcp transport module.
[ 226.686792] RPC: Registered tcp NFSv4.1 backchannel transport module.
[ 226.851677] Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
[ 226.869381] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
[ 226.869443] NFSD: unable to find recovery directory /var/lib/nfs/v4recovery
[ 226.869488] NFSD: starting 90-second grace period
[ 226.870564] svc: failed to register lockdv1 RPC service (errno 97).
[ 244.401237] ------------[ cut here ]------------
[ 244.401288] WARNING: at include/net/sock.h:886 sock_rfree+0x54/0x70()
[ 244.401334] Hardware name: ProLiant BL460c G1
[ 244.401379] Modules linked in: nfsd lockd auth_rpcgss sunrpc exportfs bonding ipv6
[ 244.401699] Pid: 5295, comm: nfsd Not tainted 2.6.32-rc3-00920-gbdd6be3-dirty #342
[ 244.401745] Call Trace:
[ 244.401784] [<c055355a>] ? printk+0x1d/0x23
[ 244.401827] [<c023c3f2>] warn_slowpath_common+0x72/0xa0
[ 244.401869] [<c04b0d54>] ? sock_rfree+0x54/0x70
[ 244.401909] [<c04b0d54>] ? sock_rfree+0x54/0x70
[ 244.401950] [<c023c43a>] warn_slowpath_null+0x1a/0x20
[ 244.401991] [<c04b0d54>] sock_rfree+0x54/0x70
[ 244.402033] [<c04b4cf5>] skb_release_head_state+0x45/0xe0
[ 244.402084] [<c04b49f0>] __kfree_skb+0x10/0x90
[ 244.402125] [<c04b4a8c>] consume_skb+0x1c/0x40
[ 244.402166] [<c04b7c62>] skb_free_datagram+0x12/0x70
[ 244.402217] [<fc9fa177>] svc_release_skb+0x37/0x60 [sunrpc]
[ 244.402261] [<c026f452>] ? module_put+0x62/0xf0
[ 244.402308] [<fca05c98>] svc_send+0x28/0xc0 [sunrpc]
[ 244.402356] [<fc9f8acb>] svc_process+0x22b/0x770 [sunrpc]
[ 244.402416] [<fd04895c>] nfsd+0xac/0x140 [nfsd]
[ 244.402484] [<fd0488b0>] ? nfsd+0x0/0x140 [nfsd]
[ 244.402543] [<c0257ab4>] kthread+0x74/0x80
[ 244.402602] [<c0257a40>] ? kthread+0x0/0x80
[ 244.402656] [<c020390f>] kernel_thread_helper+0x7/0x10
[ 244.402711] ---[ end trace e202cdb1b491aa15 ]---
Debugging patch :
Intent is to make sure sock is locked by user or by softirq handler,
unless we are currently dismantling socket (sk_refcnt==0)
To speedup this test, we could use several bits of sk_lock.owned instead
of one for the user case.
(one for lock owned by user, one for lowk owned by softirq, one in dismantle phase)
diff --git a/include/net/sock.h b/include/net/sock.h
index 55de3bd..08ecb0e 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -849,10 +849,15 @@ static inline int sk_rmem_schedule(struct sock *sk, int size)
__sk_mem_schedule(sk, size, SK_MEM_RECV);
}
+#define sock_owned_by_user(sk) ((sk)->sk_lock.owned)
+#define sock_owned(sk) ((sk)->sk_lock.owned || \
+ spin_is_locked(&(sk)->sk_lock.slock))
+
static inline void sk_mem_reclaim(struct sock *sk)
{
if (!sk_has_account(sk))
return;
+ WARN_ON(!(sock_owned(sk) || atomic_read(&sk->sk_refcnt) == 0));
if (sk->sk_forward_alloc >= SK_MEM_QUANTUM)
__sk_mem_reclaim(sk);
}
@@ -861,6 +866,7 @@ static inline void sk_mem_reclaim_partial(struct sock *sk)
{
if (!sk_has_account(sk))
return;
+ WARN_ON(!(sock_owned(sk) || atomic_read(&sk->sk_refcnt) == 0));
if (sk->sk_forward_alloc > SK_MEM_QUANTUM)
__sk_mem_reclaim(sk);
}
@@ -869,6 +875,7 @@ static inline void sk_mem_charge(struct sock *sk, int size)
{
if (!sk_has_account(sk))
return;
+ WARN_ON(!sock_owned(sk));
sk->sk_forward_alloc -= size;
}
@@ -876,6 +883,7 @@ static inline void sk_mem_uncharge(struct sock *sk, int size)
{
if (!sk_has_account(sk))
return;
+ WARN_ON(!sock_owned(sk));
sk->sk_forward_alloc += size;
}
@@ -900,7 +908,6 @@ static inline void sk_wmem_free_skb(struct sock *sk, struct sk_buff *skb)
* Since ~2.3.5 it is also exclusive sleep lock serializing
* accesses from user process context.
*/
-#define sock_owned_by_user(sk) ((sk)->sk_lock.owned)
/*
* Macro so as to not evaluate some arguments when
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: WARNING: at net/ipv4/af_inet.c:154 inet_sock_destruct
2009-10-30 15:03 ` Eric Dumazet
2009-10-30 19:26 ` David Miller
@ 2009-11-06 8:45 ` Francis Moreau
2009-11-06 8:47 ` Eric Dumazet
1 sibling, 1 reply; 15+ messages in thread
From: Francis Moreau @ 2009-11-06 8:45 UTC (permalink / raw)
To: Eric Dumazet
Cc: Linux Kernel Mailing List, Linux Netdev List, David S. Miller
On Fri, Oct 30, 2009 at 4:03 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
[...]
>
> [PATCH take2] net: fix sk_forward_alloc corruption
>
With this patch applied 4 days ago, I haven't got any oops so far.
--
Francis
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at net/ipv4/af_inet.c:154 inet_sock_destruct
2009-11-06 8:45 ` Francis Moreau
@ 2009-11-06 8:47 ` Eric Dumazet
2009-11-06 8:58 ` David Miller
0 siblings, 1 reply; 15+ messages in thread
From: Eric Dumazet @ 2009-11-06 8:47 UTC (permalink / raw)
To: Francis Moreau
Cc: Linux Kernel Mailing List, Linux Netdev List, David S. Miller
Francis Moreau a écrit :
> On Fri, Oct 30, 2009 at 4:03 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> [...]
>
>> [PATCH take2] net: fix sk_forward_alloc corruption
>>
>
> With this patch applied 4 days ago, I haven't got any oops so far.
Thanks a lot for testing and report Francis, I believe David will push it for stable
(if not already done)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at net/ipv4/af_inet.c:154 inet_sock_destruct
2009-11-06 8:47 ` Eric Dumazet
@ 2009-11-06 8:58 ` David Miller
0 siblings, 0 replies; 15+ messages in thread
From: David Miller @ 2009-11-06 8:58 UTC (permalink / raw)
To: eric.dumazet; +Cc: francis.moro, linux-kernel, netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 06 Nov 2009 09:47:37 +0100
> Francis Moreau a écrit :
>> On Fri, Oct 30, 2009 at 4:03 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>
>> [...]
>>
>>> [PATCH take2] net: fix sk_forward_alloc corruption
>>>
>>
>> With this patch applied 4 days ago, I haven't got any oops so far.
>
> Thanks a lot for testing and report Francis, I believe David will
> push it for stable (if not already done)
I'm pretty sure it's already in Greg's current batch, but if it
doesn't show up in his next -stable release I'll make sure to
submit it.
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2009-11-06 8:58 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <38b2ab8a0909290109m3f82c161j4fb0f1266152877e@mail.gmail.com>
2009-09-29 9:18 ` WARNING: at net/ipv4/af_inet.c:154 inet_sock_destruct Eric Dumazet
2009-09-29 9:29 ` Francis Moreau
2009-09-30 11:40 ` Francis Moreau
2009-10-30 8:44 ` Francis Moreau
2009-10-30 9:41 ` Eric Dumazet
2009-10-30 11:27 ` Eric Dumazet
2009-10-30 12:33 ` Francis Moreau
2009-10-30 13:41 ` Francis Moreau
2009-10-30 14:55 ` Eric Dumazet
2009-10-30 15:03 ` Eric Dumazet
2009-10-30 19:26 ` David Miller
2009-10-31 9:20 ` Eric Dumazet
2009-11-06 8:45 ` Francis Moreau
2009-11-06 8:47 ` Eric Dumazet
2009-11-06 8:58 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).