* blocking ops when !TASK_RUNNING in vsock_stream_sendmsg() (again)
@ 2017-04-21 8:14 Michal Kubecek
2017-04-26 17:18 ` Cong Wang
0 siblings, 1 reply; 2+ messages in thread
From: Michal Kubecek @ 2017-04-21 8:14 UTC (permalink / raw)
To: Claudio Imbrenda; +Cc: netdev, Andy King, George Zhang
Hello,
one of openSUSE Leap 42.2 users encountered (repeatedly) a warning
[ 4057.170653] WARNING: CPU: 1 PID: 3471 at ../kernel/sched/core.c:7913 __might_sleep+0x76/0x80()
[ 4057.170661] do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff810c25ab>] prepare_to_wait+0x2b/0x80
with stack
[ 4057.170786] [<ffffffff81019e69>] dump_trace+0x59/0x320
[ 4057.170789] [<ffffffff8101a22a>] show_stack_log_lvl+0xfa/0x180
[ 4057.170792] [<ffffffff8101afd1>] show_stack+0x21/0x40
[ 4057.170798] [<ffffffff81327657>] dump_stack+0x5c/0x85
[ 4057.170803] [<ffffffff8107e821>] warn_slowpath_common+0x81/0xb0
[ 4057.170806] [<ffffffff8107e89c>] warn_slowpath_fmt+0x4c/0x50
[ 4057.170809] [<ffffffff810a3106>] __might_sleep+0x76/0x80
[ 4057.170814] [<ffffffff816071ac>] mutex_lock+0x1c/0x38
[ 4057.170822] [<ffffffffa0cfb477>] vmci_qpair_produce_free_space+0x97/0xd0 [vmw_vmci]
[ 4057.170848] [<ffffffffa0d10d36>] vsock_stream_sendmsg+0x1f6/0x320 [vsock]
[ 4057.170855] [<ffffffff814f6fb0>] sock_sendmsg+0x30/0x40
[ 4057.170859] [<ffffffff814f7039>] sock_write_iter+0x79/0xd0
[ 4057.170864] [<ffffffff81204d49>] __vfs_write+0xa9/0xf0
[ 4057.170867] [<ffffffff8120534d>] vfs_write+0x9d/0x190
[ 4057.170870] [<ffffffff81206012>] SyS_write+0x42/0xa0
[ 4057.170873] [<ffffffff816093f2>] entry_SYSCALL_64_fastpath+0x16/0x71
The kernel is 4.4.27 but it already has commit f7f9b5e7f8ec ("AF_VSOCK:
Shrink the area influenced by prepare_to_wait") applied. The issue comes
from this part of vsock_stream_sendmsg():
prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
while (vsock_stream_has_space(vsk) == 0 &&
sk->sk_err == 0 &&
!(sk->sk_shutdown & SEND_SHUTDOWN) &&
!(vsk->peer_shutdown & RCV_SHUTDOWN)) {
where vsock_stream_has_space() can sleep:
vsock_stream_has_space
vmci_transport_stream_has_space
vmci_qpair_produce_free_space
qp_lock
qp_acquire_queue_mutex
mutex_lock
but this is not allowed between prepare_to_wait() and either the actual
waiting or finish_wait().
I tried to think about a solution but there doesn't seem to be an easy
way to fix this in vmw_stream_sendmsg() as moving prepare_to_wait()
inside the loop would result in missed wake-ups (that was the problem
with the original fix); IMHO the right way to resolve the issue would be
rewriting the vmci queue pair code to allow performing the has_space()
check without taking a mutex.
Michal Kubecek
^ permalink raw reply [flat|nested] 2+ messages in thread* Re: blocking ops when !TASK_RUNNING in vsock_stream_sendmsg() (again) 2017-04-21 8:14 blocking ops when !TASK_RUNNING in vsock_stream_sendmsg() (again) Michal Kubecek @ 2017-04-26 17:18 ` Cong Wang 0 siblings, 0 replies; 2+ messages in thread From: Cong Wang @ 2017-04-26 17:18 UTC (permalink / raw) To: Michal Kubecek Cc: Claudio Imbrenda, Linux Kernel Network Developers, Andy King, George Zhang [-- Attachment #1: Type: text/plain, Size: 532 bytes --] Hi, On Fri, Apr 21, 2017 at 1:14 AM, Michal Kubecek <mkubecek@suse.cz> wrote: > I tried to think about a solution but there doesn't seem to be an easy > way to fix this in vmw_stream_sendmsg() as moving prepare_to_wait() > inside the loop would result in missed wake-ups (that was the problem > with the original fix); IMHO the right way to resolve the issue would be > rewriting the vmci queue pair code to allow performing the has_space() > check without taking a mutex. Can you try the attached patch (compile only)? Thanks. [-- Attachment #2: vsock-wait.diff --] [-- Type: text/plain, Size: 2095 bytes --] diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 6f7f675..dfc8c51e 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1540,8 +1540,7 @@ static int vsock_stream_sendmsg(struct socket *sock, struct msghdr *msg, long timeout; int err; struct vsock_transport_send_notify_data send_data; - - DEFINE_WAIT(wait); + DEFINE_WAIT_FUNC(wait, woken_wake_function); sk = sock->sk; vsk = vsock_sk(sk); @@ -1584,11 +1583,10 @@ static int vsock_stream_sendmsg(struct socket *sock, struct msghdr *msg, if (err < 0) goto out; - while (total_written < len) { ssize_t written; - prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE); + add_wait_queue(sk_sleep(sk), &wait); while (vsock_stream_has_space(vsk) == 0 && sk->sk_err == 0 && !(sk->sk_shutdown & SEND_SHUTDOWN) && @@ -1597,33 +1595,30 @@ static int vsock_stream_sendmsg(struct socket *sock, struct msghdr *msg, /* Don't wait for non-blocking sockets. */ if (timeout == 0) { err = -EAGAIN; - finish_wait(sk_sleep(sk), &wait); + remove_wait_queue(sk_sleep(sk), &wait); goto out_err; } err = transport->notify_send_pre_block(vsk, &send_data); if (err < 0) { - finish_wait(sk_sleep(sk), &wait); + remove_wait_queue(sk_sleep(sk), &wait); goto out_err; } release_sock(sk); - timeout = schedule_timeout(timeout); + timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout); lock_sock(sk); if (signal_pending(current)) { err = sock_intr_errno(timeout); - finish_wait(sk_sleep(sk), &wait); + remove_wait_queue(sk_sleep(sk), &wait); goto out_err; } else if (timeout == 0) { err = -EAGAIN; - finish_wait(sk_sleep(sk), &wait); + remove_wait_queue(sk_sleep(sk), &wait); goto out_err; } - - prepare_to_wait(sk_sleep(sk), &wait, - TASK_INTERRUPTIBLE); } - finish_wait(sk_sleep(sk), &wait); + remove_wait_queue(sk_sleep(sk), &wait); /* These checks occur both as part of and after the loop * conditional since we need to check before and after ^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2017-04-26 17:19 UTC | newest] Thread overview: 2+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-04-21 8:14 blocking ops when !TASK_RUNNING in vsock_stream_sendmsg() (again) Michal Kubecek 2017-04-26 17:18 ` Cong Wang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox