From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michal Kubecek Subject: blocking ops when !TASK_RUNNING in vsock_stream_sendmsg() (again) Date: Fri, 21 Apr 2017 10:14:58 +0200 Message-ID: <20170421081458.GI13789@unicorn.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: netdev@vger.kernel.org, Andy King , George Zhang To: Claudio Imbrenda Return-path: Received: from mx2.suse.de ([195.135.220.15]:33748 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1036560AbdDUIPA (ORCPT ); Fri, 21 Apr 2017 04:15:00 -0400 Content-Disposition: inline Sender: netdev-owner@vger.kernel.org List-ID: Hello, one of openSUSE Leap 42.2 users encountered (repeatedly) a warning [ 4057.170653] WARNING: CPU: 1 PID: 3471 at ../kernel/sched/core.c:7913 __might_sleep+0x76/0x80() [ 4057.170661] do not call blocking ops when !TASK_RUNNING; state=1 set at [] prepare_to_wait+0x2b/0x80 with stack [ 4057.170786] [] dump_trace+0x59/0x320 [ 4057.170789] [] show_stack_log_lvl+0xfa/0x180 [ 4057.170792] [] show_stack+0x21/0x40 [ 4057.170798] [] dump_stack+0x5c/0x85 [ 4057.170803] [] warn_slowpath_common+0x81/0xb0 [ 4057.170806] [] warn_slowpath_fmt+0x4c/0x50 [ 4057.170809] [] __might_sleep+0x76/0x80 [ 4057.170814] [] mutex_lock+0x1c/0x38 [ 4057.170822] [] vmci_qpair_produce_free_space+0x97/0xd0 [vmw_vmci] [ 4057.170848] [] vsock_stream_sendmsg+0x1f6/0x320 [vsock] [ 4057.170855] [] sock_sendmsg+0x30/0x40 [ 4057.170859] [] sock_write_iter+0x79/0xd0 [ 4057.170864] [] __vfs_write+0xa9/0xf0 [ 4057.170867] [] vfs_write+0x9d/0x190 [ 4057.170870] [] SyS_write+0x42/0xa0 [ 4057.170873] [] entry_SYSCALL_64_fastpath+0x16/0x71 The kernel is 4.4.27 but it already has commit f7f9b5e7f8ec ("AF_VSOCK: Shrink the area influenced by prepare_to_wait") applied. The issue comes from this part of vsock_stream_sendmsg(): prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE); while (vsock_stream_has_space(vsk) == 0 && sk->sk_err == 0 && !(sk->sk_shutdown & SEND_SHUTDOWN) && !(vsk->peer_shutdown & RCV_SHUTDOWN)) { where vsock_stream_has_space() can sleep: vsock_stream_has_space vmci_transport_stream_has_space vmci_qpair_produce_free_space qp_lock qp_acquire_queue_mutex mutex_lock but this is not allowed between prepare_to_wait() and either the actual waiting or finish_wait(). I tried to think about a solution but there doesn't seem to be an easy way to fix this in vmw_stream_sendmsg() as moving prepare_to_wait() inside the loop would result in missed wake-ups (that was the problem with the original fix); IMHO the right way to resolve the issue would be rewriting the vmci queue pair code to allow performing the has_space() check without taking a mutex. Michal Kubecek