From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ilya Maximets Subject: Re: [PATCH v3] vhost: fix connect hang in client mode Date: Thu, 21 Jul 2016 16:43:25 +0300 Message-ID: <5790D17D.8080402@samsung.com> References: <1469089275-15209-1-git-send-email-i.maximets@samsung.com> <1469107175-1216-1-git-send-email-i.maximets@samsung.com> <20160721133558.GK28708@yliu-dev.sh.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Cc: dev@dpdk.org, Huawei Xie , Dyasly Sergey , Heetae Ahn , Thomas Monjalon To: Yuanhan Liu Return-path: Received: from mailout1.w1.samsung.com (mailout1.w1.samsung.com [210.118.77.11]) by dpdk.org (Postfix) with ESMTP id A58073989 for ; Thu, 21 Jul 2016 15:43:27 +0200 (CEST) Received: from eucpsbgm2.samsung.com (unknown [203.254.199.245]) by mailout1.w1.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTP id <0OAO00BOI3GEW9B0@mailout1.w1.samsung.com> for dev@dpdk.org; Thu, 21 Jul 2016 14:43:26 +0100 (BST) In-reply-to: <20160721133558.GK28708@yliu-dev.sh.intel.com> List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 21.07.2016 16:35, Yuanhan Liu wrote: > On Thu, Jul 21, 2016 at 04:19:35PM +0300, Ilya Maximets wrote: >> If something abnormal happened to QEMU, 'connect()' can block calling >> thread (e.g. main thread of OVS) forever or for a really long time. >> This can break whole application or block the reconnection thread. >> >> Example with OVS: >> >> ovs_rcu(urcu2)|WARN|blocked 512000 ms waiting for main to quiesce >> (gdb) bt >> #0 connect () from /lib64/libpthread.so.0 >> #1 vhost_user_create_client (vsocket=0xa816e0) >> #2 rte_vhost_driver_register >> #3 netdev_dpdk_vhost_user_construct >> #4 netdev_open (name=0xa664b0 "vhost1") >> [...] >> #11 main >> >> Fix that by setting non-blocking mode for client sockets for connection. >> >> Fixes: 64ab701c3d1e ("vhost: add vhost-user client mode") >> >> Signed-off-by: Ilya Maximets > > Acked-by: Yuanhan Liu > > One help I'd like to ask is that I'd appriciate if you could do the test > to make sure that your 2 (latest) patches fix the two issues you reported. > > You might have already done that; I just want to make sure. I've performed the test with 'ofport_request' script before sending patches. And currently test still works. No leaks of descriptors, no hangs, no QEMU crashes observed. Sometimes network device breaks on QEMU side, but it's QEMU issue. In this case I'm receiving following message from DPDK's vhost: VHOST_CONFIG: vhost-user client: socket created, fd: 28 VHOST_CONFIG: failed to connect to /vhost1: Resource temporarily unavailable VHOST_CONFIG: /vhost1: reconnecting... Before the 'hang' patch there was hang of main thread. After QEMU restart all works normally. OVS restart not required. Best regards, Ilya Maximets.