From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40530) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X0Bck-0002HN-B6 for qemu-devel@nongnu.org; Thu, 26 Jun 2014 11:30:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1X0Bcd-0008QS-BN for qemu-devel@nongnu.org; Thu, 26 Jun 2014 11:30:06 -0400 Received: from mx1.redhat.com ([209.132.183.28]:64970) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X0Bcd-0008Pv-3p for qemu-devel@nongnu.org; Thu, 26 Jun 2014 11:29:59 -0400 Message-ID: <53AC3C72.6080308@redhat.com> Date: Thu, 26 Jun 2014 17:29:54 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Ming Lei , Stefan Hajnoczi , qemu-devel@nongnu.org Cc: Kevin Wolf , Fam Zheng , "Michael S. Tsirkin" Il 26/06/2014 17:14, Ming Lei ha scritto: > Hi Stefan, > > I found VM block I/O thoughput is decreased by more than 40% > on my laptop, and looks much worsen in my server environment, > and it is caused by your commit 580b6b2aa2: > > dataplane: use the QEMU block layer for I/O > > I run fio with below config to test random read: > > [global] > direct=1 > size=4G > bsrange=4k-4k > timeout=20 > numjobs=4 > ioengine=libaio > iodepth=64 > filename=/dev/vdc > group_reporting=1 > > [f] > rw=randread > > Together with throughput drop, the latency is improved a little. > > With this commit, I/O block submitted to fs becomes much smaller > than before, and more io_submit() need to be called to kernel, that > means iodepth may become much less. > > I am not surprised with the result since I did compare VM I/O > performance between qemu and lkvm before, which has no big qemu > lock problem and handle I/O in a dedicated thread, but lkvm's block > IO is still much worse than qemu from view of throughput, because > lkvm doesn't submit block I/O at batch like the way of previous > dataplane, IMO. What is your elevator setting in both the host and the guest? Usually deadline gives the best performance. Paolo