From mboxrd@z Thu Jan 1 00:00:00 1970 From: Liu Yuan Subject: Re: [RFC PATCH]vhost-blk: In-kernel accelerator for virtio block device Date: Mon, 15 Aug 2011 11:20:55 +0800 Message-ID: <4E489097.1070307@gmail.com> References: <1311863346-4338-1-git-send-email-namei.unix@gmail.com> <4E325F98.5090308@gmail.com> <4E32F7F2.4080607@us.ibm.com> <4E363DB9.70801@gmail.com> <1312495132.9603.4.camel@badari-desktop> <4E3BCE4D.7090809@gmail.com> <4E3C302A.3040500@us.ibm.com> <4E3F3D4E.70104@gmail.com> <4E3F6E72.1000907@us.ibm.com> <4E3F90E3.9080600@gmail.com> <4E4019E1.2090508@us.ibm.com> <4E41EAC5.8060001@gmail.com> <1313008667.9603.14.camel@badari-desktop> <4E4345F1.90107@gmail.com> <4E434A51.8000902@gmail.com> <4E44B100.3000208@us.ibm.com> <4E44E40C.7040407@gmail.com> <4E45113C.3040502@gmail.com> <4E4550DB.3020802@us.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: kvm@vger.kernel.org, Dongsu Park To: Badari Pulavarty Return-path: Received: from mail-pz0-f42.google.com ([209.85.210.42]:58180 "EHLO mail-pz0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751555Ab1HODVG (ORCPT ); Sun, 14 Aug 2011 23:21:06 -0400 Received: by pzk37 with SMTP id 37so2316807pzk.1 for ; Sun, 14 Aug 2011 20:21:06 -0700 (PDT) In-Reply-To: <4E4550DB.3020802@us.ibm.com> Sender: kvm-owner@vger.kernel.org List-ID: On 08/13/2011 12:12 AM, Badari Pulavarty wrote: > On 8/12/2011 4:40 AM, Liu Yuan wrote: >> On 08/12/2011 04:27 PM, Liu Yuan wrote: >>> On 08/12/2011 12:50 PM, Badari Pulavarty wrote: >>>> On 8/10/2011 8:19 PM, Liu Yuan wrote: >>>>> On 08/11/2011 11:01 AM, Liu Yuan wrote: >>>>>> >>>>>>> It looks like the patch wouldn't work for testing multiple devices. >>>>>>> >>>>>>> vhost_blk_open() does >>>>>>> + used_info_cachep = KMEM_CACHE(used_info, >>>>>>> SLAB_HWCACHE_ALIGN | >>>>>>> SLAB_PANIC); >>>>>>> >>>>>> >>>>>> This is weird. how do you open multiple device?I just opened the >>>>>> device with following command: >>>>>> >>>>>> -drive file=/dev/sda6,if=virtio,cache=none,aio=native -drive >>>>>> file=~/data0.img,if=virtio,cache=none,aio=native -drive >>>>>> file=~/data1.img,if=virtio,cache=none,aio=native >>>>>> >>>>>> And I didn't meet any problem. >>>>>> >>>>>> this would tell qemu to open three devices, and pass three FDs to >>>>>> three instances of vhost_blk module. >>>>>> So KMEM_CACHE() is okay in vhost_blk_open(). >>>>>> >>>>> >>>>> Oh, you are right. KMEM_CACHE() is in the wrong place. it is three >>>>> instances vhost worker threads created. Hmmm, but I didn't meet >>>>> any problem when opening it and running it. So strange. I'll go to >>>>> figure it out. >>>>> >>>>>>> When opening second device, we get panic since used_info_cachep is >>>>>>> already created. Just to make progress I moved this call to >>>>>>> vhost_blk_init(). >>>>>>> >>>>>>> I don't see any host panics now. With single block device (dd), >>>>>>> it seems to work fine. But when I start testing multiple block >>>>>>> devices I quickly run into hangs in the guest. I see following >>>>>>> messages in the guest from virtio_ring.c: >>>>>>> >>>>>>> virtio_blk virtio2: requests: id 0 is not a head ! >>>>>>> virtio_blk virtio1: requests: id 0 is not a head ! >>>>>>> virtio_blk virtio4: requests: id 1 is not a head ! >>>>>>> virtio_blk virtio3: requests: id 39 is not a head ! >>>>>>> >>>>>>> Thanks, >>>>>>> Badari >>>>>>> >>>>>>> >>>>>> >>>>>> vq->data[] is initialized by guest virtio-blk driver and >>>>>> vhost_blk is unware of it. it looks like used ID passed >>>>>> over by vhost_blk to guest virtio_blk is wrong, but, it should >>>>>> not happen. :| >>>>>> >>>>>> And I can't reproduce this on my laptop. :( >>>>>> >>>> Finally, found the issue :) >>>> >>>> Culprit is: >>>> >>>> +static struct io_event events[MAX_EVENTS]; >>>> >>>> With multiple devices, multiple threads could be executing >>>> handle_completion() (one for >>>> each fd) at the same time. "events" array is global :( Need to make >>>> it one per device/fd. >>>> >>>> For test, I changed MAX_EVENTS to 32 and moved "events" array to be >>>> local (stack) >>>> to handle_completion(). Tests are running fine. >>>> >>>> Your laptop must have single processor, hence you have only one >>>> thread executing handle_completion() >>>> at any time.. >>>> >>>> Thanks, >>>> Badari >>>> >>>> >>> Good catch, this is rather cool!....Yup, I develop it mostly in a >>> nested KVM environment. and the L2 host only runs single processor :( >>> >>> Thanks, >>> Yuan >> By the way, MAX_EVENTS should be 128, as much as guest virtio_blk >> driver can batch-submit, >> causing array overflow. >> I have had turned on the debug, and had seen as much as over 100 >> requests batched from guest OS. >> > > Hmm.. I am not sure why you see over 100 outstanding events per fd. > Max events could be as high as > number of number of outstanding IOs. > > Anyway, instead of putting it on stack, I kmalloced it now. > > Dongsu Park, Here is the complete patch. > > Thanks > Badari > > In the physical machine, there is a queue depth posted by block device driver to limit the pending requests number, normally it is 31. But virtio driver doesn't post it in the guest OS. So nothing prvents OS batch-submitting requests more than 31. I have noticed over 100 pending requests during guest OS initilization and it is reproducible. BTW, how is perf number for vhost-blk in your environment? Thanks, Yuan