From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47875) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WjnWk-0000l3-Og for qemu-devel@nongnu.org; Mon, 12 May 2014 06:32:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WjnWf-0004We-UZ for qemu-devel@nongnu.org; Mon, 12 May 2014 06:32:10 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53276) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WjnWf-0004Wa-Mi for qemu-devel@nongnu.org; Mon, 12 May 2014 06:32:05 -0400 Date: Mon, 12 May 2014 13:30:52 +0300 From: "Michael S. Tsirkin" Message-ID: <20140512103052.GB15294@redhat.com> References: <33183CC9F5247A488A2544077AF19020815E7324@SZXEMA503-MBS.china.huawei.com> <536C8E83.8030504@redhat.com> <33183CC9F5247A488A2544077AF19020815E744C@SZXEMA503-MBS.china.huawei.com> <536CA5A5.4080303@redhat.com> <33183CC9F5247A488A2544077AF19020815E7B70@SZXEMA503-MBS.china.huawei.com> <53709B0C.4030808@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <53709B0C.4030808@redhat.com> Subject: Re: [Qemu-devel] [RFC] vhost: Can we change synchronize_rcu to call_rcu in vhost_set_memory() in vhost kernel module? List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: "Huangweidong (C)" , "gleb@redhat.com" , "qemu-devel@nongnu.org" , "Gonglei (Arei)" , "avi.kivity@gmail.com" , "Herongguang (Stephen)" On Mon, May 12, 2014 at 11:57:32AM +0200, Paolo Bonzini wrote: > Il 12/05/2014 11:28, Gonglei (Arei) ha scritto: > >From previous discussion: > >https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg04925.html > >we know that you are going to replace RCU in KVM_SET_GSI_ROUTING with SRCU. Though > >SRCU is quite better than originally RCU, in our test case this cannot satisfy our needs. Our VMs > >work in telecom scenario, VMs report CPU and memory usage to balance node each second, and > >balance node dispatch works to different VMs according to VM load. Since this balance needs > >high accuracy, IRQ affinity settings in VM also need high accuracy, so we balance IRQ affinity in > >every 0.5s. So for telecom scenario, KVM_SET_GSI_ROUTING IOCTL needs much optimization. > >And in live migration case, VHOST_SET_MEM_TABLE needs attention. > > > >We tried to change synchronize_rcu() to call_rcu() with rate limit, but rate limit is not easy to > >configure. Do you have better ideas to achieve this? Thanks. > > Perhaps we can check for cases where only the address is changing, > and poke at an existing struct kvm_kernel_irq_routing_entry without > doing any RCU synchronization? > > As long as kvm_set_msi_irq only reads address_lo once, it should work. > > VHOST_SET_MEM_TABLE is a different problem. What happens in > userspace that leads to calling that ioctl? Can we remove it > altogether, or delay it to after the destination has started > running? > > Paolo vhost does everything under a VQ lock. I think RCU for VHOST_SET_MEM_TABLE can be replaced with taking and freeing the VQ lock. Does the below solve the problem for you (warning: untested, sorry, busy with other bugs right now)? Signed-off-by: Michael S. Tsirkin --- diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 78987e4..df2e3eb 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -593,6 +593,7 @@ static long vhost_set_memory(struct vhost_dev *d, struct vhost_memory __user *m) { struct vhost_memory mem, *newmem, *oldmem; unsigned long size = offsetof(struct vhost_memory, regions); + int i; if (copy_from_user(&mem, m, size)) return -EFAULT; @@ -619,7 +620,14 @@ static long vhost_set_memory(struct vhost_dev *d, struct vhost_memory __user *m) oldmem = rcu_dereference_protected(d->memory, lockdep_is_held(&d->mutex)); rcu_assign_pointer(d->memory, newmem); - synchronize_rcu(); + + /* All memory accesses are done under some VQ mutex. + * So below is a faster equivalent of synchronize_rcu() + */ + for (i = 0; i < dev->nvqs; ++i) { + mutex_lock(d->vqs[idx]->mutex); + mutex_unlock(d->vqs[idx]->mutex); + } kfree(oldmem); return 0; }