From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5AE02C32754 for ; Wed, 7 Aug 2019 14:02:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3BB8721E6E for ; Wed, 7 Aug 2019 14:02:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388027AbfHGOCQ (ORCPT ); Wed, 7 Aug 2019 10:02:16 -0400 Received: from mx1.redhat.com ([209.132.183.28]:39334 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387739AbfHGOCQ (ORCPT ); Wed, 7 Aug 2019 10:02:16 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 16CE351F0B; Wed, 7 Aug 2019 14:02:16 +0000 (UTC) Received: from [10.72.12.139] (ovpn-12-139.pek2.redhat.com [10.72.12.139]) by smtp.corp.redhat.com (Postfix) with ESMTP id A825B608AB; Wed, 7 Aug 2019 14:02:13 +0000 (UTC) Subject: Re: [PATCH V4 7/9] vhost: do not use RCU to synchronize MMU notifier with worker To: Jason Gunthorpe Cc: mst@redhat.com, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20190807070617.23716-1-jasowang@redhat.com> <20190807070617.23716-8-jasowang@redhat.com> <20190807120738.GB1557@ziepe.ca> From: Jason Wang Message-ID: Date: Wed, 7 Aug 2019 22:02:12 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20190807120738.GB1557@ziepe.ca> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Wed, 07 Aug 2019 14:02:16 +0000 (UTC) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On 2019/8/7 下午8:07, Jason Gunthorpe wrote: > On Wed, Aug 07, 2019 at 03:06:15AM -0400, Jason Wang wrote: >> We used to use RCU to synchronize MMU notifier with worker. This leads >> calling synchronize_rcu() in invalidate_range_start(). But on a busy >> system, there would be many factors that may slow down the >> synchronize_rcu() which makes it unsuitable to be called in MMU >> notifier. >> >> So this patch switches use seqlock counter to track whether or not the >> map was used. The counter was increased when vq try to start or finish >> uses the map. This means, when it was even, we're sure there's no >> readers and MMU notifier is synchronized. When it was odd, it means >> there's a reader we need to wait it to be even again then we are >> synchronized. Consider the read critical section is pretty small the >> synchronization should be done very fast. >> >> Reported-by: Michael S. Tsirkin >> Fixes: 7f466032dc9e ("vhost: access vq metadata through kernel virtual address") >> Signed-off-by: Jason Wang >> drivers/vhost/vhost.c | 141 ++++++++++++++++++++++++++---------------- >> drivers/vhost/vhost.h | 7 ++- >> 2 files changed, 90 insertions(+), 58 deletions(-) >> >> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c >> index cfc11f9ed9c9..57bfbb60d960 100644 >> +++ b/drivers/vhost/vhost.c >> @@ -324,17 +324,16 @@ static void vhost_uninit_vq_maps(struct vhost_virtqueue *vq) >> >> spin_lock(&vq->mmu_lock); >> for (i = 0; i < VHOST_NUM_ADDRS; i++) { >> - map[i] = rcu_dereference_protected(vq->maps[i], >> - lockdep_is_held(&vq->mmu_lock)); >> + map[i] = vq->maps[i]; >> if (map[i]) { >> vhost_set_map_dirty(vq, map[i], i); >> - rcu_assign_pointer(vq->maps[i], NULL); >> + vq->maps[i] = NULL; >> } >> } >> spin_unlock(&vq->mmu_lock); >> >> - /* No need for synchronize_rcu() or kfree_rcu() since we are >> - * serialized with memory accessors (e.g vq mutex held). >> + /* No need for synchronization since we are serialized with >> + * memory accessors (e.g vq mutex held). >> */ >> >> for (i = 0; i < VHOST_NUM_ADDRS; i++) >> @@ -362,6 +361,40 @@ static bool vhost_map_range_overlap(struct vhost_uaddr *uaddr, >> return !(end < uaddr->uaddr || start > uaddr->uaddr - 1 + uaddr->size); >> } >> >> +static void inline vhost_vq_access_map_begin(struct vhost_virtqueue *vq) >> +{ >> + write_seqcount_begin(&vq->seq); >> +} >> + >> +static void inline vhost_vq_access_map_end(struct vhost_virtqueue *vq) >> +{ >> + write_seqcount_end(&vq->seq); >> +} > The write side of a seqlock only provides write barriers. Access to > > map = vq->maps[VHOST_ADDR_USED]; > > Still needs a read side barrier, and then I think this will be no > better than a normal spinlock. > > It also doesn't seem like this algorithm even needs a seqlock, as this > is just a one bit flag Right, so then I tend to use spinlock first for correctness. > > atomic_set_bit(using map) > smp_mb__after_atomic() > .. maps [...] > atomic_clear_bit(using map) > > > map = NULL; > smp_mb__before_atomic(); > while (atomic_read_bit(using map)) > relax() > > Again, not clear this could be faster than a spinlock when the > barriers are correct... Yes, for next release we may want to use the idea from Michael like to mitigate the impact of mb. https://lwn.net/Articles/775871/ Thanks > > Jason