From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yuanhan Liu Subject: Re: [PATCH 21/21] vhost: iotlb: reduce iotlb read lock usage Date: Mon, 11 Sep 2017 17:39:52 +0800 Message-ID: <20170911093951.GY9736@yliu-home> References: <20170831095023.21037-1-maxime.coquelin@redhat.com> <20170831095023.21037-22-maxime.coquelin@redhat.com> <20170911041821.GI9736@yliu-home> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: dev@dpdk.org, jfreiman@redhat.com, tiwei.bie@intel.com, mst@redhat.com, vkaplans@redhat.com, jasowang@redhat.com To: Maxime Coquelin Return-path: Received: from mail-pg0-f54.google.com (mail-pg0-f54.google.com [74.125.83.54]) by dpdk.org (Postfix) with ESMTP id 546C6199B0 for ; Mon, 11 Sep 2017 11:40:01 +0200 (CEST) Received: by mail-pg0-f54.google.com with SMTP id j16so4068641pga.1 for ; Mon, 11 Sep 2017 02:40:01 -0700 (PDT) Content-Disposition: inline In-Reply-To: List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Mon, Sep 11, 2017 at 09:34:30AM +0200, Maxime Coquelin wrote: > Hi Yuanhan, > > On 09/11/2017 06:18 AM, Yuanhan Liu wrote: > >On Thu, Aug 31, 2017 at 11:50:23AM +0200, Maxime Coquelin wrote: > >>Prior to this patch, iotlb cache's read/write lock was > >>read-locked at every guest IOVA to app VA translation, > >>i.e. at least once per packet with indirect off and twice > >>with indirect on. > >> > >>The problem is that rte_rwlock_read_lock() makes use of atomic > >>operation, which is costly. > >> > >>This patch introduces iotlb lock helpers, so that a full burst > >>can be protected with taking the lock once, which reduces the > >>number of atomic operations by up to 64 with indirect > >>descriptors. > > > >You were assuming there is no single miss during a burst. If a miss > >happens, it requries 2 locks: one for _pending_miss and another one > >for _pending_insert. From this point of view, it's actually more > >expensive. > > It's actually more expensive only when a miss happens. Yes, that's what I meant. > And in that case, > the cost of taking the lock is negligible compared to the miss itself. Yes, I'm aware of it. > >However, I won't call it's a bad assumption (for the case of virtio > >PMD). And if you take this assumption, why not just deleting the > >pending list and moving the lock outside the _iotlb_find function() > >like what you did in this patch? > > Because we need the pending list. When the is no matching entry in the > IOTLB cache, we have to send a miss request through the slave channel. You don't have the pending list to send a MISS. > On miss request reception, Qemu performs the translation, and in case of > success, sends it through the main channel using an update request. > > While all this is done, the backend could wait for it, blocking > processing on the PMD thread. But it would be really inefficient in case > other queues are being processed on the same lcore. Moreover, if the > iova is invalid, no update requst is sent, so the lcore would be blocked > forever. > > To overcome this, what is done is that in case of miss, it exits the > burst and try again later, letting a chance for other virtqueues to be > processed while the update arrives. You can also quit earlier without the pending list. > And here comes the pending list. On the next try, the update may have > not arrived yet, so we need the check whether a miss has already been > sent for the same address & perm. Else, we would flood Qemu with miss > requests for the same address. Okay, that's the reason we need a pending list: to record the miss we have already sent. > >I don't really see the point of introducing the pending list. > > Hope the above clarifies. Thanks, it indeed helps! > I will see if I can improve the pending list protection, but honestly, > its cost is negligible. That's not my point :) --yliu