From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6ABD0C432C1 for ; Wed, 25 Sep 2019 10:51:52 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3ED2E20872 for ; Wed, 25 Sep 2019 10:51:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3ED2E20872 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:48102 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iD4tj-0002p9-6b for qemu-devel@archiver.kernel.org; Wed, 25 Sep 2019 06:51:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42608) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iD4t4-0002Pe-71 for qemu-devel@nongnu.org; Wed, 25 Sep 2019 06:51:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iD4t1-0003wl-Bi for qemu-devel@nongnu.org; Wed, 25 Sep 2019 06:51:08 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52856) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1iD4t1-0003wO-3x for qemu-devel@nongnu.org; Wed, 25 Sep 2019 06:51:07 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A9F2A10DCC93; Wed, 25 Sep 2019 10:51:05 +0000 (UTC) Received: from [10.36.117.14] (ovpn-117-14.ams2.redhat.com [10.36.117.14]) by smtp.corp.redhat.com (Postfix) with ESMTP id 148A361F24; Wed, 25 Sep 2019 10:50:52 +0000 (UTC) Subject: Re: when to use virtio (was Re: [PATCH v4 0/8] Introduce the microvm machine type) To: Paolo Bonzini , Sergio Lopez References: <20190924124433.96810-1-slp@redhat.com> <87h850ssnb.fsf@redhat.com> <231f9f20-ae88-c46b-44da-20b610420e0c@redhat.com> <77a157c4-5f43-5c70-981c-20e5a31a4dd1@redhat.com> From: David Hildenbrand Openpgp: preference=signencrypt Autocrypt: addr=david@redhat.com; prefer-encrypt=mutual; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwX4EEwECACgFAljj9eoCGwMFCQlmAYAGCwkI BwMCBhUIAgkKCwQWAgMBAh4BAheAAAoJEE3eEPcA/4Na5IIP/3T/FIQMxIfNzZshIq687qgG 8UbspuE/YSUDdv7r5szYTK6KPTlqN8NAcSfheywbuYD9A4ZeSBWD3/NAVUdrCaRP2IvFyELj xoMvfJccbq45BxzgEspg/bVahNbyuBpLBVjVWwRtFCUEXkyazksSv8pdTMAs9IucChvFmmq3 jJ2vlaz9lYt/lxN246fIVceckPMiUveimngvXZw21VOAhfQ+/sofXF8JCFv2mFcBDoa7eYob s0FLpmqFaeNRHAlzMWgSsP80qx5nWWEvRLdKWi533N2vC/EyunN3HcBwVrXH4hxRBMco3jvM m8VKLKao9wKj82qSivUnkPIwsAGNPdFoPbgghCQiBjBe6A75Z2xHFrzo7t1jg7nQfIyNC7ez MZBJ59sqA9EDMEJPlLNIeJmqslXPjmMFnE7Mby/+335WJYDulsRybN+W5rLT5aMvhC6x6POK z55fMNKrMASCzBJum2Fwjf/VnuGRYkhKCqqZ8gJ3OvmR50tInDV2jZ1DQgc3i550T5JDpToh dPBxZocIhzg+MBSRDXcJmHOx/7nQm3iQ6iLuwmXsRC6f5FbFefk9EjuTKcLMvBsEx+2DEx0E UnmJ4hVg7u1PQ+2Oy+Lh/opK/BDiqlQ8Pz2jiXv5xkECvr/3Sv59hlOCZMOaiLTTjtOIU7Tq 7ut6OL64oAq+zsFNBFXLn5EBEADn1959INH2cwYJv0tsxf5MUCghCj/CA/lc/LMthqQ773ga uB9mN+F1rE9cyyXb6jyOGn+GUjMbnq1o121Vm0+neKHUCBtHyseBfDXHA6m4B3mUTWo13nid 0e4AM71r0DS8+KYh6zvweLX/LL5kQS9GQeT+QNroXcC1NzWbitts6TZ+IrPOwT1hfB4WNC+X 2n4AzDqp3+ILiVST2DT4VBc11Gz6jijpC/KI5Al8ZDhRwG47LUiuQmt3yqrmN63V9wzaPhC+ xbwIsNZlLUvuRnmBPkTJwwrFRZvwu5GPHNndBjVpAfaSTOfppyKBTccu2AXJXWAE1Xjh6GOC 8mlFjZwLxWFqdPHR1n2aPVgoiTLk34LR/bXO+e0GpzFXT7enwyvFFFyAS0Nk1q/7EChPcbRb hJqEBpRNZemxmg55zC3GLvgLKd5A09MOM2BrMea+l0FUR+PuTenh2YmnmLRTro6eZ/qYwWkC u8FFIw4pT0OUDMyLgi+GI1aMpVogTZJ70FgV0pUAlpmrzk/bLbRkF3TwgucpyPtcpmQtTkWS gDS50QG9DR/1As3LLLcNkwJBZzBG6PWbvcOyrwMQUF1nl4SSPV0LLH63+BrrHasfJzxKXzqg rW28CTAE2x8qi7e/6M/+XXhrsMYG+uaViM7n2je3qKe7ofum3s4vq7oFCPsOgwARAQABwsFl BBgBAgAPBQJVy5+RAhsMBQkJZgGAAAoJEE3eEPcA/4NagOsP/jPoIBb/iXVbM+fmSHOjEshl KMwEl/m5iLj3iHnHPVLBUWrXPdS7iQijJA/VLxjnFknhaS60hkUNWexDMxVVP/6lbOrs4bDZ NEWDMktAeqJaFtxackPszlcpRVkAs6Msn9tu8hlvB517pyUgvuD7ZS9gGOMmYwFQDyytpepo YApVV00P0u3AaE0Cj/o71STqGJKZxcVhPaZ+LR+UCBZOyKfEyq+ZN311VpOJZ1IvTExf+S/5 lqnciDtbO3I4Wq0ArLX1gs1q1XlXLaVaA3yVqeC8E7kOchDNinD3hJS4OX0e1gdsx/e6COvy qNg5aL5n0Kl4fcVqM0LdIhsubVs4eiNCa5XMSYpXmVi3HAuFyg9dN+x8thSwI836FoMASwOl C7tHsTjnSGufB+D7F7ZBT61BffNBBIm1KdMxcxqLUVXpBQHHlGkbwI+3Ye+nE6HmZH7IwLwV W+Ajl7oYF+jeKaH4DZFtgLYGLtZ1LDwKPjX7VAsa4Yx7S5+EBAaZGxK510MjIx6SGrZWBrrV TEvdV00F2MnQoeXKzD7O4WFbL55hhyGgfWTHwZ457iN9SgYi1JLPqWkZB0JRXIEtjd4JEQcx +8Umfre0Xt4713VxMygW0PnQt5aSQdMD58jHFxTk092mU+yIHj5LeYgvwSgZN4airXk5yRXl SE+xAvmumFBY Organization: Red Hat GmbH Message-ID: Date: Wed, 25 Sep 2019 12:50:52 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <77a157c4-5f43-5c70-981c-20e5a31a4dd1@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.64]); Wed, 25 Sep 2019 10:51:05 +0000 (UTC) Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , ehabkost@redhat.com, kvm@vger.kernel.org, mst@redhat.com, lersek@redhat.com, mtosatti@redhat.com, qemu-devel@nongnu.org, kraxel@redhat.com, imammedo@redhat.com, philmd@redhat.com, rth@twiddle.net Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" On 25.09.19 12:19, Paolo Bonzini wrote: > This is a tangent, but I was a bit too harsh in my previous message (at > least it made you laugh rather than angry!) so I think I owe you an > explanation. It's hard to make me really angry, you have to try better :) However, after years of working on VMs, VM memory management and Linux MM, I learned that things are horribly complicated - it's not obvious so I can't expect all people to know what I learned. >=20 > On 25/09/19 10:44, David Hildenbrand wrote: >> I consider virtio the silver bullet whenever we want a mature >> paravirtualized interface across architectures. And you can tell that >> I'm not the only one by the huge amount of virtio device people are >> crafting right now. >=20 > Given there are hardware implementation of virtio, I would refine that: > virtio is a silver bullet whenever we want a mature ring buffer > interface across architectures. Being friendly to virtualization is by > now only a detail of virtio. It is also not exclusive to virtio, for > example NVMe 1.3 has incorporated some ideas from Xen and virtio and is > also virtualization-friendly. >=20 > In turn, the ring buffer interface is great if you want to have mostly > asynchronous operation---if not, the ring buffer is just adding > complexity. Sure, we have the luxury of abstractions and powerful > computers that hide most of the complexity, but some of it still lurks > in the form of race conditions. >=20 > So the question for virtio-mem is what makes asynchronous operation > important for memory hotplug? If I understand the virtio-mem driver, > all interaction with the virtio device happens through a work item, > meaning that it is strictly synchronous. At this point, you do not nee= d > a ring buffer, you only need: So, the main building pieces virtio-mem uses as of now in the virtio infrastructure are the config space and one virtqueue. a) A way for the host to send requests to the guest. E.g., request a certain amount of memory to be plugged/unplugged by the guest. Done via config space updates (e.g., similar to virtio-balloon inflation/deflation requests). b) A way for the guest to communicate with the host. E.g., send plug/unplug requests to plug/unplug separate memory blocks. Done via a virtqueue. Similar to inflation/deflation of pages in virtio-balloon. Requests by the host via the config space are processed asynchronously by the guest (again, similar to - say - virtio-balloon). Guest requests are currently processed synchronously by the host. Guest: Can I plug this block? Host: Sorry, No can do. Can't tell if there might be extensions (if virtio-mem ever comes to life ;) ) that might make use of asynchronous communication. Especially, there might be asynchronous/multiple guest->host requests at some point (e.g., "I'm nearly out of memory, please send help"). So yes, currently we could live without the ring buffer. But the config space and the virtqueue are real life-savers for me right now :) >=20 > - a command register where you write the address of a command buffer. > The device will do DMA from the command block, do whatever it has to do= , > DMA back the results, and trigger an interrupt. >=20 > - an interrupt mechanism. It could be MSI, or it could be an interrupt > pending/interrupt acknowledge register if all the hardware offers is > level-triggered interrupts. >=20 > I do agree that virtio-mem's command buffer/DMA architecture is better > than the more traditional "bunch of hardware registers" architecture > that QEMU uses for its ACPI-based CPU and memory hotplug controllers. > But that's because command buffer/DMA is what actually defines a good > paravirtualized interface; virtio is a superset of that that may not be > always a good solution. >=20 I completely agree to what you say here, virtio comes with complexity, but also with features (e.g., config space, support for multiple queues, abstraction of transports). Say, I would only want to expose a DIMM to the guest just like via ACPI, virtio would clearly not be the right choice. --=20 Thanks, David / dhildenb