From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paolo Bonzini Subject: Re: [RFC/RFT PATCH 0/3] arm64: KVM: work around incoherency with uncached guest mappings Date: Thu, 05 Mar 2015 18:43:36 +0100 Message-ID: <54F895C8.2070306@redhat.com> References: <20150304141212.GA5352@hawk.usersys.redhat.com> <20150304142943.GU28951@e104818-lin.cambridge.arm.com> <54F73ACF.1090605@redhat.com> <20150304172855.GA28951@e104818-lin.cambridge.arm.com> <54F82C06.2000701@redhat.com> <20150305110415.GA7712@e104818-lin.cambridge.arm.com> <20150305120348.GD7712@e104818-lin.cambridge.arm.com> <54F84B7F.5010007@redhat.com> <20150305145831.GA11447@e104818-lin.cambridge.arm.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 11B4847DBF for ; Thu, 5 Mar 2015 12:37:51 -0500 (EST) Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BywWTaY1mWt2 for ; Thu, 5 Mar 2015 12:37:49 -0500 (EST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by mm01.cs.columbia.edu (Postfix) with ESMTPS id C2EF147D84 for ; Thu, 5 Mar 2015 12:37:49 -0500 (EST) In-Reply-To: <20150305145831.GA11447@e104818-lin.cambridge.arm.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu To: Catalin Marinas Cc: KVM devel mailing list , Ard Biesheuvel , Marc Zyngier , Laszlo Ersek , "kvmarm@lists.cs.columbia.edu" , "linux-arm-kernel@lists.infradead.org" List-Id: kvmarm@lists.cs.columbia.edu On 05/03/2015 15:58, Catalin Marinas wrote: >> It would especially suck if the user has a cluster with different >> machines, some of them coherent and others non-coherent, and then has to >> debug why the same configuration works on some machines and not on others. > > That's a problem indeed, especially with guest migration. But I don't > think we have any sane solution here for the bus master DMA. I do not oppose doing cache management in QEMU for bus master DMA (though if the solution you outlined below works it would be great). > ARM can override them as well but only making them stricter. Otherwise, > on a weakly ordered architecture, it's not always safe (let's say the > guest thinks it accesses Strongly Ordered memory and avoids barriers for > flag updates but the host "upgrades" it to Cacheable which breaks the > memory order). The same can happen on x86 though, even if it's rarer. You still need a barrier between stores and loads. > If we want the host to enforce guest memory mapping attributes via stage > 2, we could do it the other way around: get the guests to always assume > full cache coherency, generating Normal Cacheable mappings, but use the > stage 2 attributes restriction in the host to make such mappings > non-cacheable when needed (it works this way on ARM but not in the other > direction to relax the attributes). That sounds like a plan for device assignment. But it still would not solve the problem of the MMIO framebuffer, right? >> The problem arises with MMIO areas that the guest can reasonably expect >> to be uncacheable, but that are optimized by the host so that they end >> up backed by cacheable RAM. It's perfectly reasonable that the same >> device needs cacheable mapping with one userspace, and works with >> uncacheable mapping with another userspace that doesn't optimize the >> MMIO area to RAM. > > Unless the guest allocates the framebuffer itself (e.g. > dma_alloc_coherent), we can't control the cacheability via > "dma-coherent" properties as it refers to bus master DMA. Okay, it's good to rule that out. One less thing to think about. :) Same for _DSD. > So for MMIO with the buffer allocated by the host (Qemu), the only > solution I see on ARM is for the host to ensure coherency, either via > explicit cache maintenance (new KVM API) or by changing the memory > attributes used by Qemu to access such virtual MMIO. > > Basically Qemu is acting as a bus master when reading the framebuffer it > allocated but the guest considers it a slave access and we don't have a > way to tell the guest that such accesses should be cacheable, nor can we > upgrade them via architecture features. Yes, that's a way to put it. >> In practice, the VGA framebuffer has an optimization that uses dirty >> page tracking, so we could piggyback on the ioctls that return which >> pages are dirty. It turns out that piggybacking on those ioctls also >> should fix the case of migrating a guest while the MMU is disabled. > > Yes, Qemu would need to invalidate the cache before reading a dirty > framebuffer page. > > As I said above, an API that allows non-cacheable mappings for the VGA > framebuffer in Qemu would also solve the problem. I'm not sure what KVM > provides here (or whether we can add such API). Nothing for now; other architectures simply do not have the issue. As long as it's just VGA, we can quirk it. There's just a couple vendor/device IDs to catch, and the guest can then use a cacheable mapping. For a more generic solution, the API would be madvise(MADV_DONTCACHE). It would be easy for QEMU to use it, but I am not too optimistic about convincing the mm folks about it. We can try. Paolo