From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexey Kardashevskiy Subject: Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows Date: Thu, 05 Jun 2014 23:04:01 +1000 Message-ID: <53906AC1.6000404@ozlabs.ru> References: <1401953144-19186-1-git-send-email-aik@ozlabs.ru> <1401953144-19186-4-git-send-email-aik@ozlabs.ru> <1401953908.3247.121.camel@pasglop> <539037DB.5080706@ozlabs.ru> <1401964037.3247.129.camel@pasglop> <53905ADB.8000100@suse.de> <1401971411.3247.132.camel@pasglop> Mime-Version: 1.0 Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 7bit Cc: linuxppc-dev@lists.ozlabs.org, Paul Mackerras , Gleb Natapov , Paolo Bonzini , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, kvm-ppc@vger.kernel.org To: Benjamin Herrenschmidt , Alexander Graf Return-path: In-Reply-To: <1401971411.3247.132.camel@pasglop> Sender: kvm-ppc-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On 06/05/2014 10:30 PM, Benjamin Herrenschmidt wrote: > On Thu, 2014-06-05 at 13:56 +0200, Alexander Graf wrote: >> What if we ask user space to give us a pointer to user space allocated >> memory along with the TCE registration? We would still ask user space to >> only use the returned fd for TCE modifications, but would have some >> nicely swappable memory we can store the TCE entries in. > > That isn't going to work terribly well for VFIO :-) But yes, for > emulated devices, we could improve things a bit, including for > the 32-bit TCE tables. > > For emulated, the real mode path could walk the page tables and fallback > to virtual mode & get_user if the page isn't present, thus operating > directly on qemu memory TCE tables instead of the current pinned stuff. > > However that has a cost in performance, but since that's really only > used for emulated devices and PAPR VIOs, it might not be a huge issue. > > But for VFIO we don't have much choice, we need to create something the > HW can access. You are confusing things here. There are 2 tables: 1. guest-visible TCE table, this is what is allocated for VIO or emulated PCI; 2. real HW DMA window, one exists already for DMA32 and one I will allocated for a huge window. I have just #2 for VFIO now but we will need both in order to implement H_GET_TCE correctly, and this is the table I will allocate by this new ioctl. >> In fact, the code as is today can allocate an arbitrary amount of pinned >> kernel memory from within user space without any checks. > > Right. We should at least account it in the locked limit. Yup. And (probably) this thing will keep a counter of how many windows were created per KVM instance to avoid having multiple copies of the same table. -- Alexey