From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Daney Subject: Re: Q: Why not use struct mm_struct to manage guest physical addresses in new port? Date: Fri, 08 Feb 2013 15:08:54 -0800 Message-ID: <51158586.3060308@gmail.com> References: <51115748.2090203@gmail.com> <20130208221151.GA27012@amt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: KVM devel mailing list , Ralf Baechle To: Marcelo Tosatti Return-path: Received: from mail-ie0-f180.google.com ([209.85.223.180]:33359 "EHLO mail-ie0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1947176Ab3BHXJC (ORCPT ); Fri, 8 Feb 2013 18:09:02 -0500 Received: by mail-ie0-f180.google.com with SMTP id bn7so5584671ieb.39 for ; Fri, 08 Feb 2013 15:09:01 -0800 (PST) In-Reply-To: <20130208221151.GA27012@amt.cnet> Sender: kvm-owner@vger.kernel.org List-ID: On 02/08/2013 02:11 PM, Marcelo Tosatti wrote: > On Tue, Feb 05, 2013 at 11:02:32AM -0800, David Daney wrote: >> Hi, >> >> I am starting to working on a port of KVM to an architecture that >> has a dual TLB. The Guest Virtual Addresses (GVA) are translated to >> Guest Physical Addresses (GPA) by the first TLB, then a second TLB >> translates the GPA to a Root Physical Address (RPA). For the sake >> of this question, we will ignore the GVA->GPA TLB and consider only >> the GPA->RPA TLB. >> >> I seems that most existing ports have a bunch of custom code that >> manages the GPA->RPA TLB and page tables. >> >> Here is what I would like to try: Create a mm for the GPA->RPA >> mappings each vma would have a fault handler that calls gfn_to_pfn() >> to look up the proper page. In kvm_arch_vcpu_ioctl_run() we would >> call switch_mm() to this new 'gva_mm'. > > gfn_to_pfn uses the address space of the controlling process. GPA->RPA > translation does: > > 1) Find 'memory slot' (indexed by gfn) > 2) From 'memory slot', find virtual address (relative to controlling > process). > 3) Walk pagetable of controlling process and page retrieve physical address. Actually, it kind of works. Here is the vm_operations_struct for the VMAs in the guest MM using this technique: static int kvm_mipsvz_host_fault(struct vm_area_struct *vma, struct vm_fault *vmf) { struct page *page[1]; unsigned long addr; int npages; struct kvm *kvm = vma->vm_private_data; gfn_t gfn = vmf->pgoff + (vma->vm_start >> PAGE_SHIFT); addr = gfn_to_hva(kvm, gfn); if (kvm_is_error_hva(addr)) return VM_FAULT_SIGBUS; npages = get_user_pages(current, kvm->arch.host_mm, addr, 1, 1, 0, page, NULL); if (unlikely(npages != 1)) return VM_FAULT_SIGBUS; vmf->page = page[0]; return 0; } static const struct vm_operations_struct kvm_mipsvz_host_ops = { .fault = kvm_mipsvz_host_fault }; Most likely this screws up the page reference counts in a manner that will leak pages. But the existing mm infrastructure is managing the page tables so that the pages show up in the proper place in the guest. That said, I think I will switch to a more conventional approach where the guest page tables are manages outside of the kernel's struct mm_struct framework. What I did, works for memory, but I think it will be very difficult to implement trap-and-emulate on memory references this way. > >> Upon exiting guest mode we >> would switch back to the original mm of the controlling process. >> For me the benefit of this approach is that all the code that >> manages the TLB is already implemented and works well for struct >> mm_struct. The only thing I need to do is write a vma fault >> handler. That is a lot easier and less error prone than maintaining >> a parallel TLB management framework and making sure it interacts >> properly with the existing TLB code for 'normal' processes. >> >> >> Q1: Am I crazy for wanting to try this? > > You need the mm_struct of the controlling to be active, when doing > GPA->RPA translations. > >> Q2: Have others tried this and rejected it? What were the reasons? > > I think you'll have to switch_mm back to the controlling process mm on > every page fault (and then back to gva_mm). > >> >> Thanks in advance, >> David Daney >> Cavium, Inc. > > 'vma' `is a process > >> -- >> To unsubscribe from this list: send the line "unsubscribe kvm" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html