From mboxrd@z Thu Jan 1 00:00:00 1970 From: john cooper Subject: Re: KVM Test result, kernel ff5bdac.., userspace eb2fd67.. -- One New Issue Date: Tue, 10 Jun 2008 15:54:11 -0400 Message-ID: <484EDBE3.5060304@redhat.com> References: <4848AAD0.1070108@intel.com> <20080609212310.GA6324@dmt.cnet> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------090501000303080603010005" Cc: Yunfeng Zhao , Chris Wright , kvm@vger.kernel.org, Avi Kivity , john.cooper@redhat.com To: Marcelo Tosatti Return-path: Received: from mx1.redhat.com ([66.187.233.31]:46564 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753922AbYFJT6k (ORCPT ); Tue, 10 Jun 2008 15:58:40 -0400 In-Reply-To: <20080609212310.GA6324@dmt.cnet> Sender: kvm-owner@vger.kernel.org List-ID: This is a multi-part message in MIME format. --------------090501000303080603010005 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Marcelo Tosatti wrote: > > This is a get_user_pages() with hugetlb-vma's bug, not KVM's problem, > fixed by: > > commit 5b23dbe8173c212d6a326e35347b038705603d39 > Author: Adam Litke > Date: Wed Nov 14 16:59:33 2007 -0800 I'd say so. Just to close the loop here, attached is a trivial patch gleaned from 2.6.25 which is relative to 2.6.23.9 fc8 where I'd reproduced the issue. It dismisses the BIOS identify problem and gets the guest through a full userland init (without the earlier bandaid of mapping the first 2MB of physmem with 4KB pages). -john -- john.cooper@redhat.com --------------090501000303080603010005 Content-Type: text/x-patch; name="kvm-1941302.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="kvm-1941302.diff" include/linux/hugetlb.h | 4 ++-- mm/hugetlb.c | 7 ++++--- mm/memory.c | 2 +- 3 files changed, 7 insertions(+), 6 deletions(-) ================================================================= --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -19,7 +19,7 @@ static inline int is_vm_hugetlb_page(str int hugetlb_sysctl_handler(struct ctl_table *, int, struct file *, void __user *, size_t *, loff_t *); int hugetlb_treat_movable_handler(struct ctl_table *, int, struct file *, void __user *, size_t *, loff_t *); int copy_hugetlb_page_range(struct mm_struct *, struct mm_struct *, struct vm_area_struct *); -int follow_hugetlb_page(struct mm_struct *, struct vm_area_struct *, struct page **, struct vm_area_struct **, unsigned long *, int *, int); +int follow_hugetlb_page(struct mm_struct *, struct vm_area_struct *, struct page **, struct vm_area_struct **, unsigned long *, int *, int, int); void unmap_hugepage_range(struct vm_area_struct *, unsigned long, unsigned long); void __unmap_hugepage_range(struct vm_area_struct *, unsigned long, unsigned long); int hugetlb_prefault(struct address_space *, struct vm_area_struct *); @@ -105,7 +105,7 @@ static inline unsigned long hugetlb_tota return 0; } -#define follow_hugetlb_page(m,v,p,vs,a,b,i) ({ BUG(); 0; }) +#define follow_hugetlb_page(m,v,p,vs,a,b,i,w) ({ BUG(); 0; }) #define follow_huge_addr(mm, addr, write) ERR_PTR(-EINVAL) #define copy_hugetlb_page_range(src, dst, vma) ({ BUG(); 0; }) #define hugetlb_prefault(mapping, vma) ({ BUG(); 0; }) ================================================================= --- a/mm/memory.c +++ b/mm/memory.c @@ -1039,7 +1039,7 @@ int get_user_pages(struct task_struct *t if (is_vm_hugetlb_page(vma)) { i = follow_hugetlb_page(mm, vma, pages, vmas, - &start, &len, i); + &start, &len, i, write); continue; } ================================================================= --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -621,7 +621,8 @@ int hugetlb_fault(struct mm_struct *mm, int follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, struct page **pages, struct vm_area_struct **vmas, - unsigned long *position, int *length, int i) + unsigned long *position, int *length, int i, + int write) { unsigned long pfn_offset; unsigned long vaddr = *position; @@ -639,11 +640,11 @@ int follow_hugetlb_page(struct mm_struct */ pte = huge_pte_offset(mm, vaddr & HPAGE_MASK); - if (!pte || pte_none(*pte)) { + if (!pte || pte_none(*pte) || (write && !pte_write(*pte))) { int ret; spin_unlock(&mm->page_table_lock); - ret = hugetlb_fault(mm, vma, vaddr, 0); + ret = hugetlb_fault(mm, vma, vaddr, write); spin_lock(&mm->page_table_lock); if (!(ret & VM_FAULT_ERROR)) continue; --------------090501000303080603010005--