From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH 3/4] Swapping Date: Sun, 14 Oct 2007 08:02:41 +0200 Message-ID: <4711B101.7070305@qumranet.com> References: <47102919.6070802@qumranet.com> <471124D4.3090901@codemonkey.ws> <471126D9.4030204@qumranet.com> <47112D66.4020500@qumranet.com> <47115207.3090909@codemonkey.ws> <4711542F.2060306@qumranet.com> <4711563D.8020000@codemonkey.ws> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org To: Anthony Liguori Return-path: In-Reply-To: <4711563D.8020000-rdkfGonbjUSkNkDKm+mE6A@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org Errors-To: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org List-Id: kvm.vger.kernel.org Anthony Liguori wrote: > Izik Eidus wrote: > >> Anthony Liguori wrote: >> >>> Izik Eidus wrote: >>> >>>> Izik Eidus wrote: >>>> >>>>> Anthony Liguori wrote: >>>>> >>>>>> Izik Eidus wrote: >>>>>> >>>>>>> @@ -1058,8 +1038,27 @@ struct page *gfn_to_page(struct kvm *kvm, >>>>>>> gfn_t gfn) >>>>>>> >>>>>>> gfn = unalias_gfn(kvm, gfn); >>>>>>> slot = __gfn_to_memslot(kvm, gfn); >>>>>>> - if (!slot) >>>>>>> + if (!slot) { >>>>>>> + get_page(bad_page); >>>>>>> return bad_page; >>>>>>> + } >>>>>>> + if (slot->user_alloc) { >>>>>>> + struct page *page[1]; >>>>>>> + int npages; >>>>>>> + >>>>>>> + down_read(¤t->mm->mmap_sem); >>>>>>> + npages = get_user_pages(current, current->mm, >>>>>>> + slot->userspace_addr >>>>>>> + + (gfn - slot->base_gfn) * PAGE_SIZE, 1, >>>>>>> + 1, 0, page, NULL); >>>>>>> + up_read(¤t->mm->mmap_sem); >>>>>>> + if (npages != 1) { >>>>>>> + get_page(bad_page); >>>>>>> + return bad_page; >>>>>>> + } >>>>>>> + return page[0]; >>>>>>> >>>>>>> >>>>>> Wouldn't it be necessary to assign page[0] to slot->phys_mem[gfn - >>>>>> slot->base_gfn]? >>>>>> >>>> sorry, it seems like i missunderstand you in the answer i gave you. >>>> it wouldnt be necessary to assign page[0] to slot->phys_mem[gfn - >>>> slot->base_gfn], beacuse phys_mem wont have any memory allocate by >>>> this time. >>>> >>>> with this patch, we are not holding anymore (when using userspace >>>> allocation) array of all the memory at phys_mem. >>>> beacuse now that the pages are swappable, the physical address >>>> pointed by the virtual address all the time change (for example when >>>> swapping happn) >>>> so no one promise us that slot->phys_mem[gfn - slot->base_gfn] will >>>> really point to page holding the gfn page. >>>> >>>> so what we did, is throw away the phys_mem array (also nice beacuse >>>> it waste less ram), and at runtime we are getting the pages by using >>>> the virtual address >>>> beacuse the reference of the page get increased, it promised us that >>>> untill we release it point to the gfn (release it by doing put_page) >>>> >>>> hope i was more clear this time :) >>>> >>> Yes, that makes sense! >>> >>> I wonder if there's a more elegant way dealing with older >>> userspaces. For instance, is there any reason why we can allocate a >>> userspace memory region on behalf of userspace. That way swap would >>> even work with older userspaces. >>> >> if we can do that, yes swap will work on older userspace. >> > > I think it's just a matter of calling do_mmap() with the appropriate > parameters. It looks likes there's some drivers call do_mmap() directly. > > This will halve the maximum size of virtual machines on i386 since userspace will also mmap() the memory, and the virtual address space is restricted to 3GB. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/