From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [RFC 7/8]KVM: swap out guest pages Date: Tue, 24 Jul 2007 17:55:04 +0300 Message-ID: <46A612C8.6090804@qumranet.com> References: <1185173505.2645.71.camel@sli10-conroe.sh.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: kvm-devel , lkml To: Shaohua Li Return-path: In-Reply-To: <1185173505.2645.71.camel-yAZKuqJtXNMXR+D7ky4Foa2pdiUAq4bhAL8bYrjMMd8@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org Errors-To: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org List-Id: kvm.vger.kernel.org Shaohua Li wrote: > Make KVM guest pages be allocated dynamically and able to be swaped out. > > One issue: all inodes returned from anon_inode_getfd are shared, > if one module changes field of the inode, other moduels might break. > Should we introduce a new API to not share inode? > > Signed-off-by: Shaohua Li > --- > > +static int kvm_set_page_dirty(struct page *page) > +{ > + if (!PageDirty(page)) > + SetPageDirty(page); > + return 0; > +} > + > +static int kvm_writepage(struct page *page, struct writeback_control *wbc) > +{ > + struct address_space *mapping = page->mapping; > + struct kvm *kvm = address_space_to_kvm(mapping); > + int ret = 0; > + > + /* > + * gfn_to_page is called with kvm->lock hold, which might invoke page > + * reclaim. So the .writepage should check if we already hold the lock > + * to avoid deadlock. > + */ > + if (!mutex_trylock(&kvm->lock)) { > + set_page_dirty(page); > + return AOP_WRITEPAGE_ACTIVATE; > + } > + > + /* > + * We just zap vcpu 0's page table. For a SMP guest, we should zap all > + * vcpus'. It's better shadow page table is per-vm. > + */ > + if (PagePrivate(page)) > + kvm_mmu_zap_pagetbl(&kvm->vcpus[0], page->index); > + > + ret = kvm_move_to_swap(page); > + if (ret) { > + set_page_dirty(page); > + goto out; > + } > + unlock_page(page); > +out: > + mutex_unlock(&kvm->lock); > + > + return ret; > +} > + > Perhaps we can use this as a base for userspace-allocated memory. We still have a kvm inode and address_space; but instead of calling kvm_move_to_swap(), we use the memory slot and virtual address offset to locate the underlying address_space and call that ->writepage(). So: kvm_writepage() removes any shadow page table references the underlying ->writepage() does the work of paging to the underlying store We need to figure out how to avoid the underlying ->writepage() from not within the context of kvm_writepage(). Maybe have a page flag signifying layered address spaces? [it probably violates fifteen different mm assumptions; I need to study that code] An alternative would be to have kvm set a page flag signifying it has references to the page when it installs it in a shadow pte. The mm would notice the flag and call kvm to clear it below proceeding with normal ->writepage(). -- error compiling committee.c: too many arguments to function ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/