From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [RFC][Patch v9 0/6] KVM: Guest Free Page Hinting Date: Thu, 7 Mar 2019 21:24:02 -0500 Message-ID: <20190307212253-mutt-send-email-mst@kernel.org> References: <20190306155048.12868-1-nitesh@redhat.com> <1d5e27dc-aade-1be7-2076-b7710fa513b6@redhat.com> <2269c59c-968c-bbff-34c4-1041a2b1898a@redhat.com> <20190307134744-mutt-send-email-mst@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Alexander Duyck , Nitesh Narayan Lal , kvm list , LKML , linux-mm , Paolo Bonzini , lcapitulino@redhat.com, pagupta@redhat.com, wei.w.wang@intel.com, Yang Zhang , Rik van Riel , dodgen@google.com, Konrad Rzeszutek Wilk , dhildenb@redhat.com, Andrea Arcangeli To: David Hildenbrand Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On Thu, Mar 07, 2019 at 08:27:32PM +0100, David Hildenbrand wrote: > On 07.03.19 19:53, Michael S. Tsirkin wrote: > > On Thu, Mar 07, 2019 at 10:45:58AM -0800, Alexander Duyck wrote: > >> To that end what I think w may want to do is instead just walk the LRU > >> list for a given zone/order in reverse order so that we can try to > >> identify the pages that are most likely to be cold and unused and > >> those are the first ones we want to be hinting on rather than the ones > >> that were just freed. If we can look at doing something like adding a > >> jiffies value to the page indicating when it was last freed we could > >> even have a good point for determining when we should stop processing > >> pages in a given zone/order list. > >> > >> In reality the approach wouldn't be too different from what you are > >> doing now, the only real difference would be that we would just want > >> to walk the LRU list for the given zone/order rather then pulling > >> hints on what to free from the calls to free_one_page. In addition we > >> would need to add a couple bits to indicate if the page has been > >> hinted on, is in the middle of getting hinted on, and something such > >> as the jiffies value I mentioned which we could use to determine how > >> old the page is. > > > > Do we really need bits in the page? > > Would it be bad to just have a separate hint list? > > > > If you run out of free memory you can check the hint > > list, if you find stuff there you can spin > > or kick the hypervisor to hurry up. > > > > Core mm/ changes, so nothing's easy, I know. > > We evaluated the idea of busy spinning on some bit/list entry a while > ago. While it sounds interesting, it is usually not what we want and has > other negative performance impacts. > > Talking about "marking" pages, what we actually would want is to rework > the buddy to skip over these "marked" pages and only really spin in case > there are no other pages left. Allocation paths should only ever be > blocked if OOM, not if just some hinting activity is going on on another > VCPU. > > However as you correctly say: "core mm changes". New page flag? > Basically impossible. Well not exactly. page bits are at a premium but only for *allocated* pages. pages in the buddy are free and there are some unused bits for these. > Reuse another one? Can easily get horrbily > confusing and can easily get rejected upstream. What about the buddy > wanting to merge pages that are marked (assuming we also want something > < MAX_ORDER - 1)? This smells like possibly heavy core mm changes. > > Lesson learned: Avoid such heavy changes. Especially in the first shot. > > The interesting thing about Nitesh's aproach right now is that we can > easily rework these details later on. The host->guest interface will > stay the same. Instead of temporarily taking pages out of the buddy, we > could e.g. mark them and make the buddy or other users skip over them. > > -- > > Thanks, > > David / dhildenb