From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Mon, 17 Sep 2018 04:19:50 -0600 From: Tycho Andersen Subject: Re: Redoing eXclusive Page Frame Ownership (XPFO) with isolated CPUs in mind (for KVM to isolate its guests per CPU) Message-ID: <20180917101950.GG4672@cisco> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: To: Julian Stecklina Cc: Juerg Haefliger , Linus Torvalds , David Woodhouse , Konrad Rzeszutek Wilk , deepa.srinivasan@oracle.com, Jim Mattson , Andrew Cooper , Linux Kernel Mailing List , Boris Ostrovsky , linux-mm , Thomas Gleixner , joao.m.martins@oracle.com, pradeep.vincent@oracle.com, Andi Kleen , Khalid Aziz , kanth.ghatraju@oracle.com, Liran Alon , Kees Cook , Kernel Hardening , chris.hyser@oracle.com, Tyler Hicks , John Haxby , Jon Masters List-ID: On Mon, Sep 17, 2018 at 12:01:02PM +0200, Julian Stecklina wrote: > Juerg Haefliger writes: > > >> I've updated my XPFO branch[1] to make some of the debugging optional > >> and also integrated the XPFO bookkeeping with struct page, instead of > >> requiring CONFIG_PAGE_EXTENSION, which removes some checks in the hot > >> path. > > > > FWIW, that was my original design but there was some resistance to > > adding more to the page struct and page extension was suggested > > instead. > > From looking at both versions, I have to say that having the metadata in > struct page makes the code easier to understand and removes some special > cases and bookkeeping. > > > I'm wondering how much performance we're loosing by having to split > > hugepages. Any chance this can be quantified somehow? Maybe we can > > have a pool of some sorts reserved for userpages and group allocations > > so that we can track the XPFO state at the hugepage level instead of > > at the 4k level to prevent/reduce page splitting. Not sure if that > > causes issues or has any unwanted side effects though... > > Optimizing the allocation/deallocation path might be worthwhile, because > that's where most of the overhead goes. I haven't looked into how to do > this yet. I'd appreciate if someone has pointers to code that tries to > achieve similar functionality to get me started. > > That being said, I'm wondering whether we have unrealistic expectations > about the overhead here and whether it's worth turning this patch into > something far more complicated. Opinions? I think that implementing Dave Hansen's suggestions of not doing flushes/other work on every map/unmap, but only when pages are added to the various free lists will probably help out a lot. That's where I got stuck last time when I was trying to do it, though :) Cheers, Tycho