From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tim Deegan Subject: Re: PoD code killing domain before it really gets started Date: Tue, 7 Aug 2012 11:00:29 +0100 Message-ID: <20120807100029.GC84051@ocelot.phlegethon.org> References: <501173540200007800090A04@nat28.tlf.novell.com> <50116CD9.6000503@eu.citrix.com> <501FE96E0200007800092E25@nat28.tlf.novell.com> <501FED030200007800092E35@nat28.tlf.novell.com> <5020E13F02000078000931DA@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <5020E13F02000078000931DA@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: George Dunlap , Ian Jackson , Ian Campbell , xen-devel List-Id: xen-devel@lists.xenproject.org Hi, At 08:34 +0100 on 07 Aug (1344328495), Jan Beulich wrote: > >>> On 06.08.12 at 18:03, George Dunlap wrote: > > I guess there are two problems with that: > > * As you've seen, apparently dom0 may access these pages before any > > faults happen. > > * If it happens that reclaim_single is below the only zeroed page, the > > guest will crash even when there is reclaim-able memory available. > > > > Two ways we could fix this: > > 1. Remove dom0 accesses (what on earth could be looking at a > > not-yet-created VM?) > > I'm told it's a monitoring daemon, and yes, they are intending to > adjust it to first query the GFN's type (and don't do the access > when it's not populated, yet). But wait, I didn't check the code > when I recommended this - XEN_DOMCTL_getpageframeinfo{2,3) > also call get_page_from_gfn() with P2M_ALLOC, so would also > trigger the PoD code (in -unstable at least) - Tim, was that really > a correct adjustment in 25355:974ad81bb68b? It looks to be a > 1:1 translation, but is that really necessary? AFAICT 25355:974ad81bb68b doesn't change anything. Back in 4.1-testing the lookup was done with gmfn_to_mfn(), which boils down to a lookup with p2m_alloc. > If one wanted to find out whether a page is PoD to avoid getting it > populated, how would that be done from outside the hypervisor? Would > we need XEN_DOMCTL_getpageframeinfo4 for this? We'd certainly need _some_ change to the hypercall interface, as there's no XEN_DOMCTL_PFINFO_ rune for 'PoD', and presumably you'd want to know the difference between PoD and not-present. > > 2. Allocate the PoD cache before populating the p2m table Any reason not to do this? Tim.