From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Andres Lagar-Cavilla" Subject: Re: PoD code killing domain before it really gets started Date: Tue, 7 Aug 2012 07:40:09 -0700 Message-ID: References: Reply-To: andres@lagarcavilla.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: xen-devel@lists.xen.org Cc: george.dunlap@eu.citrix.com, Ian.Jackson@eu.citrix.com, tim@xen.org, JBeulich@suse.com List-Id: xen-devel@lists.xenproject.org >>>> On 06.08.12 at 18:03, George Dunlap >>>> wrote: >> I guess there are two problems with that: >> * As you've seen, apparently dom0 may access these pages before any >> faults happen. >> * If it happens that reclaim_single is below the only zeroed page, the >> guest will crash even when there is reclaim-able memory available. >> >> Two ways we could fix this: >> 1. Remove dom0 accesses (what on earth could be looking at a >> not-yet-created VM?) > > I'm told it's a monitoring daemon, and yes, they are intending to > adjust it to first query the GFN's type (and don't do the access > when it's not populated, yet). But wait, I didn't check the code > when I recommended this - XEN_DOMCTL_getpageframeinfo{2,3) > also call get_page_from_gfn() with P2M_ALLOC, so would also > trigger the PoD code (in -unstable at least) - Tim, was that really > a correct adjustment in 25355:974ad81bb68b? It looks to be a > 1:1 translation, but is that really necessary? If one wanted to > find out whether a page is PoD to avoid getting it populated, > how would that be done from outside the hypervisor? Would > we need XEN_DOMCTL_getpageframeinfo4 for this? > >> 2. Allocate the PoD cache before populating the p2m table >> 3. Make it so that some accesses fail w/o crashing the guest? I don't >> see how that's really practical. > > What's wrong with telling control tools that a certain page is > unpopulated (from which they will be able to imply that's it's all > clear from the guest's pov)? Even outside of the current problem, > I would think that's more efficient than allocating the page. Of > course, the control tools need to be able to cope with that. And > it may also be necessary to distinguish between read and > read/write mappings being established (and for r/w ones the > option of populating at access time rather than at creation time > would need to be explored). I wouldn't be opposed to some form of getpageframeinfo4. It's not just PoD we are talking about here. Is the page paged out? Is the page shared? Right now we have global per-domain queries (domaininfo). Or individual gfn debug memctl's. A batched interface with richer information would be a blessing for debugging or diagnosis purposes. The first order of business is exposing the type. Do we really want to expose the whole range of p2m_* types or just "really useful" ones like is_shared, is_pod, is_paged, is_normal? An argument for the former is that the mem event interface already pumps the p2m_* type up the stack. The other useful bit of information I can think of is exposing the shared ref count. My two cents Andres > >> 4. Change the sweep routine so that the lower 2MiB gets swept >> >> #2 would require us to use all PoD entries when building the p2m >> table, thus addressing the mail you mentioned from 25 July*. Given >> that you don't want #1, it seems like #2 is the best option. >> >> No matter what we do, the sweep routine for 4.2 should be re-written >> to search all of memory at least once (maybe with a timeout for >> watchdogs), since it's only called in an actual emergency. >> >> Let me take a look... > > Thanks! > > Jan