From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mukesh Rathor Subject: Re: [RFC 0 PATCH 3/3] PVH dom0: construct_dom0 changes Date: Mon, 7 Oct 2013 17:58:52 -0700 Message-ID: <20131007175852.6583ea0d@mantra.us.oracle.com> References: <1380142988-9487-1-git-send-email-mukesh.rathor@oracle.com> <1380142988-9487-4-git-send-email-mukesh.rathor@oracle.com> <5244064102000078000F69AF@nat28.tlf.novell.com> <20130926171737.071f118f@mantra.us.oracle.com> <524547CF02000078000F73F7@nat28.tlf.novell.com> <20131002175358.5f31579c@mantra.us.oracle.com> <524E820002000078000F8C16@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta4.messagelabs.com ([85.158.143.247]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1VTLdd-00082f-4n for xen-devel@lists.xenproject.org; Tue, 08 Oct 2013 00:59:01 +0000 In-Reply-To: <524E820002000078000F8C16@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: xen-devel , keir.xen@gmail.com List-Id: xen-devel@lists.xenproject.org On Fri, 04 Oct 2013 07:53:20 +0100 "Jan Beulich" wrote: > >>> On 03.10.13 at 02:53, Mukesh Rathor > >>> wrote: > > On Fri, 27 Sep 2013 07:54:39 +0100 > > "Jan Beulich" wrote: > > > >> >>> On 27.09.13 at 02:17, Mukesh Rathor > >> >>> wrote: > >> > On Thu, 26 Sep 2013 09:02:41 +0100 "Jan Beulich" > >> > wrote: > >> >> >>> On 25.09.13 at 23:03, Mukesh Rathor > >> >> >>> wrote: > >> >> > +/* > >> >> > + * Set the 1:1 map for all non-RAM regions for dom 0. Thus, > >> >> > dom0 will have > >> >> > + * the entire io region mapped in the EPT/NPT. > >> >> > + * > >> >> > + * PVH FIXME: The following doesn't map MMIO ranges when they > >> >> > sit above the > >> >> > + * highest E820 covered address. > >> >> > >> >> This absolutely needs fixing before this can go in. > >> > > >> > Any suggestions on how to fix it? Mapping all the way to end > >> > could result in a huge hap table. > >> > >> You'll probably need a call down from Dom0 telling you where it > >> finds/puts MMIO resources. Or perhaps that could be mapped > >> in on demand from the EPT fault handler (since these regions > >> shouldn't be subject to DMA, and hence IOMMU faults shouldn't > >> occur - perhaps that's even a reason to not share page tables > >> at least in dom0-strict mode)? > > > > Thinking about mapping in on demand from the EPT fault handler, how > > would I know if the access beyond last e820 entry is genuine and > > not a faulty pte in a buggy guest? Could I consult the mmconfig > > table (?) or the ACPI table in xen? Any pointers would be > > helpful... my knowledge runs out quickly here. > > You'd have to inspect all the BARs of the devices the domain owns. > Hence the thought of having Dom0 tell you about those resource > assignments. > > > FWIW, at present pv-ops linux doesn't allow any mmio access beyond > > the last e820 entry. So, we'd need a fix there too. In my very orig > > patch, I was updating all IO mappings on demand by putting hook > > in linux native_pte_update if it was _PAGE_BIT_IOMAP. Another > > possibility would be do that for any mappings above the last > > e820 entry. What do you think? > > Special casing IOMAP page table creation might be an option, but > has the downside of allowing kernel bugs to propagate into Xen's > view of the world. > > > For testing purposes, do you have reference for hardware? I don't > > see any here with such configuration. > > Nothing specific, but I know that SR-IOV virtual functions easily > cause kernels to run out of MMIO space below 4G (namely when > the hole is only around 1Gb or even less), and Intel must have > knowledge of graphics cards having so huge a frame buffer that > it can only be mapped above 4G. In that case, I don't see why this is a MUST for the patch. Combined with the fact that at present pv-ops doesn't even allow for mapping above the last e820 entry, I think this can be left FIXME/bug-fix for near future that can be done relatively quickly by an expert in that area if learning that takes me a long time. thanks Mukesh