From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wei Wang Subject: Re: [PATCH] dump_p2m_table: For IOMMU Date: Fri, 10 Aug 2012 12:50:48 +0200 Message-ID: <5024E788.80300@amd.com> References: <8deb7c7a25c4a3bc50d7.1344446255@REDBLD-XS.ad.xensource.com> <5023822E0200007800093D2A@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <5023822E0200007800093D2A@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: tim@xen.org, xiantao.zhang@intel.com, Santosh Jodh , xen-devel List-Id: xen-devel@lists.xenproject.org On 08/09/2012 09:26 AM, Jan Beulich wrote: > Wei - here I'm particularly worried about the use of "level - 1" > instead of "next_level", which would similarly apply to the > original function. If the way this is currently done is okay, then > why is next_level being computed in the first place? I think that recalculation is to guarantee that this recursive function returns. It should run at most "paging_mode" times no matter what "next_level" says. But if we could assume that next level field in every pde is correct, then using next level is fine to me. (And similar > to the issue Santosh has already fixed here - the original > function pointlessly maps/unmaps the page when "level<= 1". > Furthermore, iommu_map.c has nice helper functions > iommu_next_level() and amd_iommu_is_pte_present() - why > aren't they in a header instead, so they could be used here, > avoiding the open coding of them?) Maybe those helps appears after the original function. I could sent a patch to clean up these: * do not map/unmap if level <= 1 * move amd_iommu_is_pte_present() and iommu_next_level() to a header file. and use them in deallocate_next_page_table. * Using next_level instead of recalculation (if requested) Thanks, Wei >> + } >> + >> + if ( present ) >> + { >> + printk("gfn: %016"PRIx64" mfn: %016"PRIx64"\n", >> + address>> PAGE_SHIFT, next_table_maddr>> PAGE_SHIFT); > > I'd prefer you to use PFN_DOWN() here. > > Also, depth first, as requested by Tim, to me doesn't mean > recursing before printing. I think you really want to print first, > then recurse. Otherwise how would be output be made sense > of? > >> + } >> + } >> + >> + unmap_domain_page(table_vaddr); >> +} >> ... >> --- a/xen/drivers/passthrough/iommu.c Tue Aug 07 18:37:31 2012 +0100 >> +++ b/xen/drivers/passthrough/iommu.c Wed Aug 08 09:56:50 2012 -0700 >> @@ -54,6 +55,8 @@ bool_t __read_mostly amd_iommu_perdev_in >> >> DEFINE_PER_CPU(bool_t, iommu_dont_flush_iotlb); >> >> +void setup_iommu_dump(void); >> + > > This is completely bogus. If the function was called from another > source file, the declaration would belong into a header file. Since > it's only used here, it ought to be static. > >> static void __init parse_iommu_param(char *s) >> { >> char *ss; >> @@ -119,6 +122,7 @@ void __init iommu_dom0_init(struct domai >> if ( !iommu_enabled ) >> return; >> >> + setup_iommu_dump(); >> d->need_iommu = !!iommu_dom0_strict; >> if ( need_iommu(d) ) >> { >> ... >> +void __init setup_iommu_dump(void) >> +{ >> + register_keyhandler('o',&iommu_p2m_table); >> +} > > Furthermore, there's no real need for a separate function here > anyway. Just call register_key_handler() directly. Or > alternatively this ought to match other code doing the same - > using an initcall. > >> --- a/xen/drivers/passthrough/vtd/iommu.c Tue Aug 07 18:37:31 2012 +0100 >> +++ b/xen/drivers/passthrough/vtd/iommu.c Wed Aug 08 09:56:50 2012 -0700 >> +static void vtd_dump_p2m_table_level(u64 pt_maddr, int level, u64 gpa) >> +{ >> + u64 address; > > Again, both gpa and address ought to be paddr_t, and the format > specifiers should match. > >> + int i; >> + struct dma_pte *pt_vaddr, *pte; >> + int next_level; >> + >> + if ( pt_maddr == 0 ) >> + return; >> + >> + pt_vaddr = (struct dma_pte *)map_vtd_domain_page(pt_maddr); > > Pointless cast. > >> + if ( pt_vaddr == NULL ) >> + { >> + printk("Failed to map VT-D domain page %016"PRIx64"\n", pt_maddr); >> + return; >> + } >> + >> + next_level = level - 1; >> + for ( i = 0; i< PTE_NUM; i++ ) >> + { >> + if ( !(i % 2) ) >> + process_pending_softirqs(); >> + >> + pte =&pt_vaddr[i]; >> + if ( !dma_pte_present(*pte) ) >> + continue; >> + >> + address = gpa + offset_level_address(i, level); >> + if ( next_level>= 1 ) >> + vtd_dump_p2m_table_level(dma_pte_addr(*pte), next_level, address); >> + >> + if ( level == 1 ) >> + printk("gfn: %016"PRIx64" mfn: %016"PRIx64" superpage=%d\n", >> + address>> PAGE_SHIFT_4K, pte->val>> PAGE_SHIFT_4K, dma_pte_superpage(*pte)? 1 : 0); > > Why do you print leaf (level 1) tables here only? > > And the last line certainly is above 80 chars, so needs breaking up. > > (Also, just to avoid you needing to do another iteration: Don't > switch to PFN_DOWN() here.) > > I further wonder whether "superpage" alone is enough - don't we > have both 2M and 1G pages? Of course, that would become mute > if higher levels got also dumped (as then this knowledge is implicit). > > Which reminds me to ask that both here and in the AMD code the > recursion level should probably be reflected by indenting the > printed strings. > > Jan >