From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx2.suse.de (cantor2.suse.de [195.135.220.15]) (using TLSv1 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 796EF1A0178 for ; Thu, 10 Jul 2014 23:17:19 +1000 (EST) Message-ID: <53BE925C.2030401@suse.de> Date: Thu, 10 Jul 2014 15:17:16 +0200 From: Alexander Graf MIME-Version: 1.0 To: Mel Gorman Subject: Re: [PATCH v2] Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8 References: <1404437035-4336-1-git-send-email-stewart@linux.vnet.ibm.com> <1404795988-9892-1-git-send-email-stewart@linux.vnet.ibm.com> <53BBCAC7.90904@suse.de> <53BE738B.1010100@suse.de> <20140710130716.GQ25275@novell.com> In-Reply-To: <20140710130716.GQ25275@novell.com> Content-Type: text/plain; charset=ISO-8859-15; format=flowed Cc: Stewart Smith , paulus@samba.org, linuxppc-dev@lists.ozlabs.org, kvm-ppc@vger.kernel.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On 10.07.14 15:07, Mel Gorman wrote: > On Thu, Jul 10, 2014 at 01:05:47PM +0200, Alexander Graf wrote: >> On 09.07.14 00:59, Stewart Smith wrote: >>> Hi! >>> >>> Thanks for review, much appreciated! >>> >>> Alexander Graf writes: >>>> On 08.07.14 07:06, Stewart Smith wrote: >>>>> @@ -1528,6 +1535,7 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc) >>>>> int i, need_vpa_update; >>>>> int srcu_idx; >>>>> struct kvm_vcpu *vcpus_to_update[threads_per_core]; >>>>> + phys_addr_t phy_addr, tmp; >>>> Please put the variable declarations into the if () branch so that the >>>> compiler can catch potential leaks :) >>> ack. will fix. >>> >>>>> @@ -1590,9 +1598,48 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc) >>>>> srcu_idx = srcu_read_lock(&vc->kvm->srcu); >>>>> + /* If we have a saved list of L2/L3, restore it */ >>>>> + if (cpu_has_feature(CPU_FTR_ARCH_207S) && vc->mpp_buffer) { >>>>> + phy_addr = virt_to_phys((void *)vc->mpp_buffer); >>>>> +#if defined(CONFIG_PPC_4K_PAGES) >>>>> + phy_addr = (phy_addr + 8*4096) & ~(8*4096); >>>> get_free_pages() is automatically aligned to the order, no? >>> That's what Paul reckoned too, and then we've attempted to find anywhere >>> that documents that behaviour. Happen to be able to point to docs/source >>> that say this is part of API? >> Phew - it's probably buried somewhere. I could only find this >> document saying that we always get order-aligned allocations: >> >> http://www.thehackademy.net/madchat/ebooks/Mem_virtuelle/linux-mm/zonealloc.html >> >> Mel, do you happen to have any pointer to something that explicitly >> (or even properly implicitly) says that get_free_pages() returns >> order-aligned memory? >> > I did not read the whole thread so I lack context and will just answer > this part. > > There is no guarantee that pages are returned in PFN order for multiple > requests to the page allocator. This is the relevant comment in > rmqueue_bulk > > /* > * Split buddy pages returned by expand() are received here > * in physical page order. The page is added to the callers and > * list and the list head then moves forward. From the callers > * perspective, the linked list is ordered by page number in > * some conditions. This is useful for IO devices that can > * merge IO requests if the physical pages are ordered > * properly. > */ > > It will probably be true early in the lifetime of the system but the milage > will vary on systems with a lot of uptime. If you depend on this behaviour > for correctness then you will have a bad day. > > High-order page requests to the page allocator are guaranteed to be in physical > order. However, this does not apply to vmalloc() where allocations are > only guaranteed to be virtually contiguous. Hrm, ok to be very concrete: Does __get_free_pages(..., 4); on a 4k page size system give me a 64k aligned pointer? :) Alex