From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Chen, Tiejun" Subject: Re: [v3][PATCH 07/16] hvmloader/pci: skip reserved ranges Date: Tue, 23 Jun 2015 17:46:28 +0800 Message-ID: <55892AF4.8040803@intel.com> References: <1433985325-16676-1-git-send-email-tiejun.chen@intel.com> <1433985325-16676-8-git-send-email-tiejun.chen@intel.com> <557A9000.600@intel.com> <557FEC6A.1080003@intel.com> <55800B21020000780008561E@mail.emea.novell.com> <55811D50.2070002@intel.com> <55813BA60200007800085D93@mail.emea.novell.com> <558127B7.5060800@intel.com> <5581466C0200007800085DEC@mail.emea.novell.com> <55812F2D.2030504@intel.com> <558153D30200007800085E77@mail.emea.novell.com> <55813B61.6020206@intel.com> <558158D30200007800085EEB@mail.emea.novell.com> <5582627E.3010909@intel.com> <558281590200007800086672@mail.emea.novell.com> <55826CD7.3070502@intel.com> <558297D6020 000780008671D@mail.emea.novell.com> <5583781B.6020101@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <5583781B.6020101@intel.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: Kevin Tian , "wei.liu2@citrix.com" , "ian.campbell@citrix.com" , "tim@xen.org" , "Ian.Jackson@eu.citrix.com" , "xen-devel@lists.xen.org" , "stefano.stabellini@citrix.com" , "andrew.cooper3@citrix.com" , Yang Z Zhang List-Id: xen-devel@lists.xenproject.org On 2015/6/19 10:02, Chen, Tiejun wrote: > On 2015/6/18 16:05, Jan Beulich wrote: >>>>> On 18.06.15 at 09:01, wrote: >>> On 2015/6/18 14:29, Jan Beulich wrote: >>>>>>> On 18.06.15 at 08:17, wrote: >>>>> On 2015/6/17 17:24, Jan Beulich wrote: >>>>>>>>> On 17.06.15 at 11:18, wrote: >>>>>>> On 2015/6/17 17:02, Jan Beulich wrote: >>>>>>>>>>> On 17.06.15 at 10:26, wrote: >>>>>>>>> Something hits me to generate another idea, >>>>>>>>> >>>>>>>>> #1. Still allocate all devices as before. >>>>>>>>> #2. Lookup all actual bars to check if they're conflicting RMRR >>>>>>>>> >>>>>>>>> We can skip these bars to keep zero. Then later it would make >>>>>>>>> lookup easily. >>>>>>>>> >>>>>>>>> #3. Need to reallocate these conflicting bars. >>>>>>>>> #3.1 Trying to reallocate them with the remaining resources >>>>>>>>> #3.2 If the remaining resources aren't enough, we need to >>>>>>>>> allocate them >>>>>>>>> from high_mem_resource. >>>>>>>> >>>>>>>> That's possible onyl for 64-bit BARs. >>>>>>> >>>>>>> You're right so this means its not proper to adjust mmio_total to >>>>>>> include conflicting reserved ranges and finally moved all >>>>>>> conflicting >>>>>>> bars to high_mem_resource as Kevin suggested previously, so i high >>>>>>> level, we still need to decrease pci_mem_start to populate more >>>>>>> RAM to >>>>>>> compensate them as I did, right? >>>>>> >>>>>> You probably should do both: Prefer moving things beyond 4Gb, >>>>>> but if not possible increase the MMIO hole. >>>>>> >>>>> >>>>> I'm trying to figure out a better solution. Perhaps we can allocate >>>>> 32-bit bars and 64-bit bars orderly. This may help us bypass those >>>>> complicated corner cases. >>>> >>>> Dealing with 32- and 64-bit BARs separately won't help at all, as >>> >>> More precisely I'm saying to deal with them orderly. >>> >>>> there may only be 32-bit ones, or the set of 32-bit ones may >>>> already require you to do re-arrangements. Plus, for compatibility >>> >>> Yes but I don't understand they are specific cases to my idea. >> >> Perhaps the problem is that you don't say what "orderly" is supposed >> to mean here? > > You're right. Here when "separately" vs "orderly", I should definitely > use "orderly" to make more understandable. > >> >>>> reasons (just like physical machines' BIOSes do) avoiding to place >>>> MMIO above 4Gb where possible is still a goal. >>> >>> So are you sure you see my idea completely? I don't intend to expand pci >>> memory above 4GB. >>> >>> Let me clear this simply, >>> >>> #1. I'm still trying to allocate all 32bit bars from >>> [pci_mem_start,pci_mem_end] as before. >>> >>> #2. But [pci_mem_start,pci_mem_end] mightn't enough cover all 32bit-bars >>> again because of RMRR, right? So I will populate RAM to push downward at >>> cur_pci_mem_start ( = pci_mem_start - reserved device memory), then >>> allocate the remaining 32bit-bars from [cur_pci_mem_start , >>> pci_mem_start] >>> >>> #3. Then I'm still trying to allocate 64bit-bars from >>> [pci_mem_start,pci_mem_end], unless its not enough. This is just going >>> to follow the original. >>> >>> So anything is breaking that goal? >> >> Maybe not, from the above. >> >>> And overall, its same as the original. >> >> If the model follows the original, what's the point of outlining >> supposed changes to the model? All I'm trying to understand is how > > Its not same completely, or let me change this statement, "same" -> > "similar". > >> you want to change the current code to accommodate the not >> aligned reserved memory regions. If everything is the same as >> before, this can't have been taken care of. If something is different >> from the original, that's what needs spelling out (and nothing else, >> as that would only clutter the picture). >> >>>> Doesn't look like the right approach to me. As said before, I think >>> >>> Could you see what I'm saying again? I just feel you don't understand >>> what you mean. If you still think I'm wrong let me know. >> >> I think I understand what _I_ mean, but I'm indeed unsure I see >> what _you_ mean. Part of the problem is that you toggle between >> sending (incomplete) patches, code fragments, and discussing the >> approach verbally. I'd much prefer if either you started with a clear >> picture of what you intend to implement, or with an implementation >> that at least attempts to take care of all the corner cases (showing >> that you understand what the corner cases are, which so far I'm >> getting the - perhaps false - impression that you don't). >> > > Based on my previous recognition and our recent discussion, my current > understanding can be summarized as follows; > > #1. Goal > > MMIO region should exclude all reserved device memory > > #2. Requirements > > #2.1 Still need to make sure MMIO region is fit all pci devices as before > > #2.2 Accommodate the not aligned reserved memory regions > > If I'm missing something let me know. > > #3. How to > > #3.1 Address #2.1 > > We need to either of populating more RAM, or of expanding more highmem. > But we should know just 64bit-bar can work with highmem, and as you > mentioned we also should avoid expanding highmem as possible. > > So my implementation is to allocate 32bit-bar and 64bit-bar orderly. > > 1>. The first allocation round just to 32bit-bar > > If we can finish allocating all 32bit-bar, we just go to allocate > 64bit-bar with all remaining resources including low pci memory. > > If not, we need to calculate how much RAM should be populated to > allocate the remaining 32bit-bars, then populate sufficient RAM as > exp_mem_resource to go to the second allocation round 2>. > > 2>. The second allocation round to the remaining 32bit-bar > > We should can finish allocating all 32bit-bar in theory, then go to the > third allocation round 3>. > > 3>. The third allocation round to 64bit-bar > > We'll try to first allocate from the remaining low memory resource. If > that isn't enough, we try to expand highmem to allocate for 64bit-bar. > This process should be same as the original. > > I think my pasted patch should represent this framework above. > > #3.2 Address #2.2 > > As you said, we need to accommodate the not aligned reserved memory > regions. I didn't consider this more previously. > > Now I'm trying to rewrite that chunk of code fragment to address this > point like this, > > - base = (resource->base + bar_sz - 1) & ~(uint64_t)(bar_sz - 1); > + reallocate_bar: > + base = (resource->base + bar_sz - 1) & ~(uint64_t)(bar_sz - 1); > bar_data |= (uint32_t)base; > bar_data_upper = (uint32_t)(base >> 32); > + /* > + * We should skip all reserved device memory, but we also need > + * to check if other smaller bars can be allocated if a mmio hole > + * exists between resource->base and reserved device memory. > + */ > + for ( j = 0; j < memory_map.nr_map ; j++ ) > + { > + if ( memory_map.map[j].type != E820_RAM ) > + { > + reserved_start = memory_map.map[i].addr; > + reserved_size = memory_map.map[i].size; > + reserved_end = reserved_start + reserved_size; > + if ( check_overlap(base, bar_sz, > + reserved_start, reserved_size) ) > + { > + /* > + * If a hole exists between base and reserved device > + * memory, lets go out simply to try allocate for next > + * bar since all bars are in descending order of size. > + */ > + if ( resource->base < reserved_start ) > + continue; > + /* > + * If not, we need to move resource->base to > + * reserved_end just to reallocate this bar. > + */ > + else > + { > + resource->base = reserved_end; > + goto reallocate_bar; > + } > + } > + } > + } > base += bar_sz; > > Just let me include this into next series to seek further review or discussion. Thanks Tiejun