From mboxrd@z Thu Jan 1 00:00:00 1970 From: PUCCETTI Armand Subject: Re: Xen-devel Digest, Vol 25, Issue 93 Date: Mon, 12 Mar 2007 17:10:45 +0100 Message-ID: <45F57B85.30800@cea.fr> References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org >> When the system boots, the processor is normally in "real-mode", and >> it's definitely not got paging enabled. So we have to "make >> the guest OS >> believe this is the case". But at the same time, the guest OS is most >> likely not loaded at address zero in memory, so we need paging enabled >> to remap the GUEST PHYSICAL address to match the machine physical >> address. So we have a "linear map" to translate the "address zero" to >> the "start of guest memory", and so on for every page of memory in the >> guest. >> >> This is not hard to do, since the AMD-V/VT feature of the processor >> expects the paging-bit to be different between what the guest "thinks" >> and the actual case. In the AMD-V, there's even support to >> run real-mode >> with paging enabled, so all the BIOS-code and such will be running in >> this mode. VT has to do a bunch of tricky stuff to work around that >> problem. >> >> Ok fine, does this argument holds true for even non-VT and >> non-Pacifica enabled processors? >> I doubt it. >> > > Not precisely. I'm talking only about HVM mode, which is "full > virtualization". PV-mode uses a different paging interface, which at > least for most parts, comprise of changing the whole area of code in the > kernel that updates the page-tables, by adding code that is aware of the > THREE types of address (guest-virtual, guest-physical and > machine-physical). This means that there's no real need for the > "read-only page-tables" and "shadow-mode" - the page-table just contains > the right value for the machine-physical address. [That's not to say > that read-only page-tables can't be used in a PV system too - I'm not > 100% sure how the page-table management works in the PV mode]. > That is very interesting info on the paging system. Mats, could you please explain a bit the working of the PV paging? How do the the guest+host page tables work together? What does the guest page table point to, i.e. how+when is it mapped onto the host page table? I have seen in the code that there are different cases of guest+host paging table heights. Why? thanks. Armand >>> I hope i made myself clear. >>> Please enlighten me :-). >>> >>> When paging is enabled, we use a shadow page-table, which is >>> essentially >>> that the GUEST sees one page-table, and the processor another >>> (thanks to >>> the fact that the hypervisor intercepts the CR3 read/write >>> >> operations, >> >>> and when CR3 is read back by the guest, we don't send back the value >>> it's ACTUALLY POINTING TO IN THE PROCESSOR, but the value >>> >> that was set >> >>> by the guest). So there are two page-tables. >>> >>> Got this well, thanks Mats :). >>> >>> To make the page-table updates by the guest visible to the >>> >> hypervisor, >> >>> all of the guest-page-tables are made read-only (by scanning >>> the new CR3 >>> value whenever one is set). >>> >>> I didn't get this either well :( >>> sorry, but do you mean CR3 for the guest or for the >>> processor? i hope you mean guest? >>> >> Yes, scan the guest-CR3 to see where it placed the page-tables. >> >> >>> Whenever a page-fault happens, the hypervisor has "first look", and >>> determines if the update is for a page-table or not. If it is a >>> page-table update, the guest operation is emulated (in >>> >> x86_emulate.c), >> >>> and the result is written to the shadow-page-table AND the >>> >>> Why do we need emulation?some peculiar reason for emulating? >>> Do you mean to say if i am running a 32 bit domU on top of a >>> 64 bit processor, the guest operation for updating the page >>> table is emulated by the hypervisor.am i right? >>> >> No, it's simply because we need to see the result of the >> instruction and >> write it to two places (with some modification in one of >> those places). >> So if the code is doing, for example: "*pte |= 1;" (set a >> page-table-entry to "present"), we need to mark both the >> guest-page-table-entry to "present", and mark our >> shadow-entry "present" >> (and perhaps do some other work too, but that's the minimum work >> needed). >> >> This brings one more question in my mind.Why do we use pinning then? >> > > I believe there's two types of pinning! Page-pinning, which is blocking > a page from being accessed in an incorrect way [again, I'm not 100% sure > how this works, or exactly what it does - just that it's a term used in > the general way I described in the previous sentence]. > > >> As i see at it.To avoid shadow page tables to be swapped out >> before the page tables they actually point to are swapped.Am i right? >> >> But according to interface manual,-> to bind a vcpu to a >> specific CPU in a SMP environment we use pining.But these two >> look pretty orthogonal statements to me, which means i may be >> wrong :(. >> Can somebody help me in this regard? >> > > CPU pinning is to tie a VCPU to a (set of) processor(s). For example, > you may want to pin Dom0 to run only on CPU0, and pin a DomU to run on > CPU's 1,2 and 3. That way, Dom0 is ALWAYS able to run on it's own CPU, > and it's never in contention about which CPU to use, and DomU can run on > three CPU's as much as it likes. You could have another DomU pinned to > CPU 3 if you wish. That means that CPU 1, 2 are exclusively for the > first DomU, whilst the second DomU shares CPU3 with the first DomU (so > they both get half the CPU performance of one CPU - on average over a > reasonable amount of time). > > -- >