From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1L5B3F-00008T-P1 for qemu-devel@nongnu.org; Tue, 25 Nov 2008 22:26:53 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1L5B3E-000080-RY for qemu-devel@nongnu.org; Tue, 25 Nov 2008 22:26:53 -0500 Received: from [199.232.76.173] (port=60076 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L5B3E-00007t-Im for qemu-devel@nongnu.org; Tue, 25 Nov 2008 22:26:52 -0500 Received: from mail.gmx.net ([213.165.64.20]:42091) by monty-python.gnu.org with smtp (Exim 4.60) (envelope-from ) id 1L5B3D-0001WB-SL for qemu-devel@nongnu.org; Tue, 25 Nov 2008 22:26:52 -0500 Message-ID: <492CC1F9.5050408@gmx.net> Date: Wed, 26 Nov 2008 04:26:49 +0100 From: Carl-Daniel Hailfinger MIME-Version: 1.0 Subject: Re: [Qemu-devel] Modeling x86 early initialization accurately References: <492C80BF.4010103@gmx.net> <492CAEC8.4010306@codemonkey.ws> In-Reply-To: <492CAEC8.4010306@codemonkey.ws> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org On 26.11.2008 03:04, Anthony Liguori wrote: > Carl-Daniel Hailfinger wrote: >> current svn HEAD of QEMU assumes all RAM is available directly at x86 >> CPU startup. The ability to lock processor caches to function as RAM >> (Cache-as-RAM) is unimplemented as well. >> While that does make it easier for the shipped BIOS to set up working >> RAM (i.e. it does nothing about that right now), that simplification >> reduces the ability to run alternative firmwares for x86 in QEMU. >> coreboot (a free x86 firmware/BIOS replacement) is unable to use >> standard x86 early initialization because the MSRs for cache control >> (MTRRs) are completely unimplemented and ignored. >> Modeling ACPI S3 (Suspend-to-RAM) suffers from similar issues. >> >> Things which need to be changed to model x86 better: >> - Start up with all RAM being readonly. Writes should be discarded, >> reads will usually return 0xff or be undefined. The "undefined" variant >> would allow the code to allocate RAM once and just switch write access >> on/off. >> > > This is pretty reasonable. So this would be my first patch, together with a patch to change the allocation to read/write once a special MSR is written. Is it possible to change the type of allocation from readonly to read/write if the backing store has been allocated with qemu_ram_alloc()? Can I simply call cpu_register_physical_memory() again for the same target region and the newer register will take precedence? Is the "special MSR" solution acceptable? If yes, which number should I pick? Or is that my choice? >> - Support MTRRs. >> -- Mention MTRR support in CPUID. >> -- I sent a patch to dump unknown MSR accesses in general and MTRR >> reads/writes in particular. The subject was "[Qemu-devel] [PATCH] x86 >> MTRR access dumping". >> > > Yes, I saw this patch but since it's just debugging code, it's not > interesting for inclusion. Quite a few x86 processors reset themselves if they encounter an unknown MSR write. Should we do the same? If not, would spewing a loud debug message be appropriate? >> -- It is not really needed to completely implement L1/L2 caches, but the >> ability to lock the cache with the help of MTRRs should be available. >> Areas with active locked cache do not send writes down to the RAM which >> is still readonly. The cache locking is done on a per-page basis (or >> even larger granularity), so it should be easier than having to handle >> single cache lines. >> > > I'm concerned that modeling this could have a non negligible overhead > and could be very difficult in something like KVM. Can you describe > exactly what coreboot is expecting that we are not implementing? How > is it relying on cache locking? Since there is no RAM before RAM initialization, we have no way to keep a stack. That rules out implementing RAM init in C (which is fond of using a stack for local variables, parameters and call return addresses) unless you either can fake some RAM or have a C compiler which needs no stack. Faking some RAM is way easier. Basically, we use MTRRs to declare everything uncached except one small (4-64k sized with page granularity) area in the CPU address space which has cache type writeback. That area is called the CAR (Cache-as-RAM) area. Reads in that area will allocate a cache line and subsequent reads will hit the cache directly. Writes in that area will allocate a cache line if none already exists for the given address. Writes to the area will never be passed to RAM. Reads and writes outside the CAR area will go directly through to RAM/ROM. Writes outside the CAR area will be discarded. Since everything besides the CAR area is declared as uncached and any access outside the cache area won't cause cacheline evictions, the cache is effectively locked. >>From a firmware perspective, the following implementation is good enough: 1. CAR enable: Copy the contents of the address area designated for CAR from the underlying (readonly RAM/ROM) backing store to a new "CAR" read/write backing store mapped to the same CPU physical address area. 2. CAR usage: All reads/writes to the CAR area hit the "CAR" read/write backing store. All other reads outside the CAR area hit the normal backing store. All writes outside the CAR area are discarded if they would have ended up in RAM. Writes to MMIO regions are still honored. 3. RAM enabling: The backing store for RAM outside the CAR area now accepts writes. 3. CAR disabling: The "CAR" backing store is either discarded (INVD instruction) or written to RAM (WBINVD instruction). The runtime performance hit of this implementation should be negligible because there is no need to check for CAR on each memory access. Only the relevant MSR writes need to be handled to change allocation type. Once CAR is disabled, the memory allocation and mapping should match exactly what the current code does. That means any performance hit would only matter during the time CAR is active. That's probably a few hundred instructions after poweron until RAM is enabled. >> - Decide what to do for RAM initialization. Do we switch RAM into >> read-write mode by a simple QEMU-specific MSR write? Do we want to >> implement all memory initialization hardware instead? >> - Adapt the currently shipped BIOS to these tasks and/or switch to >> coreboot+SeaBIOS. >> > > BTW, I'd love to switch to something like coreboot but the legacy BIOS > support payload is too incomplete. SeaBIOS is a good option too but > it needs some heavy regression testing first. SeaBIOS is now the official coreboot payload for any legacy BIOS needs. The SeaBIOS maintainer is pretty responsive to bug reports, so I think it will be working well. After all, SeaBIOS as a coreboot payload can boot Windows XP in QEMU and on some (the number of testers is rather limited) real hardware. >> I'm willing to do most of the work if I know that this won't be rejected >> outright. >> > > In general, better modeling of processor modes, provided that there > isn't a regression in performance, is a good thing. Dividing the > effort into incremental bits that are posted early for inclusion is > also a good thing. Thanks, I'll heed your advice. I hope the explanations I gave were precise and understandable enough. If anything seems unclear or incomplete, feel free to ask. x86 CAR is difficult to explain and understand and there are almost no public docs about it. Regards, Carl-Daniel > Regards, > > Anthony Liguori -- http://www.hailfinger.org/