From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NePDP-0007gu-N8 for qemu-devel@nongnu.org; Mon, 08 Feb 2010 03:43:31 -0500 Received: from [199.232.76.173] (port=33752 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NePDO-0007gB-UA for qemu-devel@nongnu.org; Mon, 08 Feb 2010 03:43:31 -0500 Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim 4.60) (envelope-from ) id 1NePDK-0001If-4J for qemu-devel@nongnu.org; Mon, 08 Feb 2010 03:43:30 -0500 Received: from mx1.redhat.com ([209.132.183.28]:58709) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NePDJ-0001IZ-Np for qemu-devel@nongnu.org; Mon, 08 Feb 2010 03:43:25 -0500 Message-ID: <4B6FCEA6.7040408@redhat.com> Date: Mon, 08 Feb 2010 10:43:18 +0200 From: Avi Kivity MIME-Version: 1.0 Subject: Re: [Qemu-devel] [Patch] Support translating Guest physical address to Host virtual address. References: <4B60B28A.40400@linux.vnet.ibm.com> <1264631460.29051.35.camel@w-amax.beaverton.ibm.com> <4B697D04.7070507@linux.vnet.ibm.com> <46D41A8912DCCF4FB93FA509BD00C63101E79CA0@irsmsx002.ger.corp.intel.com> <4B6986F6.5000808@linux.vnet.ibm.com> <46D41A8912DCCF4FB93FA509BD00C63101E79E14@irsmsx002.ger.corp.intel.com> <4B69A0DF.4040908@linux.vnet.ibm.com> <4B6EC835.3010805@redhat.com> <4B6EE8FC.9080602@codemonkey.ws> <4B6EEAC7.7040900@redhat.com> <4B6F3A02.7080800@codemonkey.ws> In-Reply-To: <4B6F3A02.7080800@codemonkey.ws> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: "lmr@redhat.com" , "Li, Haicheng" , Max Asbock , "qemu-devel@nongnu.org" , "Zheng, Jiajia" , "You, Yongkang" , "Kleen, Andi" On 02/08/2010 12:09 AM, Anthony Liguori wrote: > On 02/07/2010 10:31 AM, Avi Kivity wrote: >>> Only insofar as you don't have to deal with getting at the VM fd. >>> You can avoid the problem by having the kvm ioctl interface take a >>> pid or something. >> >> >> That's a racy interface. > > The mechanism itself is racy. That said, pid's don't recycle very > quickly so the chances of running into a practical issue is quite small. While a low probability of a race is acceptable for a test tool, it isn't for a kernel interface. >> Well, we need to provide a reasonable alternative. > > I think this is the sort of thing that really needs to be a utility > that lives outside of qemu. I'm absolutely in favor of exposing > enough internals to let people do interesting things provided it's > reasonably correct. I agree that's desirable. However in light of the changable gpa->hva->hpa mappings, this may not be feasible. > >> One might be to use -mempath (which is hacky by itself, but so far we >> have no alternative) and use an external tool on the memory object to >> poison it. An advantage is that you can use it independently of kvm. > > It would help if the actual requirements were spelled out a bit more. > What exactly needs validating? Do we need to validate that a > poisoning a host physical address results in a very particular guest > page getting poisoned? > > Is it not enough to just choose a random anonymous memory area within > the qemu process, generate an MCE to that location, see whether qemu > SIGBUS's. If it doesn't, validate that an MCE has been received in > the guest? /proc/pid/pagemap may help, though that's racy too. If you pick the largest vma (or use -mempath) you're pretty much guaranteed to hit on the guest memory area. > > But FWIW, I think a set of per-VM directories in sysfs could be very > useful for this sort of debugging. > > Maybe we should consider having the equivalent of a QMP-for-debugging > session. This would be a special QMP session that we basically > provided no compatibility or even sanity guarantees that was > specifically there for debugging. I would expect that it be disabled > in any production build (even perhaps even by default in the general > build). > We have 'info cpus' that shows the vcpu->thread mappings, allowing management to pin cpus. Why not have 'info memory' that shows guest numa nodes and host virtual addresses? The migrate_pages() syscall takes a pid so it can be used by qemu's controller to load-balance a numa machine, and this can also be used by the poisoner to do its work. -- error compiling committee.c: too many arguments to function