From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=45924 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Peli3-0002Va-JB for qemu-devel@nongnu.org; Mon, 17 Jan 2011 04:49:12 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Peli2-00007f-73 for qemu-devel@nongnu.org; Mon, 17 Jan 2011 04:49:11 -0500 Received: from david.siemens.de ([192.35.17.14]:22509) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Peli1-00007X-Ta for qemu-devel@nongnu.org; Mon, 17 Jan 2011 04:49:10 -0500 Message-ID: <4D34107B.6020703@siemens.com> Date: Mon, 17 Jan 2011 10:48:43 +0100 From: Jan Kiszka MIME-Version: 1.0 References: <1294907685.4596.44.camel@yhuang-dev> <4D2EBF69.9070208@siemens.com> <1294969881.4596.80.camel@yhuang-dev> <4D300B9A.9010508@siemens.com> <1295230088.10748.79.camel@yhuang-dev> In-Reply-To: <1295230088.10748.79.camel@yhuang-dev> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [PATCH uq/master 2/2] MCE, unpoison memory address across reboot List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Huang Ying Cc: "kvm@vger.kernel.org" , Dean Nelson , Marcelo Tosatti , "qemu-devel@nongnu.org" , Anthony Liguori , Andi Kleen , Avi Kivity , "Wu, Fengguang" On 2011-01-17 03:08, Huang Ying wrote: >>>> As indicated, I'm sitting on lots of fixes and refactorings of the MCE >>>> user space code. How do you test your patches? Any suggestions how to do >>>> this efficiently would be warmly welcome. >>> >>> We use a self-made test script to test. Repository is at: >>> >>> git://git.kernel.org/pub/scm/utils/cpu/mce/mce-test.git >>> >>> The kvm test script is in kvm sub-directory. >>> >>> The qemu patch attached is need by the test script. >>> >> >> Yeah, I already found this yesterday and started reading. I was just >> searching for p2v in qemu, but now it's clear where it comes from. Will >> have a look (if you want to preview my changes: >> git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream). >> >> I was almost about to use MADV_HWPOISON instead of the injection module. >> Is there a way to recover the fake corruption afterward? I think that >> would allow to move some of the test logic into qemu and avoid p2v which >> - IIRC - was disliked upstream. > > I don't know how to fully recover from MADV_HWPOISON. You can recover > the virtual address space via qemu_ram_remap() introduced in 1/2 of this > patchset. But you will lose one or several physical pages for each > testing. I think that may be not a big issue for a testing machine. > > Ccing Andi and Fengguang, they know more than me about MADV_HWPOISON. "page-types -b hwpoison -x" does the trick of unpoisoning for me. It can be found at linux/Documentation/vm/page-types.c. So it's quite easy to set up and clean up a test case based on MADV_HWPOISON IMO. Not sure, though, if that can simulate all of what you currently do via mce-inject. > >> Also, is there a way to simulate corrected errors (BUS_MCEERR_AO)? > > BUS_MCEERR_AO is recoverable uncorrected error instead of corrected > error. > > The test script is for BUS_MCEERR_AO and BUS_MCEERR_AR. To see the > effect of pure BUS_MCEERR_AO, just remove the memory accessing loop > (memset) in tools/simple_process/simple_process.c. Yeah, that question was based on lacking knowledge about the different error types. Meanwhile, I was able to trigger BUS_MCEERR_AO via MADV_HWPOISON - and also BUS_MCEERR_AR by accessing that page. However, I did not succeed with using mce-inject so far, thus with mce-test. But I need to check this again. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux