From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: Data corruption in guest using KVM Date: Sun, 22 Jul 2007 10:52:03 +0300 Message-ID: <46A30CA3.3090100@qumranet.com> References: <20070721172248.GA1555@hall.aurel32.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org To: Aurelien Jarno Return-path: In-Reply-To: <20070721172248.GA1555-OqXK5JiLQY5aJl8KAwiEcA@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org Errors-To: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org List-Id: kvm.vger.kernel.org Aurelien Jarno wrote: > Hi all, > > For a long time I am seeing data corruption in guests when using KVM, > but I am convinced only since today that the problem comes from KVM. > > The symptoms are a few bytes that are mangled to 0x00 in a file that has > been written. For now I have only seen 2 or 4 consecutive bytes mangled, > but that may due to statistics given the limited samples. > > The problem appears very rarely. I am only seeing it when doing huge > compilations (for example gcc or glibc), and not for every build. Note > that I am only detecting build failures, so I can miss some corruptions. > > Note that I have observed the problem on GNU/Linux, GNU/kFreeBSD and > plain FreeBSD, for both 32 and 64-bit guests. I always used 64-bit > hosts, and I have seen the problem on both Core 2 and Athlon 64 CPU > (always multi-core). > > I have never seen such corruptions using QEMU, so I would say the > problem does not comes from the disk emulation, though it may be due to > statistics. Note that I have made a lot of compilation in a MIPS QEMU > guest (a few hundred of hours), without any problem. This platform uses > the same IDE controller as the one in KVM. > > Does anybody have seen the same kind of problem? Without a way to > reproduce the corruption, I think it will be very difficult to debug > the problem. Did you observe anything about the corruption? For example, are the offsets at page boundary? Can you provide a corrupted file and the same, non-corrupted file as a reference? For the 32-bit case, were the guests pae, nonpae, or both? How would I go about reproducing this? Is a single ./configure; make clean; make in a loop compiling gcc sufficient? -- error compiling committee.c: too many arguments to function ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/