From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40643) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YBCIr-0004vE-WA for qemu-devel@nongnu.org; Tue, 13 Jan 2015 19:59:23 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YBCIm-0000os-0g for qemu-devel@nongnu.org; Tue, 13 Jan 2015 19:59:21 -0500 Received: from mx1.redhat.com ([209.132.183.28]:33021) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YBCIl-0000oE-Os for qemu-devel@nongnu.org; Tue, 13 Jan 2015 19:59:15 -0500 Message-ID: <54B5BF5F.9000805@redhat.com> Date: Tue, 13 Jan 2015 19:59:11 -0500 From: Laine Stump MIME-Version: 1.0 References: <54AE87C1.2060907@wiesinger.com> <54AEBD43.2060705@redhat.com> <54AEC877.9080600@wiesinger.com> <54AECAF3.3060909@redhat.com> <54AF047D.8010009@wiesinger.com> <54B3B2F5.1090405@wiesinger.com> <54B57C51.7090002@wiesinger.com> <54B584AB.4090303@redhat.com> <54B58AC0.5080805@wiesinger.com> <54B58B18.9060205@redhat.com> <54B595C7.3080101@wiesinger.com> In-Reply-To: <54B595C7.3080101@wiesinger.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] Fedora FC21 - Bug: 100% CPU and hangs in gettimeofday(&tp, NULL); forever List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gerhard Wiesinger , Paolo Bonzini , qemu-devel@nongnu.org, Cole Robinson , virt@lists.fedoraproject.org On 01/13/2015 05:01 PM, Gerhard Wiesinger wrote: > On 13.01.2015 22:16, Paolo Bonzini wrote: >> >> On 13/01/2015 22:14, Gerhard Wiesinger wrote: >>> I also had a look at the kernel code again: >>> http://lxr.free-electrons.com/source/kernel/time/timekeeping.c?v=3.17#L493 >>> >>> 499 do { >>> 500 seq = read_seqcount_begin(&tk_core.seq); >>> 501 >>> 502 ts->tv_sec = tk->xtime_sec; >>> 503 nsecs = timekeeping_get_ns(&tk->tkr); >>> 504 >>> 505 } while (read_seqcount_retry(&tk_core.seq, seq)); >>> >>> So it looks like that the seqcount always changes and therefore loops >>> forever here (as far as I digged it down this is the only loop here). >>> >>> Might be something wrong with the memory barriers in recent qemu-kvm >>> releases? >> No, that's not possible. Unless you pause/resume or migrate the VM, all >> of the handling of kvmclock is entirely in the kernel. > > Any other possible explaination of the problem? > > Had a look at the diff (I guess the right file at least in qemu tree): > # no critical changes IHMO here > git diff -u v1.6.2..v2.1.2 ./hw/i386/kvm/clock.c > > Trying to reproduce with a loop: > #include > #include > > int main(int argc, char* argv[]) > { > struct timeval tv; > int i = 0; > for (;;) > { > gettimeofday(&tv, 0); > ++i; > if (i >= 10000000) > { > i = 0; > printf("%i\n", (int)tv.tv_sec); > } > } > return 0; > } > > As I wrote this: "First tests seem to run well, so no quick win ....", > I could reproduce it with a stall in 318s :-) > (gdb) bt > #0 0x00007fff6d9fefff in gettimeofday () > #1 0x00000000004005ad in main (argc=1, argv=0x7fff6d9b28b8) at > gettimeofdayloop.c:10 > > So we have at least a testcase which is quickly to reproduce. > > So we are digging down my second findings about a major bug in > qemu-kvm :-) > > Can someone try, too? > > Ciao, > Gerhard > > Take a look at the following kernel bug. It specifically deals with a hang in gettimeofday() in a KVM guest: https://bugzilla.redhat.com/show_bug.cgi?id=1178975 There is a link to a patched kernel you can try; it fixed my problems (I was repeatedly getting hangs in python-urlgrabber during yum updates on F21).