From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:40643)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <laine@redhat.com>) id 1YBCIr-0004vE-WA
	for qemu-devel@nongnu.org; Tue, 13 Jan 2015 19:59:23 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <laine@redhat.com>) id 1YBCIm-0000os-0g
	for qemu-devel@nongnu.org; Tue, 13 Jan 2015 19:59:21 -0500
Received: from mx1.redhat.com ([209.132.183.28]:33021)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <laine@redhat.com>) id 1YBCIl-0000oE-Os
	for qemu-devel@nongnu.org; Tue, 13 Jan 2015 19:59:15 -0500
Message-ID: <54B5BF5F.9000805@redhat.com>
Date: Tue, 13 Jan 2015 19:59:11 -0500
From: Laine Stump <laine@redhat.com>
MIME-Version: 1.0
References: <54AE87C1.2060907@wiesinger.com>	<54AEBD43.2060705@redhat.com>	<54AEC877.9080600@wiesinger.com>	<54AECAF3.3060909@redhat.com>	<54AF047D.8010009@wiesinger.com>	<54B3B2F5.1090405@wiesinger.com>	<54B57C51.7090002@wiesinger.com>	<54B584AB.4090303@redhat.com>	<54B58AC0.5080805@wiesinger.com>
	<54B58B18.9060205@redhat.com> <54B595C7.3080101@wiesinger.com>
In-Reply-To: <54B595C7.3080101@wiesinger.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] Fedora FC21 - Bug: 100% CPU and hangs in
 gettimeofday(&tp, NULL); forever
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Gerhard Wiesinger <lists@wiesinger.com>, Paolo Bonzini <pbonzini@redhat.com>, qemu-devel@nongnu.org, Cole Robinson <crobinso@redhat.com>, virt@lists.fedoraproject.org

On 01/13/2015 05:01 PM, Gerhard Wiesinger wrote:
> On 13.01.2015 22:16, Paolo Bonzini wrote:
>>
>> On 13/01/2015 22:14, Gerhard Wiesinger wrote:
>>> I also had a look at the kernel code again:
>>> http://lxr.free-electrons.com/source/kernel/time/timekeeping.c?v=3.17#L493
>>>
>>> 499         do {
>>> 500                 seq = read_seqcount_begin(&tk_core.seq);
>>> 501
>>> 502                 ts->tv_sec = tk->xtime_sec;
>>> 503                 nsecs = timekeeping_get_ns(&tk->tkr);
>>> 504
>>> 505         } while (read_seqcount_retry(&tk_core.seq, seq));
>>>
>>> So it looks like that the seqcount always changes and therefore loops
>>> forever here (as far as I digged it down this is the only loop here).
>>>
>>> Might be something wrong with the memory barriers in recent qemu-kvm
>>> releases?
>> No, that's not possible.  Unless you pause/resume or migrate the VM, all
>> of the handling of kvmclock is entirely in the kernel.
>
> Any other possible explaination of the problem?
>
> Had a look at the diff (I guess the right file at least in qemu tree):
> # no critical changes IHMO here
> git diff -u v1.6.2..v2.1.2 ./hw/i386/kvm/clock.c
>
> Trying to reproduce with a loop:
> #include <sys/time.h>
> #include <stdio.h>
>
> int main(int argc, char* argv[])
> {
>   struct timeval tv;
>   int i = 0;
>   for (;;)
>   {
>     gettimeofday(&tv, 0);
>     ++i;
>     if (i >= 10000000)
>     {
>       i = 0;
>       printf("%i\n", (int)tv.tv_sec);
>     }
>   }
>   return 0;
> }
>
> As I wrote this: "First tests seem to run well, so no quick win ....",
> I could reproduce it with a stall in 318s :-)
> (gdb) bt
> #0  0x00007fff6d9fefff in gettimeofday ()
> #1  0x00000000004005ad in main (argc=1, argv=0x7fff6d9b28b8) at
> gettimeofdayloop.c:10
>
> So we have at least a testcase which is quickly to reproduce.
>
> So we are digging down my second findings about a major bug in
> qemu-kvm  :-)
>
> Can someone try, too?
>
> Ciao,
> Gerhard
>
>

Take a look at the following kernel bug. It specifically deals with a
hang in gettimeofday() in a KVM guest:

https://bugzilla.redhat.com/show_bug.cgi?id=1178975

There is a link to a patched kernel you can try; it fixed my problems (I
was repeatedly getting hangs in python-urlgrabber during yum updates on
F21).