From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1L0F3X-0001Xx-4n for qemu-devel@nongnu.org; Wed, 12 Nov 2008 07:42:47 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1L0F3V-0001Xl-KG for qemu-devel@nongnu.org; Wed, 12 Nov 2008 07:42:45 -0500 Received: from [199.232.76.173] (port=39154 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L0F3V-0001Xi-Hj for qemu-devel@nongnu.org; Wed, 12 Nov 2008 07:42:45 -0500 Received: from mx2.redhat.com ([66.187.237.31]:38781) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1L0F3V-0003ho-2A for qemu-devel@nongnu.org; Wed, 12 Nov 2008 07:42:45 -0500 Message-ID: <491ACE61.6000009@redhat.com> Date: Wed, 12 Nov 2008 14:38:57 +0200 From: Dor Laor MIME-Version: 1.0 Subject: Re: [Qemu-devel] [RESEND][PATCH 0/3] Fix guest time drift under heavy load. References: <20081106145142.GA29861@redhat.com> <20081108083620.GB19381@redhat.com> <491711AB.7000806@codemonkey.ws> <20081110143750.GA20617@redhat.com> <49185247.70905@codemonkey.ws> <20081110152925.GB20617@redhat.com> <49185744.6010500@codemonkey.ws> <20081111144324.GE20617@redhat.com> <4919E86D.8070705@codemonkey.ws> <20081112114229.GG20617@redhat.com> <5d6222a80811120354i5deff074g8892be066b88c8cc@mail.gmail.com> In-Reply-To: <5d6222a80811120354i5deff074g8892be066b88c8cc@mail.gmail.com> Content-Type: multipart/alternative; boundary="------------060506080103070607010601" Reply-To: dlaor@redhat.com, qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Paul Brook This is a multi-part message in MIME format. --------------060506080103070607010601 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Glauber Costa wrote: > On Wed, Nov 12, 2008 at 9:42 AM, Gleb Natapov wrote: > >> On Tue, Nov 11, 2008 at 02:17:49PM -0600, Anthony Liguori wrote: >> >>> Gleb Natapov wrote: >>> >>>> On Mon, Nov 10, 2008 at 09:46:12AM -0600, Anthony Liguori wrote: >>>> -usbdevice tablet has nothing to do with it. Qemu misses interrupt >>>> even >>>> without this option and with SDL screen it misses them in bunches when >>>> SDL redraws a screen. In case of vnc qemu misses interrupt because of >>>> fsync() call in raw_flush(), or so my instrumentation shows. >>>> >>>> >>> Can you give this patch a spin? >>> >>> >> Doesn't compile for me. fd_pool_inuse and fd_inuse are used but not >> defined. >> >> >>> This introduces a bdrv_aio_flush() which will wait for all existing AIO >>> operations to complete before indicating completion. It also fixes up >>> IDE. Fixing up SCSI will be a little more tricky but not much. Since >>> we now use O_DSYNC, it's unnecessary to do an fsync (or an fdatasync). >>> >>> Assuming you're using IDE, this should eliminate any delays from fsync. >>> >> I am using IDE. >> >> >>> SDL delays are unavoidable because it's going to come down to SDL doing >>> sychronous updates to the X server. The proper long term solution here >>> would be to put SDL in it's own thread but I'm not too worried about >>> >> And probably time-keeping deserves its own thread. And CPU execution >> too. >> > > It might well be a stupid idea, (would have to benchmark it), but the > other day it occurred to me > that we could keep timekeeping in a separate _process_, with a shm > area, doing timekeeping > for all running guests. > > The problem is that the right vcpu should be preempted in order us injecting time irq into it. It will introduce latency (signal/IPI). This is why the in-kernel pit performs better/ more accurate. What we can do is add a kernel interface to schedule a timer on a specific cpu for preempting the vcpu. The problem such interface is a bit awkward and Avi feels it's beginning to be too complex. So either we fix it using Gleb's set_irq ack method or just fix the RTC using Andrzej suggestion for RTC irq status bit. --------------060506080103070607010601 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit Glauber Costa wrote:
On Wed, Nov 12, 2008 at 9:42 AM, Gleb Natapov <gleb@redhat.com> wrote:
  
On Tue, Nov 11, 2008 at 02:17:49PM -0600, Anthony Liguori wrote:
    
Gleb Natapov wrote:
      
On Mon, Nov 10, 2008 at 09:46:12AM -0600, Anthony Liguori wrote:
  -usbdevice tablet has nothing to do with it. Qemu misses interrupt
even
without this option and with SDL screen it misses them in bunches when
SDL redraws a screen. In case of vnc qemu misses interrupt because of
fsync() call in raw_flush(), or so my instrumentation shows.

        
Can you give this patch a spin?

      
Doesn't compile for me. fd_pool_inuse and fd_inuse are used but not
defined.

    
This introduces a bdrv_aio_flush() which will wait for all existing AIO
operations to complete before indicating completion.  It also fixes up
IDE.  Fixing up SCSI will be a little more tricky but not much.  Since
we now use O_DSYNC, it's unnecessary to do an fsync (or an fdatasync).

Assuming you're using IDE, this should eliminate any delays from fsync.
      
I am using IDE.

    
SDL delays are unavoidable because it's going to come down to SDL doing
sychronous updates to the X server.  The proper long term solution here
would be to put SDL in it's own thread but I'm not too worried about
      
And probably time-keeping deserves its own thread. And CPU execution
too.
    

It might well be a stupid idea, (would have to benchmark it), but the
other day it occurred to me
that we could keep timekeeping in a separate _process_, with a shm
area, doing timekeeping
for all running guests.

  
The problem is that the right vcpu should be preempted in order us injecting time irq into it.
It will introduce latency (signal/IPI).
This is why the in-kernel pit performs better/ more accurate.

What we can do is add a kernel interface to schedule a timer on a specific cpu for preempting the vcpu.
The problem such interface is a bit awkward and Avi feels it's beginning to be too complex.

So either we fix it using Gleb's set_irq ack method or just fix the RTC using Andrzej suggestion for RTC irq
status bit.
--------------060506080103070607010601--