From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34821) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uz7Po-0007ma-SX for qemu-devel@nongnu.org; Tue, 16 Jul 2013 11:43:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Uz7Ph-00034T-OB for qemu-devel@nongnu.org; Tue, 16 Jul 2013 11:43:48 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46277) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uz7Ph-00033z-Gi for qemu-devel@nongnu.org; Tue, 16 Jul 2013 11:43:41 -0400 Message-ID: <51E56A1A.50502@redhat.com> Date: Tue, 16 Jul 2013 17:43:22 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <1373127897-3445-1-git-send-email-alex@alex.org.uk> <51E4063D.6010308@redhat.com> <75638CBA408455BF52192DA7@Ximines.local> <51E4613D.9000106@redhat.com> <44590808AF4A6E7DC093637A@nimrod.local> <51E4E54A.10908@redhat.com> <51E4F77C.2090509@redhat.com> <794E19D97CCC267CCBFA8397@Ximines.local> In-Reply-To: <794E19D97CCC267CCBFA8397@Ximines.local> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] [RFC] aio/async: Add timed bottom-halves List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alex Bligh Cc: Kevin Wolf , Anthony Liguori , qemu-devel@nongnu.org, Stefan Hajnoczi , rth@twiddle.net Il 16/07/2013 17:29, Alex Bligh ha scritto: > Paolo, > > --On 16 July 2013 09:34:20 +0200 Paolo Bonzini wrote: > >>>> You did. But aio_wait() ignores the timeout. It is only used by the >>>> main loop. >>> >>> OK well that seems worth fixing in any case, as even without timed bh's >>> that means no bh can be executed for an indeterminate time. I'll have >>> a look at that. >> >> No, BHs work because they do aio_notify(). Idle BHs can be skipped for >> an indeterminate time, but that's fine because idle BHs are a hack that >> we should not need at all. > > OK, so a bit more code reading later, I think I now understand. > > 1. non-idle bh's call aio_notify at schedule time, which will cause > any poll to exit immediately because at least one FD will be ready. > > 2. idle bh's do not do aio_notify() because we don't care whether > they get stuck in aio_poll and they are a hack [actually I think > we could do better here] > > 3. aio_poll calls aio_bh_poll. If this returns true, this indicates > at least one non-idle bh exists, which causes aio_poll not to > block. No, this indicates that at least one scheduled non-idle bh exist*ed*, which causes aio_poll not to block (because some progress has been done). There could be non-idle BHs scheduled during aio_poll, but not visited by aio_bh_poll. These rely on aio_notify (it could be an idle BH who scheduled this non-idle BH, so aio_bh_poll might return 0). > Question 1: it then calls aio_dispatch - if this itself > generates non-idle bh's, this would seem not to clear the blocking > flag. Does this rely on aio_notify? Yes. > Question 2: if we're already telling aio_poll not to block > by the presence of non-idle bh's as detected in aio_bh_poll, > why do we need to use aio_notify too? IE why do we have both > the blocking= logic AND the aio_notify logic? See above (newly-scheduled BHs are always handled with aio_notify, the blocking=false logic is for previously-scheduled BHs). > 4. aio_poll then calls g_poll (POSIX) or WaitForMultipleObjects > (Windows). However, the timeout is either 0 or infinite. > Both functions take a milliseconds (yuck) timeout, but that > is not used. I agree with the yuck. :) But Linux has the nanoseconds-resolution ppoll, too. > So, the first thing I don't understand is why aio_poll needs the > return value of aio_bh_poll at all. Firstly, after sampling it, > it then causes aio_dispatch, and that can presumably set its own > bottom half callbacks; if this happens 'int blocking' won't be > cleared, and it will still enter g_poll with an infinite timeout. > Secondly, there seems to be an entirely separate mechanism > (aio_notify) in any case. If a non-idle bh has been scheduled, > this will cause g_poll to exit immediately as a read will be > ready. I believe this is cleared by the bh being used. I hope the above > The second thing I don't understand is why we aren't using > the timeout on g_poll / WaitForMultipleObjects. Because so far it wasn't needed (insert rant about idle BHs being a hack). This is a good occasion to use it. But I wouldn't introduce a new one-off concept (almost as much of a hack as idle BHs), I would rather reuse as much code as possible from QEMUTimer/QEMUClock. I must admit I don't have a clear idea of how the API would look like. > It would > seem to be reasonably easy to make aio_poll call aio_ctx_prepare > or something that does the same calculation. This would fix > idle bh's to be more reliable (we know it's safe to call them > within aio_poll anyway, it's just a question of whether > we turn an infinite wait into a 10ms wait). Idle BHs could be changed to timers as well, and then they would disappear. Paolo > Perhaps these two are related. > > I /think/ fixing the second (and removing the aio_notify > from qemu_bh_schedule_at) is sufficient provided it checks > for scheduled bh's immediately prior to the poll. This assumes > other threads cannot schedule bh's. This would seem to be less > intrusive that a TimedEventNotifier approach which (as far as I > can see) requires another thread. >