From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:54829) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TEHNQ-0004EY-BM for qemu-devel@nongnu.org; Wed, 19 Sep 2012 06:19:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TEHNL-00034A-JZ for qemu-devel@nongnu.org; Wed, 19 Sep 2012 06:19:28 -0400 Received: from mx1.redhat.com ([209.132.183.28]:26213) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TEHNL-00033L-9y for qemu-devel@nongnu.org; Wed, 19 Sep 2012 06:19:23 -0400 Message-ID: <50599C25.1020904@redhat.com> Date: Wed, 19 Sep 2012 12:19:17 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <50597D1F.3070607@redhat.com> <50598B58.4010704@redhat.com> <50598D0D.2090608@redhat.com> <50598EB2.5040101@redhat.com> <505995B7.4010709@redhat.com> <5059993C.1070506@redhat.com> In-Reply-To: <5059993C.1070506@redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [big lock] Discussion about the convention of device's DMA each other after breaking down biglock List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: Jan Kiszka , Marcelo Tosatti , liu ping fan , Anthony Liguori , qemu-devel@nongnu.org Il 19/09/2012 12:06, Avi Kivity ha scritto: >> > (The hunch is that ts could be deleted exactly at the moment the >> > callback is unlocked. This can be solved with ref/unref on the opaque >> > value, as you mention below). > Are you saying that this works as is or not? It does seem broken wrt > deletion; after qemu_del_timer() completes the caller expects the > callback not to be called. Ouch, then I think the only solution is to remove this invariant if you add fine-grained locking and re-check the enable conditions in the timer callback. If you allow calling del_timer to be called with the device lock held, you cannot fixing without deadlocking. If you don't, the caller of del_timer cannot set any expectation because the timer could be run in the window between dropping the device lock and calling del_timer Paolo > This isn't trivial to guarantee, we need something like > > dispatch_timer(): > pending += 1 > timer.ref() > drop lock > timer.cb() > take lock > timer.unref() > pending -= 1 > notify > > del_timer(): > take lock > timer.unlink() > while pending: > wait for notification > drop lock > > but, if del_timer is called with the device lock held, we deadlock. ugh. Wait for notification should be a cond_wait that drops the lock. Alternatively: dispatch_timer(): drop lock if timer has expired: timer.cb() take lock notify del_timer(): take lock timer.unlink() drop lock