From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55340) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UrmRo-0002GI-8i for qemu-devel@nongnu.org; Wed, 26 Jun 2013 05:55:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UrmRk-0000jS-5R for qemu-devel@nongnu.org; Wed, 26 Jun 2013 05:55:32 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55373) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UrmRj-0000iX-TU for qemu-devel@nongnu.org; Wed, 26 Jun 2013 05:55:28 -0400 Message-ID: <51CABA88.7040809@redhat.com> Date: Wed, 26 Jun 2013 11:55:20 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <1372181916-4676-1-git-send-email-pingfank@linux.vnet.ibm.com> <51C9379C.80605@redhat.com> <51C94C90.8000104@redhat.com> <51CA8B63.6000501@redhat.com> <51CAA8A2.6060609@redhat.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 0/3] make bh safe with hot-unplug List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: liu ping fan Cc: qemu-devel@nongnu.org, Stefan Hajnoczi Il 26/06/2013 11:44, liu ping fan ha scritto: > On Wed, Jun 26, 2013 at 4:38 PM, Paolo Bonzini wrote: >> Il 26/06/2013 10:20, liu ping fan ha scritto: >>>>> On the other hand, pushing _delete() out of finalization path is not >>>>> easy, since we do not what time the DeviceState has done with its bh. >>>> >>>> See above: >>>> >>>> - if the BH will run in the iothread, the BH is definitely not running >>>> (because the BH runs under the BQL, and the reclaimer owns it) >>> >>> It works for the case in which the DeviceState's reclaimer calls >>> _bh_delete(). But in another case(also exist in current code), where >>> we call _bh_delete() in callback, the bh will be scheduled, and oops! >> >> But you may know that the BH is not scheduled after removal, too. >> > No, the removal can run in parallel with the other mmio-dispatch which > can resort to schedule a bh. But then behavior is more or less undefined. Of course the device must ensure that something sane happens, but that's the responsibility of the device, not of the generic layer. > I.e, the removal can not call > _bh_delete(), so it do not know whether bh will be scheduled or not. It can call qemu_bh_delete(), then all concurrent qemu_bh_schedule() will be no-ops. >> If you look at the memory work, for example, the owner patches happen to >> be useful for BQL-less dispatch too, but they are solving existing >> hot-unplug bugs. >> > Oh, I will read it again, since I had thought the owner is only used > for the purpose of refcnt. It is. But it goes beyond BQL-less dispatch. Anything that gives away the BQL between a map and unmap (including asynchronous I/O) needs an owner. We have ignored that until now because we do not have memory unplug. >>>> What we need is separation of removal and reclamation. Without that, >>>> any effort to make things unplug-safe is going to be way way more >>>> complicated than necessary. >>>> >>> Agree, but when I tried for bh, I found the separation of removal and >>> reclamation are not easy. We can not _bh_delete() in >>> acpi_piix_eject_slot(), because mmio-dispatch can schedule a bh at the >>> same time. >> >> acpi_piix_eject_slot() is removal, not reclamation. The plan there is >> that qdev_free calls the exit callback immediately (which can do >> qemu_bh_delete), and schedules an unref after the next RCU grace period. >> At the next RCU grace period the BH will not be running, thus it is >> safe to finalize the object. >> > I have two question: > 1st, who trigger qdev_free? //Not figuring out the total design, but > I feel it is quite different from kernel's design, where refcnt->0, > then exit is called. qdev_free is triggered by the guest, but free is a misnomer. It is really "make it inaccessible from the guest and management" (the kernel equivalent would be removal of /dev and /sys entries, for example). The actual "free" will happen later. > 2nd, _delete_bh() means even there is any not-severed request, the > request will be canceled. Is it legal, and will not lose data(not > sure, since I do not know who will call qdev_free)? That's something that should be ensured by the device. For example it is not a problem to cancel virtio-net's tx_bh. If it is a problem, the device must do something else. It could even be a bdrv_drain_all() in the worst case. Paolo