From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50028) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YStC5-0000zC-9A for qemu-devel@nongnu.org; Tue, 03 Mar 2015 15:13:30 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YStC2-00083I-1t for qemu-devel@nongnu.org; Tue, 03 Mar 2015 15:13:29 -0500 From: Max Reitz Date: Tue, 3 Mar 2015 15:12:58 -0500 Message-Id: <1425413591-31413-1-git-send-email-mreitz@redhat.com> Subject: [Qemu-devel] [PATCH v5 00/13] block: Rework bdrv_close_all() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-block@nongnu.org Cc: Kevin Wolf , Fam Zheng , qemu-devel@nongnu.org, Markus Armbruster , Stefan Hajnoczi , Paolo Bonzini , Max Reitz Currently, bdrv_close_all() force-closes all BDSs with a BlockBackend, which can lead to data corruption (see the iotest added in the final patch of this series) and is most certainly very ugly. This series reworks bdrv_close_all() to instead eject the BDS trees from all BlockBackends and then close the monitor-owned BDS trees, which are the only BDSs without a BB. In effect, all BDSs are closed just by getting closed automatically due to their reference count becoming 0. The benefit over the approach taken in v1 and v2 is that in device models we often cannot simply drop the reference to a BB because there may be some user which we forgot about. By ejecting the BDS trees from the BB, the BB itself becomes unusable, but in a clean way (it will return errors when accessed, but nothing will crash). Also, it is much simpler (no reference tracking necessary). The only disadvantage (I can see) is that the BBs are leaked; but this does not matter because the qemu process is about to exit anyway. This series depends on v2 of my series "blockdev: BlockBackend and media" (or any later version), and on v2 of my patch "virtio-scsi: Allocate op blocker reason before blocking". (actually, I rebased it on my local v3 of the "BB and media" series, but I did not send that one out yet, so I'm writing "v2" for the time being) v5: - Patch 5: Because of above mentioned virtio-scsi patch, the helper functions for setting up and destroying op blockers on a virtio-scsi device are no longer necessary because s->blocker is allocated all the time. - Patch 9: See justification below. [Kevin] - Patch 10 (was patch 9 in v4): Dropped the modification of bdrv_close_all() (moved to patch 12) - Patch 11 (split off patch 10 from v4): Moved the modification of bdrv_close_all() to patch 12 (just like for patch 10) - Patch 12 (Heavy modification and extension of a part of patch 10 from v4): If we don't force-bdrv_close() all BDSs anymore, there may be block jobs running; so we have to cancel all block jobs manually. However, in order to do that effectively and in a simple way, I want the list of BDSs to be completely empty afterwards. For that to happen, however, we need to eject all BDSs from the BBs and drop the monitor references to bare BDSs first. Therefore, all these changes have to be in a single patch (because they depend on each other), and that patch is this patch. [Kevin] Justification for patch 9: Adding yet another list of BlockDriverStates is ugly, I know. But we need something to track all the block jobs, and these are the alternatives: A. Have a list of all block jobs. Feasible, but this would involve a bit more code changes. B. Only consider the BDSs block jobs may run on; these are the root BDSs (bdrv_states) and the named BDSs (graph_bdrv_states). Is a bit uglier in that we need to iterate through both lists, but the benefit of this approach is that we don't have to add yet another BDS list. Also, this approach is a bit faster because we don't have to iterate through all the BDS left open, but only through the ones which might have block jobs running on them. However, the way patch 12 is implemented, bdrv_close_all() (which isn't performance-critical anyway...) only iterates through the list of all BDSs once the references from the BBs and the monitor are gone, so the only BDSs left are the ones on which block jobs are running, and the ones which are referenced by those (recursively). Therefore, we will not have a performance problem here. C. Have a list of all BDSs. This is mostly ugly because it means having yet another list of BDSs, but other than that, it's okay. So all these alternatives seem pretty equal, with A maybe being the most efficient one. Why did I go for C? As Kevin said, we probably want to have a way to assert that all BDSs are closed at the end of bdrv_close_all(). Besides adding a counter for the number of BDSs (which would have been as simple as it is ugly), this is the only way (I see) to do it. So we get that for free. But after this series, there are four lists for BDSs: bdrv_states, graph_bdrv_states, all_bdrv_states, and monitor_bdrv_states. Urgh. But the follow-up series to this series removes bdrv_states because they are covered by the list of BlockBackends. Maybe I could have repurposed that list to be what all_bdrv_states is right now, but I would find that confusing. So we're down to three. And finally, there have been plans for quite a while to give node-names to all BDSs; this would make graph_bdrv_states and all_bdrv_states just the same. So with that, we'd be down to two. So, because using option C allows us to make sure there are no BDSs left open at the end of bdrv_close_all(), because it's just as good as solution A and B from a technical perspective, and because it is not that ugly after all, I chose to go for that. git-backport-diff against v4: Key: [----] : patches are identical [####] : number of functional differences between upstream/downstream patch [down] : patch is downstream-only The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively 001/13:[----] [--] 'iotests: Move _filter_nbd into common.filter' 002/13:[----] [--] 'iotests: Make redirecting qemu's stderr optional' 003/13:[----] [--] 'iotests: Add test for eject under NBD server' 004/13:[----] [--] 'quorum: Fix close path' 005/13:[0032] [FC] 'block: Move BDS close notifiers into BB' 006/13:[----] [-C] 'block: Use blk_remove_bs() in blk_delete()' 007/13:[----] [-C] 'blockdev: Use blk_remove_bs() in do_drive_del()' 008/13:[----] [--] 'block: Make bdrv_close() static' 009/13:[down] 'block: Add list of all BlockDriverStates' 010/13:[0003] [FC] 'blockdev: Keep track of monitor-owned BDS' 011/13:[down] 'block: Add blk_remove_all_bs()' 012/13:[down] 'block: Rewrite bdrv_close_all()' 013/13:[----] [-C] 'iotests: Add test for multiple BB on BDS tree' *** BLURB HERE *** Max Reitz (13): iotests: Move _filter_nbd into common.filter iotests: Make redirecting qemu's stderr optional iotests: Add test for eject under NBD server quorum: Fix close path block: Move BDS close notifiers into BB block: Use blk_remove_bs() in blk_delete() blockdev: Use blk_remove_bs() in do_drive_del() block: Make bdrv_close() static block: Add list of all BlockDriverStates blockdev: Keep track of monitor-owned BDS block: Add blk_remove_all_bs() block: Rewrite bdrv_close_all() iotests: Add test for multiple BB on BDS tree block.c | 49 ++++++++++++------ block/block-backend.c | 41 ++++++++++++---- block/quorum.c | 3 +- blockdev-nbd.c | 36 +------------- blockdev.c | 24 +++++++-- hw/block/dataplane/virtio-blk.c | 77 ++++++++++++++++++++++------- hw/scsi/virtio-scsi.c | 59 ++++++++++++++++++++++ include/block/block.h | 2 - include/block/block_int.h | 8 ++- include/hw/virtio/virtio-scsi.h | 10 ++++ include/sysemu/block-backend.h | 4 +- nbd.c | 13 +++++ stubs/Makefile.objs | 1 + stubs/blockdev-close-all-bdrv-states.c | 5 ++ tests/qemu-iotests/083 | 13 +---- tests/qemu-iotests/083.out | 10 ---- tests/qemu-iotests/096 | 90 ++++++++++++++++++++++++++++++++++ tests/qemu-iotests/096.out | 16 ++++++ tests/qemu-iotests/117 | 86 ++++++++++++++++++++++++++++++++ tests/qemu-iotests/117.out | 14 ++++++ tests/qemu-iotests/common.filter | 12 +++++ tests/qemu-iotests/common.qemu | 12 ++++- tests/qemu-iotests/group | 2 + 23 files changed, 474 insertions(+), 113 deletions(-) create mode 100644 stubs/blockdev-close-all-bdrv-states.c create mode 100755 tests/qemu-iotests/096 create mode 100644 tests/qemu-iotests/096.out create mode 100755 tests/qemu-iotests/117 create mode 100644 tests/qemu-iotests/117.out -- 2.1.0