From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45063) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bD8Ji-0008M5-04 for qemu-devel@nongnu.org; Wed, 15 Jun 2016 06:45:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bD8Jf-0006yV-TE for qemu-devel@nongnu.org; Wed, 15 Jun 2016 06:45:00 -0400 References: <1465917916-22348-1-git-send-email-den@openvz.org> <20160615090647.GE4566@noname.redhat.com> <57612115.4040202@openvz.org> <20160615102515.GF4566@noname.redhat.com> From: "Denis V. Lunev" Message-ID: <5761319C.2010705@openvz.org> Date: Wed, 15 Jun 2016 13:44:44 +0300 MIME-Version: 1.0 In-Reply-To: <20160615102515.GF4566@noname.redhat.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 0/9] major rework of drive-mirror List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, vsementsov@virtuozzo.com, Stefan Hajnoczi , Fam Zheng , Max Reitz , Jeff Cody , Eric Blake , pbonzini@redhat.com On 06/15/2016 01:25 PM, Kevin Wolf wrote: > Am 15.06.2016 um 11:34 hat Denis V. Lunev geschrieben: >> On 06/15/2016 12:06 PM, Kevin Wolf wrote: >>> The second big thing is that I don't want to see new users of the >>> notifiers in I/O functions. Let's try if we can't add a filter >>> BlockDriver instead. Then we'd add an option to set the filter node-name >>> in the mirror QMP command so that the management tool is aware of the >>> node and can refer to it. >> this will be much more difficult to implement at my experience. > I agree that it will be more difficult, though I'm not sure whether it's > much more. > >> Can you share more details why filters are bad? > Basically, notifiers (and other code for supporting other features in > the common I/O path) just don't fit the block layer design very well. > They are more hacked on instead of designed in. > > This shows in different ways in different places, so I can't fully tell > you the problems that using notifiers in the mirror code would cause > (which is a problem in itself), but for example the copy-on-read code > that has been added to the core block layer instead of a separate filter > driver had trouble with ignoring its own internal requests that it made > to the BDS, which was hacked around with a new request flag, i.e. making > the core block layer even more complicated. > > > An example that I can think of with respect to mirroring is taking a > snapshot during the operation. Op blockers prevent this from happening > at the moment, but it seems to be a very reasonable thing for a user to > want. Let me use the filter node approach to visualise this: > > (raw-posix) (qcow2) > source_file <- source <- mirror <- virtio-blk > | > target_file <- target <----+ > > There are two ways to create a snapshot: Either you put it logically > between the mirror and virtio-blk, so that only source (now a backing > file), but no new writes will be mirrored. This is easy in both > approaches, but maybe the less commonly wanted thing. > > The other option is putting the new overlay between source and mirror: > > (raw-posix) (qcow2) > source_file <- source <- snap <- mirror <- virtio-blk > | > target_file <- target <------------+ > > With the mirror intercept being its own BDS node, making this change is > very easy and doesn't involve any code that is specific to mirroring. > > If it doesn't have a filter driver and uses notifiers, however, the > mirror is directly tied to the source BDS, and now it doesn't just work, > but you need extra code that explicitly moves the notifier from the > source BDS to the snap BDS. And you need such code not only for > mirroring, but for everything that potentially hooks into the I/O > functions. > > > Maybe these two examples give you an idea why I want to use the concepts > that are designed into the block layer with much thought rather than > hacked on notifiers. Notifiers may be simpler to implement in the first > place, but they lead to a less consistent and harder to maintain block > layer in the long run. ok. great explanation, thank you >>> If we don't do this now, we'll have to introduce it later and can't be >>> sure that the management tool knows about it. This would complicate >>> things quite a bit because we would have to make sure that the added >>> node stays invisible to the management tool. >>> >>> >>> I think these two things are the big architectural questions. The rest >>> is hopefully more or less implementation details. >> I completely agree with you. >> >> We have the following choices: >> 1. implement parameter to use 'active'/'passive' mode from the very >> beginning >> 2. switch to 'active' mode upon receiving "block-job-complete" >> command unconditionally >> 3. switch to 'active' mode upon receiving "block-job-complete" >> command with proper parameter >> 4. switch to 'active' mode after timeout (I personally do not like >> this option) >> >> I think that choices 1 and 3 do not contradict each other and >> could be implemented to gather. > I think we definitely want 1. and some way to switch after the fact. > That you suggest three different conditions for doing the switch > suggests that it is policy and doesn't belong in qemu, but in the > management tools. So how about complementing 1. with 5.? > > 5. Provide a QMP command to switch between active and passive mode reasonable. looks OK to me. Den