qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: Sandeep Joshi <sanjos100@gmail.com>, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] writing a QEMU block driver
Date: Mon, 20 Oct 2014 09:50:35 +0200	[thread overview]
Message-ID: <5444BECB.3040900@redhat.com> (raw)
In-Reply-To: <CAEfL3KmQodggd+Z=9U8QzyaCckbmgGvEbQgwjjTOy4dNH9A25A@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4247 bytes --]

Am 2014-10-17 um 16:59 schrieb Sandeep Joshi:
>
> Hi there,
>
> Do let me know if I am asking these questions on the wrong forum.  I'd 
> like to write a QEMU block driver which forwards IO requests to a 
> custom-built storage cluster.
>
> I have seen Jeff Cody's presentation <http://bugnik.us/kvm2013> and 
> also browsed the source code for sheepdog, nbd and gluster in the 
> "block" directory and had a few questions to confirm or correct my 
> understanding.
>
> 1) What is the difference between bdrv_open and bdrv_file_open 
> function pointers in the BlockDriver ?

I'm not sure, but the main difference should be that bdrv_file_open() is 
invoked for protocol block drivers, whereas bdrv_open() is invoked for 
format block drivers. A couple of months ago, there was still a 
top-level bdrv_file_open() function which has since been integrated into 
bdrv_open(), so we might probably want to remove bdrv_file_open() in the 
future as well...

But for now, use bdrv_file_open() for protocol drivers and bdrv_open() 
for format drivers.

> 2) Is it possible to implement only a protocol driver without a format 
> driver (the distinction that Jeff made in his presentation above) ?  
> In other words, can I only set the "protocol_name" and not 
> "format_name" in BlockDriver ?  I'd like to support all image formats 
> (qemu, raw, etc) without having to reimplement the logic for each.

Setting format_name does not make a block driver a format driver. A 
block driver can only be either protocol or format driver, and the 
distinction is probably made (again, I'd have to look it up to be sure) 
by protocol drivers setting protocol_name and bdrv_file_open(), whereas 
format drivers do not.

So you just need to set protocol_name and bdrv_file_open() (and 
format_name as well, see nbd for an example where protocol_name and 
format_name differ) and qemu knows your block driver is a protocol 
driver and any format drivers will work on top of it. You should not set 
bdrv_open(), however.

Once again, I'm not 100 % sure, but it should work that way.

Just by the way, I can very well imagine that the distinction between 
protocol and format block drivers will disappear (at least in the code) 
in the future. But that should not be any of your concern. :-)

> 3) The control flow for creating a file starts with the image format 
> driver and later invokes the protocol driver.
>
> image_driver->bdrv_create()
>     --> bdrv_create_file
>           --> bdrv_find_protocol(filename)
>           --> bdrv_create
>                 ---> Protocol_driver->bdrv_create()
>
> Is this the case for all functions?   Does the read/write first flow 
> through the image format driver before getting passed down to the 
> protocol driver (possibly via some coroutine invoked from the block 
> layer or virtio-blk ) ?  Can someone give me a hint as to how I can 
> trace the control flow ?

Well, you can always use gdb with break points and backtraces. At least 
that's what I'd do.

For your first question: Yes, for each guest device or let's say virtual 
guest device (because creating an image is not done through a guest 
device, but the only thing missing from a guest device configuration is 
in fact the device itself), there is a tree of BlockDriverStates. Every 
request runs through the whole tree. It may not touch all nodes, but it 
will start from the top (which is normally a format BDS) and then 
proceed as far as the block drivers create new requests to their children.

Or, to be more technical: A request only goes to the topmost node in the 
BDS tree (the root). If need be, it will manually forward it to its 
child (which normally is bs->file if bs is a pointer to the 
BlockDriverState) or children (e.g. bs->backing_hd, the backing file, or 
driver-specific things, such as the children for the quorum block driver 
which are not stored in the BlockDriverState).

This doesn't apply so well to bdrv_create(), because that function does 
not work on BlockDriverStates, but I'm hoping you're seeing the point.

Shameless self plug: Regarding this whole BDS tree thing I can recommend 
Kevin's and my presentation from this year's KVM Forum: 
http://events.linuxfoundation.org/sites/events/files/slides/blockdev.pdf


Max

[-- Attachment #2: Type: text/html, Size: 6267 bytes --]

  reply	other threads:[~2014-10-20  7:50 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-17 14:59 [Qemu-devel] writing a QEMU block driver Sandeep Joshi
2014-10-20  7:50 ` Max Reitz [this message]
2014-10-21  7:30   ` Sandeep Joshi
2014-10-22  3:08     ` Sandeep Joshi
2014-10-22  7:12       ` Max Reitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5444BECB.3040900@redhat.com \
    --to=mreitz@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=sanjos100@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).