All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Baron <jbaron@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	qemu-devel@nongnu.org, Alexander Graf <agraf@suse.de>,
	Gerd Hoffmann <kraxel@redhat.com>,
	Anthony Liguori <anthony@codemonkey.ws>
Subject: Re: [Qemu-devel] On block interface types in general, IF_AHCI in particular
Date: Tue, 30 Oct 2012 11:16:31 -0400	[thread overview]
Message-ID: <20121030151631.GA2744@redhat.com> (raw)
In-Reply-To: <87k3u82f7b.fsf@blackfin.pond.sub.org>

On Tue, Oct 30, 2012 at 03:43:20PM +0100, Markus Armbruster wrote:
> The primary purpose of this memo is a brain dump on how block interface
> types are used, and what that means for AHCI.  A secondary purpose is to
> disabuse Alex of the notion that -drive is simple ;)
> 
> 
> BlockInterfaceType really just serves -drive and its monitor cousins
> drive_add and pci_add[*].
> 
> Note: we have a whole zoo of convenience options to define drives, but
> they're all just shorthand for -drive, so let's ignore them here.
> 
> 
> Let's start with -drive.  -drive creates a block backend, and leaves it
> in a place where board code can find it if it cares.
> 
> That place is indexed by a triple (BlockInterfaceType type, int bus, int
> unit), corresponding to -drive options if, bus, unit.
> 
> The actual work is done by drive_init().  Its opts argument comes
> straight from -drive's argument.
> 
> If option if is missing, type is set to a board-specific default.
> 
> If option bus is missing, bus is set to zero.
> 
> If option unit is missing, the system picks the first unused bus, unit
> starting with (bus, 0).
> 
> For type IF_IDE, unit must be 0 or 1.
> 
> For type IF_SCSI, unit must be 0..6.  Note that the range is fine for
> 8-bit SCSI, and inappropriate for anything else.
> 
> Complication: option index.  This is an alternate way to specify bus and
> unit.
> 
> For type IF_IDE, bus = index / 2, unit = index % 2.
> 
> For type IF_SCSI, bus = index / 7, unit = index % 7.  Again, fine for
> 8-bit SCSI, not fine for everything else.
> 
> For any other type, bus = 0, unit = index.  There's no way to specify
> bus > 0 with index.
> 
> This mapping from index to (bus, unit) is ABI.
> 
> (type, bus, unit) also get put into the drive ID when the user doesn't
> specify one (again ABI), but let's ignore that here.
> 
> 
> Now examine what board code does with (type, bus, unit) triples.  In
> general, board code looks up the triples it knows, and it finds a
> backend, it creates a suitable device model connected to the it.
> 
> Same thing, different perspective: a board has a fixed set of onboard
> devices.  They may be associated with a well-known number of (type, bus,
> unit) triples.  The association is ABI.
> 
> Example: "connex" associates (IF_PFLASH, 0, 0) with its CFI parallel
> flash.  It errors out when the triple isn't found.
> 
> Example: "g3beige" associates its PMAC IDE's master with (IF_IDE, 0, 0),
> the slave with (IF_IDE, 0, 1), its CMD646's primary master with (IF_IDE,
> 1, 0), its slave with (IF_IDE, 1, 1).  A suitable IDE device is created
> when a triple is found.  It doesn't associate its secondary master and
> slave with any triple (which means you can't put drives there with
> -drive if=ide).
> 
> Boards never associate (IF_NONE, ...) with anything.  The backends
> behind these tripes are for use with -device.
> 
> Some boards create additional device models when they find certain
> triples.  What gets created when is ABI.
> 
> Example: "pc" finds the maximum bus N that occurs in any (IF_SCSI, ...)
> triple, then creates N+1 lsi53c895a SCSI HBAs, and connects all
> (IF_SCSI, B, U) to a suitable SCSI device model with SCSI ID U on the
> B-th HBA.
> 
> Example: "spapr" does the same, except it creates spapr-vscsi HBAs.
> 
> Boards generally ignore triples they don't associate with anything
> silently, but there are exceptions.
> 
> Example: "sun4m" errors out if it finds (IF_SCSI, B, U) with B>0.
> 
> Ignored triples can still be used with -device.
> 
> Example: "pc" silently ignores the (IF_SD, ...), but these drives are
> still usable with -device ide-cd,drive=sd0 and such.  Filed under
> "stupid stunts".  It's ABI all the same.
> 
> There's a twist for (IF_SCSI, B, U) triples ignored by the board:
> creating the B-th a SCSI HBA with -device also creates a suitable SCSI
> device model with SCSI ID U on that HBA.
> 
> Example: "sun4u" doesn't provide a SCSI HBA, and normally ignores
> (IF_SCSI, 0, 0) silently.  But if you then create a SCSI HBA with
> -device, this also creates a suitable SCSI device model with SCSI ID 0
> on that HBA.
> 
> Boards can associate block interface types with whatever device models
> they choose, even totally different kinds than the name of the block
> interface type suggests.  You may call that abuse.  I call it ABI.
> 
> Example: "s390-virtio" creates virtio-blk-s390 device models for
> (IF_IDE, ...) triples.
> 
> Some boards may do weird shit not covered here.  What weird shit gets
> done when is ABI.
> 
> The important lesson here is that the meaning of -drive parameters if,
> bus, unit and index depends entirely on the board (except for -drive
> if=none, of course), and is part of the board's ABI.
> 
> 
> Next are drive_add and pci_add.  These commands are quite limited in
> what they can do.
> 
> drive_add "" if=none,OPTS... works just like -drive if=none,OPTS...: it
> creates a block backend for use with device_add.
> 
> drive_add ADDR if=scsi,OPTS... resembles -drive if=scsi,OPTS..., except
> it doesn't leave device model creation to the board code: it creates a
> suitable SCSI device connected to the SCSI HBA with PCI address ADDR.
> 
> pci_add ADDR storage if=scsi,OPTS additionally creates an lsi53c895a HBA
> with PCI address ADDR.  Even when the board creates some other HBA for
> -drive if=scsi.
> 
> pci_add ADDR storage if=virtio,OPTS creates a virtio-blk-pci device with
> PCI address ADDR, similar to -drive if=virtio,addr=ADDR,OPTS.
> 
> The only general way to hot plug a block device is drive_add with
> if=none, then device_add.  The other forms cover only special cases.
> Generalizing them involves making them work like -drive: leave device
> model creation to board code.  Which had to be added to every board
> supporting hot plug.  Hardly worth the trouble.
> 
> 
> What does this all mean for AHCI?
> 
> The argument that we must have IF_AHCI because AHCI needs different
> guest OS drivers than IDE doesn't hold water.  Plenty of precedence
> above: IF_IDE can get you an ide-hd connected to some IDE controller, or
> a virtio-blk-s390 device, and IF_SCSI can get you a scsi-hd connected to
> totally different SCSI HBAs.  I can't see why IF_IDE giving you an
> ide-hd connected to ich9-ahci on q35 would be any different.
> 
> There's a related argument that I find more compelling: we may want
> if=ahci to let users choose nicely between IDE and AHCI.  Makes sense
> only if we have boards providing both kind of controllers onboard.  q35
> doesn't.
> 

This was my primary argument for ahci. However, I didn't realize
initially that if=blah was only applied for the default controller. IE
it doesn't spawn controllers if they don't exist. If we are limiting
if=blah to attaching to the default controller, now and in the future, I
think the argument for IF_AHCI becomes less compelling. 

> Another argument for IF_AHCI is about the permissible range of unit and
> the mapping of index to bus and unit, both detemined by the maximum
> permissible unit number.  Right now, the maximum only depends on the
> block interface type, not the board, and IF_IDE's maximum 2 is
> inappropriate for AHCI (ich9-ahci wants 6, assuming we model it as a
> single bus for purposes of -drive).  However, the same problem already
> exists for IF_SCSI: its maximum 7 is inappropriate for anything but
> crusty old 8-bit SCSI.  Creating IF_AHCI bypasses the problem with
> IF_IDE's maximum, but leaves the IF_SCSI problem unsolved.  Hmm.
> 
> How do we plan to handle the next member of the ATA controller family?
> Create yet another block interface type for it?
> 

The way IF_AHCI is implemented, the board controlls mapping of index to
bus and unit. So I don't think we would need to create a new one.

> I'm not saying IF_AHCI is wrong.  I'm just saying some of the arguments
> for it aren't compelling.
> 
> 
> [*] There are two other uses of drive_init(), in hw/pc_sysfw.c and
> hw/usb/dev_storage.  They both create both backend and frontend, thus
> the BlockInterfaceType they use doesn't really matter.

As you know, I'm trying to come to consensus on this in the interest of getting
q35 into 1.3.

I just prototyped a patch that removes IF_AHCI. And models the 6 port
ahci controller as 3 busses with 2 units per bus. This should allow all
existing if=ide usages to continue to work as expected. Is that more
palatable?

Thanks,

-Jason

  reply	other threads:[~2012-10-30 15:16 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-30 14:43 [Qemu-devel] On block interface types in general, IF_AHCI in particular Markus Armbruster
2012-10-30 15:16 ` Jason Baron [this message]
2012-10-30 16:31 ` Paolo Bonzini
2012-10-30 17:04   ` Kevin Wolf
2012-10-30 19:03     ` Anthony Liguori
2012-10-30 21:12     ` Anthony Liguori
2012-10-31 13:58       ` Markus Armbruster
2012-10-31 14:01         ` Peter Maydell
2012-10-31 14:09         ` Paolo Bonzini
2012-10-31 14:20         ` Markus Armbruster
2012-10-31 14:40           ` Paolo Bonzini
2012-10-31 14:43             ` Alexander Graf
2012-10-31 14:46               ` Paolo Bonzini
2012-10-31 16:00               ` Anthony Liguori
2012-10-31 16:05                 ` Alexander Graf
2012-10-31 16:46                   ` Anthony Liguori
2012-10-31 16:57                     ` Alexander Graf
2012-10-31 18:32                       ` Anthony Liguori
2012-10-31 16:48                   ` Paolo Bonzini
2012-10-31 16:21         ` Kevin Wolf
2012-10-31 16:32           ` Paolo Bonzini
2012-10-31 16:35             ` Alexander Graf
2012-10-31 16:50               ` Paolo Bonzini
2012-11-01  8:50   ` Gerd Hoffmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121030151631.GA2744@redhat.com \
    --to=jbaron@redhat.com \
    --cc=agraf@suse.de \
    --cc=anthony@codemonkey.ws \
    --cc=armbru@redhat.com \
    --cc=kraxel@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.