qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] virtio-scsi wiki feature page
@ 2011-10-27 10:49 Stefan Hajnoczi
  2011-10-27 11:19 ` Paolo Bonzini
  0 siblings, 1 reply; 3+ messages in thread
From: Stefan Hajnoczi @ 2011-10-27 10:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Anthony Liguori, Badari Pulavarty

Hi,
I have created a virtio-scsi wiki feature page with links to Paolo's
latest draft specification, our KVM Forum presentation, and code
repos:

http://wiki.qemu.org/Features/VirtioSCSI

Paolo: v3 had some comments, is it a good time for a new revision of
the draft specification?

Stefan

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] virtio-scsi wiki feature page
  2011-10-27 10:49 [Qemu-devel] virtio-scsi wiki feature page Stefan Hajnoczi
@ 2011-10-27 11:19 ` Paolo Bonzini
  2011-10-27 12:18   ` Stefan Hajnoczi
  0 siblings, 1 reply; 3+ messages in thread
From: Paolo Bonzini @ 2011-10-27 11:19 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Anthony Liguori, Badari Pulavarty, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 881 bytes --]

On 10/27/2011 12:49 PM, Stefan Hajnoczi wrote:
> I have created a virtio-scsi wiki feature page with links to Paolo's
> latest draft specification, our KVM Forum presentation, and code
> repos:
>
> http://wiki.qemu.org/Features/VirtioSCSI
>
> Paolo: v3 had some comments, is it a good time for a new revision of
> the draft specification?

Yes.  I was waiting until I actually have an implementation, but anyway 
here it is, attached.  The changes are small:

- additional failure kinds mapping more or less to Linux driver_statuses

- defined the format of the LUN.  Unlike vSCSI, there's no support for 
generic hierarchical LUNs.  A single LUN format is specified, that 
supports 256 targets and 16384 LUNs per target.

- clarified multiqueue semantics

I'm planning to update your LLD code to support these changes, but I'll 
gladly accept that someone else does it. :)

Paolo

[-- Attachment #2: virtio-scsi-v4.txt --]
[-- Type: text/plain, Size: 15976 bytes --]

Virtio SCSI Host Device Spec
============================

The virtio SCSI host device groups together one or more simple virtual
devices (ie. disk), and allows communicating to these devices using the
SCSI protocol.  An instance of the device represents a SCSI host with
possibly many buses (also known as channels or paths), targets and
LUNs attached.

The virtio SCSI device services two kinds of requests:

- command requests for a logical unit;

- task management functions related to a logical unit, target or
command.

The device is also able to send out notifications about added and removed
logical units.  Together, these capabilities provide a SCSI transport
protocol that uses virtqueues as the transfer medium.  In the transport
protocol, the virtio driver acts as the initiator, while the virtio SCSI
host provides one or more targets that receive and process the requests.

v1:
    First public version

v2:
    Merged all virtqueues into one, removed separate TARGET fields

v3:
    Added configuration information and reworked descriptor structure.
    Added back multiqueue on Avi's request, while still leaving TARGET
    fields out.  Added dummy event and clarified some aspects of the
    event protocol.  First version sent to a wider audience (linux-kernel
    and virtio lists).

v4:
    Clarified multiqueue semantics.  Specified format of LUN field.
    Added more failure codes roughly corresponding to Linux driver_status
    values.

Configuration
-------------

Subsystem Device ID
    TBD

Virtqueues
    0:controlq
    1:eventq
    2..n:request queues

Feature bits
    VIRTIO_SCSI_F_INOUT (0) - Whether a single request can include both
        read-only and write-only data buffers.

Device configuration layout
    struct virtio_scsi_config {
        u32 num_queues;
        u32 event_info_size;
        u32 sense_size;
        u32 cdb_size;
    }

    num_queues is the total number of virtqueues exposed by the
    device.  The driver is free to use only one request queue, or
    it can use more to achieve better performance.

    event_info_size is the maximum size that the device will fill
    for buffers that the driver places in the eventq.  The
    driver should always put buffers at least of this size.

    sense_size is the maximum size of the sense data that the device
    will write.  The default value is written by the device and
    will always be 96, but the driver can modify it.

    cdb_size is the maximum size of the CDB that the driver
    will write.  The default value is written by the device and
    will always be 32, but the driver can likewise modify it.

Device initialization
---------------------

The initialization routine should first of all discover the device's
virtqueues.

The driver should then place at least a buffer in the eventq.
Buffers returned by the device on the eventq may be referred
to as "events" in the rest of the document.

The driver can immediately issue requests (for example, INQUIRY or
REPORT LUNS) or task management functions (for example, I_T RESET).

Device operation: request queues
--------------------------------

The driver queues requests to an arbitrary request queue, and they are
used by the device on that same queue.  In this version of the spec,
commands placed on different queue will be consumed with _no_ order
constraints.

Requests have the following format:

    struct virtio_scsi_req_cmd {
        u8 lun[8];
        u64 id;
        u8 task_attr;
        u8 prio;
        u8 crn;
        char cdb[cdb_size];
        char dataout[];

        u8 sense[sense_size];
        u32 sense_len;
        u32 residual;
        u16 status_qualifier;
        u8 status;
        u8 response;
        char datain[];
    };

    /* command-specific response values */
    #define VIRTIO_SCSI_S_OK                0
    #define VIRTIO_SCSI_S_UNDERRUN          1
    #define VIRTIO_SCSI_S_ABORTED           2
    #define VIRTIO_SCSI_S_BAD_TARGET        3
    #define VIRTIO_SCSI_S_RESET             4
    #define VIRTIO_SCSI_S_TRANSPORT_FAILURE 5
    #define VIRTIO_SCSI_S_TARGET_FAILURE    6
    #define VIRTIO_SCSI_S_NEXUS_FAILURE     7
    #define VIRTIO_SCSI_S_FAILURE           8

    /* task_attr */
    #define VIRTIO_SCSI_S_SIMPLE            0
    #define VIRTIO_SCSI_S_ORDERED           1
    #define VIRTIO_SCSI_S_HEAD              2
    #define VIRTIO_SCSI_S_ACA               3

    The lun field addresses a target and logical unit in the virtio-scsi
    device's SCSI domain.  In this version of the spec, the only supported
    value of the LUN field is: first byte set to 1, second byte set
    to target, third and fourth byte representing a single level LUN
    structure, followed by four zero bytes.  With this representation,
    a virtio-scsi device can serve up to 256 targets and 16384 LUNs
    per target.

    The id field is the command identifier.

    Task_attr, prio and CRN should be left to zero: command priority is
    explicitly not supported by this version of the device; task_attr
    defines the task attribute as in the table above, but all task
    attributes may be mapped to SIMPLE by the device; CRN may also be
    provided by clients, but is generally expected to be 0.  The maximum
    CRN value defined by the protocol is 255, since CRN is stored in an
    8-bit integer.

    All of these fields are defined in SAM.  They are always read-only,
    as are the cdb and dataout field.  sense and subsequent fields are
    always write-only.

    The sense_len field indicates the number of bytes actually written
    to the sense buffer.  The residual field indicates the residual
    size, calculated as data_length - number_of_transferred_bytes, for
    read or write operations.

    The status byte is written by the device to be the SCSI status code.

    The response byte is written by the device to be one of the following:

    - VIRTIO_SCSI_S_OK when the request was completed and the status byte
      is filled with a SCSI status code (not necessarily "GOOD").

    - VIRTIO_SCSI_S_UNDERRUN if the content of the CDB requires transferring
      more data than is available in the data buffers.

    - VIRTIO_SCSI_S_ABORTED if the request was cancelled due to a task
      management function.

    - VIRTIO_SCSI_S_BAD_TARGET if the request was never processed because the
      target indicated by the LUN field does not exist.

    - VIRTIO_SCSI_S_RESET if the request was cancelled due to a bus or device
      reset.

    - VIRTIO_SCSI_S_TRANSPORT_FAILURE if the request failed due to a problem
      in the connection between the host and the target (severed link).

    - VIRTIO_SCSI_S_TARGET_FAILURE if the target is suffering a failure and
      the guest should not retry on other paths.

    - VIRTIO_SCSI_S_NEXUS_FAILURE if the nexus is suffering a failure but
      retrying on other paths might yield a different result.

    - VIRTIO_SCSI_S_FAILURE for other host or guest error.  In particular,
      if neither dataout nor datain is empty, and the VIRTIO_SCSI_F_INOUT
      feature has not been negotiated, the request will be immediately
      returned with a response equal to VIRTIO_SCSI_S_FAILURE.

Device operation: controlq
--------------------------

The controlq is used for other SCSI transport operations.
Requests have the following format:

    struct virtio_scsi_ctrl
    {
        u32 type;
        ...
        u8 response;
    }

    The type identifies the remaining fields.

The following commands are defined:

- Task management function

    #define VIRTIO_SCSI_T_TMF                      0

    #define VIRTIO_SCSI_T_TMF_ABORT_TASK           0
    #define VIRTIO_SCSI_T_TMF_ABORT_TASK_SET       1
    #define VIRTIO_SCSI_T_TMF_CLEAR_ACA            2
    #define VIRTIO_SCSI_T_TMF_CLEAR_TASK_SET       3
    #define VIRTIO_SCSI_T_TMF_I_T_NEXUS_RESET      4
    #define VIRTIO_SCSI_T_TMF_LOGICAL_UNIT_RESET   5
    #define VIRTIO_SCSI_T_TMF_QUERY_TASK           6
    #define VIRTIO_SCSI_T_TMF_QUERY_TASK_SET       7

    struct virtio_scsi_ctrl_tmf
    {
        u32 type;
        u32 subtype;
        u8 lun[8];
        u64 id;
        u8 additional[];
        u8 response;
    }

    /* command-specific response values */
    #define VIRTIO_SCSI_S_FUNCTION_COMPLETE        0
    #define VIRTIO_SCSI_S_FAILURE                  3
    #define VIRTIO_SCSI_S_FUNCTION_SUCCEEDED       4
    #define VIRTIO_SCSI_S_FUNCTION_REJECTED        5
    #define VIRTIO_SCSI_S_INCORRECT_LUN            6

    The type is VIRTIO_SCSI_T_TMF.  All fields but the last one are
    filled by the driver, the response field is filled in by the device.
    The id command must match the id in a SCSI command.  Irrelevant fields
    for the requested TMF are ignored.

    Note that since ACA is not supported by this version of the spec,
    VIRTIO_SCSI_T_TMF_CLEAR_ACA is always a no-operation.

    The outcome of the task management function is written by the device
    in the response field.  Return values map 1-to-1 with those defined
    in SAM.

- Asynchronous notification query

    #define VIRTIO_SCSI_T_AN_QUERY                    1

    struct virtio_scsi_ctrl_an {
        u32 type;
        u8  lun[8];
        u32 event_requested;
        u32 event_actual;
        u8  response;
    }

    #define VIRTIO_SCSI_EVT_ASYNC_OPERATIONAL_CHANGE  2
    #define VIRTIO_SCSI_EVT_ASYNC_POWER_MGMT          4
    #define VIRTIO_SCSI_EVT_ASYNC_EXTERNAL_REQUEST    8
    #define VIRTIO_SCSI_EVT_ASYNC_MEDIA_CHANGE        16
    #define VIRTIO_SCSI_EVT_ASYNC_MULTI_HOST          32
    #define VIRTIO_SCSI_EVT_ASYNC_DEVICE_BUSY         64

    By sending this command, the driver asks the device which events
    the given LUN can report, as described in paragraphs 6.6 and A.6
    of the SCSI MMC specification.  The driver writes the events it is
    interested in into the event_requested; the device responds by
    writing the events that it supports into event_actual.

    The type is VIRTIO_SCSI_T_AN_QUERY.  The lun and event_requested
    fields are written by the driver.  The event_actual and response
    fields are written by the device.

    Valid values of the response byte are VIRTIO_SCSI_S_OK or
    VIRTIO_SCSI_S_FAILURE (with the same meaning as above).

- Asynchronous notification subscription

    #define VIRTIO_SCSI_T_AN_SUBSCRIBE                2

    struct virtio_scsi_ctrl_an {
        u32 type;
        u8  lun[8];
        u32 event_requested;
        u32 event_actual;
        u8  response;
    }

    By sending this command, the driver asks the specified LUN to report
    events for its physical interface, again as described in the SCSI
    MMC specification.  The driver writes the events it is interested in
    into the event_requested; the device responds by writing the events
    that it supports into event_actual.

    Event types are the same as for the asynchronous notification query
    message.

    The type is VIRTIO_SCSI_T_AN_SUBSCRIBE.  The lun and event_requested
    fields are written by the driver.  The event_actual and response
    fields are written by the device.

    Valid values of the response byte are VIRTIO_SCSI_S_OK or
    VIRTIO_SCSI_S_FAILURE (with the same meaning as above).

Device operation: eventq
------------------------

The eventq is used by the device to report information on logical units
that are attached to it.  The driver should always leave a few (?) buffers
ready in the eventq.  The device will end up dropping events if it finds
no buffer ready.

Buffers are placed in the eventq and filled by the device when interesting
events occur.  The buffers should be strictly write-only (device-filled)
and the size of the buffers should be at least the value given in the
device's configuration information.

Events have the following format:

    #define VIRTIO_SCSI_T_EVENTS_MISSED   0x80000000

    struct virtio_scsi_ctrl_recv {
        u32 event;
        ...
    }

If bit 31 is set in the event field, the device failed to report an
event due to missing buffers.  In this case, the driver should poll the
logical units for unit attention conditions, and/or do whatever form of
bus scan is appropriate for the guest operating system.

Other data that the device writes to the buffer depends on the contents
of the event field.  The following events are defined:

- No event

    #define VIRTIO_SCSI_T_NO_EVENT         0

    This event is fired in the following cases:

    1) When the device detects in the eventq a buffer that is shorter
    than what is indicated in the configuration field, it will use
    it immediately and put this dummy value in the event field.
    A well-written driver will never observe this situation.

    2) When events are dropped, the device may signal this event as
    soon as the drivers makes a buffer available, in order to request
    action from the driver.  In this case, of course, this event will
    be reported with the VIRTIO_SCSI_T_EVENTS_MISSED flag.

- Transport reset

    #define VIRTIO_SCSI_T_TRANSPORT_RESET  1

    struct virtio_scsi_reset {
        u32 event;
        u8  lun[8];
        u32 reason;
    }

    #define VIRTIO_SCSI_EVT_RESET_HARD         0
    #define VIRTIO_SCSI_EVT_RESET_RESCAN       1
    #define VIRTIO_SCSI_EVT_RESET_REMOVED      2

    By sending this event, the device signals that a logical unit
    on a target has been reset, including the case of a new device
    appearing or disappearing on the bus.

    The device fills in all fields.  The event field is set to
    VIRTIO_SCSI_T_TRANSPORT_RESET.  The lun field addresses a bus,
    target and logical unit in the SCSI host.

    The reason value is one of the four #define values appearing above.
    VIRTIO_SCSI_EVT_RESET_REMOVED is used if the target or logical unit
    is no longer able to receive commands.  VIRTIO_SCSI_EVT_RESET_HARD
    is used if the logical unit has been reset, but is still present.
    VIRTIO_SCSI_EVT_RESET_RESCAN is used if a target or logical unit has
    just appeared on the device.

    When VIRTIO_SCSI_EVT_RESET_REMOVED or VIRTIO_SCSI_EVT_RESET_RESCAN
    is sent for LUN 0, the driver should ask the initiator to rescan
    the target, in order to detect the case when an entire target has
    appeared or disappeared.

    Events will also be reported via sense codes (this obviously does
    not apply to newly appeared buses or targets, since the application
    has never discovered them):

    - VIRTIO_SCSI_EVT_RESET_HARD
      sense UNIT ATTENTION
      asc POWER ON, RESET OR BUS DEVICE RESET OCCURRED

    - VIRTIO_SCSI_EVT_RESET_RESCAN
      sense UNIT ATTENTION
      asc REPORTED LUNS DATA HAS CHANGED

    - VIRTIO_SCSI_EVT_RESET_REMOVED
      sense ILLEGAL REQUEST
      asc LOGICAL UNIT NOT SUPPORTED

    The preferred way to detect transport reset is always to use events,
    because sense codes are only seen by the driver when it sends a
    SCSI command to the logical unit or target.  However, in case events
    are dropped, the initiator will still be able to synchronize with the
    actual state of the controller if the driver asks the initiator to
    rescan of the SCSI bus.  During the rescan, the initiator will be
    able to observe the above sense codes, and it will process them as
    if it the driver had received the equivalent event.

- Asynchronous notification

    #define VIRTIO_SCSI_T_ASYNC_NOTIFY     2

    struct virtio_scsi_an_event {
        u32 event;
        u8  lun[8];
        u32 reason;
    }

    By sending this event, the device signals that an asynchronous
    event was fired from a physical interface.

    All fields are written by the device.  The event field is set to
    VIRTIO_SCSI_T_ASYNC_NOTIFY.  The reason field is a subset of the
    events that the driver has subscribed to via the "Asynchronous
    notification subscription" command.

    When dropped events are reported, the driver should poll for 
    asynchronous events manually using SCSI commands.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] virtio-scsi wiki feature page
  2011-10-27 11:19 ` Paolo Bonzini
@ 2011-10-27 12:18   ` Stefan Hajnoczi
  0 siblings, 0 replies; 3+ messages in thread
From: Stefan Hajnoczi @ 2011-10-27 12:18 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Anthony Liguori, Badari Pulavarty, qemu-devel

On Thu, Oct 27, 2011 at 12:19 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 10/27/2011 12:49 PM, Stefan Hajnoczi wrote:
>>
>> I have created a virtio-scsi wiki feature page with links to Paolo's
>> latest draft specification, our KVM Forum presentation, and code
>> repos:
>>
>> http://wiki.qemu.org/Features/VirtioSCSI
>>
>> Paolo: v3 had some comments, is it a good time for a new revision of
>> the draft specification?
>
> Yes.  I was waiting until I actually have an implementation, but anyway here
> it is, attached.  The changes are small:
>
> - additional failure kinds mapping more or less to Linux driver_statuses
>
> - defined the format of the LUN.  Unlike vSCSI, there's no support for
> generic hierarchical LUNs.  A single LUN format is specified, that supports
> 256 targets and 16384 LUNs per target.
>
> - clarified multiqueue semantics
>
> I'm planning to update your LLD code to support these changes, but I'll
> gladly accept that someone else does it. :)

Okay, that sounds great.  As I get back into virtio-scsi I'll let you
know so we don't duplicate work.

Stefan

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-10-27 12:18 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-27 10:49 [Qemu-devel] virtio-scsi wiki feature page Stefan Hajnoczi
2011-10-27 11:19 ` Paolo Bonzini
2011-10-27 12:18   ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).