qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: qemu-devel <qemu-devel@nongnu.org>,
	Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Subject: Re: [Qemu-devel] virtio scsi host draft specification, v2
Date: Wed, 01 Jun 2011 15:31:42 +0200	[thread overview]
Message-ID: <4DE63F3E.6040805@redhat.com> (raw)
In-Reply-To: <20110601125124.GB12294@redhat.com>

On 06/01/2011 02:51 PM, Michael S. Tsirkin wrote:
>>>>      The cdb, data and sense fields must reside in separate buffers.
>>>>      The cdb field is always read-only.  The data buffers may be either
>>>>      read-only or write-only, depending on the request, with the read-only
>>>>      buffers coming first.  The sense buffer is always write-only.
>>>>
>>>>      The request shall have num_dataout read-only data buffers and
>>>>      num_datain write-only data buffers.  One of these two values must be
>>>>      zero if the VIRTIO_SCSI_F_INOUT has not been negotiated.
>>>
>>> Why do num_datain/num_dataout need to be there?
>>> We can just look at the number of io/out bufs in
>>> virtio descriptors, no?
>>
>> This depends on having a single variable-sized datum per direction.
>> I'd rather avoid this assumption.
>
> I think it's a sane assumption: does scsi ever give you more?

Sense data, and data from the device, are both variable-length.  With 
your suggestion that they may be in a single buffer, we do need size 
information for them.

> If you have many how do you know the size of each?

That's up to the transport protocol to devise.

However, I'll make a few changes:

1) num_dataout/num_datain will become byte counts rather than buffer 
counts.  This fits better if the driver has the liberty has to use 
single buffers.

2) I'll add a sense_size field to give the size of the sense buffer, 
since now it won't necessarily be in a separate buffer (which gave this 
knowledge to the .  The sense_len will tell the driver how many bytes 
were actually written (usually 0).

>>>>      Remaining fields are filled in by the device.  The sense_len field
>>>>      indicates the number of bytes actually written to the sense buffer,
>>>>      while the residual field indicates the residual size, calculated as
>>>>      data_length - number_of_transferred_bytes.
>>>
>>> Again virtio gives you total number of written bytes in the used len
>>> field.  So just one of these fields will be enough.
>>
>> The two fields give completely different information (sense vs. real
>> data), and the math has to be done anyway in either the driver or
>> the device.  The device is going to be written just once and
>> actually it already has the separate information, so I put it in the
>> struct and spared some annoyance to driver writers.
>
> Yes but this way you get duplicate information which means
> it can get out of sync. Go figure who's right then ...

But is it duplicate actually?  The write-only buffer(s) start with 
datain_size bytes of data, and sense_size bytes of sense.  If everything 
is placed in a single buffer, the used len field cannot capture how many 
bytes were written in the data and how many were written in the sense. 
In addition, if everything is placed in a single buffer, the device will 
have to report that it has written the full buffer because other parts 
of the response come after datain and sense.

So, if I am going to give this liberty with buffers to the driver, I 
_have_ to keep the size information.  Otherwise, I agree that it is 
redundant and I will remove it.  What poison do you prefer?

>>> [...] It looks like there's a finite number of possible events.
>>
>> If you keep opening and closing the tray from the guest, you could
>> fire a possibly unbounded number of events.
>
> In that case only the last one is really interesting.

That's true.  But events are rare enough that you should not have any 
problems feeding enough buffers to the device, and even in that case 
there is a way out in case of dropped events (see below).  Assuming I am 
going to implement the device, requiring the device to queue events 
seems a bit masochistic. :)

>>> If this mechanism is unreliable, how is it useful?
>>
>> Events alone are unreliable, but the combination of events+sense is
>> reliable.  And events+sense are still useful because:
>>
>> 1) sense codes only provide information when the driver next
>> accesses the unit or, at best, the target.  Until then, the driver
>> has no clue that the event happened.  Events can be reported at the
>> time they happen.  This is important for example when the host
>> requests a clean hot-unplug of a disk: if the disk is idle in the
>> guest, the driver may never see the event and acknowledge the
>> hot-unplug!
>
> right. But if you then drop this because you don't have a buffer,
> you get the same problem.

First of all, it should be clear that while lack of buffers may happen, 
it is expected to be an exceptional case.  If the driver keeps on not 
providing buffers, so goes life.  PCI hot-unplug requires a cooperating 
operating system too, after all.

So, it may happen that the driver misses an event because it doesn't 
provide a buffer at the right time.  But if the driver works correctly, 
it will provide event buffers regularly to the device, and the device 
will tell the driver that events were dropped through the 
VIRTIO_SCSI_T_EVENTS_MISSED flag.  The driver will then ask the SCSI 
subsystem to rescan the bus, and will see the unit attention condition 
during the rescan.

I guess this answered your next questions too:

>> 2) for this reason, unit attention has no way to signal events on a
>> target that is unknown to the driver (because it has just been
>> hotplugged).
>>
>> Events and sense codes together are reliable because the driver is
>> aware of dropped events.
>
> It is?  How is it notified of dropped events?
>
>> - No event
>>
>>      #define VIRTIO_SCSI_T_NO_EVENT         0
>>
>>      This event is fired in the following cases:
>>
>>      1) When the device detects in the eventq a buffer that is shorter
>>      than what is indicated in the configuration field, it will use
>>      it immediately and put this dummy value in the event field.
>>      A well-written driver will never observe this situation.
>>
>>      2) When events are dropped, the device may signal this event as
>>      soon as the drivers makes a buffer available, in order to request
>>      action from the driver.  In this case, of course, this event will
>>      be reported with the VIRTIO_SCSI_T_EVENTS_MISSED flag.
>
> For 2, you don't have a buffer - so how is this event reported?

It is reported _eventually_: "as soon as the driver makes a buffer 
available".  After it is reported, a bus rescan synchronizes the device 
and the driver.

Paolo

  reply	other threads:[~2011-06-01 13:31 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-20  8:21 [Qemu-devel] virtio scsi host draft specification, v2 Paolo Bonzini
2011-05-28 19:33 ` Stefan Hajnoczi
2011-05-30  9:22   ` Paolo Bonzini
2011-06-01  4:44     ` Stefan Hajnoczi
2011-06-01  8:49 ` Michael S. Tsirkin
2011-06-01 11:59   ` Paolo Bonzini
2011-06-01 12:51     ` Michael S. Tsirkin
2011-06-01 13:31       ` Paolo Bonzini [this message]
2011-06-01 14:36         ` Michael S. Tsirkin
2011-06-01 14:51           ` Avi Kivity
2011-06-02 10:42             ` Michael S. Tsirkin
2011-06-02 11:42               ` Avi Kivity
2011-06-02 11:56                 ` Michael S. Tsirkin
2011-06-02 12:18                   ` Avi Kivity
2011-06-01 15:59           ` Paolo Bonzini
2011-06-02 11:41             ` Michael S. Tsirkin
2011-06-02 12:02               ` Paolo Bonzini
2011-06-02 12:54                 ` Michael S. Tsirkin
2011-06-01 13:46 ` Avi Kivity
2011-06-01 16:25   ` Paolo Bonzini
2011-06-01 16:29     ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DE63F3E.6040805@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).