All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rusty Russell <rusty@rustcorp.com.au>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	kvm@vger.kernel.org, virtualization@lists.linux-foundation.org
Subject: Re: Using PCI config space to indicate config location
Date: Fri, 12 Oct 2012 08:59:36 +1030	[thread overview]
Message-ID: <87391k7icv.fsf@rustcorp.com.au> (raw)
In-Reply-To: <20121011102333.GA5552@redhat.com>

"Michael S. Tsirkin" <mst@redhat.com> writes:
> On Thu, Oct 11, 2012 at 11:48:22AM +1030, Rusty Russell wrote:
>> "Michael S. Tsirkin" <mst@redhat.com> writes:
>> > On Mon, Oct 08, 2012 at 12:51:25PM +1030, Rusty Russell wrote:
>> >> Note before anyone gets confused; we were talking about using the PCI
>> >> config space to indicate what BAR(s) the virtio stuff is in.  An
>> >> alternative would be to simply specify a new layout format in BAR1.
>> >
>> > One problem we are still left with is this: device specific
>> > config accesses are still non atomic.
>> > This is a problem for multibyte fields such as MAC address
>> > where MAC could change while we are accessing it.
>> 
>> It's also a problem for related fields, eg. console width and height, or
>> disk geometry.
>> 
>> > I was thinking about some backwards compatible way to solve this, but if
>> > we are willing to break compatiblity or use some mode switch, how about
>> > we give up on virtio config space completely, and do everything besides
>> > IO and ISR through guest memory?
>> 
>> I think there's still a benefit in the simple publishing of information:
>> I don't really want to add a control queue for the console.
>
> One reason I thought using a vq is handy is because this would
> let us get by with a single MSI vector. Currently we need at least 2
> for config changes + a shared one for vqs.
> But I don't insist.

Hmmm, that is true.

>> Here's a table from a quick audit:
>> 
>> Driver          Config       Device changes    Driver writes... after init?
>> net             Y            Y                 N                N
>> block           Y            Y                 Y                Y
>> console         Y            Y                 N                N
>> rng             N            N                 N                N
>> balloon         Y            Y                 Y                Y
>> scsi            Y            N                 Y                N
>> 9p              Y            N                 N                N
>> 
>> For config space reads, I suggest the driver publish a generation count.
>
> You mean device?

Yes, sorry.

>> For writes, the standard seems to be a commit latch.  We could abuse the
>> generation count for this: the driver writes to it to commit config
>> changes.
>
> I think this will work. There are a couple of things that bother me:
>
> This assumes read accesses have no side effects, and these are sometimes handy.
> Also the semantics for write aren't very clear to me.
> I guess device must buffer data until generation count write?
> This assumes the device has a buffer to store writes,
> and it must track each byte written. I kind of dislike this
> tracking of accessed bytes. Also, device would need to resolve conflicts
> if any in some device specific way.

It should be trivial to implement: you keep a scratch copy of the config
space, and copy it to the master copy when they hit the latch.

Implementation of this will show whether I've missed anything here, I
think.

> Maybe it's a good idea to make the buffer accesses explicit instead?
> IOW require driver to declare intent to read/request write of a specific
> config range.  We could for example do it like this:
> 	__le32 config_offset;
> 	__le32 config_len;
> 	__u8 config_cmd; /* write-only: 0 - read, 1 - write
> 			    config_len bytes at config_offset
> 			    from/to config space to/from device memory */
> But maybe this is over-engineering?

Seems overengineering since the config space is quite small in practice.

>> PS.  Let's make all the virtio-device config LE, too...
>
> We'll need some new API for devices then: currently we pass bytes.

Yes, but driver authors expect that anyway.  We can retro-define
virito-mmio to be LE (since all current users are), so this is
v. tempting.

Cheers,
Rusty.

WARNING: multiple messages have this Message-ID (diff)
From: Rusty Russell <rusty@rustcorp.com.au>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	kvm@vger.kernel.org, virtualization@lists.linux-foundation.org
Subject: Re: [Qemu-devel] Using PCI config space to indicate config location
Date: Fri, 12 Oct 2012 08:59:36 +1030	[thread overview]
Message-ID: <87391k7icv.fsf@rustcorp.com.au> (raw)
In-Reply-To: <20121011102333.GA5552@redhat.com>

"Michael S. Tsirkin" <mst@redhat.com> writes:
> On Thu, Oct 11, 2012 at 11:48:22AM +1030, Rusty Russell wrote:
>> "Michael S. Tsirkin" <mst@redhat.com> writes:
>> > On Mon, Oct 08, 2012 at 12:51:25PM +1030, Rusty Russell wrote:
>> >> Note before anyone gets confused; we were talking about using the PCI
>> >> config space to indicate what BAR(s) the virtio stuff is in.  An
>> >> alternative would be to simply specify a new layout format in BAR1.
>> >
>> > One problem we are still left with is this: device specific
>> > config accesses are still non atomic.
>> > This is a problem for multibyte fields such as MAC address
>> > where MAC could change while we are accessing it.
>> 
>> It's also a problem for related fields, eg. console width and height, or
>> disk geometry.
>> 
>> > I was thinking about some backwards compatible way to solve this, but if
>> > we are willing to break compatiblity or use some mode switch, how about
>> > we give up on virtio config space completely, and do everything besides
>> > IO and ISR through guest memory?
>> 
>> I think there's still a benefit in the simple publishing of information:
>> I don't really want to add a control queue for the console.
>
> One reason I thought using a vq is handy is because this would
> let us get by with a single MSI vector. Currently we need at least 2
> for config changes + a shared one for vqs.
> But I don't insist.

Hmmm, that is true.

>> Here's a table from a quick audit:
>> 
>> Driver          Config       Device changes    Driver writes... after init?
>> net             Y            Y                 N                N
>> block           Y            Y                 Y                Y
>> console         Y            Y                 N                N
>> rng             N            N                 N                N
>> balloon         Y            Y                 Y                Y
>> scsi            Y            N                 Y                N
>> 9p              Y            N                 N                N
>> 
>> For config space reads, I suggest the driver publish a generation count.
>
> You mean device?

Yes, sorry.

>> For writes, the standard seems to be a commit latch.  We could abuse the
>> generation count for this: the driver writes to it to commit config
>> changes.
>
> I think this will work. There are a couple of things that bother me:
>
> This assumes read accesses have no side effects, and these are sometimes handy.
> Also the semantics for write aren't very clear to me.
> I guess device must buffer data until generation count write?
> This assumes the device has a buffer to store writes,
> and it must track each byte written. I kind of dislike this
> tracking of accessed bytes. Also, device would need to resolve conflicts
> if any in some device specific way.

It should be trivial to implement: you keep a scratch copy of the config
space, and copy it to the master copy when they hit the latch.

Implementation of this will show whether I've missed anything here, I
think.

> Maybe it's a good idea to make the buffer accesses explicit instead?
> IOW require driver to declare intent to read/request write of a specific
> config range.  We could for example do it like this:
> 	__le32 config_offset;
> 	__le32 config_len;
> 	__u8 config_cmd; /* write-only: 0 - read, 1 - write
> 			    config_len bytes at config_offset
> 			    from/to config space to/from device memory */
> But maybe this is over-engineering?

Seems overengineering since the config space is quite small in practice.

>> PS.  Let's make all the virtio-device config LE, too...
>
> We'll need some new API for devices then: currently we pass bytes.

Yes, but driver authors expect that anyway.  We can retro-define
virito-mmio to be LE (since all current users are), so this is
v. tempting.

Cheers,
Rusty.

  reply	other threads:[~2012-10-11 22:29 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-27  0:29 Proposal for virtio standardization Rusty Russell
2012-09-27  0:29 ` [Qemu-devel] " Rusty Russell
2012-09-27  0:29 ` Rusty Russell
2012-10-04 18:49 ` [Qemu-devel] " Anthony Liguori
2012-10-04 18:49 ` Anthony Liguori
2012-10-04 18:49   ` Anthony Liguori
2012-10-08  2:21   ` Using PCI config space to indicate config location Rusty Russell
2012-10-08  2:21   ` Rusty Russell
2012-10-08  2:21     ` [Qemu-devel] " Rusty Russell
2012-10-08 13:58     ` Anthony Liguori
2012-10-08 13:58     ` Anthony Liguori
2012-10-08 13:58       ` Anthony Liguori
2012-10-08 14:58       ` Gerd Hoffmann
2012-10-08 14:58         ` Gerd Hoffmann
2012-10-08 15:09         ` Anthony Liguori
2012-10-08 15:09           ` Anthony Liguori
2012-10-08 20:13           ` Gerd Hoffmann
2012-10-08 20:13             ` Gerd Hoffmann
2012-10-08 20:55             ` Anthony Liguori
2012-10-08 20:55               ` Anthony Liguori
2012-10-08 23:56               ` Rusty Russell
2012-10-09  1:51                 ` Anthony Liguori
2012-10-09  3:16                   ` Rusty Russell
2012-10-09  3:16                   ` Rusty Russell
2012-10-09  3:16                     ` Rusty Russell
2012-10-09 10:17                     ` Avi Kivity
2012-10-09 10:17                       ` Avi Kivity
2012-10-09 14:03                       ` Anthony Liguori
2012-10-09 14:03                         ` Anthony Liguori
2012-10-09 13:56                     ` Anthony Liguori
2012-10-09 13:56                       ` Anthony Liguori
2012-10-10  3:44                       ` Rusty Russell
2012-10-10  3:44                         ` Rusty Russell
2012-10-10 11:37                         ` Michael S. Tsirkin
2012-10-10 11:37                           ` Michael S. Tsirkin
2012-10-09 13:56                     ` Anthony Liguori
2012-10-09 21:09                     ` Jamie Lokier
2012-10-09 21:09                     ` Jamie Lokier
2012-10-09 21:09                       ` [Qemu-devel] " Jamie Lokier
2012-10-10  3:44                       ` Rusty Russell
2012-10-10  3:44                         ` Rusty Russell
2012-10-11  0:08                         ` Rusty Russell
2012-10-11  0:08                         ` Rusty Russell
2012-10-11  0:08                           ` Rusty Russell
2012-10-09  6:33                   ` Gerd Hoffmann
2012-10-09  6:33                     ` Gerd Hoffmann
2012-10-09 15:26                     ` Anthony Liguori
2012-10-09 15:26                       ` Anthony Liguori
2012-10-09 20:24                       ` Gerd Hoffmann
2012-10-09 20:24                         ` Gerd Hoffmann
2012-10-10  2:54                         ` Rusty Russell
2012-10-10  2:54                           ` Rusty Russell
2012-10-10 13:36                           ` Anthony Liguori
2012-10-10 13:41                             ` Michael S. Tsirkin
2012-10-10 13:41                               ` Michael S. Tsirkin
2012-10-11  0:43                               ` Rusty Russell
2012-10-11  0:43                               ` Rusty Russell
2012-10-11  0:43                                 ` [Qemu-devel] " Rusty Russell
2012-10-10  2:54                         ` Rusty Russell
2012-10-09 15:26                     ` Anthony Liguori
2012-10-10  8:34                 ` Michael S. Tsirkin
2012-10-10  8:34                   ` Michael S. Tsirkin
2012-10-10  8:30     ` Michael S. Tsirkin
2012-10-10  8:30       ` [Qemu-devel] " Michael S. Tsirkin
2012-10-11  1:18       ` Rusty Russell
2012-10-11  1:18         ` [Qemu-devel] " Rusty Russell
2012-10-11 10:23         ` Michael S. Tsirkin
2012-10-11 10:23           ` [Qemu-devel] " Michael S. Tsirkin
2012-10-11 22:29           ` Rusty Russell [this message]
2012-10-11 22:29             ` Rusty Russell
2012-10-12  9:33             ` Michael S. Tsirkin
2012-10-12  9:33               ` [Qemu-devel] " Michael S. Tsirkin
2012-10-12  9:51               ` Rusty Russell
2012-10-12  9:51                 ` [Qemu-devel] " Rusty Russell
2012-10-12 10:02                 ` Michael S. Tsirkin
2012-10-12 10:02                   ` [Qemu-devel] " Michael S. Tsirkin
2012-10-16 13:15                   ` Rusty Russell
2012-10-16 13:15                   ` Rusty Russell
2012-10-16 13:15                     ` [Qemu-devel] " Rusty Russell
2012-10-16 13:30                     ` Michael S. Tsirkin
2012-10-16 13:30                       ` [Qemu-devel] " Michael S. Tsirkin
2012-10-16 13:52                       ` Rusty Russell
2012-10-16 13:52                         ` [Qemu-devel] " Rusty Russell
2012-10-16 13:52                       ` Rusty Russell
2012-10-11 22:29           ` Rusty Russell
2012-10-09 14:02 ` Proposal for virtio standardization Cornelia Huck
2012-10-09 14:02   ` [Qemu-devel] " Cornelia Huck
2012-10-10  3:46   ` Rusty Russell
2012-10-10  3:46   ` Rusty Russell
2012-10-10  3:46     ` [Qemu-devel] " Rusty Russell
2012-10-09 14:02 ` Cornelia Huck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87391k7icv.fsf@rustcorp.com.au \
    --to=rusty@rustcorp.com.au \
    --cc=aliguori@us.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.