Discussion of the VIRTIO specification
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Cornelia Huck <cohuck@redhat.com>
Cc: Frank Yang <lfy@google.com>,
	virtio-comment@lists.oasis-open.org,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Halil Pasic <pasic@linux.ibm.com>
Subject: Re: [virtio-comment] [PATCH 1/3] shared memory: Define shared memory regions
Date: Fri, 15 Feb 2019 11:19:09 +0000	[thread overview]
Message-ID: <20190215111908.GC2630@work-vm> (raw)
In-Reply-To: <20190215120718.7c7e09cc.cohuck@redhat.com>

* Cornelia Huck (cohuck@redhat.com) wrote:
> On Thu, 14 Feb 2019 09:43:10 -0800
> Frank Yang <lfy@google.com> wrote:
> 
> > On Thu, Feb 14, 2019 at 8:37 AM Dr. David Alan Gilbert <dgilbert@redhat.com>
> > wrote:
> > 
> > > * Cornelia Huck (cohuck@redhat.com) wrote:  
> > > > On Wed, 13 Feb 2019 18:37:56 +0000
> > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > >  
> > > > > * Cornelia Huck (cohuck@redhat.com) wrote:  
> > > > > > On Wed, 16 Jan 2019 20:06:25 +0000
> > > > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > > > >  
> > > > > > > So these are all moving this 1/3 forward - has anyone got comments  
> > > on  
> > > > > > > the transport specific implementations?  
> > > > > >
> > > > > > No comment on pci or mmio, but I've hacked something together for  
> > > ccw.  
> > > > > > Basically, one sense-type ccw for discovery and a control-type ccw  
> > > for  
> > > > > > activation of the regions (no idea if we really need the latter),  
> > > both  
> > > > > > available with ccw revision 3.
> > > > > >
> > > > > > No idea whether this will work this way, though...  
> > > > >
> > > > > That sounds (from a shm perspective) reasonable; can I ask why the
> > > > > 'activate' is needed?  
> > > >
> > > > The activate interface is actually what I'm most unsure about; maybe
> > > > Halil can chime in.
> > > >
> > > > My basic concern is that we don't have any idea how the guest will use
> > > > the available memory. If the shared memory areas are supposed to be
> > > > mapped into an inconvenient place, the activate interface gives the
> > > > guest a chance to clear up that area before the host starts writing to
> > > > it.  
> > >
> > > I'm expecting the host to map it into an area of GPA that is out of the
> > > way - it doesn't overlap with RAM.
> 
> My issue here is that I'm not sure how to model something like that on
> s390...
> 
> > > Given that, I'm not sure why the guest would have to do any 'clear up' -
> > > it probably wants to make a virtual mapping somewhere, but again that's
> > > upto the guest to do when it feels like it.
> > >
> > >  
> > This is what we do with Vulkan as well.
> > 
> > 
> > > > I'm not really enthusiastic about that interface... for one, I'm not
> > > > sure how this plays out at the device type level, which should not
> > > > really concern itself with transport-specific handling.  
> > >
> > > I'd expect the host side code to give an area of memory to the transport
> > > and tell it to map it somewhere (in the QEMU terminology a MemoryRegion
> > > I think).
> 
> My main issue is the 'somewhere'.
> 
> > >  
> > 
> > I wonder if this could help: the way we're running Vulkan at the moment,
> > what we do is add a the concept of a MemoryRegion with no actual backing:
> > 
> > https://android-review.googlesource.com/q/topic:%22qemu-user-controlled-hv-mappings%22+(status:open%20OR%20status:merged)
> > 
> > and it would be connected to the entire PCI address space on the shared
> > memory address space realization. So it's kind of like a sparse or deferred
> > MemoryRegion.
> > 
> > When the guest actually wants to map a subregion associated with the host
> > memory,
> > on the host side, we can call the hypervisor to map the region, based on
> > giving the device implementation the functions KVM_SET_USER_MEMORY_REGION
> > and analogs.
> > 
> > This has the advantage of a smaller contact area between shm and qemu,
> > where the device level stuff can operate at a separate layer from
> > MemoryRegions which is more transport level.
> 
> That sounds like an interesting concept, but I'm not quite sure how it
> would help with my problem. Read on for more explanation below...
> 
> > 
> > 
> > > Similarly in the guest, I'm expecting the driver for the device to
> > > ask for a pointer to a region with a particular ID and that goes
> > > down to the transport code.
> > >
> > > Another option would be to map these into a special memory area that  
> > > > the guest won't use for its normal operation... the original s390
> > > > (non-ccw) virtio transport mapped everything into two special pages
> > > > above the guest memory, but that was quite painful, and I don't think
> > > > we want to go down that road again.  
> > >
> > > Can you explain why?
> 
> The background here is that s390 traditionally does not have any
> concept of memory-mapped I/O. IOW, you don't just write to or read from
> a special memory area; instead, I/O operations use special instructions.
> 
> The mechanism I'm trying to extend here is channel I/O: the driver
> builds a channel program with commands that point to guest memory areas
> and hands it to the channel subsystem (which means, in our case, the
> host) via a special instruction. The channel subsystem and the device
> (the host, in our case) translate the memory addresses and execute the
> commands. The one place where we write shared memory directly in the
> virtio case are the virtqueues -- which are allocated in guest memory,
> so the guest decides which memory addresses are special. Accessing the
> config space of a virtio device via the ccw transport does not
> read/write a memory location directly, but instead uses a channel
> program that performs the read/write.
> 
> For pci, the memory accesses are mapped to special instructions:
> reading or writing the config space of a pci device does not perform
> reads or writes of a memory location, either; the driver uses special
> instructions to access the config space (which are also
> interpreted/emulated by QEMU, for example.)
> 
> The old s390 (pre-virtio-ccw) virtio transport had to rely on the
> knowledge that there were two pages containing the virtqueues etc.
> right above the normal memory (probed by checking whether accessing
> that memory gave an exception or not). The main problems were that this
> was inflexible (the guest had no easy way to find out how many
> 'special' pages were present, other than trying to access them), and
> that it was different from whatever other mechanisms are common on s390.
> 
> We might be able to come up with another scheme, but I wouldn't hold my
> breath. Would be great if someone else with s390 knowledge could chime
> in here.

What I'm missing here is why the behaviour of the s390's traditional channel program
matters to the design of an entirely emulated device.

As long as the s390 allows:
  a) The host to map a region of HVA into GPA at an arbitrary GPA
address
  b) Not tell the guest that (a) is RAM
  c) Find a non-RAM GPA for (a)
  d) Allow the guest to set up a page table pointing to (c)
  e) Discover (c) via the scheme you described

Then that's all that's needed - and I'm not seeing what is different on
s390 about a-d from any other architecture.

Dave



--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


  reply	other threads:[~2019-02-15 11:19 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-11 11:41 [virtio-comment] [PATCH 0/3] Large shared memory regions Dr. David Alan Gilbert (git)
2019-01-11 11:41 ` [virtio-comment] [PATCH 1/3] shared memory: Define " Dr. David Alan Gilbert (git)
2019-01-11 12:15   ` Cornelia Huck
2019-01-11 12:26     ` Dr. David Alan Gilbert
2019-01-15 10:10       ` Cornelia Huck
2019-01-15 11:23         ` Dr. David Alan Gilbert
2019-01-16 10:56           ` Cornelia Huck
2019-01-16 20:06             ` Dr. David Alan Gilbert
2019-02-11 21:52               ` Cornelia Huck
2019-02-13 18:37                 ` Dr. David Alan Gilbert
2019-02-14 10:58                   ` Cornelia Huck
2019-02-14 16:37                     ` Dr. David Alan Gilbert
2019-02-14 17:43                       ` Frank Yang
2019-02-15 11:07                         ` Cornelia Huck
2019-02-15 11:19                           ` Dr. David Alan Gilbert [this message]
2019-02-15 12:31                             ` Cornelia Huck
2019-02-18 15:28                             ` Halil Pasic
2019-02-15 11:26                           ` David Hildenbrand
2019-02-15 12:28                             ` Cornelia Huck
2019-02-15 12:33                               ` David Hildenbrand
2019-02-15 12:37                                 ` Cornelia Huck
2019-02-15 12:59                                   ` David Hildenbrand
2019-02-15 13:50                                   ` Dr. David Alan Gilbert
2019-02-15 13:56                                     ` David Hildenbrand
2019-02-15 14:02                                       ` Dr. David Alan Gilbert
2019-02-15 14:13                                         ` David Hildenbrand
2019-02-15 15:14                                           ` Dr. David Alan Gilbert
2019-02-15 21:42                                             ` Halil Pasic
2019-02-15 22:08                                             ` David Hildenbrand
2019-02-15 12:51                     ` Halil Pasic
2019-02-15 13:33                       ` Cornelia Huck
2019-01-23 15:12         ` Michael S. Tsirkin
2019-01-11 15:29     ` Halil Pasic
2019-01-11 16:07       ` Dr. David Alan Gilbert
2019-01-11 17:57         ` Halil Pasic
2019-01-15  9:33           ` Cornelia Huck
2019-02-13  2:25   ` [virtio-comment] " Stefan Hajnoczi
2019-02-13 10:44     ` Dr. David Alan Gilbert
2019-02-14  3:43       ` Stefan Hajnoczi
2019-01-11 11:41 ` [virtio-comment] [PATCH 2/3] shared memory: Define PCI capability Dr. David Alan Gilbert (git)
2019-02-13  2:30   ` [virtio-comment] " Stefan Hajnoczi
2019-01-11 11:42 ` [virtio-comment] [PATCH 3/3] shared memory: Define mmio registers Dr. David Alan Gilbert (git)
2019-02-13  2:33   ` [virtio-comment] " Stefan Hajnoczi
2019-02-13 16:52     ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190215111908.GC2630@work-vm \
    --to=dgilbert@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=lfy@google.com \
    --cc=pasic@linux.ibm.com \
    --cc=stefanha@redhat.com \
    --cc=virtio-comment@lists.oasis-open.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox