From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Cornelia Huck <cohuck@redhat.com>
Cc: Frank Yang <lfy@google.com>,
virtio-comment@lists.oasis-open.org,
Stefan Hajnoczi <stefanha@redhat.com>,
Halil Pasic <pasic@linux.ibm.com>
Subject: Re: [virtio-comment] [PATCH 1/3] shared memory: Define shared memory regions
Date: Fri, 15 Feb 2019 11:19:09 +0000 [thread overview]
Message-ID: <20190215111908.GC2630@work-vm> (raw)
In-Reply-To: <20190215120718.7c7e09cc.cohuck@redhat.com>
* Cornelia Huck (cohuck@redhat.com) wrote:
> On Thu, 14 Feb 2019 09:43:10 -0800
> Frank Yang <lfy@google.com> wrote:
>
> > On Thu, Feb 14, 2019 at 8:37 AM Dr. David Alan Gilbert <dgilbert@redhat.com>
> > wrote:
> >
> > > * Cornelia Huck (cohuck@redhat.com) wrote:
> > > > On Wed, 13 Feb 2019 18:37:56 +0000
> > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > >
> > > > > * Cornelia Huck (cohuck@redhat.com) wrote:
> > > > > > On Wed, 16 Jan 2019 20:06:25 +0000
> > > > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > > > >
> > > > > > > So these are all moving this 1/3 forward - has anyone got comments
> > > on
> > > > > > > the transport specific implementations?
> > > > > >
> > > > > > No comment on pci or mmio, but I've hacked something together for
> > > ccw.
> > > > > > Basically, one sense-type ccw for discovery and a control-type ccw
> > > for
> > > > > > activation of the regions (no idea if we really need the latter),
> > > both
> > > > > > available with ccw revision 3.
> > > > > >
> > > > > > No idea whether this will work this way, though...
> > > > >
> > > > > That sounds (from a shm perspective) reasonable; can I ask why the
> > > > > 'activate' is needed?
> > > >
> > > > The activate interface is actually what I'm most unsure about; maybe
> > > > Halil can chime in.
> > > >
> > > > My basic concern is that we don't have any idea how the guest will use
> > > > the available memory. If the shared memory areas are supposed to be
> > > > mapped into an inconvenient place, the activate interface gives the
> > > > guest a chance to clear up that area before the host starts writing to
> > > > it.
> > >
> > > I'm expecting the host to map it into an area of GPA that is out of the
> > > way - it doesn't overlap with RAM.
>
> My issue here is that I'm not sure how to model something like that on
> s390...
>
> > > Given that, I'm not sure why the guest would have to do any 'clear up' -
> > > it probably wants to make a virtual mapping somewhere, but again that's
> > > upto the guest to do when it feels like it.
> > >
> > >
> > This is what we do with Vulkan as well.
> >
> >
> > > > I'm not really enthusiastic about that interface... for one, I'm not
> > > > sure how this plays out at the device type level, which should not
> > > > really concern itself with transport-specific handling.
> > >
> > > I'd expect the host side code to give an area of memory to the transport
> > > and tell it to map it somewhere (in the QEMU terminology a MemoryRegion
> > > I think).
>
> My main issue is the 'somewhere'.
>
> > >
> >
> > I wonder if this could help: the way we're running Vulkan at the moment,
> > what we do is add a the concept of a MemoryRegion with no actual backing:
> >
> > https://android-review.googlesource.com/q/topic:%22qemu-user-controlled-hv-mappings%22+(status:open%20OR%20status:merged)
> >
> > and it would be connected to the entire PCI address space on the shared
> > memory address space realization. So it's kind of like a sparse or deferred
> > MemoryRegion.
> >
> > When the guest actually wants to map a subregion associated with the host
> > memory,
> > on the host side, we can call the hypervisor to map the region, based on
> > giving the device implementation the functions KVM_SET_USER_MEMORY_REGION
> > and analogs.
> >
> > This has the advantage of a smaller contact area between shm and qemu,
> > where the device level stuff can operate at a separate layer from
> > MemoryRegions which is more transport level.
>
> That sounds like an interesting concept, but I'm not quite sure how it
> would help with my problem. Read on for more explanation below...
>
> >
> >
> > > Similarly in the guest, I'm expecting the driver for the device to
> > > ask for a pointer to a region with a particular ID and that goes
> > > down to the transport code.
> > >
> > > Another option would be to map these into a special memory area that
> > > > the guest won't use for its normal operation... the original s390
> > > > (non-ccw) virtio transport mapped everything into two special pages
> > > > above the guest memory, but that was quite painful, and I don't think
> > > > we want to go down that road again.
> > >
> > > Can you explain why?
>
> The background here is that s390 traditionally does not have any
> concept of memory-mapped I/O. IOW, you don't just write to or read from
> a special memory area; instead, I/O operations use special instructions.
>
> The mechanism I'm trying to extend here is channel I/O: the driver
> builds a channel program with commands that point to guest memory areas
> and hands it to the channel subsystem (which means, in our case, the
> host) via a special instruction. The channel subsystem and the device
> (the host, in our case) translate the memory addresses and execute the
> commands. The one place where we write shared memory directly in the
> virtio case are the virtqueues -- which are allocated in guest memory,
> so the guest decides which memory addresses are special. Accessing the
> config space of a virtio device via the ccw transport does not
> read/write a memory location directly, but instead uses a channel
> program that performs the read/write.
>
> For pci, the memory accesses are mapped to special instructions:
> reading or writing the config space of a pci device does not perform
> reads or writes of a memory location, either; the driver uses special
> instructions to access the config space (which are also
> interpreted/emulated by QEMU, for example.)
>
> The old s390 (pre-virtio-ccw) virtio transport had to rely on the
> knowledge that there were two pages containing the virtqueues etc.
> right above the normal memory (probed by checking whether accessing
> that memory gave an exception or not). The main problems were that this
> was inflexible (the guest had no easy way to find out how many
> 'special' pages were present, other than trying to access them), and
> that it was different from whatever other mechanisms are common on s390.
>
> We might be able to come up with another scheme, but I wouldn't hold my
> breath. Would be great if someone else with s390 knowledge could chime
> in here.
What I'm missing here is why the behaviour of the s390's traditional channel program
matters to the design of an entirely emulated device.
As long as the s390 allows:
a) The host to map a region of HVA into GPA at an arbitrary GPA
address
b) Not tell the guest that (a) is RAM
c) Find a non-RAM GPA for (a)
d) Allow the guest to set up a page table pointing to (c)
e) Discover (c) via the scheme you described
Then that's all that's needed - and I'm not seeing what is different on
s390 about a-d from any other architecture.
Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
next prev parent reply other threads:[~2019-02-15 11:19 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-11 11:41 [virtio-comment] [PATCH 0/3] Large shared memory regions Dr. David Alan Gilbert (git)
2019-01-11 11:41 ` [virtio-comment] [PATCH 1/3] shared memory: Define " Dr. David Alan Gilbert (git)
2019-01-11 12:15 ` Cornelia Huck
2019-01-11 12:26 ` Dr. David Alan Gilbert
2019-01-15 10:10 ` Cornelia Huck
2019-01-15 11:23 ` Dr. David Alan Gilbert
2019-01-16 10:56 ` Cornelia Huck
2019-01-16 20:06 ` Dr. David Alan Gilbert
2019-02-11 21:52 ` Cornelia Huck
2019-02-13 18:37 ` Dr. David Alan Gilbert
2019-02-14 10:58 ` Cornelia Huck
2019-02-14 16:37 ` Dr. David Alan Gilbert
2019-02-14 17:43 ` Frank Yang
2019-02-15 11:07 ` Cornelia Huck
2019-02-15 11:19 ` Dr. David Alan Gilbert [this message]
2019-02-15 12:31 ` Cornelia Huck
2019-02-18 15:28 ` Halil Pasic
2019-02-15 11:26 ` David Hildenbrand
2019-02-15 12:28 ` Cornelia Huck
2019-02-15 12:33 ` David Hildenbrand
2019-02-15 12:37 ` Cornelia Huck
2019-02-15 12:59 ` David Hildenbrand
2019-02-15 13:50 ` Dr. David Alan Gilbert
2019-02-15 13:56 ` David Hildenbrand
2019-02-15 14:02 ` Dr. David Alan Gilbert
2019-02-15 14:13 ` David Hildenbrand
2019-02-15 15:14 ` Dr. David Alan Gilbert
2019-02-15 21:42 ` Halil Pasic
2019-02-15 22:08 ` David Hildenbrand
2019-02-15 12:51 ` Halil Pasic
2019-02-15 13:33 ` Cornelia Huck
2019-01-23 15:12 ` Michael S. Tsirkin
2019-01-11 15:29 ` Halil Pasic
2019-01-11 16:07 ` Dr. David Alan Gilbert
2019-01-11 17:57 ` Halil Pasic
2019-01-15 9:33 ` Cornelia Huck
2019-02-13 2:25 ` [virtio-comment] " Stefan Hajnoczi
2019-02-13 10:44 ` Dr. David Alan Gilbert
2019-02-14 3:43 ` Stefan Hajnoczi
2019-01-11 11:41 ` [virtio-comment] [PATCH 2/3] shared memory: Define PCI capability Dr. David Alan Gilbert (git)
2019-02-13 2:30 ` [virtio-comment] " Stefan Hajnoczi
2019-01-11 11:42 ` [virtio-comment] [PATCH 3/3] shared memory: Define mmio registers Dr. David Alan Gilbert (git)
2019-02-13 2:33 ` [virtio-comment] " Stefan Hajnoczi
2019-02-13 16:52 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190215111908.GC2630@work-vm \
--to=dgilbert@redhat.com \
--cc=cohuck@redhat.com \
--cc=lfy@google.com \
--cc=pasic@linux.ibm.com \
--cc=stefanha@redhat.com \
--cc=virtio-comment@lists.oasis-open.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox