VIRTIO - compatibility with different virtualization solutions

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

* VIRTIO - compatibility with different virtualization solutions
@ 2014-02-17 13:23 Daniel Kiper
  2014-02-19  0:26 ` Rusty Russell
       [not found] ` <87vbwcaqxe.fsf@rustcorp.com.au>
  0 siblings, 2 replies; 16+ messages in thread
From: Daniel Kiper @ 2014-02-17 13:23 UTC (permalink / raw)
  To: xen-devel, virtio
  Cc: wei.liu2, ian.campbell, rusty, stefano.stabellini, ian, anthony,
	sasha.levin

Hi,

Below you could find a summary of work in regards to VIRTIO compatibility with
different virtualization solutions. It was done mainly from Xen point of view
but results are quite generic and can be applied to wide spectrum
of virtualization platforms.

VIRTIO devices were designed as a set of generic devices for virtual environments.
They work without major issues on many currently existing virtualization solutions.
However, there is one VIRTIO specification and implementation issue which could hinder
VRITIO devices/drivers implementation on new or even existing platforms (e.g. Xen).

The problem is that the specification uses guest physical addresses as a pointers
to virtques, buffers and other structures. It means that VIRTIO device controller
(hypervisor/host or special device domain/process) knows the guest physical memory
layout and simply maps required regions as needed. However, this crude mapping
mechanism usually assumes that guests does not impose any access restrictions on
its whole memory. That situation is not desirable because many times guests would
like to put access restrictions on its whole memory and just give access to certain
memory regions needed for device operations. Fortunately many hypervisors have some
more or less advanced memory sharing mechanisms with relevant access control builtin.
However, those mechanisms do not use guest physical addresses as a shared memory
region address/reference but unique identifier which could be called "handle" here
(or anything else which clearly describes idea). It means that specification should
use term "handle" instead of guest physical addresses (in a particular case it can be
guest physical address). This way any virtualization environment could choose the best
way to access guest memory without compromising security if needed.

Above mentioned changes in specification require some changes in VIRTIO devices
and drivers implementation.

>From an implementation perspective of Linux VIRTIO drivers transition from old to
new one should not be very difficult. Linux Kernel itself provides DMA API which
should ease work on drivers. Hence they should use this API instead of omitting it.
Additionally, new IOMMU drivers should be created. Those IOMMU drivers should expose
handles to VIRTIO and hide hypervisor specific details. This way VIRTIO would not so
strongly depend on specific hypervisor behavior. Another part of VIRTIO are devices
which usually do not have an access to DMA API available in Linux Kernel and this may
present some challenges in transition to the new implementation. Additionally, similar
problems may also appear during drivers implementation in systems which do not have
Linux Kernel DMA API. However, even in that situation it should not be very big issue
and prevent transition to handles.

The author does not know FreeBSD and Windows well enough to make assumptions of how
to retool VRITIO drivers there to use some mechanism to get hypervisor handles,
but surely there must be some easy API to plumb this through.

As it can be seen from above description current VIRTIO specification could create
implementation challenges in some virtual environments. However, this issue could
be quite easily solved by migration from guest physical addresses which are used
as a pointers to virtques, bufferrs and other structures to handles. This change
should not be so difficult in implementation. Additionally, it makes VRITIO not so
tightly linked with specific virtual environment. This in turn helps improve fulfillment
of "Standard" assumption (Virtio makes no assumptions about the environment in which
it operates, beyond supporting the bus attaching the device. Virtio devices are
implemented over PCI and other buses, and earlier drafts been implemented on other
buses not included in this spec) made in VIRTIO spec introduction.

Acknowledgments for comments and suggestions: Wei Liu, Ian Pratt, Konrad Rzeszutek Wilk

Daniel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: VIRTIO - compatibility with different virtualization solutions
  2014-02-17 13:23 Daniel Kiper
@ 2014-02-19  0:26 ` Rusty Russell
       [not found] ` <87vbwcaqxe.fsf@rustcorp.com.au>
  1 sibling, 0 replies; 16+ messages in thread
From: Rusty Russell @ 2014-02-19  0:26 UTC (permalink / raw)
  To: Daniel Kiper, xen-devel, virtio-dev
  Cc: wei.liu2, ian.campbell, stefano.stabellini, ian, anthony,
	sasha.levin

Daniel Kiper <daniel.kiper@oracle.com> writes:
> Hi,
>
> Below you could find a summary of work in regards to VIRTIO compatibility with
> different virtualization solutions. It was done mainly from Xen point of view
> but results are quite generic and can be applied to wide spectrum
> of virtualization platforms.

Hi Daniel,

        Sorry for the delayed response, I was pondering...  CC changed
to virtio-dev.

>From a standard POV: It's possible to abstract out the where we use
'physical address' for 'address handle'.  It's also possible to define
this per-platform (ie. Xen-PV vs everyone else).  This is sane, since
Xen-PV is a distinct platform from x86.

For platforms using EPT, I don't think you want anything but guest
addresses, do you?

>From an implementation POV:

On IOMMU, start here for previous Linux discussion:
        http://thread.gmane.org/gmane.linux.kernel.virtualization/14410/focus=14650

And this is the real problem.  We don't want to use the PCI IOMMU for
PCI devices.  So it's not just a matter of using existing Linux APIs.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: VIRTIO - compatibility with different virtualization solutions
       [not found] ` <87vbwcaqxe.fsf@rustcorp.com.au>
@ 2014-02-19  4:42   ` Anthony Liguori
  2014-02-20  1:31     ` Rusty Russell
       [not found]     ` <87ha7ubme0.fsf@rustcorp.com.au>
  2014-02-19 10:09   ` Ian Campbell
  2014-02-19 10:11   ` Ian Campbell
  2 siblings, 2 replies; 16+ messages in thread
From: Anthony Liguori @ 2014-02-19  4:42 UTC (permalink / raw)
  To: Rusty Russell
  Cc: virtio-dev, Wei Liu, Ian Campbell, Daniel Kiper,
	Stefano Stabellini, ian, sasha.levin, xen-devel

On Tue, Feb 18, 2014 at 4:26 PM, Rusty Russell <rusty@au1.ibm.com> wrote:
> Daniel Kiper <daniel.kiper@oracle.com> writes:
>> Hi,
>>
>> Below you could find a summary of work in regards to VIRTIO compatibility with
>> different virtualization solutions. It was done mainly from Xen point of view
>> but results are quite generic and can be applied to wide spectrum
>> of virtualization platforms.
>
> Hi Daniel,
>
>         Sorry for the delayed response, I was pondering...  CC changed
> to virtio-dev.
>
> From a standard POV: It's possible to abstract out the where we use
> 'physical address' for 'address handle'.  It's also possible to define
> this per-platform (ie. Xen-PV vs everyone else).  This is sane, since
> Xen-PV is a distinct platform from x86.

I'll go even further and say that "address handle" doesn't make sense too.

Just using grant table references is not enough to make virtio work
well under Xen.  You really need to use bounce buffers ala persistent
grants.

I think what you ultimately want is virtio using a DMA API (I know
benh has scoffed at this but I don't buy his argument at face value)
and a DMA layer that bounces requests to a pool of persistent grants.

> For platforms using EPT, I don't think you want anything but guest
> addresses, do you?
>
> From an implementation POV:
>
> On IOMMU, start here for previous Linux discussion:
>         http://thread.gmane.org/gmane.linux.kernel.virtualization/14410/focus=14650
>
> And this is the real problem.  We don't want to use the PCI IOMMU for
> PCI devices.  So it's not just a matter of using existing Linux APIs.

Is there any data to back up that claim?

Just because power currently does hypercalls for anything that uses
the PCI IOMMU layer doesn't mean this cannot be changed.  It's pretty
hacky that virtio-pci just happens to work well by accident on power
today.  Not all architectures have this limitation.

Regards,

Anthony Liguori

> Cheers,
> Rusty.
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: VIRTIO - compatibility with different virtualization solutions
       [not found] ` <87vbwcaqxe.fsf@rustcorp.com.au>
  2014-02-19  4:42   ` Anthony Liguori
@ 2014-02-19 10:09   ` Ian Campbell
  2014-02-20  7:48     ` Rusty Russell
       [not found]     ` <8761oab4y7.fsf@rustcorp.com.au>
  2014-02-19 10:11   ` Ian Campbell
  2 siblings, 2 replies; 16+ messages in thread
From: Ian Campbell @ 2014-02-19 10:09 UTC (permalink / raw)
  To: Rusty Russell
  Cc: virtio-dev, wei.liu2, Daniel Kiper, stefano.stabellini, ian,
	anthony, sasha.levin, xen-devel

On Wed, 2014-02-19 at 10:56 +1030, Rusty Russell wrote:
> For platforms using EPT, I don't think you want anything but guest
> addresses, do you?

No, the arguments for preventing unfettered access by backends to
frontend RAM applies to EPT as well.

Ian.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: VIRTIO - compatibility with different virtualization solutions
       [not found] ` <87vbwcaqxe.fsf@rustcorp.com.au>
  2014-02-19  4:42   ` Anthony Liguori
  2014-02-19 10:09   ` Ian Campbell
@ 2014-02-19 10:11   ` Ian Campbell
  2 siblings, 0 replies; 16+ messages in thread
From: Ian Campbell @ 2014-02-19 10:11 UTC (permalink / raw)
  To: Rusty Russell
  Cc: virtio-dev, wei.liu2, Daniel Kiper, stefano.stabellini, ian,
	anthony, sasha.levin, xen-devel

On Wed, 2014-02-19 at 10:56 +1030, Rusty Russell wrote:
>         Sorry for the delayed response, I was pondering...  CC changed
> to virtio-dev.

Which apparently is subscribers only + discard as opposed to moderate,
so my previous post won't show up there.

Ian.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: VIRTIO - compatibility with different virtualization solutions
  2014-02-19  4:42   ` Anthony Liguori
@ 2014-02-20  1:31     ` Rusty Russell
       [not found]     ` <87ha7ubme0.fsf@rustcorp.com.au>
  1 sibling, 0 replies; 16+ messages in thread
From: Rusty Russell @ 2014-02-20  1:31 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: virtio-dev, Wei Liu, Ian Campbell, Daniel Kiper,
	Stefano Stabellini, ian, sasha.levin, xen-devel

Anthony Liguori <anthony@codemonkey.ws> writes:
> On Tue, Feb 18, 2014 at 4:26 PM, Rusty Russell <rusty@au1.ibm.com> wrote:
>> Daniel Kiper <daniel.kiper@oracle.com> writes:
>>> Hi,
>>>
>>> Below you could find a summary of work in regards to VIRTIO compatibility with
>>> different virtualization solutions. It was done mainly from Xen point of view
>>> but results are quite generic and can be applied to wide spectrum
>>> of virtualization platforms.
>>
>> Hi Daniel,
>>
>>         Sorry for the delayed response, I was pondering...  CC changed
>> to virtio-dev.
>>
>> From a standard POV: It's possible to abstract out the where we use
>> 'physical address' for 'address handle'.  It's also possible to define
>> this per-platform (ie. Xen-PV vs everyone else).  This is sane, since
>> Xen-PV is a distinct platform from x86.
>
> I'll go even further and say that "address handle" doesn't make sense too.

I was trying to come up with a unique term, I wasn't trying to define
semantics :)

There are three debates here now: (1) what should the standard say, and
(2) how would Linux implement it, (3) should we use each platform's PCI
IOMMU.

> Just using grant table references is not enough to make virtio work
> well under Xen.  You really need to use bounce buffers ala persistent
> grants.

Wait, if you're using bounce buffers, you didn't make it "work well"!

> I think what you ultimately want is virtio using a DMA API (I know
> benh has scoffed at this but I don't buy his argument at face value)
> and a DMA layer that bounces requests to a pool of persistent grants.

We can have a virtio DMA API, sure.  It'd be a noop for non-Xen.

But emulating the programming of an IOMMU seems masochistic.  PowerPC
have made it clear they don't want this.  And noone else has come up
with a compelling reason to want this: virtio passthrough?

>> For platforms using EPT, I don't think you want anything but guest
>> addresses, do you?
>>
>> From an implementation POV:
>>
>> On IOMMU, start here for previous Linux discussion:
>>         http://thread.gmane.org/gmane.linux.kernel.virtualization/14410/focus=14650
>>
>> And this is the real problem.  We don't want to use the PCI IOMMU for
>> PCI devices.  So it's not just a matter of using existing Linux APIs.
>
> Is there any data to back up that claim?

Yes, for powerpc.  Implementer gets to measure, as always.  I suspect
that if you emulate an IOMMU on Intel, your performance will suck too.

> Just because power currently does hypercalls for anything that uses
> the PCI IOMMU layer doesn't mean this cannot be changed.

Does someone have an implementation of an IOMMU which doesn't use
hypercalls, or is this theoretical?

>  It's pretty
> hacky that virtio-pci just happens to work well by accident on power
> today.  Not all architectures have this limitation.

It's a fundamental assumption of virtio that the host can access all of
guest memory.  That's paravert, not a hack.

But tomayto tomatoh aside, it's unclear to me how you'd build an
efficient IOMMU today.  And it's unclear what benefit you'd gain.  But
the cost for Power is clear.

So if someone wants do to this for PCI, they need to implement it and
benchmark.  But this is a little orthogonal to the Xen discussion.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: VIRTIO - compatibility with different virtualization solutions
  2014-02-19 10:09   ` Ian Campbell
@ 2014-02-20  7:48     ` Rusty Russell
       [not found]     ` <8761oab4y7.fsf@rustcorp.com.au>
  1 sibling, 0 replies; 16+ messages in thread
From: Rusty Russell @ 2014-02-20  7:48 UTC (permalink / raw)
  To: Ian Campbell
  Cc: virtio-dev, wei.liu2, Daniel Kiper, stefano.stabellini, ian,
	anthony, sasha.levin, xen-devel

Ian Campbell <Ian.Campbell@citrix.com> writes:
> On Wed, 2014-02-19 at 10:56 +1030, Rusty Russell wrote:
>> For platforms using EPT, I don't think you want anything but guest
>> addresses, do you?
>
> No, the arguments for preventing unfettered access by backends to
> frontend RAM applies to EPT as well.

I can see how you'd parse my sentence that way, I think, but the two
are orthogonal.

AFAICT your grant-table access restrictions are page granularity, though
you don't use page-aligned data (eg. in xen-netfront).  This level of
access control is possible using the virtio ring too, but noone has
implemented such a thing AFAIK.

Hope that clarifies,
Rusty.
PS.  Random aside: I greatly enjoyed your blog post on 'Xen on ARM and
     the Device Tree vs. ACPI debate'.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: VIRTIO - compatibility with different virtualization solutions
       [not found]     ` <87ha7ubme0.fsf@rustcorp.com.au>
@ 2014-02-20 12:28       ` Stefano Stabellini
  2014-02-20 20:28       ` Daniel Kiper
  2014-02-21  2:50       ` Anthony Liguori
  2 siblings, 0 replies; 16+ messages in thread
From: Stefano Stabellini @ 2014-02-20 12:28 UTC (permalink / raw)
  To: Rusty Russell
  Cc: virtio-dev, Wei Liu, Ian Campbell, Daniel Kiper,
	Stefano Stabellini, ian, Anthony Liguori, sasha.levin, xen-devel

On Thu, 20 Feb 2014, Rusty Russell wrote:
> It's a fundamental assumption of virtio that the host can access all of
> guest memory.

I take that by "host" you mean the virtio backends in this context.

Do you think that this fundamental assumption should be sustained going
forward?

I am asking because Xen assumes that the backends are only allowed to
access the memory that the guest decides to share with them.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: VIRTIO - compatibility with different virtualization solutions
       [not found]     ` <87ha7ubme0.fsf@rustcorp.com.au>
  2014-02-20 12:28       ` Stefano Stabellini
@ 2014-02-20 20:28       ` Daniel Kiper
  2014-02-21  2:50       ` Anthony Liguori
  2 siblings, 0 replies; 16+ messages in thread
From: Daniel Kiper @ 2014-02-20 20:28 UTC (permalink / raw)
  To: Rusty Russell
  Cc: virtio-dev, Wei Liu, Ian Campbell, Stefano Stabellini, ian,
	Anthony Liguori, sasha.levin, xen-devel

Hey,

On Thu, Feb 20, 2014 at 12:01:19PM +1030, Rusty Russell wrote:
> Anthony Liguori <anthony@codemonkey.ws> writes:
> > On Tue, Feb 18, 2014 at 4:26 PM, Rusty Russell <rusty@au1.ibm.com> wrote:
> >> Daniel Kiper <daniel.kiper@oracle.com> writes:
> >>> Hi,
> >>>
> >>> Below you could find a summary of work in regards to VIRTIO compatibility with
> >>> different virtualization solutions. It was done mainly from Xen point of view
> >>> but results are quite generic and can be applied to wide spectrum
> >>> of virtualization platforms.
> >>
> >> Hi Daniel,
> >>
> >>         Sorry for the delayed response, I was pondering...  CC changed
> >> to virtio-dev.

Do not worry. It is not a problem. It is not easy issue.

> >> From a standard POV: It's possible to abstract out the where we use
> >> 'physical address' for 'address handle'.  It's also possible to define
> >> this per-platform (ie. Xen-PV vs everyone else).  This is sane, since
> >> Xen-PV is a distinct platform from x86.
> >
> > I'll go even further and say that "address handle" doesn't make sense too.
>
> I was trying to come up with a unique term, I wasn't trying to define
> semantics :)
>
> There are three debates here now: (1) what should the standard say, and

Yep.

> (2) how would Linux implement it,

It seems to me that we should think about other common OSes too.

> (3) should we use each platform's PCI IOMMU.

I do not want emulate any hardware. It seems to me that we should think about
something which fits best in VIRTIO environment. DMA API with relevant backends
looks promising but I have also some worries about performance. Additionally,
it is Linux Kernel specific stuff so maybe we should invent something more generic
which will fit well in other guest OSes too.

[...]

> It's a fundamental assumption of virtio that the host can access all of
> guest memory.  That's paravert, not a hack.

Why? What if guests would like to limit access to their memory? I think
that it will happen sooner or later. Additionally, I think that your
assumption is not hypervisor agnostic which limits implementation of
VIRTIO spec. At least for Xen your idea will make difficulties and
probably prevent VRITIO implementation.

Daniel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: VIRTIO - compatibility with different virtualization solutions
       [not found]     ` <8761oab4y7.fsf@rustcorp.com.au>
@ 2014-02-20 20:37       ` Daniel Kiper
  0 siblings, 0 replies; 16+ messages in thread
From: Daniel Kiper @ 2014-02-20 20:37 UTC (permalink / raw)
  To: Rusty Russell
  Cc: virtio-dev, wei.liu2, Ian Campbell, stefano.stabellini, ian,
	anthony, sasha.levin, xen-devel

Hey,

On Thu, Feb 20, 2014 at 06:18:00PM +1030, Rusty Russell wrote:
> Ian Campbell <Ian.Campbell@citrix.com> writes:
> > On Wed, 2014-02-19 at 10:56 +1030, Rusty Russell wrote:
> >> For platforms using EPT, I don't think you want anything but guest
> >> addresses, do you?
> >
> > No, the arguments for preventing unfettered access by backends to
> > frontend RAM applies to EPT as well.
>
> I can see how you'd parse my sentence that way, I think, but the two
> are orthogonal.
>
> AFAICT your grant-table access restrictions are page granularity, though
> you don't use page-aligned data (eg. in xen-netfront).  This level of
> access control is possible using the virtio ring too, but noone has
> implemented such a thing AFAIK.

Could you say in short how it should be done? DMA API is an option but
if there is a simpler mechanism available in VIRTIO itself we will be
happy to use it in Xen.

Daniel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: VIRTIO - compatibility with different virtualization solutions
       [not found]     ` <87ha7ubme0.fsf@rustcorp.com.au>
  2014-02-20 12:28       ` Stefano Stabellini
  2014-02-20 20:28       ` Daniel Kiper
@ 2014-02-21  2:50       ` Anthony Liguori
  2014-02-21 10:05         ` Wei Liu
  2 siblings, 1 reply; 16+ messages in thread
From: Anthony Liguori @ 2014-02-21  2:50 UTC (permalink / raw)
  To: Rusty Russell
  Cc: virtio-dev, Wei Liu, Ian Campbell, Daniel Kiper,
	Stefano Stabellini, ian, sasha.levin, xen-devel

On Wed, Feb 19, 2014 at 5:31 PM, Rusty Russell <rusty@au1.ibm.com> wrote:
> Anthony Liguori <anthony@codemonkey.ws> writes:
>> On Tue, Feb 18, 2014 at 4:26 PM, Rusty Russell <rusty@au1.ibm.com> wrote:
>>> Daniel Kiper <daniel.kiper@oracle.com> writes:
>>>> Hi,
>>>>
>>>> Below you could find a summary of work in regards to VIRTIO compatibility with
>>>> different virtualization solutions. It was done mainly from Xen point of view
>>>> but results are quite generic and can be applied to wide spectrum
>>>> of virtualization platforms.
>>>
>>> Hi Daniel,
>>>
>>>         Sorry for the delayed response, I was pondering...  CC changed
>>> to virtio-dev.
>>>
>>> From a standard POV: It's possible to abstract out the where we use
>>> 'physical address' for 'address handle'.  It's also possible to define
>>> this per-platform (ie. Xen-PV vs everyone else).  This is sane, since
>>> Xen-PV is a distinct platform from x86.
>>
>> I'll go even further and say that "address handle" doesn't make sense too.
>
> I was trying to come up with a unique term, I wasn't trying to define
> semantics :)

Understood, that wasn't really directed at you.

> There are three debates here now: (1) what should the standard say, and

The standard should say, "physical address"

> (2) how would Linux implement it,

Linux should use the PCI DMA API.

> (3) should we use each platform's PCI
> IOMMU.

Just like any other PCI device :-)

>> Just using grant table references is not enough to make virtio work
>> well under Xen.  You really need to use bounce buffers ala persistent
>> grants.
>
> Wait, if you're using bounce buffers, you didn't make it "work well"!

Preaching to the choir man...  but bounce buffering is proven to be
faster than doing grant mappings on every request.  xen-blk does
bounce buffering by default and I suspect netfront is heading that
direction soon.

It would be a lot easier to simply have a global pool of grant tables
that effectively becomes the DMA pool.  Then the DMA API can bounce
into that pool and those addresses can be placed on the ring.

It's a little different for Xen because now the backends have to deal
with physical addresses but the concept is still the same.

>> I think what you ultimately want is virtio using a DMA API (I know
>> benh has scoffed at this but I don't buy his argument at face value)
>> and a DMA layer that bounces requests to a pool of persistent grants.
>
> We can have a virtio DMA API, sure.  It'd be a noop for non-Xen.
>
> But emulating the programming of an IOMMU seems masochistic.  PowerPC
> have made it clear they don't want this.

I don't think the argument is all that clear.  Wouldn't it be nice for
other PCI devices to be faster under Power KVM?  Why not have not
change the DMA API under Power Linux to detect that it's under KVM and
simply not make any hypercalls?

>  And noone else has come up
> with a compelling reason to want this: virtio passthrough?

So I can run Xen under QEMU and use virtio-blk and virtio-net as the
device model.  Xen PV uses the DMA API to do mfn -> pfn mapping and
since virtio doesn't use it, it's the only PCI device in the QEMU
device model that doesn't actually work when running Xen under QEMU.

Regards,

Anthony Liguori

>>> For platforms using EPT, I don't think you want anything but guest
>>> addresses, do you?
>>>
>>> From an implementation POV:
>>>
>>> On IOMMU, start here for previous Linux discussion:
>>>         http://thread.gmane.org/gmane.linux.kernel.virtualization/14410/focus=14650
>>>
>>> And this is the real problem.  We don't want to use the PCI IOMMU for
>>> PCI devices.  So it's not just a matter of using existing Linux APIs.
>>
>> Is there any data to back up that claim?
>
> Yes, for powerpc.  Implementer gets to measure, as always.  I suspect
> that if you emulate an IOMMU on Intel, your performance will suck too.
>
>> Just because power currently does hypercalls for anything that uses
>> the PCI IOMMU layer doesn't mean this cannot be changed.
>
> Does someone have an implementation of an IOMMU which doesn't use
> hypercalls, or is this theoretical?
>
>>  It's pretty
>> hacky that virtio-pci just happens to work well by accident on power
>> today.  Not all architectures have this limitation.
>
> It's a fundamental assumption of virtio that the host can access all of
> guest memory.  That's paravert, not a hack.
>
> But tomayto tomatoh aside, it's unclear to me how you'd build an
> efficient IOMMU today.  And it's unclear what benefit you'd gain.  But
> the cost for Power is clear.
>
> So if someone wants do to this for PCI, they need to implement it and
> benchmark.  But this is a little orthogonal to the Xen discussion.
>
> Cheers,
> Rusty.
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: VIRTIO - compatibility with different virtualization solutions
  2014-02-21  2:50       ` Anthony Liguori
@ 2014-02-21 10:05         ` Wei Liu
  2014-02-21 15:01           ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 16+ messages in thread
From: Wei Liu @ 2014-02-21 10:05 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: virtio-dev, Wei Liu, Ian Campbell, Rusty Russell, Daniel Kiper,
	Stefano Stabellini, ian, sasha.levin, xen-devel

On Thu, Feb 20, 2014 at 06:50:59PM -0800, Anthony Liguori wrote:
> On Wed, Feb 19, 2014 at 5:31 PM, Rusty Russell <rusty@au1.ibm.com> wrote:
> > Anthony Liguori <anthony@codemonkey.ws> writes:
> >> On Tue, Feb 18, 2014 at 4:26 PM, Rusty Russell <rusty@au1.ibm.com> wrote:
> >>> Daniel Kiper <daniel.kiper@oracle.com> writes:
> >>>> Hi,
> >>>>
> >>>> Below you could find a summary of work in regards to VIRTIO compatibility with
> >>>> different virtualization solutions. It was done mainly from Xen point of view
> >>>> but results are quite generic and can be applied to wide spectrum
> >>>> of virtualization platforms.
> >>>
> >>> Hi Daniel,
> >>>
> >>>         Sorry for the delayed response, I was pondering...  CC changed
> >>> to virtio-dev.
> >>>
> >>> From a standard POV: It's possible to abstract out the where we use
> >>> 'physical address' for 'address handle'.  It's also possible to define
> >>> this per-platform (ie. Xen-PV vs everyone else).  This is sane, since
> >>> Xen-PV is a distinct platform from x86.
> >>
> >> I'll go even further and say that "address handle" doesn't make sense too.
> >
> > I was trying to come up with a unique term, I wasn't trying to define
> > semantics :)
> 
> Understood, that wasn't really directed at you.
> 
> > There are three debates here now: (1) what should the standard say, and
> 
> The standard should say, "physical address"
> 
> > (2) how would Linux implement it,
> 
> Linux should use the PCI DMA API.
> 
> > (3) should we use each platform's PCI
> > IOMMU.
> 
> Just like any other PCI device :-)
> 
> >> Just using grant table references is not enough to make virtio work
> >> well under Xen.  You really need to use bounce buffers ala persistent
> >> grants.
> >
> > Wait, if you're using bounce buffers, you didn't make it "work well"!
> 
> Preaching to the choir man...  but bounce buffering is proven to be
> faster than doing grant mappings on every request.  xen-blk does
> bounce buffering by default and I suspect netfront is heading that
> direction soon.
> 

FWIW Annie Li @ Oracle once implemented a persistent map prototype for
netfront and the result was not satisfying.

> It would be a lot easier to simply have a global pool of grant tables
> that effectively becomes the DMA pool.  Then the DMA API can bounce
> into that pool and those addresses can be placed on the ring.
> 
> It's a little different for Xen because now the backends have to deal
> with physical addresses but the concept is still the same.
> 

How would you apply this to Xen's security model? How can hypervisor
effectively enforce access control? "Handle" and "physical address" are
essentially not the same concept, otherwise you wouldn't have proposed
this change. Not saying I'm against this change, just this description
is too vague for me to understand the bigger picture.

But a downside for sure is that if we go with this change we then have
to maintain two different paths in backend. However small the difference
is it is still a burden.

Wei.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: VIRTIO - compatibility with different virtualization solutions
  2014-02-21 10:05         ` Wei Liu
@ 2014-02-21 15:01           ` Konrad Rzeszutek Wilk
  2014-02-25  0:33             ` Rusty Russell
       [not found]             ` <87y51058vf.fsf@rustcorp.com.au>
  0 siblings, 2 replies; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-02-21 15:01 UTC (permalink / raw)
  To: Wei Liu
  Cc: virtio-dev, Ian Campbell, Stefano Stabellini, Rusty Russell,
	Daniel Kiper, ian, Anthony Liguori, sasha.levin, xen-devel

On Fri, Feb 21, 2014 at 10:05:06AM +0000, Wei Liu wrote:
> On Thu, Feb 20, 2014 at 06:50:59PM -0800, Anthony Liguori wrote:
> > On Wed, Feb 19, 2014 at 5:31 PM, Rusty Russell <rusty@au1.ibm.com> wrote:
> > > Anthony Liguori <anthony@codemonkey.ws> writes:
> > >> On Tue, Feb 18, 2014 at 4:26 PM, Rusty Russell <rusty@au1.ibm.com> wrote:
> > >>> Daniel Kiper <daniel.kiper@oracle.com> writes:
> > >>>> Hi,
> > >>>>
> > >>>> Below you could find a summary of work in regards to VIRTIO compatibility with
> > >>>> different virtualization solutions. It was done mainly from Xen point of view
> > >>>> but results are quite generic and can be applied to wide spectrum
> > >>>> of virtualization platforms.
> > >>>
> > >>> Hi Daniel,
> > >>>
> > >>>         Sorry for the delayed response, I was pondering...  CC changed
> > >>> to virtio-dev.
> > >>>
> > >>> From a standard POV: It's possible to abstract out the where we use
> > >>> 'physical address' for 'address handle'.  It's also possible to define
> > >>> this per-platform (ie. Xen-PV vs everyone else).  This is sane, since
> > >>> Xen-PV is a distinct platform from x86.
> > >>
> > >> I'll go even further and say that "address handle" doesn't make sense too.
> > >
> > > I was trying to come up with a unique term, I wasn't trying to define
> > > semantics :)
> > 
> > Understood, that wasn't really directed at you.
> > 
> > > There are three debates here now: (1) what should the standard say, and
> > 
> > The standard should say, "physical address"

This conversation is heading towards - implementation needs it - hence lets
make the design have it. Which I am OK with - but if we are going that
route we might as well call this thing 'my-pony-number' because I think
each hypervisor will have a different view of it.

Some of them might use a physical address with some flag bits on it.
Some might use just physical address.

And some might want an 32-bit value that has no correlation to to physical
nor virtual addresses.
> > 
> > > (2) how would Linux implement it,
> > 
> > Linux should use the PCI DMA API.

Aye.
> > 
> > > (3) should we use each platform's PCI
> > > IOMMU.
> > 
> > Just like any other PCI device :-)

Aye.
> > 
> > >> Just using grant table references is not enough to make virtio work
> > >> well under Xen.  You really need to use bounce buffers ala persistent
> > >> grants.
> > >
> > > Wait, if you're using bounce buffers, you didn't make it "work well"!
> > 
> > Preaching to the choir man...  but bounce buffering is proven to be
> > faster than doing grant mappings on every request.  xen-blk does
> > bounce buffering by default and I suspect netfront is heading that
> > direction soon.
> > 
> 
> FWIW Annie Li @ Oracle once implemented a persistent map prototype for
> netfront and the result was not satisfying.

Which could be due to the traffic pattern. There is a lot of back/forth
traffic on a single ring in network (TCP with ACK/SYN).

With block the issue was a bit different and we do more of streaming
workloads.

> 
> > It would be a lot easier to simply have a global pool of grant tables
> > that effectively becomes the DMA pool.  Then the DMA API can bounce
> > into that pool and those addresses can be placed on the ring.
> > 
> > It's a little different for Xen because now the backends have to deal
> > with physical addresses but the concept is still the same.

Rusty, this is a part below is Xen specific - so you are welcome to gloss over it.

I presume you would also need some machinary for the hypervisor to give
access to this 64MB (or whatever size) pool (and we could make grant pages have
2MB granularity - so we just 32 grants) to the backend.

But the backend would have to know the grant entries to at least do the proper
mapping and unmapping (if it choose to)? And for that it needs
the grant value to make the proper hypercall to map its memory
(backend) to the frontend memory.

Or are you saying - instead of using grant entries just use physical
addresses - and naturally the hypervisor would have to use that as well.
Since it is just a number, why not make it at least something and we
won't need to keep a 'grant->physical address' lookup machinery?


> > 
> 
> How would you apply this to Xen's security model? How can hypervisor
> effectively enforce access control? "Handle" and "physical address" are
> essentially not the same concept, otherwise you wouldn't have proposed
> this change. Not saying I'm against this change, just this description
> is too vague for me to understand the bigger picture.
> 
> But a downside for sure is that if we go with this change we then have
> to maintain two different paths in backend. However small the difference
> is it is still a burden.

Or just in the grant machinery. The backends just plucks this number
in data structures and that is all it cares about.

> 
> Wei.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: VIRTIO - compatibility with different virtualization solutions
       [not found] <mailman.9276.1392977438.24322.xen-devel@lists.xen.org>
@ 2014-02-21 16:41 ` Andres Lagar-Cavilla
  0 siblings, 0 replies; 16+ messages in thread
From: Andres Lagar-Cavilla @ 2014-02-21 16:41 UTC (permalink / raw)
  To: xen-devel
  Cc: virtio-dev, Wei Liu, Ian Campbell, Stefano Stabellini,
	Rusty Russell, Daniel Kiper, ian, Anthony Liguori, sasha.levin

> On Thu, Feb 20, 2014 at 06:50:59PM -0800, Anthony Liguori wrote:
>> On Wed, Feb 19, 2014 at 5:31 PM, Rusty Russell <rusty@au1.ibm.com> wrote:
>>> Anthony Liguori <anthony@codemonkey.ws> writes:
>>>> On Tue, Feb 18, 2014 at 4:26 PM, Rusty Russell <rusty@au1.ibm.com> wrote:
>>>>> Daniel Kiper <daniel.kiper@oracle.com> writes:
>>>>>> Hi,
>>>>>> 
>>>>>> Below you could find a summary of work in regards to VIRTIO compatibility with
>>>>>> different virtualization solutions. It was done mainly from Xen point of view
>>>>>> but results are quite generic and can be applied to wide spectrum
>>>>>> of virtualization platforms.
>>>>> 
>>>>> Hi Daniel,
>>>>> 
>>>>>        Sorry for the delayed response, I was pondering...  CC changed
>>>>> to virtio-dev.
>>>>> 
>>>>> From a standard POV: It's possible to abstract out the where we use
>>>>> 'physical address' for 'address handle'.  It's also possible to define
>>>>> this per-platform (ie. Xen-PV vs everyone else).  This is sane, since
>>>>> Xen-PV is a distinct platform from x86.
>>>> 
>>>> I'll go even further and say that "address handle" doesn't make sense too.
>>> 
>>> I was trying to come up with a unique term, I wasn't trying to define
>>> semantics :)
>> 
>> Understood, that wasn't really directed at you.
>> 
>>> There are three debates here now: (1) what should the standard say, and
>> 
>> The standard should say, "physical address"
>> 
>>> (2) how would Linux implement it,
>> 
>> Linux should use the PCI DMA API.
>> 
>>> (3) should we use each platform's PCI
>>> IOMMU.
>> 
>> Just like any other PCI device :-)
>> 
>>>> Just using grant table references is not enough to make virtio work
>>>> well under Xen.  You really need to use bounce buffers ala persistent
>>>> grants.
>>> 
>>> Wait, if you're using bounce buffers, you didn't make it "work well"!
>> 
>> Preaching to the choir man...  but bounce buffering is proven to be
>> faster than doing grant mappings on every request.  xen-blk does
>> bounce buffering by default and I suspect netfront is heading that
>> direction soon.
>> 
> 
> FWIW Annie Li @ Oracle once implemented a persistent map prototype for
> netfront and the result was not satisfying.
> 
>> It would be a lot easier to simply have a global pool of grant tables
>> that effectively becomes the DMA pool.  Then the DMA API can bounce
>> into that pool and those addresses can be placed on the ring.
>> 
>> It's a little different for Xen because now the backends have to deal
>> with physical addresses but the concept is still the same.
>> 
> 
> How would you apply this to Xen's security model? How can hypervisor
> effectively enforce access control? "Handle" and "physical address" are
> essentially not the same concept, otherwise you wouldn't have proposed
> this change. Not saying I'm against this change, just this description
> is too vague for me to understand the bigger picture.

I might be missing something trivial. But the burden of enforcing visibility of memory only for handles befalls on the hypervisor. Taking KVM for example, the whole RAM of a guest is a vma in the mm of the faulting qemu process. That's KVM's way of doing things. "Handles" could be pfns for all that model cares, and translation+mapping from handles to actual guest RAM addresses is trivially O(1). And there's no guest control over ram visibility, and that's happy KVM.

Xen, on the other hand, can encode a 64 bit grant handle in the "__u64 addr" field of a virtio descriptor. The negotiation happens up front, the flags field is set to signal the guest is encoding handles in there. Once the Xen virtio backend gets that descriptor out of the ring, what is left is not all that different from what netback/blkback/gntdev do today with a ring request.

I'm obviously glossing over serious details (e.g. negotiation of what u64 addr means), but I what I'm going at is that I fail to understand why whole RAM visibility is a requirement for virtio. It seems to me to be a requirement for KVM and other hypervisors, while virtio is a transport and sync mechanism for high(er) level IO descriptors.

Can someone please clarify why "under Xen, you really need to use bounce buffers ala persistent grants?" Is that a performance need to avoid backend side repeated mapping and TLB junking? Granted. But why would it be a correctness need? Guest side grant table works requires no hyper calls in the data path.

If I am rewinding the conversation, feel free to ignore, but I'm not feeling a lot of clarity in the dialogue right now.

Thanks
Andres

> 
> But a downside for sure is that if we go with this change we then have
> to maintain two different paths in backend. However small the difference
> is it is still a burden.

> 
> Wei.
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: VIRTIO - compatibility with different virtualization solutions
  2014-02-21 15:01           ` Konrad Rzeszutek Wilk
@ 2014-02-25  0:33             ` Rusty Russell
       [not found]             ` <87y51058vf.fsf@rustcorp.com.au>
  1 sibling, 0 replies; 16+ messages in thread
From: Rusty Russell @ 2014-02-25  0:33 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Wei Liu
  Cc: virtio-dev, Ian Campbell, Stefano Stabellini, Daniel Kiper, ian,
	Anthony Liguori, sasha.levin, xen-devel

Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> writes:
> On Fri, Feb 21, 2014 at 10:05:06AM +0000, Wei Liu wrote:
>> On Thu, Feb 20, 2014 at 06:50:59PM -0800, Anthony Liguori wrote:
>> > The standard should say, "physical address"
>
> This conversation is heading towards - implementation needs it - hence lets
> make the design have it. Which I am OK with - but if we are going that
> route we might as well call this thing 'my-pony-number' because I think
> each hypervisor will have a different view of it.
>
> Some of them might use a physical address with some flag bits on it.
> Some might use just physical address.
>
> And some might want an 32-bit value that has no correlation to to physical
> nor virtual addresses.

True, but if the standard doesn't define what it is, it's not a standard
worth anything.  Xen is special because it's already requiring guest
changes; it's a platform in itself and so can be different from
everything else.  But it still needs to be defined.

At the moment, anything but guest-phys would not be compliant.  That's a
Good Thing if we simply don't know the best answer for Xen; we'll adjust
the standard when we do.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: VIRTIO - compatibility with different virtualization solutions
       [not found]             ` <87y51058vf.fsf@rustcorp.com.au>
@ 2014-02-25 21:09               ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-02-25 21:09 UTC (permalink / raw)
  To: Rusty Russell
  Cc: virtio-dev, Wei Liu, Ian Campbell, Stefano Stabellini,
	Daniel Kiper, ian, Anthony Liguori, sasha.levin, xen-devel

On Tue, Feb 25, 2014 at 11:03:24AM +1030, Rusty Russell wrote:
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> writes:
> > On Fri, Feb 21, 2014 at 10:05:06AM +0000, Wei Liu wrote:
> >> On Thu, Feb 20, 2014 at 06:50:59PM -0800, Anthony Liguori wrote:
> >> > The standard should say, "physical address"
> >
> > This conversation is heading towards - implementation needs it - hence lets
> > make the design have it. Which I am OK with - but if we are going that
> > route we might as well call this thing 'my-pony-number' because I think
> > each hypervisor will have a different view of it.
> >
> > Some of them might use a physical address with some flag bits on it.
> > Some might use just physical address.
> >
> > And some might want an 32-bit value that has no correlation to to physical
> > nor virtual addresses.
> 
> True, but if the standard doesn't define what it is, it's not a standard
> worth anything.  Xen is special because it's already requiring guest
> changes; it's a platform in itself and so can be different from
> everything else.  But it still needs to be defined.
> 
> At the moment, anything but guest-phys would not be compliant.  That's a
> Good Thing if we simply don't know the best answer for Xen; we'll adjust
> the standard when we do.

I think Daniel's suggestion of a 'handle' should cover it, no?

Or are you saying that the 'handle' should actually say what it is
for every platform on which VirtIO will run?

For Xen it would be whatever the DMA API gives back as 'dma_addr_t'.
Which would require the VirtIO drivers to use the DMA (or PCI) APIs.


> 
> Cheers,
> Rusty.
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-02-28 13:04 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <mailman.9276.1392977438.24322.xen-devel@lists.xen.org>
2014-02-21 16:41 ` VIRTIO - compatibility with different virtualization solutions Andres Lagar-Cavilla
2014-02-17 13:23 Daniel Kiper
2014-02-19  0:26 ` Rusty Russell
     [not found] ` <87vbwcaqxe.fsf@rustcorp.com.au>
2014-02-19  4:42   ` Anthony Liguori
2014-02-20  1:31     ` Rusty Russell
     [not found]     ` <87ha7ubme0.fsf@rustcorp.com.au>
2014-02-20 12:28       ` Stefano Stabellini
2014-02-20 20:28       ` Daniel Kiper
2014-02-21  2:50       ` Anthony Liguori
2014-02-21 10:05         ` Wei Liu
2014-02-21 15:01           ` Konrad Rzeszutek Wilk
2014-02-25  0:33             ` Rusty Russell
     [not found]             ` <87y51058vf.fsf@rustcorp.com.au>
2014-02-25 21:09               ` Konrad Rzeszutek Wilk
2014-02-19 10:09   ` Ian Campbell
2014-02-20  7:48     ` Rusty Russell
     [not found]     ` <8761oab4y7.fsf@rustcorp.com.au>
2014-02-20 20:37       ` Daniel Kiper
2014-02-19 10:11   ` Ian Campbell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).