All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/23] Add KVM support for PPC64 (970) hosts
@ 2009-07-07 14:17 Alexander Graf
  2009-07-07 15:44 ` Arnd Bergmann
                   ` (21 more replies)
  0 siblings, 22 replies; 23+ messages in thread
From: Alexander Graf @ 2009-07-07 14:17 UTC (permalink / raw)
  To: kvm-ppc

KVM for PowerPC only supports embedded cores at the moment.

While it makes sense to virtualize on small machines, it's even more fun
to do so on big boxes. So I figured we need KVM for PowerPC64 as well.

This patchset implements KVM support for PPC64 hosts and guest support
for PPC64 and G3/G4.

To really make use of this, you will also need a modified version of qemu
that can deal with KVM on desktop cores. I will send out patches for those
later, but want to get feedback on the kernel side first.

In the meanwhile, use the qemu version from
http://www.powerkvm.org/powerkvm.git which already includes all required
patches to run PPC32 and PPC64 guests.

Alexander Graf (23):
  Pass PVR in sregs
  Add PPC64 definitions
  Add PPC64 fields to vcpu structs
  Add asm/970_kvm.h
  Add common PPC64 KVM asm helpers
  Add 970 highmem asm code
  Add SLB switching code for entry/exit
  Add interrupt handling code
  Add 970.c
  Add 970 Host MMU handling
  Add 970 guest MMU
  Add 74xx guest MMU
  Add 970 specific opcode emulation
  Add mfdec emulation
  Add desktop PowerPC specific emulation
  Make head_64.S aware of KVM real mode code
  Patch SLB size
  Add PPC64 offsets to asm-offsets.c
  Export symbols for KVM module
  Export KVM symbols for module
  Include PPC64 target in buildsystem
  Hack in dirty logging for VGA
  Fix trace.h

 arch/powerpc/include/asm/exception.h   |    2 +
 arch/powerpc/include/asm/kvm.h         |    1 +
 arch/powerpc/include/asm/kvm_970.h     |  131 +++++
 arch/powerpc/include/asm/kvm_970_asm.h |  128 +++++
 arch/powerpc/include/asm/kvm_asm.h     |   39 ++
 arch/powerpc/include/asm/kvm_host.h    |   80 +++-
 arch/powerpc/include/asm/kvm_ppc.h     |    1 +
 arch/powerpc/kernel/asm-offsets.c      |   13 +
 arch/powerpc/kernel/exceptions-64s.S   |    8 +
 arch/powerpc/kernel/head_64.S          |    6 +
 arch/powerpc/kernel/ppc_ksyms.c        |    3 +-
 arch/powerpc/kernel/time.c             |    1 +
 arch/powerpc/kvm/74xx_mmu.c            |  357 ++++++++++++
 arch/powerpc/kvm/970.c                 |  947 ++++++++++++++++++++++++++++++++
 arch/powerpc/kvm/970_emulate.c         |  338 ++++++++++++
 arch/powerpc/kvm/970_exports.c         |   24 +
 arch/powerpc/kvm/970_interrupts.S      |  422 ++++++++++++++
 arch/powerpc/kvm/970_mmu.c             |  466 ++++++++++++++++
 arch/powerpc/kvm/970_mmu_host.c        |  439 +++++++++++++++
 arch/powerpc/kvm/970_rmhandlers.S      |  128 +++++
 arch/powerpc/kvm/970_slb.S             |  456 +++++++++++++++
 arch/powerpc/kvm/Kconfig               |   17 +
 arch/powerpc/kvm/Makefile              |   15 +-
 arch/powerpc/kvm/emulate.c             |   43 ++-
 arch/powerpc/kvm/powerpc.c             |   21 +-
 arch/powerpc/kvm/trace.h               |    6 +-
 arch/powerpc/mm/hash_utils_64.c        |    2 +
 arch/powerpc/mm/mmu_context_hash64.c   |    4 +
 arch/powerpc/mm/slb.c                  |   15 +
 kernel/fork.c                          |    1 +
 30 files changed, 4106 insertions(+), 8 deletions(-)
 create mode 100644 arch/powerpc/include/asm/kvm_970.h
 create mode 100644 arch/powerpc/include/asm/kvm_970_asm.h
 create mode 100644 arch/powerpc/kvm/74xx_mmu.c
 create mode 100644 arch/powerpc/kvm/970.c
 create mode 100644 arch/powerpc/kvm/970_emulate.c
 create mode 100644 arch/powerpc/kvm/970_exports.c
 create mode 100644 arch/powerpc/kvm/970_interrupts.S
 create mode 100644 arch/powerpc/kvm/970_mmu.c
 create mode 100644 arch/powerpc/kvm/970_mmu_host.c
 create mode 100644 arch/powerpc/kvm/970_rmhandlers.S
 create mode 100644 arch/powerpc/kvm/970_slb.S


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) hosts
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
@ 2009-07-07 15:44 ` Arnd Bergmann
  2009-07-07 15:49 ` Alexander Graf
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: Arnd Bergmann @ 2009-07-07 15:44 UTC (permalink / raw)
  To: kvm-ppc

On Tuesday 07 July 2009, Alexander Graf wrote:
> KVM for PowerPC only supports embedded cores at the moment.
> 
> While it makes sense to virtualize on small machines, it's even more fun
> to do so on big boxes. So I figured we need KVM for PowerPC64 as well.

Very nice to see how far you have come with this!

> This patchset implements KVM support for PPC64 hosts and guest support
> for PPC64 and G3/G4.

Most of the code is after 970 but does look generic enough to be usable
on other powerpc64 implementations. How specific to 970 is it really?
Maybe the files and identifiers could be renamed to ppc64?

You mentioned before that you could not get ppc32 guests to run on
Cell hosts, but how about 970 or cell guests? Are there any problems
that mean it cannot work on Power5/6/7 hosts, or was that just a
matter of your priorities and available test hardware?

	Arnd <><

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) hosts
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
  2009-07-07 15:44 ` Arnd Bergmann
@ 2009-07-07 15:49 ` Alexander Graf
  2009-07-07 15:54 ` Avi Kivity
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: Alexander Graf @ 2009-07-07 15:49 UTC (permalink / raw)
  To: kvm-ppc


On 07.07.2009, at 17:44, Arnd Bergmann wrote:

> On Tuesday 07 July 2009, Alexander Graf wrote:
>> KVM for PowerPC only supports embedded cores at the moment.
>>
>> While it makes sense to virtualize on small machines, it's even  
>> more fun
>> to do so on big boxes. So I figured we need KVM for PowerPC64 as  
>> well.
>
> Very nice to see how far you have come with this!
>
>> This patchset implements KVM support for PPC64 hosts and guest  
>> support
>> for PPC64 and G3/G4.
>
> Most of the code is after 970 but does look generic enough to be  
> usable
> on other powerpc64 implementations. How specific to 970 is it really?
> Maybe the files and identifiers could be renamed to ppc64?

I'm completely open to naming suggenstions. The code is 100% ppc64  
compatible. I'm not aware of any 970 specifics. I just thought all  
PPC64 cores are called something with 970.

> You mentioned before that you could not get ppc32 guests to run on
> Cell hosts, but how about 970 or cell guests? Are there any problems
> that mean it cannot work on Power5/6/7 hosts, or was that just a
> matter of your priorities and available test hardware?

I got ppc32 guests running on Cell hosts now. Doing Cell guests could  
get tricky because of the SPEs but should be fine otherwise as well.

Apart from that I simply wanted to get the CPUs running in the guest  
that work with qemu as well :-).

Alex

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) hosts
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
  2009-07-07 15:44 ` Arnd Bergmann
  2009-07-07 15:49 ` Alexander Graf
@ 2009-07-07 15:54 ` Avi Kivity
  2009-07-07 15:56 ` Alexander Graf
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: Avi Kivity @ 2009-07-07 15:54 UTC (permalink / raw)
  To: kvm-ppc

On 07/07/2009 05:17 PM, Alexander Graf wrote:
> KVM for PowerPC only supports embedded cores at the moment.
>
> While it makes sense to virtualize on small machines, it's even more fun
> to do so on big boxes. So I figured we need KVM for PowerPC64 as well.
>
> This patchset implements KVM support for PPC64 hosts and guest support
> for PPC64 and G3/G4.
>
> To really make use of this, you will also need a modified version of qemu
> that can deal with KVM on desktop cores. I will send out patches for those
> later, but want to get feedback on the kernel side first.
>
> In the meanwhile, use the qemu version from
> http://www.powerkvm.org/powerkvm.git which already includes all required
> patches to run PPC32 and PPC64 guests.
>    

 From a quick review (not that I'm qualified to review ppc code) this 
looks very good.

Does this support smp guests?

What are you plans on mmu notifier integration?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) hosts
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (2 preceding siblings ...)
  2009-07-07 15:54 ` Avi Kivity
@ 2009-07-07 15:56 ` Alexander Graf
  2009-07-07 16:08 ` Avi Kivity
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: Alexander Graf @ 2009-07-07 15:56 UTC (permalink / raw)
  To: kvm-ppc


On 07.07.2009, at 17:54, Avi Kivity wrote:

> On 07/07/2009 05:17 PM, Alexander Graf wrote:
>> KVM for PowerPC only supports embedded cores at the moment.
>>
>> While it makes sense to virtualize on small machines, it's even  
>> more fun
>> to do so on big boxes. So I figured we need KVM for PowerPC64 as  
>> well.
>>
>> This patchset implements KVM support for PPC64 hosts and guest  
>> support
>> for PPC64 and G3/G4.
>>
>> To really make use of this, you will also need a modified version  
>> of qemu
>> that can deal with KVM on desktop cores. I will send out patches  
>> for those
>> later, but want to get feedback on the kernel side first.
>>
>> In the meanwhile, use the qemu version from
>> http://www.powerkvm.org/powerkvm.git which already includes all  
>> required
>> patches to run PPC32 and PPC64 guests.
>>
>
> From a quick review (not that I'm qualified to review ppc code) this  
> looks very good.

Wow - thanks :-).

> Does this support smp guests?

The low level code should. I simply haven't tried yet.

> What are you plans on mmu notifier integration?

I want to get things really stable first. Also speed needs to be  
improved by an order of magnitude until I really start thinking about  
mmu_notifiers.

Also, the paging code could need some improvements. The current array  
approach is a first attempt to have something up and running. It's  
probably not the fastest possible.

Or is using mmu_notifiers mandatory? :-). I also found some  
mmu_notifiers parts in generic kvm code. Does that mean it gets  
handled by that already?

Alex


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) hosts
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (3 preceding siblings ...)
  2009-07-07 15:56 ` Alexander Graf
@ 2009-07-07 16:08 ` Avi Kivity
  2009-07-08  6:17 ` Benjamin Herrenschmidt
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: Avi Kivity @ 2009-07-07 16:08 UTC (permalink / raw)
  To: kvm-ppc

On 07/07/2009 06:56 PM, Alexander Graf wrote:
>> From a quick review (not that I'm qualified to review ppc code) this 
>> looks very good.
>
>
> Wow - thanks :-).

Well, I only reviewed the code that I understood, which was the 
whitespace.  Apparently that part is a well-formed program (see 
http://compsoc.dur.ac.uk/whitespace/).

>
> Or is using mmu_notifiers mandatory? :-). 

It's highly recommended.

> I also found some mmu_notifiers parts in generic kvm code. Does that 
> mean it gets handled by that already?

You just need to implement a hook that drops a gfn from your shadow page 
tables and tlbs.  It should be fairly small.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) hosts
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (4 preceding siblings ...)
  2009-07-07 16:08 ` Avi Kivity
@ 2009-07-08  6:17 ` Benjamin Herrenschmidt
  2009-07-08  6:22 ` Benjamin Herrenschmidt
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2009-07-08  6:17 UTC (permalink / raw)
  To: kvm-ppc

On Tue, 2009-07-07 at 17:44 +0200, Arnd Bergmann wrote:
> Most of the code is after 970 but does look generic enough to be
> usable
> on other powerpc64 implementations. How specific to 970 is it really?
> Maybe the files and identifiers could be renamed to ppc64?
> 
> You mentioned before that you could not get ppc32 guests to run on
> Cell hosts, but how about 970 or cell guests? Are there any problems
> that mean it cannot work on Power5/6/7 hosts, or was that just a
> matter of your priorities and available test hardware?

There is another problem I foresee which is the cache line size
difference.... That can not be emulated.

The cache line sizes vary between processors, that is, the problem is
not only classic 32 "32 bytes" vs. classic 64 "128 bytes", some CPUs
have 64 bytes cache line size for example too.

However, we can somewhat work around. At least 64-bit linux uses the
cache line size from the device-tree for example and overrides the
cputable content with that.

So if we manage to pass a "manufactures" device-tree down, we can cope
with the differences to some extent.

Things are going to be harder if the goal is to run some other OS, of
course, to the extent that we may not be able to sort it out completely
but at least for the linux-in-linux case it should work. For example, I
think MacOS uses the PVR and ignores the device-tree values...

Unfortunately, 32-bit linux does the same, though there is some work in
progress to change that, so it won't be a problem in the long run.

The patching technique Alexander does on executable pages is fun :-)
Though it's also both a bit slow and potentially dangerous (you don't
know for sure that what you are patching is actually an instruction).

Maybe it should be a configuration option to only be enabled when
needed.

Cheers,
Ben.




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) hosts
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (5 preceding siblings ...)
  2009-07-08  6:17 ` Benjamin Herrenschmidt
@ 2009-07-08  6:22 ` Benjamin Herrenschmidt
  2009-07-08  6:42 ` Benjamin Herrenschmidt
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2009-07-08  6:22 UTC (permalink / raw)
  To: kvm-ppc

On Tue, 2009-07-07 at 19:08 +0300, Avi Kivity wrote:
> >
> > Or is using mmu_notifiers mandatory? :-). 
> 
> It's highly recommended.

Considering how "differnent" the ppc MMU is, I'm not sure mmu notifiers
is necessarily the right approach but we definitely should have a close
look.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) hosts
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (6 preceding siblings ...)
  2009-07-08  6:22 ` Benjamin Herrenschmidt
@ 2009-07-08  6:42 ` Benjamin Herrenschmidt
  2009-07-08  7:35 ` Alexander Graf
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2009-07-08  6:42 UTC (permalink / raw)
  To: kvm-ppc

On Tue, 2009-07-07 at 17:49 +0200, Alexander Graf wrote:
> I'm completely open to naming suggenstions. The code is 100% ppc64  
> compatible. I'm not aware of any 970 specifics. I just thought all  
> PPC64 cores are called something with 970.

 :-)

So 970 is just the thing also known as G5 which is a "light" variant of
the POWER 4 processor.

Basically, the world we support in linux looks like that:

 - Power3 and rs64: We really don't want to deal with these for now.
They don't have SLBs for example, but instead some more ancient stuff
call stab etc... besides they are old stuff.

 - Power4: First one with an SLB, no altivec (aka VMX). Only older IBM
server machines. The FW on those machines let you operate in "full
system partition" mode (ie, no hypervisor underneath you, ie, you run
with MSR:HV=1) or partitioned, but the partitionning is relatively
simplistic (fixed, no shared processors, etc...)

 - 970 (aka G5, aka GPUL): This is the stuff that was made for Apple
(sort-of), it's derived from some later revision of the Power4,
simplified in some ways, and adds a VMX to it among other things. It was
also used in some IBM products such as the js20 and js21 blades, or in
the Fixstar PowerStation.

 - Power5: Back to IBM "server" processors, this is an
evolution/improvement of Power4. It doesn't have VMX, but brings
multithread (2 threads per core). Also, the FW on those machines never
let you run without the hypervisor underneath you. Later revisions
support 1G segments in the SLB which we may have to emulate if we want
the guest to use them at some stages and supports 64K page sizes.

 - Power6: This adds back VMX, different pipeline structure than Power5
(oriented toward mhz, much more in order, etc...). Current offering.
Also available on js22 blades.

 - Cell: The ppc core in there is simplistic, but supports multithreads
and altivec. It's based on arch 2.02 or so, so it has SLB etc... it
supports also 64K and 1M page sizes under some circumstances (ie, you
can only enable 2 out of the 3 non-4K sizes via some HID bits or on PS3,
some HV calls) but doesn't support 1G segments. Of course, it adds SPUs
but that's out of the core.

And that's about all we can talk about at this stage :-) I think there
have been patches posted already related to Power7 but I don't know how
much we can or cannot tell about it, though from your POV it shouldn't
be very different from P5 or P6.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) hosts
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (7 preceding siblings ...)
  2009-07-08  6:42 ` Benjamin Herrenschmidt
@ 2009-07-08  7:35 ` Alexander Graf
  2009-07-08  8:01 ` Benjamin Herrenschmidt
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: Alexander Graf @ 2009-07-08  7:35 UTC (permalink / raw)
  To: kvm-ppc


On 08.07.2009, at 08:17, Benjamin Herrenschmidt wrote:

> On Tue, 2009-07-07 at 17:44 +0200, Arnd Bergmann wrote:
>> Most of the code is after 970 but does look generic enough to be
>> usable
>> on other powerpc64 implementations. How specific to 970 is it really?
>> Maybe the files and identifiers could be renamed to ppc64?
>>
>> You mentioned before that you could not get ppc32 guests to run on
>> Cell hosts, but how about 970 or cell guests? Are there any problems
>> that mean it cannot work on Power5/6/7 hosts, or was that just a
>> matter of your priorities and available test hardware?
>
> There is another problem I foresee which is the cache line size
> difference.... That can not be emulated.
>
> The cache line sizes vary between processors, that is, the problem is
> not only classic 32 "32 bytes" vs. classic 64 "128 bytes", some CPUs
> have 64 bytes cache line size for example too.

Which non-booke CPUs have 64 byte cache line size? I thought all BookS  
PPC64 ones are on 128 bytes.

> However, we can somewhat work around. At least 64-bit linux uses the
> cache line size from the device-tree for example and overrides the
> cputable content with that.
>
> So if we manage to pass a "manufactures" device-tree down, we can cope
> with the differences to some extent.

Well, we only need to tell our guest firmware what the host cache line  
size is, which qemu knows already, as it's passed to userspace.

> Things are going to be harder if the goal is to run some other OS, of
> course, to the extent that we may not be able to sort it out  
> completely
> but at least for the linux-in-linux case it should work. For  
> example, I
> think MacOS uses the PVR and ignores the device-tree values...

MacOS always sets the HID5 dcbz32 bit and assumes 128 bytes for dcbzl.

> Unfortunately, 32-bit linux does the same, though there is some work  
> in
> progress to change that, so it won't be a problem in the long run.

It'd be really nice to have 32-bit Linux be a bit more clever about  
this. Right now I have a dcbz hack in that makes all dcbz basically be  
dcbz32 when running in book3s_32 mode or HID5.dcbz32 = 1 for book3s_64.

That hack is either achieved by setting HID5.dcbz32 = 1 on the host  
(if possible) or by runtime binary patching of the guest.

> The patching technique Alexander does on executable pages is fun :-)
> Though it's also both a bit slow and potentially dangerous (you don't
> know for sure that what you are patching is actually an instruction).

Yeah, we also lose nx capabilites in the guest atm. We're rather safe  
on not patching data by only patching on execution, but you're right -  
there could still be data in executed pages.

> Maybe it should be a configuration option to only be enabled when
> needed.

It _is_ needed for every possible configuration atm :-). PPC32 Linux  
doesn't boot without. PPC64 Linux doesn't set dcbz32.

Alex


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) hosts
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (8 preceding siblings ...)
  2009-07-08  7:35 ` Alexander Graf
@ 2009-07-08  8:01 ` Benjamin Herrenschmidt
  2009-07-08  8:06 ` Avi Kivity
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2009-07-08  8:01 UTC (permalink / raw)
  To: kvm-ppc


> > The cache line sizes vary between processors, that is, the problem is
> > not only classic 32 "32 bytes" vs. classic 64 "128 bytes", some CPUs
> > have 64 bytes cache line size for example too.
> 
> Which non-booke CPUs have 64 byte cache line size? I thought all BookS  
> PPC64 ones are on 128 bytes.

PA-Semi, maybe rs64 (not sure, I have some vague memories here though
the later is only supported on some legacy iSeries which we may not care
about... the kernel cputable still says 128 though, but maybe the
device-tree says otherwise).

Also, for 32-bit, there's some e500's that have 64 byte cache line size
and I know of at least one thing not released yet that will be 32-bits
and will have 128 d-side and 32 i-side :-) (so it can be tricked into
making look like i side is 128 too).

> Well, we only need to tell our guest firmware what the host cache line  
> size is, which qemu knows already, as it's passed to userspace.

Right. As long as the guest OS does the right thing, which afaik only
linux 64-bit does at the moment, though we plan to fix linux 32-bit.

> MacOS always sets the HID5 dcbz32 bit and assumes 128 bytes for dcbzl.

Which will be a problem on processor that don't support that bit.. your
patching technique is interesting though. It should probably remain an
option in case it causes trouble.

For MacOS X, it's reasonably easy to fix things up with paravirt
extensions to a certain extent. For MOL, we have some optional add-ons
you can install inside OS X that patch its kernel in various ways to
make it work better/faster in MOL by avoiding some nasty constructs,
such as abuse of split MMU mode (IR clear DR set or the other way
around) (thanks for Apple publishing their source code here :-)

> It'd be really nice to have 32-bit Linux be a bit more clever about  
> this. Right now I have a dcbz hack in that makes all dcbz basically be  
> dcbz32 when running in book3s_32 mode or HID5.dcbz32 = 1 for book3s_64.
>
> That hack is either achieved by setting HID5.dcbz32 = 1 on the host  
> (if possible) or by runtime binary patching of the guest.

Yup, I saw.

> > The patching technique Alexander does on executable pages is fun :-)
> > Though it's also both a bit slow and potentially dangerous (you don't
> > know for sure that what you are patching is actually an instruction).
> 
> Yeah, we also lose nx capabilites in the guest atm. We're rather safe  
> on not patching data by only patching on execution, but you're right -  
> there could still be data in executed pages.

Well, most 32-bit "S" processors don't support NX at the PTE level
anyway, only at the segment level.

> > Maybe it should be a configuration option to only be enabled when
> > needed.
> 
> It _is_ needed for every possible configuration atm :-). PPC32 Linux  
> doesn't boot without. PPC64 Linux doesn't set dcbz32.

Right. We do need to fix ppc32 linux :-) dcbz32 only exists on 970, not
on other processors so we can't really make that more generic.

Cheers,
Ben.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) hosts
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (9 preceding siblings ...)
  2009-07-08  8:01 ` Benjamin Herrenschmidt
@ 2009-07-08  8:06 ` Avi Kivity
  2009-07-08  8:19 ` [PATCH 00/23] Add KVM support for PPC64 (970) host Benjamin Herrenschmidt
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: Avi Kivity @ 2009-07-08  8:06 UTC (permalink / raw)
  To: kvm-ppc

On 07/08/2009 09:22 AM, Benjamin Herrenschmidt wrote:
> On Tue, 2009-07-07 at 19:08 +0300, Avi Kivity wrote:
>    
>>> Or is using mmu_notifiers mandatory? :-).
>>>        
>> It's highly recommended.
>>      
>
> Considering how "differnent" the ppc MMU is, I'm not sure mmu notifiers
> is necessarily the right approach but we definitely should have a close
> look.
>    

I don't see how the mmu structure matters.  All mmu notifiers does is 
tell you when Linux wants to drop a pte, it's completely up to you how 
to use this information.

There are three ways of dealing with the problem:

- ignore it (ia64, ppc)

Easy to implement, but you can't swap or do any fancy stuff like page 
migration.  If you have a large tlb/shadow, you pin a lot of pages.

- use mmu notifiers (x86)

Need reverse mapping to convert guest physical addresses to shadow/tlb 
addresses.  Not difficult to implement but requires careful locking.

- s390 (s390)

You can s390 your problems away if you have s390 hardware.

Can you explain your concerns?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) host
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (10 preceding siblings ...)
  2009-07-08  8:06 ` Avi Kivity
@ 2009-07-08  8:19 ` Benjamin Herrenschmidt
  2009-07-08  8:28 ` [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2009-07-08  8:19 UTC (permalink / raw)
  To: kvm-ppc

On Wed, 2009-07-08 at 11:06 +0300, Avi Kivity wrote:
> 
> I don't see how the mmu structure matters.  All mmu notifiers does is 
> tell you when Linux wants to drop a pte, it's completely up to you how 
> to use this information.
> 
> There are three ways of dealing with the problem:
> 
> - ignore it (ia64, ppc)
> 
> Easy to implement, but you can't swap or do any fancy stuff like page 
> migration.  If you have a large tlb/shadow, you pin a lot of pages.

Hrm... with MOL we used to just patch into the low level hash
invalidate, which is called to invalidate the translation from the hash
table when Linux invalidates the PTE. I haven't quite figured out yet
what Alexander does. We could probably use an mmu_notifier for that
provided it's not too high level ie, we may want to stick to whacking
the hash code which does some fancy stuff... 

> - use mmu notifiers (x86)
> 
> Need reverse mapping to convert guest physical addresses to shadow/tlb 
> addresses.  Not difficult to implement but requires careful locking.

Well, as I said, we may want to implement that completely differently at
a lower level, purely at the hash level, and thus use a different
infrastructure. At least that's what MOL did and swap etc... worked just
fine. I need to get more familiar with Alex code to see how he does
things vs. the MMU, but we definitely need to think it through. 

> Can you explain your concerns?

Nothing really just yet. As I said, I need to get my head around Alex
code and figure before I take a position here. It might just be easier
for us to hook into the low level hash table invalidation code instead
and totally ignore the linux PTEs.

Cheers,
Ben.

 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) hosts
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (11 preceding siblings ...)
  2009-07-08  8:19 ` [PATCH 00/23] Add KVM support for PPC64 (970) host Benjamin Herrenschmidt
@ 2009-07-08  8:28 ` Alexander Graf
  2009-07-08  8:43 ` [PATCH 00/23] Add KVM support for PPC64 (970) host Avi Kivity
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: Alexander Graf @ 2009-07-08  8:28 UTC (permalink / raw)
  To: kvm-ppc


On 08.07.2009, at 10:01, Benjamin Herrenschmidt wrote:

>
>>> The cache line sizes vary between processors, that is, the problem  
>>> is
>>> not only classic 32 "32 bytes" vs. classic 64 "128 bytes", some CPUs
>>> have 64 bytes cache line size for example too.
>>
>> Which non-booke CPUs have 64 byte cache line size? I thought all  
>> BookS
>> PPC64 ones are on 128 bytes.
>
> PA-Semi, maybe rs64 (not sure, I have some vague memories here though
> the later is only supported on some legacy iSeries which we may not  
> care
> about... the kernel cputable still says 128 though, but maybe the
> device-tree says otherwise).
>
> Also, for 32-bit, there's some e500's that have 64 byte cache line  
> size
> and I know of at least one thing not released yet that will be 32-bits
> and will have 128 d-side and 32 i-side :-) (so it can be tricked into
> making look like i side is 128 too).

Sounds like all of those are potential guest CPUs, but probably not  
host :-).
E500 has its own KVM implementation, as does 440. So I don't think  
we'll run into dcbz issues again in the near future.

>
>> Well, we only need to tell our guest firmware what the host cache  
>> line
>> size is, which qemu knows already, as it's passed to userspace.
>
> Right. As long as the guest OS does the right thing, which afaik only
> linux 64-bit does at the moment, though we plan to fix linux 32-bit.
>
>> MacOS always sets the HID5 dcbz32 bit and assumes 128 bytes for  
>> dcbzl.
>
> Which will be a problem on processor that don't support that bit..  
> your
> patching technique is interesting though. It should probably remain an
> option in case it causes trouble.
>
> For MacOS X, it's reasonably easy to fix things up with paravirt
> extensions to a certain extent. For MOL, we have some optional add-ons
> you can install inside OS X that patch its kernel in various ways to
> make it work better/faster in MOL by avoiding some nasty constructs,
> such as abuse of split MMU mode (IR clear DR set or the other way
> around) (thanks for Apple publishing their source code here :-)

What about userspace applications that use dcbz?

But to run Mac OS X we first have to fix quite a lot of qemu stuff  
anyways ;-). It might also make sense to only run osx in PPC32 mode,  
as the PPC64 one doesn't add that much value.

>>> The patching technique Alexander does on executable pages is fun :-)
>>> Though it's also both a bit slow and potentially dangerous (you  
>>> don't
>>> know for sure that what you are patching is actually an  
>>> instruction).
>>
>> Yeah, we also lose nx capabilites in the guest atm. We're rather safe
>> on not patching data by only patching on execution, but you're  
>> right -
>> there could still be data in executed pages.
>
> Well, most 32-bit "S" processors don't support NX at the PTE level
> anyway, only at the segment level.

Which is all the same from my code's perspective. Host segments are  
just mapped unconditionally while the real handling happens on a per- 
page basis. So if the guest sets NX on a segment, it ends up as NX in  
every host page we map.

Alex

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) host
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (12 preceding siblings ...)
  2009-07-08  8:28 ` [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
@ 2009-07-08  8:43 ` Avi Kivity
  2009-07-08  9:51 ` Benjamin Herrenschmidt
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: Avi Kivity @ 2009-07-08  8:43 UTC (permalink / raw)
  To: kvm-ppc

On 07/08/2009 11:19 AM, Benjamin Herrenschmidt wrote:
>> Can you explain your concerns?
>>      
>
> Nothing really just yet. As I said, I need to get my head around Alex
> code and figure before I take a position here. It might just be easier
> for us to hook into the low level hash table invalidation code instead
> and totally ignore the linux PTEs.
>    

Are the hashes lazily or eagerly populated wrt the linux ptes?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) host
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (13 preceding siblings ...)
  2009-07-08  8:43 ` [PATCH 00/23] Add KVM support for PPC64 (970) host Avi Kivity
@ 2009-07-08  9:51 ` Benjamin Herrenschmidt
  2009-07-08 10:04 ` Avi Kivity
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2009-07-08  9:51 UTC (permalink / raw)
  To: kvm-ppc

On Wed, 2009-07-08 at 11:43 +0300, Avi Kivity wrote:
> On 07/08/2009 11:19 AM, Benjamin Herrenschmidt wrote:
> >> Can you explain your concerns?
> >>      
> >
> > Nothing really just yet. As I said, I need to get my head around Alex
> > code and figure before I take a position here. It might just be easier
> > for us to hook into the low level hash table invalidation code instead
> > and totally ignore the linux PTEs.
> >    
> 
> Are the hashes lazily or eagerly populated wrt the linux ptes?

Lazily.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) host
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (14 preceding siblings ...)
  2009-07-08  9:51 ` Benjamin Herrenschmidt
@ 2009-07-08 10:04 ` Avi Kivity
  2009-07-08 10:24 ` Benjamin Herrenschmidt
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: Avi Kivity @ 2009-07-08 10:04 UTC (permalink / raw)
  To: kvm-ppc

On 07/08/2009 12:51 PM, Benjamin Herrenschmidt wrote:
> On Wed, 2009-07-08 at 11:43 +0300, Avi Kivity wrote:
>    
>> On 07/08/2009 11:19 AM, Benjamin Herrenschmidt wrote:
>>      
>>>> Can you explain your concerns?
>>>>
>>>>          
>>> Nothing really just yet. As I said, I need to get my head around Alex
>>> code and figure before I take a position here. It might just be easier
>>> for us to hook into the low level hash table invalidation code instead
>>> and totally ignore the linux PTEs.
>>>
>>>        
>> Are the hashes lazily or eagerly populated wrt the linux ptes?
>>      
>
> Lazily.
>    

In this case how can hooking the hashes work?

kvm obtains pages by calling get_user_pages_fast() so you can have a 
page in the shadow hash which is not present in the host hash.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) host
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (15 preceding siblings ...)
  2009-07-08 10:04 ` Avi Kivity
@ 2009-07-08 10:24 ` Benjamin Herrenschmidt
  2009-07-08 11:10 ` [PATCH 00/23] Add KVM support for PPC64 (970) hosts Benjamin Herrenschmidt
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2009-07-08 10:24 UTC (permalink / raw)
  To: kvm-ppc

On Wed, 2009-07-08 at 13:04 +0300, Avi Kivity wrote:
> 
> In this case how can hooking the hashes work?
> 
> kvm obtains pages by calling get_user_pages_fast() so you can have a 
> page in the shadow hash which is not present in the host hash.
> 
I'm not sure why we would have a problem with that actually. But again,
it depends how Alex does his stuff...

We -must- keep track that a given Linux PTE was hashed, we must not put
something into the hash without keeping that track or bad things will
happen such as potential duplicates etc...

We have fields in the PTE for that though I would have to understand
how Alex handles translated entries there.

If a guest entry is in the real hash, we thus must have some tracking
in the linux PTE that this guest entry is there to properly invalidate
it when the PTE is invalidated.

Now we could probably use the mmu_notifiers for that, but we have
existing mechanism to do that tracking and I think we might just
be able to re-use them, but of course, that needs to be looked at it
more details.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) hosts
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (16 preceding siblings ...)
  2009-07-08 10:24 ` Benjamin Herrenschmidt
@ 2009-07-08 11:10 ` Benjamin Herrenschmidt
  2009-07-08 12:29 ` Alexander Graf
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2009-07-08 11:10 UTC (permalink / raw)
  To: kvm-ppc

On Wed, 2009-07-08 at 10:28 +0200, Alexander Graf wrote:

> What about userspace applications that use dcbz?

That's generally the killer... but then, MOL was only ever implemented
on ppc32 so we never had the problem.

> But to run Mac OS X we first have to fix quite a lot of qemu stuff  
> anyways ;-). It might also make sense to only run osx in PPC32 mode,  
> as the PPC64 one doesn't add that much value.

Yeah :-) We can pick stuff from MOL if we want to though, as it was all
GPL.

> Which is all the same from my code's perspective. Host segments are  
> just mapped unconditionally while the real handling happens on a per- 
> page basis. So if the guest sets NX on a segment, it ends up as NX in  
> every host page we map.

Ok.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) hosts
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (17 preceding siblings ...)
  2009-07-08 11:10 ` [PATCH 00/23] Add KVM support for PPC64 (970) hosts Benjamin Herrenschmidt
@ 2009-07-08 12:29 ` Alexander Graf
  2009-07-08 21:40 ` Benjamin Herrenschmidt
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: Alexander Graf @ 2009-07-08 12:29 UTC (permalink / raw)
  To: kvm-ppc





On 08.07.2009, at 13:10, Benjamin Herrenschmidt <benh@kernel.crashing.org 
 > wrote:

> On Wed, 2009-07-08 at 10:28 +0200, Alexander Graf wrote:
>
>> What about userspace applications that use dcbz?
>
> That's generally the killer... but then, MOL was only ever implemented
> on ppc32 so we never had the problem.
>
>> But to run Mac OS X we first have to fix quite a lot of qemu stuff
>> anyways ;-). It might also make sense to only run osx in PPC32 mode,
>> as the PPC64 one doesn't add that much value.
>
> Yeah :-) We can pick stuff from MOL if we want to though, as it was  
> all
> GPL.

Thinking about it again, why did MOL never get upstream? :)

Alex

>
>> Which is all the same from my code's perspective. Host segments are
>> just mapped unconditionally while the real handling happens on a per-
>> page basis. So if the guest sets NX on a segment, it ends up as NX in
>> every host page we map.
>
> Ok.
>
> Cheers,
> Ben.
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) hosts
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (18 preceding siblings ...)
  2009-07-08 12:29 ` Alexander Graf
@ 2009-07-08 21:40 ` Benjamin Herrenschmidt
  2009-07-22 22:58 ` Olof Johansson
  2009-07-22 23:02 ` Olof Johansson
  21 siblings, 0 replies; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2009-07-08 21:40 UTC (permalink / raw)
  To: kvm-ppc

On Wed, 2009-07-08 at 14:29 +0200, Alexander Graf wrote:
> 
> 
> 
> On 08.07.2009, at 13:10, Benjamin Herrenschmidt <benh@kernel.crashing.org 
>  > wrote:
> 
> > On Wed, 2009-07-08 at 10:28 +0200, Alexander Graf wrote:
> >
> >> What about userspace applications that use dcbz?
> >
> > That's generally the killer... but then, MOL was only ever implemented
> > on ppc32 so we never had the problem.
> >
> >> But to run Mac OS X we first have to fix quite a lot of qemu stuff
> >> anyways ;-). It might also make sense to only run osx in PPC32 mode,
> >> as the PPC64 one doesn't add that much value.
> >
> > Yeah :-) We can pick stuff from MOL if we want to though, as it was  
> > all
> > GPL.
> 
> Thinking about it again, why did MOL never get upstream? :)

No real need to. We had a few tweaks in the kernel to make it happy but
it was a moving target for a while, and so it was felt easier to keep
the MOL kernel module and userspace together in a separate package.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) hosts
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (19 preceding siblings ...)
  2009-07-08 21:40 ` Benjamin Herrenschmidt
@ 2009-07-22 22:58 ` Olof Johansson
  2009-07-22 23:02 ` Olof Johansson
  21 siblings, 0 replies; 23+ messages in thread
From: Olof Johansson @ 2009-07-22 22:58 UTC (permalink / raw)
  To: kvm-ppc

On Wed, Jul 08, 2009 at 04:42:32PM +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2009-07-07 at 17:49 +0200, Alexander Graf wrote:
> > I'm completely open to naming suggenstions. The code is 100% ppc64  
> > compatible. I'm not aware of any 970 specifics. I just thought all  
> > PPC64 cores are called something with 970.
> 
>  :-)
> 
> So 970 is just the thing also known as G5 which is a "light" variant of
> the POWER 4 processor.
> 
> Basically, the world we support in linux looks like that:

Tsk tsk tsk, you forgot PA6T. :-)


-Olof

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/23] Add KVM support for PPC64 (970) hosts
  2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
                   ` (20 preceding siblings ...)
  2009-07-22 22:58 ` Olof Johansson
@ 2009-07-22 23:02 ` Olof Johansson
  21 siblings, 0 replies; 23+ messages in thread
From: Olof Johansson @ 2009-07-22 23:02 UTC (permalink / raw)
  To: kvm-ppc

On Wed, Jul 08, 2009 at 06:01:26PM +1000, Benjamin Herrenschmidt wrote:
> 
> > > The cache line sizes vary between processors, that is, the problem is
> > > not only classic 32 "32 bytes" vs. classic 64 "128 bytes", some CPUs
> > > have 64 bytes cache line size for example too.
> > 
> > Which non-booke CPUs have 64 byte cache line size? I thought all BookS  
> > PPC64 ones are on 128 bytes.
> 
> PA-Semi, maybe rs64 (not sure, I have some vague memories here though
> the later is only supported on some legacy iSeries which we may not care
> about... the kernel cputable still says 128 though, but maybe the
> device-tree says otherwise).

Right, I think PPC620 had it, and I doubt anyone runs linux on those.


-Olof

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2009-07-22 23:02 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-07 14:17 [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
2009-07-07 15:44 ` Arnd Bergmann
2009-07-07 15:49 ` Alexander Graf
2009-07-07 15:54 ` Avi Kivity
2009-07-07 15:56 ` Alexander Graf
2009-07-07 16:08 ` Avi Kivity
2009-07-08  6:17 ` Benjamin Herrenschmidt
2009-07-08  6:22 ` Benjamin Herrenschmidt
2009-07-08  6:42 ` Benjamin Herrenschmidt
2009-07-08  7:35 ` Alexander Graf
2009-07-08  8:01 ` Benjamin Herrenschmidt
2009-07-08  8:06 ` Avi Kivity
2009-07-08  8:19 ` [PATCH 00/23] Add KVM support for PPC64 (970) host Benjamin Herrenschmidt
2009-07-08  8:28 ` [PATCH 00/23] Add KVM support for PPC64 (970) hosts Alexander Graf
2009-07-08  8:43 ` [PATCH 00/23] Add KVM support for PPC64 (970) host Avi Kivity
2009-07-08  9:51 ` Benjamin Herrenschmidt
2009-07-08 10:04 ` Avi Kivity
2009-07-08 10:24 ` Benjamin Herrenschmidt
2009-07-08 11:10 ` [PATCH 00/23] Add KVM support for PPC64 (970) hosts Benjamin Herrenschmidt
2009-07-08 12:29 ` Alexander Graf
2009-07-08 21:40 ` Benjamin Herrenschmidt
2009-07-22 22:58 ` Olof Johansson
2009-07-22 23:02 ` Olof Johansson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.