linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] KVM: PPC: Fix kvm_exit_names array
@ 2013-07-03 13:30 Mihai Caraman
  2013-07-03 13:30 ` [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction Mihai Caraman
  2013-07-08 18:39 ` [PATCH 1/2] KVM: PPC: Fix kvm_exit_names array Alexander Graf
  0 siblings, 2 replies; 16+ messages in thread
From: Mihai Caraman @ 2013-07-03 13:30 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Mihai Caraman, linuxppc-dev, kvm

Some exit ids where left out from kvm_exit_names array.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
---
 arch/powerpc/kvm/timing.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/timing.c b/arch/powerpc/kvm/timing.c
index 07b6110..c392d26 100644
--- a/arch/powerpc/kvm/timing.c
+++ b/arch/powerpc/kvm/timing.c
@@ -135,7 +135,9 @@ static const char *kvm_exit_names[__NUMBER_OF_KVM_EXIT_TYPES] = {
 	[USR_PR_INST] =             "USR_PR_INST",
 	[FP_UNAVAIL] =              "FP_UNAVAIL",
 	[DEBUG_EXITS] =             "DEBUG",
-	[TIMEINGUEST] =             "TIMEINGUEST"
+	[TIMEINGUEST] =             "TIMEINGUEST",
+	[DBELL_EXITS] =             "DBELL",
+	[GDBELL_EXITS] =            "GDBELL"
 };
 
 static int kvmppc_exit_timing_show(struct seq_file *m, void *private)
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
  2013-07-03 13:30 [PATCH 1/2] KVM: PPC: Fix kvm_exit_names array Mihai Caraman
@ 2013-07-03 13:30 ` Mihai Caraman
  2013-07-08 18:45   ` Alexander Graf
  2013-07-08 18:39 ` [PATCH 1/2] KVM: PPC: Fix kvm_exit_names array Alexander Graf
  1 sibling, 1 reply; 16+ messages in thread
From: Mihai Caraman @ 2013-07-03 13:30 UTC (permalink / raw)
  To: kvm-ppc; +Cc: Mihai Caraman, linuxppc-dev, kvm

Some guests are making use of return from machine check instruction
to do crazy things even though the 64-bit kernel doesn't handle yet
this interrupt. Emulate MCSRR0/1 SPR and rfmci instruction accordingly.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
---
 arch/powerpc/include/asm/kvm_host.h |    1 +
 arch/powerpc/kvm/booke_emulate.c    |   25 +++++++++++++++++++++++++
 arch/powerpc/kvm/timing.c           |    1 +
 3 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index af326cd..0466789 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -148,6 +148,7 @@ enum kvm_exit_types {
 	EMULATED_TLBWE_EXITS,
 	EMULATED_RFI_EXITS,
 	EMULATED_RFCI_EXITS,
+	EMULATED_RFMCI_EXITS,
 	DEC_EXITS,
 	EXT_INTR_EXITS,
 	HALT_WAKEUP,
diff --git a/arch/powerpc/kvm/booke_emulate.c b/arch/powerpc/kvm/booke_emulate.c
index 27a4b28..aaff1b7 100644
--- a/arch/powerpc/kvm/booke_emulate.c
+++ b/arch/powerpc/kvm/booke_emulate.c
@@ -23,6 +23,7 @@
 
 #include "booke.h"
 
+#define OP_19_XOP_RFMCI   38
 #define OP_19_XOP_RFI     50
 #define OP_19_XOP_RFCI    51
 
@@ -43,6 +44,12 @@ static void kvmppc_emul_rfci(struct kvm_vcpu *vcpu)
 	kvmppc_set_msr(vcpu, vcpu->arch.csrr1);
 }
 
+static void kvmppc_emul_rfmci(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.pc = vcpu->arch.mcsrr0;
+	kvmppc_set_msr(vcpu, vcpu->arch.mcsrr1);
+}
+
 int kvmppc_booke_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
                             unsigned int inst, int *advance)
 {
@@ -65,6 +72,12 @@ int kvmppc_booke_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
 			*advance = 0;
 			break;
 
+		case OP_19_XOP_RFMCI:
+			kvmppc_emul_rfmci(vcpu);
+			kvmppc_set_exit_type(vcpu, EMULATED_RFMCI_EXITS);
+			*advance = 0;
+			break;
+
 		default:
 			emulated = EMULATE_FAIL;
 			break;
@@ -138,6 +151,12 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val)
 	case SPRN_DBCR1:
 		vcpu->arch.dbg_reg.dbcr1 = spr_val;
 		break;
+	case SPRN_MCSRR0:
+		vcpu->arch.mcsrr0 = spr_val;
+		break;
+	case SPRN_MCSRR1:
+		vcpu->arch.mcsrr1 = spr_val;
+		break;
 	case SPRN_DBSR:
 		vcpu->arch.dbsr &= ~spr_val;
 		break;
@@ -284,6 +303,12 @@ int kvmppc_booke_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val)
 	case SPRN_DBCR1:
 		*spr_val = vcpu->arch.dbg_reg.dbcr1;
 		break;
+	case SPRN_MCSRR0:
+		*spr_val = vcpu->arch.mcsrr0;
+		break;
+	case SPRN_MCSRR1:
+		*spr_val = vcpu->arch.mcsrr1;
+		break;
 	case SPRN_DBSR:
 		*spr_val = vcpu->arch.dbsr;
 		break;
diff --git a/arch/powerpc/kvm/timing.c b/arch/powerpc/kvm/timing.c
index c392d26..670f63d 100644
--- a/arch/powerpc/kvm/timing.c
+++ b/arch/powerpc/kvm/timing.c
@@ -129,6 +129,7 @@ static const char *kvm_exit_names[__NUMBER_OF_KVM_EXIT_TYPES] = {
 	[EMULATED_TLBSX_EXITS] =    "EMUL_TLBSX",
 	[EMULATED_TLBWE_EXITS] =    "EMUL_TLBWE",
 	[EMULATED_RFI_EXITS] =      "EMUL_RFI",
+	[EMULATED_RFMCI_EXITS] =    "EMUL_RFMCI",
 	[DEC_EXITS] =               "DEC",
 	[EXT_INTR_EXITS] =          "EXTINT",
 	[HALT_WAKEUP] =             "HALT",
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] KVM: PPC: Fix kvm_exit_names array
  2013-07-03 13:30 [PATCH 1/2] KVM: PPC: Fix kvm_exit_names array Mihai Caraman
  2013-07-03 13:30 ` [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction Mihai Caraman
@ 2013-07-08 18:39 ` Alexander Graf
  1 sibling, 0 replies; 16+ messages in thread
From: Alexander Graf @ 2013-07-08 18:39 UTC (permalink / raw)
  To: Mihai Caraman; +Cc: linuxppc-dev, kvm, kvm-ppc


On 03.07.2013, at 15:30, Mihai Caraman wrote:

> Some exit ids where left out from kvm_exit_names array.
>=20
> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
> ---
> arch/powerpc/kvm/timing.c |    4 +++-
> 1 files changed, 3 insertions(+), 1 deletions(-)
>=20
> diff --git a/arch/powerpc/kvm/timing.c b/arch/powerpc/kvm/timing.c
> index 07b6110..c392d26 100644
> --- a/arch/powerpc/kvm/timing.c
> +++ b/arch/powerpc/kvm/timing.c
> @@ -135,7 +135,9 @@ static const char =
*kvm_exit_names[__NUMBER_OF_KVM_EXIT_TYPES] =3D {
> 	[USR_PR_INST] =3D             "USR_PR_INST",
> 	[FP_UNAVAIL] =3D              "FP_UNAVAIL",
> 	[DEBUG_EXITS] =3D             "DEBUG",
> -	[TIMEINGUEST] =3D             "TIMEINGUEST"
> +	[TIMEINGUEST] =3D             "TIMEINGUEST",
> +	[DBELL_EXITS] =3D             "DBELL",
> +	[GDBELL_EXITS] =3D            "GDBELL"

Please add a comma at the end here, so that we don't have to uselessly =
touch the entry next time again.


Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
  2013-07-03 13:30 ` [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction Mihai Caraman
@ 2013-07-08 18:45   ` Alexander Graf
  2013-07-09 17:16     ` Scott Wood
  0 siblings, 1 reply; 16+ messages in thread
From: Alexander Graf @ 2013-07-08 18:45 UTC (permalink / raw)
  To: Mihai Caraman; +Cc: linuxppc-dev, kvm, kvm-ppc


On 03.07.2013, at 15:30, Mihai Caraman wrote:

> Some guests are making use of return from machine check instruction
> to do crazy things even though the 64-bit kernel doesn't handle yet
> this interrupt. Emulate MCSRR0/1 SPR and rfmci instruction =
accordingly.
>=20
> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
> ---
> arch/powerpc/include/asm/kvm_host.h |    1 +
> arch/powerpc/kvm/booke_emulate.c    |   25 +++++++++++++++++++++++++
> arch/powerpc/kvm/timing.c           |    1 +
> 3 files changed, 27 insertions(+), 0 deletions(-)
>=20
> diff --git a/arch/powerpc/include/asm/kvm_host.h =
b/arch/powerpc/include/asm/kvm_host.h
> index af326cd..0466789 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -148,6 +148,7 @@ enum kvm_exit_types {
> 	EMULATED_TLBWE_EXITS,
> 	EMULATED_RFI_EXITS,
> 	EMULATED_RFCI_EXITS,
> +	EMULATED_RFMCI_EXITS,

I would quite frankly prefer to see us abandon the whole exit timing =
framework in the kernel and instead use trace points. Then we don't have =
to maintain all of this randomly exercised code.

FWIW I think in this case however, treating RFMCI the same as RFI or =
random "instruction emulation" shouldn't hurt. This whole table is only =
about timing measurements. If you want to know for real what's going on, =
use trace points.

Otherwise looks good.


Alex

> 	DEC_EXITS,
> 	EXT_INTR_EXITS,
> 	HALT_WAKEUP,
> diff --git a/arch/powerpc/kvm/booke_emulate.c =
b/arch/powerpc/kvm/booke_emulate.c
> index 27a4b28..aaff1b7 100644
> --- a/arch/powerpc/kvm/booke_emulate.c
> +++ b/arch/powerpc/kvm/booke_emulate.c
> @@ -23,6 +23,7 @@
>=20
> #include "booke.h"
>=20
> +#define OP_19_XOP_RFMCI   38
> #define OP_19_XOP_RFI     50
> #define OP_19_XOP_RFCI    51
>=20
> @@ -43,6 +44,12 @@ static void kvmppc_emul_rfci(struct kvm_vcpu *vcpu)
> 	kvmppc_set_msr(vcpu, vcpu->arch.csrr1);
> }
>=20
> +static void kvmppc_emul_rfmci(struct kvm_vcpu *vcpu)
> +{
> +	vcpu->arch.pc =3D vcpu->arch.mcsrr0;
> +	kvmppc_set_msr(vcpu, vcpu->arch.mcsrr1);
> +}
> +
> int kvmppc_booke_emulate_op(struct kvm_run *run, struct kvm_vcpu =
*vcpu,
>                             unsigned int inst, int *advance)
> {
> @@ -65,6 +72,12 @@ int kvmppc_booke_emulate_op(struct kvm_run *run, =
struct kvm_vcpu *vcpu,
> 			*advance =3D 0;
> 			break;
>=20
> +		case OP_19_XOP_RFMCI:
> +			kvmppc_emul_rfmci(vcpu);
> +			kvmppc_set_exit_type(vcpu, =
EMULATED_RFMCI_EXITS);
> +			*advance =3D 0;
> +			break;
> +
> 		default:
> 			emulated =3D EMULATE_FAIL;
> 			break;
> @@ -138,6 +151,12 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu =
*vcpu, int sprn, ulong spr_val)
> 	case SPRN_DBCR1:
> 		vcpu->arch.dbg_reg.dbcr1 =3D spr_val;
> 		break;
> +	case SPRN_MCSRR0:
> +		vcpu->arch.mcsrr0 =3D spr_val;
> +		break;
> +	case SPRN_MCSRR1:
> +		vcpu->arch.mcsrr1 =3D spr_val;
> +		break;
> 	case SPRN_DBSR:
> 		vcpu->arch.dbsr &=3D ~spr_val;
> 		break;
> @@ -284,6 +303,12 @@ int kvmppc_booke_emulate_mfspr(struct kvm_vcpu =
*vcpu, int sprn, ulong *spr_val)
> 	case SPRN_DBCR1:
> 		*spr_val =3D vcpu->arch.dbg_reg.dbcr1;
> 		break;
> +	case SPRN_MCSRR0:
> +		*spr_val =3D vcpu->arch.mcsrr0;
> +		break;
> +	case SPRN_MCSRR1:
> +		*spr_val =3D vcpu->arch.mcsrr1;
> +		break;
> 	case SPRN_DBSR:
> 		*spr_val =3D vcpu->arch.dbsr;
> 		break;
> diff --git a/arch/powerpc/kvm/timing.c b/arch/powerpc/kvm/timing.c
> index c392d26..670f63d 100644
> --- a/arch/powerpc/kvm/timing.c
> +++ b/arch/powerpc/kvm/timing.c
> @@ -129,6 +129,7 @@ static const char =
*kvm_exit_names[__NUMBER_OF_KVM_EXIT_TYPES] =3D {
> 	[EMULATED_TLBSX_EXITS] =3D    "EMUL_TLBSX",
> 	[EMULATED_TLBWE_EXITS] =3D    "EMUL_TLBWE",
> 	[EMULATED_RFI_EXITS] =3D      "EMUL_RFI",
> +	[EMULATED_RFMCI_EXITS] =3D    "EMUL_RFMCI",
> 	[DEC_EXITS] =3D               "DEC",
> 	[EXT_INTR_EXITS] =3D          "EXTINT",
> 	[HALT_WAKEUP] =3D             "HALT",
> --=20
> 1.7.3.4
>=20
>=20
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
  2013-07-08 18:45   ` Alexander Graf
@ 2013-07-09 17:16     ` Scott Wood
  2013-07-09 17:46       ` Alexander Graf
  0 siblings, 1 reply; 16+ messages in thread
From: Scott Wood @ 2013-07-09 17:16 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Mihai Caraman, linuxppc-dev, kvm, kvm-ppc

On 07/08/2013 01:45:58 PM, Alexander Graf wrote:
>=20
> On 03.07.2013, at 15:30, Mihai Caraman wrote:
>=20
> > Some guests are making use of return from machine check instruction
> > to do crazy things even though the 64-bit kernel doesn't handle yet
> > this interrupt. Emulate MCSRR0/1 SPR and rfmci instruction =20
> accordingly.
> >
> > Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
> > ---
> > arch/powerpc/include/asm/kvm_host.h |    1 +
> > arch/powerpc/kvm/booke_emulate.c    |   25 +++++++++++++++++++++++++
> > arch/powerpc/kvm/timing.c           |    1 +
> > 3 files changed, 27 insertions(+), 0 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/kvm_host.h =20
> b/arch/powerpc/include/asm/kvm_host.h
> > index af326cd..0466789 100644
> > --- a/arch/powerpc/include/asm/kvm_host.h
> > +++ b/arch/powerpc/include/asm/kvm_host.h
> > @@ -148,6 +148,7 @@ enum kvm_exit_types {
> > 	EMULATED_TLBWE_EXITS,
> > 	EMULATED_RFI_EXITS,
> > 	EMULATED_RFCI_EXITS,
> > +	EMULATED_RFMCI_EXITS,
>=20
> I would quite frankly prefer to see us abandon the whole exit timing =20
> framework in the kernel and instead use trace points. Then we don't =20
> have to maintain all of this randomly exercised code.

Would this map well to tracepoints?  We're not trying to track discrete =20
events, so much as accumulated time spent in different areas.

-Scott=

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
  2013-07-09 17:16     ` Scott Wood
@ 2013-07-09 17:46       ` Alexander Graf
  2013-07-09 18:29         ` Scott Wood
  0 siblings, 1 reply; 16+ messages in thread
From: Alexander Graf @ 2013-07-09 17:46 UTC (permalink / raw)
  To: Scott Wood; +Cc: Mihai Caraman, linuxppc-dev, kvm, kvm-ppc

On 07/09/2013 07:16 PM, Scott Wood wrote:
> On 07/08/2013 01:45:58 PM, Alexander Graf wrote:
>>
>> On 03.07.2013, at 15:30, Mihai Caraman wrote:
>>
>> > Some guests are making use of return from machine check instruction
>> > to do crazy things even though the 64-bit kernel doesn't handle yet
>> > this interrupt. Emulate MCSRR0/1 SPR and rfmci instruction 
>> accordingly.
>> >
>> > Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
>> > ---
>> > arch/powerpc/include/asm/kvm_host.h |    1 +
>> > arch/powerpc/kvm/booke_emulate.c    |   25 +++++++++++++++++++++++++
>> > arch/powerpc/kvm/timing.c           |    1 +
>> > 3 files changed, 27 insertions(+), 0 deletions(-)
>> >
>> > diff --git a/arch/powerpc/include/asm/kvm_host.h 
>> b/arch/powerpc/include/asm/kvm_host.h
>> > index af326cd..0466789 100644
>> > --- a/arch/powerpc/include/asm/kvm_host.h
>> > +++ b/arch/powerpc/include/asm/kvm_host.h
>> > @@ -148,6 +148,7 @@ enum kvm_exit_types {
>> >     EMULATED_TLBWE_EXITS,
>> >     EMULATED_RFI_EXITS,
>> >     EMULATED_RFCI_EXITS,
>> > +    EMULATED_RFMCI_EXITS,
>>
>> I would quite frankly prefer to see us abandon the whole exit timing 
>> framework in the kernel and instead use trace points. Then we don't 
>> have to maintain all of this randomly exercised code.
>
> Would this map well to tracepoints?  We're not trying to track 
> discrete events, so much as accumulated time spent in different areas.

I think so. We'd just have to emit tracepoints as soon as we enter 
handle_exit and in prepare_to_enter. Then a user space program should 
have everything it needs to create statistics out of that. It would 
certainly simplify the entry/exit path.


Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
  2013-07-09 17:46       ` Alexander Graf
@ 2013-07-09 18:29         ` Scott Wood
  2013-07-09 21:49           ` Alexander Graf
  0 siblings, 1 reply; 16+ messages in thread
From: Scott Wood @ 2013-07-09 18:29 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Mihai Caraman, linuxppc-dev, kvm, kvm-ppc

On 07/09/2013 12:46:32 PM, Alexander Graf wrote:
> On 07/09/2013 07:16 PM, Scott Wood wrote:
>> On 07/08/2013 01:45:58 PM, Alexander Graf wrote:
>>>=20
>>> On 03.07.2013, at 15:30, Mihai Caraman wrote:
>>>=20
>>> > Some guests are making use of return from machine check =20
>>> instruction
>>> > to do crazy things even though the 64-bit kernel doesn't handle =20
>>> yet
>>> > this interrupt. Emulate MCSRR0/1 SPR and rfmci instruction =20
>>> accordingly.
>>> >
>>> > Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
>>> > ---
>>> > arch/powerpc/include/asm/kvm_host.h |    1 +
>>> > arch/powerpc/kvm/booke_emulate.c    |   25 =20
>>> +++++++++++++++++++++++++
>>> > arch/powerpc/kvm/timing.c           |    1 +
>>> > 3 files changed, 27 insertions(+), 0 deletions(-)
>>> >
>>> > diff --git a/arch/powerpc/include/asm/kvm_host.h =20
>>> b/arch/powerpc/include/asm/kvm_host.h
>>> > index af326cd..0466789 100644
>>> > --- a/arch/powerpc/include/asm/kvm_host.h
>>> > +++ b/arch/powerpc/include/asm/kvm_host.h
>>> > @@ -148,6 +148,7 @@ enum kvm_exit_types {
>>> >     EMULATED_TLBWE_EXITS,
>>> >     EMULATED_RFI_EXITS,
>>> >     EMULATED_RFCI_EXITS,
>>> > +    EMULATED_RFMCI_EXITS,
>>>=20
>>> I would quite frankly prefer to see us abandon the whole exit =20
>>> timing framework in the kernel and instead use trace points. Then =20
>>> we don't have to maintain all of this randomly exercised code.
>>=20
>> Would this map well to tracepoints?  We're not trying to track =20
>> discrete events, so much as accumulated time spent in different =20
>> areas.
>=20
> I think so. We'd just have to emit tracepoints as soon as we enter =20
> handle_exit and in prepare_to_enter. Then a user space program should =20
> have everything it needs to create statistics out of that. It would =20
> certainly simplify the entry/exit path.

I was hoping that wasn't going to be your answer. :-)

Such a change would introduce a new dependency, more complexity, and =20
the possibility for bad totals to result from a ring buffer filling =20
faster than userspace can drain it.

I also don't see how it would simplify entry/exit, since we'd still =20
need to take timestamps in the same places, in order to record a final =20
event that says how long a particular event took.

-Scott=

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
  2013-07-09 18:29         ` Scott Wood
@ 2013-07-09 21:49           ` Alexander Graf
  2013-07-09 21:54             ` Scott Wood
  0 siblings, 1 reply; 16+ messages in thread
From: Alexander Graf @ 2013-07-09 21:49 UTC (permalink / raw)
  To: Scott Wood; +Cc: Mihai Caraman, linuxppc-dev, kvm, kvm-ppc


On 09.07.2013, at 20:29, Scott Wood wrote:

> On 07/09/2013 12:46:32 PM, Alexander Graf wrote:
>> On 07/09/2013 07:16 PM, Scott Wood wrote:
>>> On 07/08/2013 01:45:58 PM, Alexander Graf wrote:
>>>> On 03.07.2013, at 15:30, Mihai Caraman wrote:
>>>> > Some guests are making use of return from machine check =
instruction
>>>> > to do crazy things even though the 64-bit kernel doesn't handle =
yet
>>>> > this interrupt. Emulate MCSRR0/1 SPR and rfmci instruction =
accordingly.
>>>> >
>>>> > Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
>>>> > ---
>>>> > arch/powerpc/include/asm/kvm_host.h |    1 +
>>>> > arch/powerpc/kvm/booke_emulate.c    |   25 =
+++++++++++++++++++++++++
>>>> > arch/powerpc/kvm/timing.c           |    1 +
>>>> > 3 files changed, 27 insertions(+), 0 deletions(-)
>>>> >
>>>> > diff --git a/arch/powerpc/include/asm/kvm_host.h =
b/arch/powerpc/include/asm/kvm_host.h
>>>> > index af326cd..0466789 100644
>>>> > --- a/arch/powerpc/include/asm/kvm_host.h
>>>> > +++ b/arch/powerpc/include/asm/kvm_host.h
>>>> > @@ -148,6 +148,7 @@ enum kvm_exit_types {
>>>> >     EMULATED_TLBWE_EXITS,
>>>> >     EMULATED_RFI_EXITS,
>>>> >     EMULATED_RFCI_EXITS,
>>>> > +    EMULATED_RFMCI_EXITS,
>>>> I would quite frankly prefer to see us abandon the whole exit =
timing framework in the kernel and instead use trace points. Then we =
don't have to maintain all of this randomly exercised code.
>>> Would this map well to tracepoints?  We're not trying to track =
discrete events, so much as accumulated time spent in different areas.
>> I think so. We'd just have to emit tracepoints as soon as we enter =
handle_exit and in prepare_to_enter. Then a user space program should =
have everything it needs to create statistics out of that. It would =
certainly simplify the entry/exit path.
>=20
> I was hoping that wasn't going to be your answer. :-)
>=20
> Such a change would introduce a new dependency, more complexity, and =
the possibility for bad totals to result from a ring buffer filling =
faster than userspace can drain it.

Well, at least it would allow for optional tracing :). Today you have to =
change a compile flag to enable / disable timing stats.

>=20
> I also don't see how it would simplify entry/exit, since we'd still =
need to take timestamps in the same places, in order to record a final =
event that says how long a particular event took.

Not sure I understand. What the timing stats do is that they measure the =
time between [exit ... entry], right? We'd do the same thing, just all =
in C code. That means we would become slightly less accurate, but gain =
dynamic enabling of the traces and get rid of all the timing stat asm =
code.


Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
  2013-07-09 21:49           ` Alexander Graf
@ 2013-07-09 21:54             ` Scott Wood
  2013-07-09 22:00               ` Alexander Graf
  0 siblings, 1 reply; 16+ messages in thread
From: Scott Wood @ 2013-07-09 21:54 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Mihai Caraman, linuxppc-dev, kvm, kvm-ppc

On 07/09/2013 04:49:32 PM, Alexander Graf wrote:
>=20
> On 09.07.2013, at 20:29, Scott Wood wrote:
>=20
> > On 07/09/2013 12:46:32 PM, Alexander Graf wrote:
> >> On 07/09/2013 07:16 PM, Scott Wood wrote:
> >>> On 07/08/2013 01:45:58 PM, Alexander Graf wrote:
> >>>> On 03.07.2013, at 15:30, Mihai Caraman wrote:
> >>>> > Some guests are making use of return from machine check =20
> instruction
> >>>> > to do crazy things even though the 64-bit kernel doesn't =20
> handle yet
> >>>> > this interrupt. Emulate MCSRR0/1 SPR and rfmci instruction =20
> accordingly.
> >>>> >
> >>>> > Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
> >>>> > ---
> >>>> > arch/powerpc/include/asm/kvm_host.h |    1 +
> >>>> > arch/powerpc/kvm/booke_emulate.c    |   25 =20
> +++++++++++++++++++++++++
> >>>> > arch/powerpc/kvm/timing.c           |    1 +
> >>>> > 3 files changed, 27 insertions(+), 0 deletions(-)
> >>>> >
> >>>> > diff --git a/arch/powerpc/include/asm/kvm_host.h =20
> b/arch/powerpc/include/asm/kvm_host.h
> >>>> > index af326cd..0466789 100644
> >>>> > --- a/arch/powerpc/include/asm/kvm_host.h
> >>>> > +++ b/arch/powerpc/include/asm/kvm_host.h
> >>>> > @@ -148,6 +148,7 @@ enum kvm_exit_types {
> >>>> >     EMULATED_TLBWE_EXITS,
> >>>> >     EMULATED_RFI_EXITS,
> >>>> >     EMULATED_RFCI_EXITS,
> >>>> > +    EMULATED_RFMCI_EXITS,
> >>>> I would quite frankly prefer to see us abandon the whole exit =20
> timing framework in the kernel and instead use trace points. Then we =20
> don't have to maintain all of this randomly exercised code.
> >>> Would this map well to tracepoints?  We're not trying to track =20
> discrete events, so much as accumulated time spent in different areas.
> >> I think so. We'd just have to emit tracepoints as soon as we enter =20
> handle_exit and in prepare_to_enter. Then a user space program should =20
> have everything it needs to create statistics out of that. It would =20
> certainly simplify the entry/exit path.
> >
> > I was hoping that wasn't going to be your answer. :-)
> >
> > Such a change would introduce a new dependency, more complexity, =20
> and the possibility for bad totals to result from a ring buffer =20
> filling faster than userspace can drain it.
>=20
> Well, at least it would allow for optional tracing :). Today you have =20
> to change a compile flag to enable / disable timing stats.
>=20
> >
> > I also don't see how it would simplify entry/exit, since we'd still =20
> need to take timestamps in the same places, in order to record a =20
> final event that says how long a particular event took.
>=20
> Not sure I understand. What the timing stats do is that they measure =20
> the time between [exit ... entry], right? We'd do the same thing, =20
> just all in C code. That means we would become slightly less =20
> accurate, but gain dynamic enabling of the traces and get rid of all =20
> the timing stat asm code.

Compile-time enabling bothers me less than a loss of accuracy (not just =20
a small loss by moving into C code, but a potential for a large loss if =20
we overflow the buffer) and a dependency on a userspace tool (both in =20
terms of the tool needing to be written, and in the hassle of ensuring =20
that it's present in the root filesystem of whatever system I'm =20
testing).  And the whole mechanism will be more complicated.

Lots of debug options are enabled at build time; why must this be =20
different?

-Scott=

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
  2013-07-09 21:54             ` Scott Wood
@ 2013-07-09 22:00               ` Alexander Graf
  2013-07-09 22:26                 ` Scott Wood
  2013-07-09 23:50                 ` Steven Rostedt
  0 siblings, 2 replies; 16+ messages in thread
From: Alexander Graf @ 2013-07-09 22:00 UTC (permalink / raw)
  To: Scott Wood
  Cc: Steven Rostedt, Mihai Caraman, linuxppc-dev,
	<kvm@vger.kernel.org> list, <kvm-ppc@vger.kernel.org>


On 09.07.2013, at 23:54, Scott Wood wrote:

> On 07/09/2013 04:49:32 PM, Alexander Graf wrote:
>> On 09.07.2013, at 20:29, Scott Wood wrote:
>> > On 07/09/2013 12:46:32 PM, Alexander Graf wrote:
>> >> On 07/09/2013 07:16 PM, Scott Wood wrote:
>> >>> On 07/08/2013 01:45:58 PM, Alexander Graf wrote:
>> >>>> On 03.07.2013, at 15:30, Mihai Caraman wrote:
>> >>>> > Some guests are making use of return from machine check =
instruction
>> >>>> > to do crazy things even though the 64-bit kernel doesn't =
handle yet
>> >>>> > this interrupt. Emulate MCSRR0/1 SPR and rfmci instruction =
accordingly.
>> >>>> >
>> >>>> > Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
>> >>>> > ---
>> >>>> > arch/powerpc/include/asm/kvm_host.h |    1 +
>> >>>> > arch/powerpc/kvm/booke_emulate.c    |   25 =
+++++++++++++++++++++++++
>> >>>> > arch/powerpc/kvm/timing.c           |    1 +
>> >>>> > 3 files changed, 27 insertions(+), 0 deletions(-)
>> >>>> >
>> >>>> > diff --git a/arch/powerpc/include/asm/kvm_host.h =
b/arch/powerpc/include/asm/kvm_host.h
>> >>>> > index af326cd..0466789 100644
>> >>>> > --- a/arch/powerpc/include/asm/kvm_host.h
>> >>>> > +++ b/arch/powerpc/include/asm/kvm_host.h
>> >>>> > @@ -148,6 +148,7 @@ enum kvm_exit_types {
>> >>>> >     EMULATED_TLBWE_EXITS,
>> >>>> >     EMULATED_RFI_EXITS,
>> >>>> >     EMULATED_RFCI_EXITS,
>> >>>> > +    EMULATED_RFMCI_EXITS,
>> >>>> I would quite frankly prefer to see us abandon the whole exit =
timing framework in the kernel and instead use trace points. Then we =
don't have to maintain all of this randomly exercised code.
>> >>> Would this map well to tracepoints?  We're not trying to track =
discrete events, so much as accumulated time spent in different areas.
>> >> I think so. We'd just have to emit tracepoints as soon as we enter =
handle_exit and in prepare_to_enter. Then a user space program should =
have everything it needs to create statistics out of that. It would =
certainly simplify the entry/exit path.
>> >
>> > I was hoping that wasn't going to be your answer. :-)
>> >
>> > Such a change would introduce a new dependency, more complexity, =
and the possibility for bad totals to result from a ring buffer filling =
faster than userspace can drain it.
>> Well, at least it would allow for optional tracing :). Today you have =
to change a compile flag to enable / disable timing stats.
>> >
>> > I also don't see how it would simplify entry/exit, since we'd still =
need to take timestamps in the same places, in order to record a final =
event that says how long a particular event took.
>> Not sure I understand. What the timing stats do is that they measure =
the time between [exit ... entry], right? We'd do the same thing, just =
all in C code. That means we would become slightly less accurate, but =
gain dynamic enabling of the traces and get rid of all the timing stat =
asm code.
>=20
> Compile-time enabling bothers me less than a loss of accuracy (not =
just a small loss by moving into C code, but a potential for a large =
loss if we overflow the buffer)

Then don't overflow the buffer. Make it large enough. IIRC ftrace =
improved recently to dynamically increase the buffer size too.

Steven, do I remember correctly here?

> and a dependency on a userspace tool

We already have that for kvm_stat. It's a simple python script - and you =
surely have python on your rootfs, no?

> (both in terms of the tool needing to be written, and in the hassle of =
ensuring that it's present in the root filesystem of whatever system I'm =
testing).  And the whole mechanism will be more complicated.

It'll also be more flexible at the same time. You could take the logs =
and actually check what's going on to debug issues that you're =
encountering for example.

We could even go as far as sharing the same tool with other =
architectures, so that we only have to learn how to debug things once.

> Lots of debug options are enabled at build time; why must this be =
different?

Because I think it's valuable as debug tool for cases where compile time =
switches are not the best way of debugging things. It's not a high =
profile thing to tackle for me tbh, but I don't really think working =
heavily on the timing stat thing is the correct path to walk along.


Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
  2013-07-09 22:00               ` Alexander Graf
@ 2013-07-09 22:26                 ` Scott Wood
  2013-07-10  0:00                   ` Steven Rostedt
  2013-07-10 10:23                   ` Alexander Graf
  2013-07-09 23:50                 ` Steven Rostedt
  1 sibling, 2 replies; 16+ messages in thread
From: Scott Wood @ 2013-07-09 22:26 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Steven Rostedt, Mihai Caraman, linuxppc-dev,
	<kvm@vger.kernel.org> list, <kvm-ppc@vger.kernel.org>

On 07/09/2013 05:00:26 PM, Alexander Graf wrote:
>=20
> On 09.07.2013, at 23:54, Scott Wood wrote:
>=20
> > On 07/09/2013 04:49:32 PM, Alexander Graf wrote:
> >> Not sure I understand. What the timing stats do is that they =20
> measure the time between [exit ... entry], right? We'd do the same =20
> thing, just all in C code. That means we would become slightly less =20
> accurate, but gain dynamic enabling of the traces and get rid of all =20
> the timing stat asm code.
> >
> > Compile-time enabling bothers me less than a loss of accuracy (not =20
> just a small loss by moving into C code, but a potential for a large =20
> loss if we overflow the buffer)
>=20
> Then don't overflow the buffer. Make it large enough.

How large is that?  Does the tool recognize and report when overflow =20
happens?

How much will the overhead of running some python script on the host, =20
consuming a large volume of data, affect the results?

> IIRC ftrace improved recently to dynamically increase the buffer size =20
> too.
>=20
> Steven, do I remember correctly here?

Yay more complexity.

So now we get to worry about possible memory allocations happening when =20
we try to log something?  Or if there is a way to do an "atomic" log, =20
we're back to the "buffer might be full" situation.

> > and a dependency on a userspace tool
>=20
> We already have that for kvm_stat. It's a simple python script - and =20
> you surely have python on your rootfs, no?
>=20
> > (both in terms of the tool needing to be written, and in the hassle =20
> of ensuring that it's present in the root filesystem of whatever =20
> system I'm testing).  And the whole mechanism will be more =20
> complicated.
>=20
> It'll also be more flexible at the same time. You could take the logs =20
> and actually check what's going on to debug issues that you're =20
> encountering for example.
>=20
> We could even go as far as sharing the same tool with other =20
> architectures, so that we only have to learn how to debug things once.

Have you encountered an actual need for this flexibility, or is it =20
theoretical?

Is there common infrastructure for dealing with measuring intervals and =20
tracking statistics thereof, rather than just tracking points and =20
letting userspace connect the dots (though it could still do that as an =20
option)?  Even if it must be done in userspace, it doesn't seem like =20
something that should be KVM-specific.

> > Lots of debug options are enabled at build time; why must this be =20
> different?
>=20
> Because I think it's valuable as debug tool for cases where compile =20
> time switches are not the best way of debugging things. It's not a =20
> high profile thing to tackle for me tbh, but I don't really think =20
> working heavily on the timing stat thing is the correct path to walk =20
> along.

Adding new exit types isn't "working heavily" on it.

-Scott=

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
  2013-07-09 22:00               ` Alexander Graf
  2013-07-09 22:26                 ` Scott Wood
@ 2013-07-09 23:50                 ` Steven Rostedt
  1 sibling, 0 replies; 16+ messages in thread
From: Steven Rostedt @ 2013-07-09 23:50 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Scott Wood, Mihai Caraman, linuxppc-dev,
	<kvm@vger.kernel.org> list, <kvm-ppc@vger.kernel.org>

On Wed, 2013-07-10 at 00:00 +0200, Alexander Graf wrote:

> Then don't overflow the buffer. Make it large enough. IIRC ftrace improved recently to dynamically increase the buffer size too.
> 
> Steven, do I remember correctly here?

Not really. Ftrace only dynamically increases the buffer when the trace
is first used. Other than that, the size is static. I also wouldn't
suggest allocating the buffer when needed as that has the overhead of
allocating memory.

-- Steve

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
  2013-07-09 22:26                 ` Scott Wood
@ 2013-07-10  0:00                   ` Steven Rostedt
  2013-07-10 10:23                   ` Alexander Graf
  1 sibling, 0 replies; 16+ messages in thread
From: Steven Rostedt @ 2013-07-10  0:00 UTC (permalink / raw)
  To: Scott Wood
  Cc: Mauro Carvalho Chehab, <kvm@vger.kernel.org> list,
	Alexander Graf, <kvm-ppc@vger.kernel.org>, Mihai Caraman,
	linuxppc-dev

On Tue, 2013-07-09 at 17:26 -0500, Scott Wood wrote:
> On 07/09/2013 05:00:26 PM, Alexander Graf wrote:
> > 
> > On 09.07.2013, at 23:54, Scott Wood wrote:
> > 
> > > On 07/09/2013 04:49:32 PM, Alexander Graf wrote:
> > >> Not sure I understand. What the timing stats do is that they  
> > measure the time between [exit ... entry], right? We'd do the same  
> > thing, just all in C code. That means we would become slightly less  
> > accurate, but gain dynamic enabling of the traces and get rid of all  
> > the timing stat asm code.
> > >
> > > Compile-time enabling bothers me less than a loss of accuracy (not  
> > just a small loss by moving into C code, but a potential for a large  
> > loss if we overflow the buffer)
> > 
> > Then don't overflow the buffer. Make it large enough.
> 
> How large is that?  Does the tool recognize and report when overflow  
> happens?

Note, the ftrace buffers allow you to see when overflow does happen.

> 
> How much will the overhead of running some python script on the host,  
> consuming a large volume of data, affect the results?

This doesn't need to be python, and you can read the buffers in binary
as well. Mauro wrote a tool that uses ftrace for MCE errors. You can
probably do something similar. I need to get the code that reads ftrace
binary buffers out as a library.


> 
> > IIRC ftrace improved recently to dynamically increase the buffer size  
> > too.

What did change was that you can create buffers for your own use.

> > 
> > Steven, do I remember correctly here?
> 
> Yay more complexity.

What? Is ftrace complex? ;-)

> 
> So now we get to worry about possible memory allocations happening when  
> we try to log something?  Or if there is a way to do an "atomic" log,  
> we're back to the "buffer might be full" situation.

Nope, ftrace doesn't do dynamic allocation here.

-- Steve

> 
> > > and a dependency on a userspace tool
> > 
> > We already have that for kvm_stat. It's a simple python script - and  
> > you surely have python on your rootfs, no?
> > 
> > > (both in terms of the tool needing to be written, and in the hassle  
> > of ensuring that it's present in the root filesystem of whatever  
> > system I'm testing).  And the whole mechanism will be more  
> > complicated.
> > 
> > It'll also be more flexible at the same time. You could take the logs  
> > and actually check what's going on to debug issues that you're  
> > encountering for example.
> > 
> > We could even go as far as sharing the same tool with other  
> > architectures, so that we only have to learn how to debug things once.
> 
> Have you encountered an actual need for this flexibility, or is it  
> theoretical?
> 
> Is there common infrastructure for dealing with measuring intervals and  
> tracking statistics thereof, rather than just tracking points and  
> letting userspace connect the dots (though it could still do that as an  
> option)?  Even if it must be done in userspace, it doesn't seem like  
> something that should be KVM-specific.
> 
> > > Lots of debug options are enabled at build time; why must this be  
> > different?
> > 
> > Because I think it's valuable as debug tool for cases where compile  
> > time switches are not the best way of debugging things. It's not a  
> > high profile thing to tackle for me tbh, but I don't really think  
> > working heavily on the timing stat thing is the correct path to walk  
> > along.
> 
> Adding new exit types isn't "working heavily" on it.
> 
> -Scott

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
  2013-07-09 22:26                 ` Scott Wood
  2013-07-10  0:00                   ` Steven Rostedt
@ 2013-07-10 10:23                   ` Alexander Graf
  2013-07-10 18:24                     ` Scott Wood
  1 sibling, 1 reply; 16+ messages in thread
From: Alexander Graf @ 2013-07-10 10:23 UTC (permalink / raw)
  To: Scott Wood
  Cc: Steven Rostedt, Mihai Caraman, linuxppc-dev,
	<kvm@vger.kernel.org> list, <kvm-ppc@vger.kernel.org>


On 10.07.2013, at 00:26, Scott Wood wrote:

> On 07/09/2013 05:00:26 PM, Alexander Graf wrote:
>> On 09.07.2013, at 23:54, Scott Wood wrote:
>> > On 07/09/2013 04:49:32 PM, Alexander Graf wrote:
>> >> Not sure I understand. What the timing stats do is that they =
measure the time between [exit ... entry], right? We'd do the same =
thing, just all in C code. That means we would become slightly less =
accurate, but gain dynamic enabling of the traces and get rid of all the =
timing stat asm code.
>> >
>> > Compile-time enabling bothers me less than a loss of accuracy (not =
just a small loss by moving into C code, but a potential for a large =
loss if we overflow the buffer)
>> Then don't overflow the buffer. Make it large enough.
>=20
> How large is that?  Does the tool recognize and report when overflow =
happens?
>=20
> How much will the overhead of running some python script on the host, =
consuming a large volume of data, affect the results?
>=20
>> IIRC ftrace improved recently to dynamically increase the buffer size =
too.
>> Steven, do I remember correctly here?
>=20
> Yay more complexity.
>=20
> So now we get to worry about possible memory allocations happening =
when we try to log something?  Or if there is a way to do an "atomic" =
log, we're back to the "buffer might be full" situation.
>=20
>> > and a dependency on a userspace tool
>> We already have that for kvm_stat. It's a simple python script - and =
you surely have python on your rootfs, no?
>> > (both in terms of the tool needing to be written, and in the hassle =
of ensuring that it's present in the root filesystem of whatever system =
I'm testing).  And the whole mechanism will be more complicated.
>> It'll also be more flexible at the same time. You could take the logs =
and actually check what's going on to debug issues that you're =
encountering for example.
>> We could even go as far as sharing the same tool with other =
architectures, so that we only have to learn how to debug things once.
>=20
> Have you encountered an actual need for this flexibility, or is it =
theoretical?

Yeah, first thing I did back then to actually debug kvm failures was to =
add trace points.

> Is there common infrastructure for dealing with measuring intervals =
and tracking statistics thereof, rather than just tracking points and =
letting userspace connect the dots (though it could still do that as an =
option)?  Even if it must be done in userspace, it doesn't seem like =
something that should be KVM-specific.

Would you like to have different ways of measuring mm subsystem =
overhead? I don't :). The same goes for KVM really. If we could converge =
towards a single user space interface to get exit timings, it'd make =
debugging a lot easier.

We already have this for the debugfs counters btw. And the timing =
framework does break kvm_stat today already, as it emits textual stats =
rather than numbers which all of the other debugfs stats do. But at =
least I can take the x86 kvm_stat tool and run it on ppc just fine to =
see exit stats.

>=20
>> > Lots of debug options are enabled at build time; why must this be =
different?
>> Because I think it's valuable as debug tool for cases where compile =
time switches are not the best way of debugging things. It's not a high =
profile thing to tackle for me tbh, but I don't really think working =
heavily on the timing stat thing is the correct path to walk along.
>=20
> Adding new exit types isn't "working heavily" on it.

No, but the fact that the first patch is a fix to add exit stats for =
exits that we missed out before doesn't give me a lot of confidence that =
lots of people use timing stats. And I am always very weary of #ifdef'ed =
code, as it blows up the test matrix heavily.


Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
  2013-07-10 10:23                   ` Alexander Graf
@ 2013-07-10 18:24                     ` Scott Wood
  2013-07-10 22:47                       ` Alexander Graf
  0 siblings, 1 reply; 16+ messages in thread
From: Scott Wood @ 2013-07-10 18:24 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Steven Rostedt, Mihai Caraman, linuxppc-dev,
	<kvm@vger.kernel.org> list, <kvm-ppc@vger.kernel.org>

On 07/10/2013 05:23:36 AM, Alexander Graf wrote:
>=20
> On 10.07.2013, at 00:26, Scott Wood wrote:
>=20
> > On 07/09/2013 05:00:26 PM, Alexander Graf wrote:
> >> It'll also be more flexible at the same time. You could take the =20
> logs and actually check what's going on to debug issues that you're =20
> encountering for example.
> >> We could even go as far as sharing the same tool with other =20
> architectures, so that we only have to learn how to debug things once.
> >
> > Have you encountered an actual need for this flexibility, or is it =20
> theoretical?
>=20
> Yeah, first thing I did back then to actually debug kvm failures was =20
> to add trace points.

I meant specifically for handling exit timings this way.

> > Is there common infrastructure for dealing with measuring intervals =20
> and tracking statistics thereof, rather than just tracking points and =20
> letting userspace connect the dots (though it could still do that as =20
> an option)?  Even if it must be done in userspace, it doesn't seem =20
> like something that should be KVM-specific.
>=20
> Would you like to have different ways of measuring mm subsystem =20
> overhead? I don't :). The same goes for KVM really. If we could =20
> converge towards a single user space interface to get exit timings, =20
> it'd make debugging a lot easier.

I agree -- that's why I said it doesn't seem like something that should =20
be KVM-specific.  But that's orthogonal to whether it's done in kernel =20
space or user space.  The ability to get begin/end events from =20
userspace would be nice when it is specifically requested, but it would =20
also be nice if the kernel could track some basic statistics so we =20
wouldn't have to ship so much data around to arrive at the same result.

At the very least, I'd like such a tool/infrastructure to exist before =20
we start complaining about doing minor maintenance of the current =20
mechanism.

> We already have this for the debugfs counters btw. And the timing =20
> framework does break kvm_stat today already, as it emits textual =20
> stats rather than numbers which all of the other debugfs stats do. =20
> But at least I can take the x86 kvm_stat tool and run it on ppc just =20
> fine to see exit stats.

We already have what?  The last two sentences seem contradictory -- can =20
you or can't you use kvm_stat as is?  I'm not familiar with kvm_stat.

What does x86 KVM expose in debugfs?

> >> > Lots of debug options are enabled at build time; why must this =20
> be different?
> >> Because I think it's valuable as debug tool for cases where =20
> compile time switches are not the best way of debugging things. It's =20
> not a high profile thing to tackle for me tbh, but I don't really =20
> think working heavily on the timing stat thing is the correct path to =20
> walk along.
> >
> > Adding new exit types isn't "working heavily" on it.
>=20
> No, but the fact that the first patch is a fix to add exit stats for =20
> exits that we missed out before doesn't give me a lot of confidence =20
> that lots of people use timing stats. And I am always very weary of =20
> #ifdef'ed code, as it blows up the test matrix heavily.

I used it quite a lot when I was doing KVM performance work.  It's just =20
been a while since I last did that.

-Scott=

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction
  2013-07-10 18:24                     ` Scott Wood
@ 2013-07-10 22:47                       ` Alexander Graf
  0 siblings, 0 replies; 16+ messages in thread
From: Alexander Graf @ 2013-07-10 22:47 UTC (permalink / raw)
  To: Scott Wood
  Cc: Steven Rostedt, Mihai Caraman, linuxppc-dev,
	<kvm@vger.kernel.org> list, <kvm-ppc@vger.kernel.org>


On 10.07.2013, at 20:24, Scott Wood wrote:

> On 07/10/2013 05:23:36 AM, Alexander Graf wrote:
>> On 10.07.2013, at 00:26, Scott Wood wrote:
>> > On 07/09/2013 05:00:26 PM, Alexander Graf wrote:
>> >> It'll also be more flexible at the same time. You could take the =
logs and actually check what's going on to debug issues that you're =
encountering for example.
>> >> We could even go as far as sharing the same tool with other =
architectures, so that we only have to learn how to debug things once.
>> >
>> > Have you encountered an actual need for this flexibility, or is it =
theoretical?
>> Yeah, first thing I did back then to actually debug kvm failures was =
to add trace points.
>=20
> I meant specifically for handling exit timings this way.

No, but I did encounter the need for debugging exits. And the only thing =
we would need to get exit timing stats once we already have trace points =
for exit types would be to have trace points for guest entry and maybe =
type specific events to indicate what the exit is about as well.

>=20
>> > Is there common infrastructure for dealing with measuring intervals =
and tracking statistics thereof, rather than just tracking points and =
letting userspace connect the dots (though it could still do that as an =
option)?  Even if it must be done in userspace, it doesn't seem like =
something that should be KVM-specific.
>> Would you like to have different ways of measuring mm subsystem =
overhead? I don't :). The same goes for KVM really. If we could converge =
towards a single user space interface to get exit timings, it'd make =
debugging a lot easier.
>=20
> I agree -- that's why I said it doesn't seem like something that =
should be KVM-specific.  But that's orthogonal to whether it's done in =
kernel space or user space.  The ability to get begin/end events from =
userspace would be nice when it is specifically requested, but it would =
also be nice if the kernel could track some basic statistics so we =
wouldn't have to ship so much data around to arrive at the same result.
>=20
> At the very least, I'd like such a tool/infrastructure to exist before =
we start complaining about doing minor maintenance of the current =
mechanism.

I admit that I don't fully understand qemu/scripts/kvm/kvm_stat, but it =
seems to me as if it already does pretty much what we want. It sets up a =
filter to only get events and their time stamps through.

It does use normal exit trace points on x86 to replace the old debugfs =
based stat counters. And it seems to work reasonably well for that.

>=20
>> We already have this for the debugfs counters btw. And the timing =
framework does break kvm_stat today already, as it emits textual stats =
rather than numbers which all of the other debugfs stats do. But at =
least I can take the x86 kvm_stat tool and run it on ppc just fine to =
see exit stats.
>=20
> We already have what?  The last two sentences seem contradictory -- =
can you or can't you use kvm_stat as is?  I'm not familiar with =
kvm_stat.

Kvm_stat back in the day used debugfs to give you an idea on what exit =
event happens most often. That mechanism got replaced by trace points =
later which the current kvm_stat uses.

I still have a copy of the old kvm_stat that I always use to get a first =
feeling for what goes wrong if something goes wrong. The original code =
couldn't deal with the fact that we have a debugfs file that contains =
text though. I patched it locally. It also works just fine if you simply =
disable timing stats, since then you won't have the text file.

> What does x86 KVM expose in debugfs?

The same thing it always exposed - exit stats. I am fairly sure Avi =
wanted to completely deprecate that interface in favor of the trace =
point based approach, but I don't think he ever got around to it.


Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-07-10 22:47 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-03 13:30 [PATCH 1/2] KVM: PPC: Fix kvm_exit_names array Mihai Caraman
2013-07-03 13:30 ` [PATCH 2/2] KVM: PPC: Book3E: Emulate MCSRR0/1 SPR and rfmci instruction Mihai Caraman
2013-07-08 18:45   ` Alexander Graf
2013-07-09 17:16     ` Scott Wood
2013-07-09 17:46       ` Alexander Graf
2013-07-09 18:29         ` Scott Wood
2013-07-09 21:49           ` Alexander Graf
2013-07-09 21:54             ` Scott Wood
2013-07-09 22:00               ` Alexander Graf
2013-07-09 22:26                 ` Scott Wood
2013-07-10  0:00                   ` Steven Rostedt
2013-07-10 10:23                   ` Alexander Graf
2013-07-10 18:24                     ` Scott Wood
2013-07-10 22:47                       ` Alexander Graf
2013-07-09 23:50                 ` Steven Rostedt
2013-07-08 18:39 ` [PATCH 1/2] KVM: PPC: Fix kvm_exit_names array Alexander Graf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).