LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/2] dtc: Implement -d option to write out a dependency file
From: Stephen Warren @ 2012-01-09 18:38 UTC (permalink / raw)
  To: Michal Marek, Jon Loeliger, David Gibson
  Cc: Jonas Bonn, Michal Simek, Russell King, linux-c6x-dev,
	Arnd Bergmann, Aurelien Jacquiot, Devicetree Discuss,
	microblaze-uclinux, linux-kbuild, linux-kernel, Rob Herring,
	Paul Mackerras, Mark Salter, Stephen Warren, linuxppc-dev, linux,
	linux-arm-kernel

This will allow callers to rebuild .dtb files when any of the /include/d
.dtsi files are modified, not just the top-level .dts file.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
---
This patch is against the Linux kernel's copy of dtc, but it applies to
upstream dtc with a couple of trivial conflicts. I can post a version for
upstream dtc as well if desired.

I have only tested this series for ARM.

 scripts/dtc/dtc.c    |   22 +++++++++++++++++++++-
 scripts/dtc/srcpos.c |    6 ++++++
 scripts/dtc/srcpos.h |    1 +
 3 files changed, 28 insertions(+), 1 deletions(-)

diff --git a/scripts/dtc/dtc.c b/scripts/dtc/dtc.c
index cbc0193..27ecf06 100644
--- a/scripts/dtc/dtc.c
+++ b/scripts/dtc/dtc.c
@@ -71,6 +71,7 @@ static void  __attribute__ ((noreturn)) usage(void)
 	fprintf(stderr, "\t\t\tasm - assembler source\n");
 	fprintf(stderr, "\t-V <output version>\n");
 	fprintf(stderr, "\t\tBlob version to produce, defaults to %d (relevant for dtb\n\t\tand asm output only)\n", DEFAULT_FDT_VERSION);
+	fprintf(stderr, "\t-d <output dependency file>\n");
 	fprintf(stderr, "\t-R <number>\n");
 	fprintf(stderr, "\t\tMake space for <number> reserve map entries (relevant for \n\t\tdtb and asm output only)\n");
 	fprintf(stderr, "\t-S <bytes>\n");
@@ -99,6 +100,7 @@ int main(int argc, char *argv[])
 	const char *inform = "dts";
 	const char *outform = "dts";
 	const char *outname = "-";
+	const char *depname = NULL;
 	int force = 0, check = 0, sort = 0;
 	const char *arg;
 	int opt;
@@ -111,7 +113,8 @@ int main(int argc, char *argv[])
 	minsize    = 0;
 	padsize    = 0;
 
-	while ((opt = getopt(argc, argv, "hI:O:o:V:R:S:p:fcqb:vH:s")) != EOF) {
+	while ((opt = getopt(argc, argv, "hI:O:o:V:d:R:S:p:fcqb:vH:s"))
+			!= EOF) {
 		switch (opt) {
 		case 'I':
 			inform = optarg;
@@ -125,6 +128,9 @@ int main(int argc, char *argv[])
 		case 'V':
 			outversion = strtol(optarg, NULL, 0);
 			break;
+		case 'd':
+			depname = optarg;
+			break;
 		case 'R':
 			reservenum = strtol(optarg, NULL, 0);
 			break;
@@ -188,6 +194,15 @@ int main(int argc, char *argv[])
 	fprintf(stderr, "DTC: %s->%s  on file \"%s\"\n",
 		inform, outform, arg);
 
+	if (depname) {
+		depfile = fopen(depname, "w");
+		if (!depfile)
+			die("Couldn't open dependency file %s: %s\n", depname,
+			    strerror(errno));
+		fputs(outname, depfile);
+		fputc(':', depfile);
+	}
+
 	if (streq(inform, "dts"))
 		bi = dt_from_source(arg);
 	else if (streq(inform, "fs"))
@@ -197,6 +212,11 @@ int main(int argc, char *argv[])
 	else
 		die("Unknown input format \"%s\"\n", inform);
 
+	if (depfile) {
+		fputc('\n', depfile);
+		fclose(depfile);
+	}
+
 	if (cmdline_boot_cpuid != -1)
 		bi->boot_cpuid_phys = cmdline_boot_cpuid;
 
diff --git a/scripts/dtc/srcpos.c b/scripts/dtc/srcpos.c
index 2dbc874..93b3533 100644
--- a/scripts/dtc/srcpos.c
+++ b/scripts/dtc/srcpos.c
@@ -40,6 +40,7 @@ static char *dirname(const char *path)
 	return NULL;
 }
 
+FILE *depfile; /* = NULL */
 struct srcfile_state *current_srcfile; /* = NULL */
 
 /* Detect infinite include recursion. */
@@ -67,6 +68,11 @@ FILE *srcfile_relative_open(const char *fname, char **fullnamep)
 			    strerror(errno));
 	}
 
+	if (depfile) {
+		fputc(' ', depfile);
+		fputs(fullname, depfile);
+	}
+
 	if (fullnamep)
 		*fullnamep = fullname;
 	else
diff --git a/scripts/dtc/srcpos.h b/scripts/dtc/srcpos.h
index bd7966e..ce980ca 100644
--- a/scripts/dtc/srcpos.h
+++ b/scripts/dtc/srcpos.h
@@ -30,6 +30,7 @@ struct srcfile_state {
 	struct srcfile_state *prev;
 };
 
+extern FILE *depfile; /* = NULL */
 extern struct srcfile_state *current_srcfile; /* = NULL */
 
 FILE *srcfile_relative_open(const char *fname, char **fullnamep);
-- 
1.7.0.4

^ permalink raw reply related

* [PATCH 2/2] Kbuild: Use dtc's -d (dependency) option
From: Stephen Warren @ 2012-01-09 18:38 UTC (permalink / raw)
  To: Michal Marek, Jon Loeliger, David Gibson
  Cc: Jonas Bonn, Michal Simek, Russell King, linux-c6x-dev,
	Arnd Bergmann, Aurelien Jacquiot, Devicetree Discuss,
	microblaze-uclinux, linux-kbuild, linux-kernel, Rob Herring,
	Paul Mackerras, Mark Salter, Stephen Warren, linuxppc-dev, linux,
	linux-arm-kernel
In-Reply-To: <1326134295-15547-1-git-send-email-swarren@nvidia.com>

This hooks dtc into Kbuild's dependency system.

Thus, for example, "make dtbs" will rebuild tegra-harmony.dtb if only
tegra20.dtsi has changed yet tegra-harmony.dts has not. The previous
lack of this feature recently caused me to have very confusing "git
bisect" results.

For ARM, it's obvious what to add to $(targets). I'm not familiar enough
with other architectures to know what to add there. Powerpc appears to
already add various .dtb files into $(targets), but the other archs may
need something added to $(targets) to work.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
---
I have only tested this series for ARM.

 arch/arm/boot/Makefile        |    6 ++++--
 arch/c6x/boot/Makefile        |    2 +-
 arch/microblaze/boot/Makefile |    2 +-
 arch/openrisc/boot/Makefile   |    4 ++--
 arch/powerpc/boot/Makefile    |    4 ++--
 scripts/Makefile.lib          |    2 +-
 6 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/arch/arm/boot/Makefile b/arch/arm/boot/Makefile
index 1338cf0..c72730d 100644
--- a/arch/arm/boot/Makefile
+++ b/arch/arm/boot/Makefile
@@ -62,9 +62,11 @@ $(obj)/zImage-dtb.%:	$(obj)/%.dtb $(obj)/zImage
 
 endif
 
+targets += $(dtb-y)
+
 # Rule to build device tree blobs
-$(obj)/%.dtb: $(src)/dts/%.dts
-	$(call cmd,dtc)
+$(obj)/%.dtb: $(src)/dts/%.dts FORCE
+	$(call if_changed_dep,dtc)
 
 $(obj)/dtbs: $(addprefix $(obj)/, $(dtb-y))
 
diff --git a/arch/c6x/boot/Makefile b/arch/c6x/boot/Makefile
index ecca820..6891257 100644
--- a/arch/c6x/boot/Makefile
+++ b/arch/c6x/boot/Makefile
@@ -13,7 +13,7 @@ obj-y += linked_dtb.o
 endif
 
 $(obj)/%.dtb: $(src)/dts/%.dts FORCE
-	$(call cmd,dtc)
+	$(call if_changed_dep,dtc)
 
 quiet_cmd_cp = CP      $< $@$2
 	cmd_cp = cat $< >$@$2 || (rm -f $@ && echo false)
diff --git a/arch/microblaze/boot/Makefile b/arch/microblaze/boot/Makefile
index 4c4e58e..0c796cf 100644
--- a/arch/microblaze/boot/Makefile
+++ b/arch/microblaze/boot/Makefile
@@ -53,6 +53,6 @@ $(obj)/simpleImage.%: vmlinux FORCE
 DTC_FLAGS := -p 1024
 
 $(obj)/%.dtb: $(src)/dts/%.dts FORCE
-	$(call cmd,dtc)
+	$(call if_changed_dep,dtc)
 
 clean-files += *.dtb simpleImage.*.unstrip linux.bin.ub
diff --git a/arch/openrisc/boot/Makefile b/arch/openrisc/boot/Makefile
index 98ca185..0995835 100644
--- a/arch/openrisc/boot/Makefile
+++ b/arch/openrisc/boot/Makefile
@@ -11,5 +11,5 @@ clean-files := *.dtb.S
 
 #DTC_FLAGS ?= -p 1024
 
-$(obj)/%.dtb: $(src)/dts/%.dts
-	$(call cmd,dtc)
+$(obj)/%.dtb: $(src)/dts/%.dts FORCE
+	$(call if_changed_dep,dtc)
diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
index 15986e7..8844a17 100644
--- a/arch/powerpc/boot/Makefile
+++ b/arch/powerpc/boot/Makefile
@@ -345,8 +345,8 @@ $(obj)/treeImage.%: vmlinux $(obj)/%.dtb $(wrapperbits)
 	$(call if_changed,wrap,treeboot-$*,,$(obj)/$*.dtb)
 
 # Rule to build device tree blobs
-$(obj)/%.dtb: $(src)/dts/%.dts
-	$(call cmd,dtc)
+$(obj)/%.dtb: $(src)/dts/%.dts FORCE
+	$(call if_changed_dep,dtc)
 
 # If there isn't a platform selected then just strip the vmlinux.
 ifeq (,$(image-y))
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 5d986d9..7bae316 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -264,7 +264,7 @@ $(obj)/%.dtb.S: $(obj)/%.dtb
 	$(call cmd,dt_S_dtb)
 
 quiet_cmd_dtc = DTC     $@
-cmd_dtc = $(objtree)/scripts/dtc/dtc -O dtb -o $@ -b 0 $(DTC_FLAGS) $<
+cmd_dtc = $(objtree)/scripts/dtc/dtc -O dtb -o $@ -b 0 $(DTC_FLAGS) -d $(depfile) $<
 
 # Bzip2
 # ---------------------------------------------------------------------------
-- 
1.7.0.4

^ permalink raw reply related

* Re: [RFC PATCH 01/16] powerpc/booke: Set CPU_FTR_DEBUG_LVL_EXC on 32-bit
From: Scott Wood @ 2012-01-09 19:14 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <C57ECE6B-1040-40BC-B7C7-5107533A2DC2@suse.de>

On 01/09/2012 09:21 AM, Alexander Graf wrote:
> 
> On 21.12.2011, at 02:34, Scott Wood wrote:
> 
>> Currently 32-bit only cares about this for choice of exception
>> vector, which is done in core-specific code.  However, KVM will
>> want to distinguish as well.
>>
>> Signed-off-by: Scott Wood <scottwood@freescale.com>
>> ---
>> arch/powerpc/include/asm/cputable.h |    5 +++--
>> 1 files changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
>> index e30442c..033ad30 100644
>> --- a/arch/powerpc/include/asm/cputable.h
>> +++ b/arch/powerpc/include/asm/cputable.h
>> @@ -375,7 +375,8 @@ extern const char *powerpc_base_platform;
>> #define CPU_FTRS_47X	(CPU_FTRS_440x6)
>> #define CPU_FTRS_E200	(CPU_FTR_USE_TB | CPU_FTR_SPE_COMP | \
>> 	    CPU_FTR_NODSISRALIGN | CPU_FTR_COHERENT_ICACHE | \
>> -	    CPU_FTR_UNIFIED_ID_CACHE | CPU_FTR_NOEXECUTE)
>> +	    CPU_FTR_UNIFIED_ID_CACHE | CPU_FTR_NOEXECUTE | \
>> +	    CPU_FTR_DEBUG_LVL_EXC)
> 
> KVM on E200?

This isn't a KVM patch, it's a patch to make CPU_FTR_DEBUG_LVL_EXC be
set properly on 32-bit chips.  e200 has this CPU feature.

-Scott

^ permalink raw reply

* Re: [RFC PATCH 16/16] KVM: PPC: e500mc support
From: Scott Wood @ 2012-01-09 19:29 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Liu Yu, kvm, agraf, kvm-ppc, Varun Sethi, linuxppc-dev
In-Reply-To: <4F0B16D7.2000709@redhat.com>

On 01/09/2012 10:33 AM, Avi Kivity wrote:
> On 12/21/2011 03:34 AM, Scott Wood wrote:
>> Add processor support for e500mc, using hardware virtualization support
>> (GS-mode).
>>
>> Current issues include:
>>  - No support for external proxy (coreint) interrupt mode in the guest.
>>
>> Includes work by Ashish Kalra <Ashish.Kalra@freescale.com>,
>> Varun Sethi <Varun.Sethi@freescale.com>, and
>> Liu Yu <yu.liu@freescale.com>.
>>
> 
> Best to include their signoffs, if possible.

These patches are based in part on a bunch of different patches from
these people (for which I did receive signoffs).  I was reluctant to put
their signoff directly on the new patches, since I didn't want to make
it look like they had submitted the patch in anything resembling its
current form.  I wanted to give them credit for what they did, but not
blame for what I did with their code.

I've CCed Varun and Liu so they can sign off these versions of the
patches if they wish.  Ashish no longer works at Freescale, so I don't
have a currently valid e-mail address for him.

-Scott

^ permalink raw reply

* Re: [PATCH 1/2] dtc: Implement -d option to write out a dependency file
From: Jon Loeliger @ 2012-01-09 18:58 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Michal Marek, Jonas Bonn, Michal Simek, Russell King,
	linux-c6x-dev, Arnd Bergmann, Aurelien Jacquiot,
	Devicetree Discuss, linux-kbuild, linux-kernel, Rob Herring,
	Paul Mackerras, Mark Salter, microblaze-uclinux, linuxppc-dev,
	linux, linux-arm-kernel, David Gibson
In-Reply-To: <1326134295-15547-1-git-send-email-swarren@nvidia.com>

> This will allow callers to rebuild .dtb files when any of the /include/d
> .dtsi files are modified, not just the top-level .dts file.
> 
> Signed-off-by: Stephen Warren <swarren@nvidia.com>
> ---
> This patch is against the Linux kernel's copy of dtc, but it applies to
> upstream dtc with a couple of trivial conflicts. I can post a version for
> upstream dtc as well if desired.

If it does go upstream into Linux proper, we should
definitely have it in the main DTC repository.
So if you would, please send me a path that applies there.

Thanks,
jdl

^ permalink raw reply

* Re: [SDK v1.2][PATCH 1/2 v3] powerpc/85xx: Add dts for P1021RDB-PC board
From: Scott Wood @ 2012-01-09 20:54 UTC (permalink / raw)
  To: Xu Jiucheng; +Cc: Matthew McClintock, linuxppc-dev
In-Reply-To: <1326092022-10085-1-git-send-email-B37781@freescale.com>

On 01/09/2012 12:53 AM, Xu Jiucheng wrote:
> +	nand@1,0 {
> +		#address-cells = <1>;
> +		#size-cells = <1>;
> +		compatible = "fsl,p1020-fcm-nand",
> +			     "fsl,elbc-fcm-nand";

s/p1020/p1021/

-Scott

^ permalink raw reply

* Re: [PATCH] powerpc/85xx: Add P1024rdb dts support
From: Scott Wood @ 2012-01-09 21:03 UTC (permalink / raw)
  To: b29983; +Cc: Tang Yuantian, linuxppc-dev
In-Reply-To: <1326098258-32097-1-git-send-email-b29983@freescale.com>

On 01/09/2012 02:37 AM, b29983@freescale.com wrote:
> +/include/ "p1024rdb.dtsi"
> +/include/ "fsl/p1020si-post.dtsi"

Is p1024 100% software-compatible with p1020?

They have different manuals...

-Scott

^ permalink raw reply

* Re: [PATCH v2 3/3] KVM: PPC: epapr: install ev_idle hcall for e500 guest
From: Scott Wood @ 2012-01-09 21:15 UTC (permalink / raw)
  To: Liu Yu; +Cc: kvm, kvm-ppc, agraf, linuxppc-dev, timur
In-Reply-To: <1325754412-29963-2-git-send-email-yu.liu@freescale.com>

On 01/05/2012 03:06 AM, Liu Yu wrote:
> diff --git a/arch/powerpc/kernel/idle_e500.S b/arch/powerpc/kernel/idle_e500.S
> index 3e2b95c..6ea95f0 100644
> --- a/arch/powerpc/kernel/idle_e500.S
> +++ b/arch/powerpc/kernel/idle_e500.S
> @@ -85,6 +85,23 @@ END_FTR_SECTION_IFSET(CPU_FTR_L2CSR|CPU_FTR_CAN_NAP)
>  2:	b	2b
>  #endif /* !E500MC */
>  
> +#ifdef CONFIG_KVM_GUEST
> +/*
> + * r3 contains the pointer to in[8]
> + * r4 contains the pointer to out[8]
> + * r5 contains the hcall vendor and nr
> + * r6 contains the handler which send hcall
> + */
> +_GLOBAL(e500_ev_idle)
> +	rlwinm	r7,r1,0,0,31-THREAD_SHIFT	/* current thread_info */
> +	lwz	r8,TI_LOCAL_FLAGS(r7)	/* set napping bit */
> +	ori	r8,r8,_TLF_NAPPING	/* so when we take an exception */
> +	stw	r8,TI_LOCAL_FLAGS(r7)	/* it will return to our caller */
> +	wrteei	1
> +	mtctr	r6
> +	bctr
> +#endif /* KVM_GUEST */

You'll need to branch back to the hcall invocation in an infinite loop
-- the only way we should leave is via an interrupt.

> +static void kvm_hcall_idle(void)
> +{
> +#ifdef CONFIG_KVM_E500
> +	ulong in[8];
> +	ulong out[8];
> +
> +	e500_ev_idle(in, out, HC_VENDOR_EPAPR | HC_EV_IDLE, kvm_hypercall);
> +#endif
> +}

kvm_hypercall is C code.  As stated before, you cannot use C code while
_TLF_NAPPING is set.

> +static bool kvm_para_has_idle(void)
> +{
> +#ifdef CONFIG_BOOKE
> +	return epapr_hcall_has_idle;
> +#else
> +	return false;
> +#endif
> +}
> +
>  static int __init kvm_guest_init(void)
>  {
>  	if (!kvm_para_available())
> @@ -594,6 +614,10 @@ static int __init kvm_guest_init(void)
>  	powersave_nap = 1;
>  #endif
>  
> +	/* Install hcall based power_save for guest kernel */
> +	if (kvm_para_has_idle())
> +		ppc_md.power_save = kvm_hcall_idle;

Why did you only move it halfway out of KVM code?  ePAPR features such
as idle hcall should work on any ePAPR hypervisor, even with all KVM
code disabled.

Plus everything Alex said. :-)

-Scott

^ permalink raw reply

* Re: [RFC PATCH 15/16] KVM: PPC: booke: standard PPC floating point support
From: Scott Wood @ 2012-01-09 21:48 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <F0432B7A-E742-48B0-8016-3F2D0A800553@suse.de>

On 01/09/2012 11:48 AM, Alexander Graf wrote:
> 
> On 21.12.2011, at 02:34, Scott Wood wrote:
>> +#ifdef CONFIG_PPC_FPU
>> +	/* Save userspace FPU state in stack */
>> +	enable_kernel_fp();
>> +	memcpy(fpr, current->thread.fpr, sizeof(current->thread.fpr));
>> +	fpscr = current->thread.fpscr.val;
>> +	fpexc_mode = current->thread.fpexc_mode;
>> +
>> +	/* Restore guest FPU state to thread */
>> +	memcpy(current->thread.fpr, vcpu->arch.fpr, sizeof(vcpu->arch.fpr));
>> +	current->thread.fpscr.val = vcpu->arch.fpscr;
>> +
>> +	/*
>> +	 * Since we can't trap on MSR_FP in GS-mode, we consider the guest
>> +	 * as always using the FPU.  Kernel usage of FP (via
>> +	 * enable_kernel_fp()) in this thread must not occur while
>> +	 * vcpu->fpu_active is set.
>> +	 */
>> +	vcpu->fpu_active = 1;
>> +
>> +	kvmppc_load_guest_fp(vcpu);
>> +#endif
> 
> Do you think it's possible to combine this with the book3s_pr code, so we don't duplicate too much here?

book3s_pr is a bit different in that it can trap when the guest sets
MSR[FP].

Maybe a few lines could be factored out (the first memcpy, fpscr,
fpexc_mode).  I'm not sure that it makes sense given the lack of
isolation between what it's doing and what the rest of the code is doing.

>> +/*
>> + * Load up guest vcpu FP state if it's needed.
>> + * It also set the MSR_FP in thread so that host know
>> + * we're holding FPU, and then host can help to save
>> + * guest vcpu FP state if other threads require to use FPU.
>> + * This simulates an FP unavailable fault.
>> + *
>> + * It requires to be called with preemption disabled.
>> + */
>> +static inline void kvmppc_load_guest_fp(struct kvm_vcpu *vcpu)
>> +{
>> +#ifdef CONFIG_PPC_FPU
>> +	if (vcpu->fpu_active && !(current->thread.regs->msr & MSR_FP)) {
>> +		load_up_fpu();
>> +		current->thread.regs->msr |= MSR_FP;
> 
> I'm having a hard time to grasp when shared->msr, shadow_msr and regs->msr is used in your code :).

shadow_msr is the real MSR.

shared->msr is the guest's view of MSR.

current->thread.regs->msr is nominally userspace's MSR.  In this case we
use it to tell host Linux that FP is in use and must be saved on context
switch.  The actual userspace MSR_FP is known to be clear at this point
because we called enable_kernel_fp().  It will be clear again when we
return to userspace because we'll call giveup_fpu().

-Scott

^ permalink raw reply

* [git pull] Please pull powerpc.git merge branch
From: Kumar Gala @ 2012-01-09 21:47 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev

The following changes since commit a0e86bd4252519321b0d102dc4ed90557aa7bee9:

  audit: always follow va_copy() with va_end() (2012-01-08 14:15:21 -0800)

are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc.git merge

Michael Neuling (1):
      powerpc: fix compile error with 85xx/p1022_ds.c

 arch/powerpc/platforms/85xx/p1022_ds.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

^ permalink raw reply

* Re: [RFC PATCH 15/16] KVM: PPC: booke: standard PPC floating point support
From: Alexander Graf @ 2012-01-09 22:17 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <4F0B60A5.1040208@freescale.com>


On 09.01.2012, at 22:48, Scott Wood wrote:

> On 01/09/2012 11:48 AM, Alexander Graf wrote:
>>=20
>> On 21.12.2011, at 02:34, Scott Wood wrote:
>>> +#ifdef CONFIG_PPC_FPU
>>> +	/* Save userspace FPU state in stack */
>>> +	enable_kernel_fp();
>>> +	memcpy(fpr, current->thread.fpr, sizeof(current->thread.fpr));
>>> +	fpscr =3D current->thread.fpscr.val;
>>> +	fpexc_mode =3D current->thread.fpexc_mode;
>>> +
>>> +	/* Restore guest FPU state to thread */
>>> +	memcpy(current->thread.fpr, vcpu->arch.fpr, =
sizeof(vcpu->arch.fpr));
>>> +	current->thread.fpscr.val =3D vcpu->arch.fpscr;
>>> +
>>> +	/*
>>> +	 * Since we can't trap on MSR_FP in GS-mode, we consider the =
guest
>>> +	 * as always using the FPU.  Kernel usage of FP (via
>>> +	 * enable_kernel_fp()) in this thread must not occur while
>>> +	 * vcpu->fpu_active is set.
>>> +	 */
>>> +	vcpu->fpu_active =3D 1;
>>> +
>>> +	kvmppc_load_guest_fp(vcpu);
>>> +#endif
>>=20
>> Do you think it's possible to combine this with the book3s_pr code, =
so we don't duplicate too much here?
>=20
> book3s_pr is a bit different in that it can trap when the guest sets
> MSR[FP].

Ah, there's no doorbell? So you always have to swap fpu registers? You =
still have to enable it manually when preempting in, right? IIRC ppc32 =
does lazy fpu activation.

> Maybe a few lines could be factored out (the first memcpy, fpscr,
> fpexc_mode).  I'm not sure that it makes sense given the lack of
> isolation between what it's doing and what the rest of the code is =
doing.

Yeah, looking at the code it does look pretty different. Too bad - I =
would've hoped to throw the vmx code in as well so we could get =
vmx/vsx/whatever for free later.

>=20
>>> +/*
>>> + * Load up guest vcpu FP state if it's needed.
>>> + * It also set the MSR_FP in thread so that host know
>>> + * we're holding FPU, and then host can help to save
>>> + * guest vcpu FP state if other threads require to use FPU.
>>> + * This simulates an FP unavailable fault.
>>> + *
>>> + * It requires to be called with preemption disabled.
>>> + */
>>> +static inline void kvmppc_load_guest_fp(struct kvm_vcpu *vcpu)
>>> +{
>>> +#ifdef CONFIG_PPC_FPU
>>> +	if (vcpu->fpu_active && !(current->thread.regs->msr & MSR_FP)) {
>>> +		load_up_fpu();
>>> +		current->thread.regs->msr |=3D MSR_FP;
>>=20
>> I'm having a hard time to grasp when shared->msr, shadow_msr and =
regs->msr is used in your code :).
>=20
> shadow_msr is the real MSR.
>=20
> shared->msr is the guest's view of MSR.
>=20
> current->thread.regs->msr is nominally userspace's MSR.  In this case =
we
> use it to tell host Linux that FP is in use and must be saved on =
context
> switch.  The actual userspace MSR_FP is known to be clear at this =
point
> because we called enable_kernel_fp().  It will be clear again when we
> return to userspace because we'll call giveup_fpu().

Ah, this is thread.regs, not vcpu.regs. Sorry, I misread that part. This =
way it obviously makes a lot more sense.


Alex

^ permalink raw reply

* Re: [RFC PATCH 15/16] KVM: PPC: booke: standard PPC floating point support
From: Scott Wood @ 2012-01-09 22:39 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <08DCCAC0-82FA-46FC-974C-E7CCA53474A3@suse.de>

On 01/09/2012 04:17 PM, Alexander Graf wrote:
> 
> On 09.01.2012, at 22:48, Scott Wood wrote:
> 
>> On 01/09/2012 11:48 AM, Alexander Graf wrote:
>>>
>>> Do you think it's possible to combine this with the book3s_pr code, so we don't duplicate too much here?
>>
>> book3s_pr is a bit different in that it can trap when the guest sets
>> MSR[FP].
> 
> Ah, there's no doorbell? So you always have to swap fpu registers? You still have to enable it manually when preempting in, right? IIRC ppc32 does lazy fpu activation.

Right.

Preempting in is handled by calling kvmppc_load_guest_fp() (which should
be renamed to be booke-specific, since the semantics are tied to
booke.c) from kvmppc_core_vcpu_load() in e500mc.c.

>>> I'm having a hard time to grasp when shared->msr, shadow_msr and regs->msr is used in your code :).
>>
>> shadow_msr is the real MSR.
>>
>> shared->msr is the guest's view of MSR.

Correction -- this applies to PR-mode (e500v2).

In GS-mode, shadow_msr is not used.  The guest sees the real MSR (hw
silently prevents it from modifying certain bits), which gets saved on
exit into shared->msr.

-Scott

^ permalink raw reply

* Re: Mac address in the DT
From: Wolfgang Denk @ 2012-01-09 22:40 UTC (permalink / raw)
  To: smitha.vanga; +Cc: scottwood, linuxppc-dev
In-Reply-To: <40631E9A2581F14BA60888C87A76A1FE01D360@HYD-MKD-MBX4.wipro.com>

Dear smitha.vanga@wipro.com,

In message <40631E9A2581F14BA60888C87A76A1FE01D360@HYD-MKD-MBX4.wipro.com> you wrote:
>  
> >Setenv set_mac 'cp 0xffec0000 0x100000 1024;fdt addr 0xc00000 8192;fdt set=
>  /soc8272@f0000000/ethernet@24000 mac-address "[00 44 00 55 00 66]";erase 0x=
> ffec0000 0xffec4000;cp 0xc00000 0xffec0000 1024;bootm 0xfe060000 - 0xffec000=
> 0'
> 
> > run set_mac
> 
> For the above command I want to replace the mac address with the ethaddr> How
> do I do that. I tried $ethaddr but I get extra : characters.

Why would you do that at all?  U-Boot will run fdt_fixup_ethernet()
for all (at least AFAICT) supported architectures, which will replace 
"mac-address" and "local-mac-address" for all "ethernet?" interfaces
in the device tree for which a corresponding "ethadd" / "eth?addr"
environment variable is set.

No additional actions are needed.

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
"One day," said a dull voice from down below, "I'm going to  be  back
in  form again and you're going to be very sorry you said that. For a
very long time. I might even go so far as to make even more Time just
for you to be sorry in."              - Terry Pratchett, _Small Gods_

^ permalink raw reply

* Re: [RFC PATCH 15/16] KVM: PPC: booke: standard PPC floating point support
From: Alexander Graf @ 2012-01-09 22:47 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <4F0B6C91.1090102@freescale.com>


On 09.01.2012, at 23:39, Scott Wood wrote:

> On 01/09/2012 04:17 PM, Alexander Graf wrote:
>>=20
>> On 09.01.2012, at 22:48, Scott Wood wrote:
>>=20
>>> On 01/09/2012 11:48 AM, Alexander Graf wrote:
>>>>=20
>>>> Do you think it's possible to combine this with the book3s_pr code, =
so we don't duplicate too much here?
>>>=20
>>> book3s_pr is a bit different in that it can trap when the guest sets
>>> MSR[FP].
>>=20
>> Ah, there's no doorbell? So you always have to swap fpu registers? =
You still have to enable it manually when preempting in, right? IIRC =
ppc32 does lazy fpu activation.
>=20
> Right.
>=20
> Preempting in is handled by calling kvmppc_load_guest_fp() (which =
should
> be renamed to be booke-specific, since the semantics are tied to
> booke.c) from kvmppc_core_vcpu_load() in e500mc.c.

Ah, and that one's called on sched_in. All is well then :).

>=20
>>>> I'm having a hard time to grasp when shared->msr, shadow_msr and =
regs->msr is used in your code :).
>>>=20
>>> shadow_msr is the real MSR.
>>>=20
>>> shared->msr is the guest's view of MSR.
>=20
> Correction -- this applies to PR-mode (e500v2).
>=20
> In GS-mode, shadow_msr is not used.  The guest sees the real MSR (hw
> silently prevents it from modifying certain bits), which gets saved on
> exit into shared->msr.

Hrm. Can we maybe #ifdef out shadow_msr on HV then? I'm really getting =
confused with having 3 potential msr variables in the vcpu struct.


Alex

^ permalink raw reply

* Re: [RFC PATCH 15/16] KVM: PPC: booke: standard PPC floating point support
From: Scott Wood @ 2012-01-09 22:54 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <96605829-1BD3-4668-AEB8-7D27775C2868@suse.de>

On 01/09/2012 04:47 PM, Alexander Graf wrote:
> 
> On 09.01.2012, at 23:39, Scott Wood wrote:
> 
>> On 01/09/2012 04:17 PM, Alexander Graf wrote:
>>>
>>> On 09.01.2012, at 22:48, Scott Wood wrote:
>>>
>>>> On 01/09/2012 11:48 AM, Alexander Graf wrote:
>>>>> I'm having a hard time to grasp when shared->msr, shadow_msr and regs->msr is used in your code :).
>>>>
>>>> shadow_msr is the real MSR.
>>>>
>>>> shared->msr is the guest's view of MSR.
>>
>> Correction -- this applies to PR-mode (e500v2).
>>
>> In GS-mode, shadow_msr is not used.  The guest sees the real MSR (hw
>> silently prevents it from modifying certain bits), which gets saved on
>> exit into shared->msr.
> 
> Hrm. Can we maybe #ifdef out shadow_msr on HV then? I'm really getting confused with having 3 potential msr variables in the vcpu struct.

An ifdef would take us further down the road of not being able to
support both in the same kernel image (not sure whether that's a
long-term goal -- probably won't happen any time soon with e500v2+e500mc
even disregarding KVM, but maybe it'll be relevant on some other chips),
and in general increase the mess in the struct definition.  How about a
comment?

-Scott

^ permalink raw reply

* Re: [RFC PATCH 15/16] KVM: PPC: booke: standard PPC floating point support
From: Alexander Graf @ 2012-01-09 22:56 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <4F0B700B.7020006@freescale.com>


On 09.01.2012, at 23:54, Scott Wood wrote:

> On 01/09/2012 04:47 PM, Alexander Graf wrote:
>>=20
>> On 09.01.2012, at 23:39, Scott Wood wrote:
>>=20
>>> On 01/09/2012 04:17 PM, Alexander Graf wrote:
>>>>=20
>>>> On 09.01.2012, at 22:48, Scott Wood wrote:
>>>>=20
>>>>> On 01/09/2012 11:48 AM, Alexander Graf wrote:
>>>>>> I'm having a hard time to grasp when shared->msr, shadow_msr and =
regs->msr is used in your code :).
>>>>>=20
>>>>> shadow_msr is the real MSR.
>>>>>=20
>>>>> shared->msr is the guest's view of MSR.
>>>=20
>>> Correction -- this applies to PR-mode (e500v2).
>>>=20
>>> In GS-mode, shadow_msr is not used.  The guest sees the real MSR (hw
>>> silently prevents it from modifying certain bits), which gets saved =
on
>>> exit into shared->msr.
>>=20
>> Hrm. Can we maybe #ifdef out shadow_msr on HV then? I'm really =
getting confused with having 3 potential msr variables in the vcpu =
struct.
>=20
> An ifdef would take us further down the road of not being able to
> support both in the same kernel image (not sure whether that's a
> long-term goal -- probably won't happen any time soon with =
e500v2+e500mc
> even disregarding KVM, but maybe it'll be relevant on some other =
chips),
> and in general increase the mess in the struct definition.  How about =
a
> comment?

Well, I'd like to make sure we don't accidentally access the wrong =
field. But yes, a comment should be ok.

Alex

^ permalink raw reply

* [PATCH] powerpc: Fix RCU idle and hcall tracing
From: Anton Blanchard @ 2012-01-10  0:29 UTC (permalink / raw)
  To: benh, paulus, paulmck; +Cc: linuxppc-dev


Tracepoints should not be called inside an rcu_idle_enter/rcu_idle_exit
region. Since pSeries calls H_CEDE in the idle loop, we were violating
this rule.

commit a7b152d5342c (powerpc: Tell RCU about idle after hcall tracing)
tried to work around it by delaying the rcu_idle_enter until after we
called the hcall tracepoint, but there are a number of issues with it.

The hcall tracepoint trampoline code is called conditionally when the
tracepoint is enabled. If the tracepoint is not enabled we never call
rcu_idle_enter. The idle_uses_rcu check was also done at compile time
which breaks multiplatform builds.

The simple fix is to avoid tracing H_CEDE and rely on other tracepoints
and the hypervisor dispatch trace log to work out if we called H_CEDE.

This fixes a hang during boot on pSeries.

Signed-off-by: Anton Blanchard <anton@samba.org>
---

Index: linux-build/arch/powerpc/kernel/idle.c
===================================================================
--- linux-build.orig/arch/powerpc/kernel/idle.c	2012-01-10 11:07:22.091615183 +1100
+++ linux-build/arch/powerpc/kernel/idle.c	2012-01-10 11:07:57.172264229 +1100
@@ -50,12 +50,6 @@ static int __init powersave_off(char *ar
 }
 __setup("powersave=off", powersave_off);
 
-#if defined(CONFIG_PPC_PSERIES) && defined(CONFIG_TRACEPOINTS)
-static const bool idle_uses_rcu = 1;
-#else
-static const bool idle_uses_rcu;
-#endif
-
 /*
  * The body of the idle task.
  */
@@ -67,8 +61,7 @@ void cpu_idle(void)
 	set_thread_flag(TIF_POLLING_NRFLAG);
 	while (1) {
 		tick_nohz_idle_enter();
-		if (!idle_uses_rcu)
-			rcu_idle_enter();
+		rcu_idle_enter();
 
 		while (!need_resched() && !cpu_should_die()) {
 			ppc64_runlatch_off();
@@ -106,8 +99,7 @@ void cpu_idle(void)
 
 		HMT_medium();
 		ppc64_runlatch_on();
-		if (!idle_uses_rcu)
-			rcu_idle_exit();
+		rcu_idle_exit();
 		tick_nohz_idle_exit();
 		preempt_enable_no_resched();
 		if (cpu_should_die())
Index: linux-build/arch/powerpc/platforms/pseries/lpar.c
===================================================================
--- linux-build.orig/arch/powerpc/platforms/pseries/lpar.c	2012-01-10 11:07:22.079614961 +1100
+++ linux-build/arch/powerpc/platforms/pseries/lpar.c	2012-01-10 11:16:55.710226236 +1100
@@ -546,6 +546,13 @@ void __trace_hcall_entry(unsigned long o
 	unsigned long flags;
 	unsigned int *depth;
 
+	/*
+	 * We cannot call tracepoints inside RCU idle regions which
+	 * means we must not trace H_CEDE.
+	 */
+	if (opcode == H_CEDE)
+		return;
+
 	local_irq_save(flags);
 
 	depth = &__get_cpu_var(hcall_trace_depth);
@@ -556,8 +563,6 @@ void __trace_hcall_entry(unsigned long o
 	(*depth)++;
 	preempt_disable();
 	trace_hcall_entry(opcode, args);
-	if (opcode == H_CEDE)
-		rcu_idle_enter();
 	(*depth)--;
 
 out:
@@ -570,6 +575,9 @@ void __trace_hcall_exit(long opcode, uns
 	unsigned long flags;
 	unsigned int *depth;
 
+	if (opcode == H_CEDE)
+		return;
+
 	local_irq_save(flags);
 
 	depth = &__get_cpu_var(hcall_trace_depth);
@@ -578,8 +586,6 @@ void __trace_hcall_exit(long opcode, uns
 		goto out;
 
 	(*depth)++;
-	if (opcode == H_CEDE)
-		rcu_idle_exit();
 	trace_hcall_exit(opcode, retval, retbuf);
 	preempt_enable();
 	(*depth)--;

^ permalink raw reply

* Re: [PATCH] powerpc: Fix RCU idle and hcall tracing
From: Paul E. McKenney @ 2012-01-10  0:43 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: paulus, linuxppc-dev
In-Reply-To: <20120110112915.387e8b1c@kryten>

On Tue, Jan 10, 2012 at 11:29:15AM +1100, Anton Blanchard wrote:
> 
> Tracepoints should not be called inside an rcu_idle_enter/rcu_idle_exit
> region. Since pSeries calls H_CEDE in the idle loop, we were violating
> this rule.
> 
> commit a7b152d5342c (powerpc: Tell RCU about idle after hcall tracing)
> tried to work around it by delaying the rcu_idle_enter until after we
> called the hcall tracepoint, but there are a number of issues with it.
> 
> The hcall tracepoint trampoline code is called conditionally when the
> tracepoint is enabled. If the tracepoint is not enabled we never call
> rcu_idle_enter. The idle_uses_rcu check was also done at compile time
> which breaks multiplatform builds.
> 
> The simple fix is to avoid tracing H_CEDE and rely on other tracepoints
> and the hypervisor dispatch trace log to work out if we called H_CEDE.
> 
> This fixes a hang during boot on pSeries.
> 
> Signed-off-by: Anton Blanchard <anton@samba.org>

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
> 
> Index: linux-build/arch/powerpc/kernel/idle.c
> ===================================================================
> --- linux-build.orig/arch/powerpc/kernel/idle.c	2012-01-10 11:07:22.091615183 +1100
> +++ linux-build/arch/powerpc/kernel/idle.c	2012-01-10 11:07:57.172264229 +1100
> @@ -50,12 +50,6 @@ static int __init powersave_off(char *ar
>  }
>  __setup("powersave=off", powersave_off);
> 
> -#if defined(CONFIG_PPC_PSERIES) && defined(CONFIG_TRACEPOINTS)
> -static const bool idle_uses_rcu = 1;
> -#else
> -static const bool idle_uses_rcu;
> -#endif
> -
>  /*
>   * The body of the idle task.
>   */
> @@ -67,8 +61,7 @@ void cpu_idle(void)
>  	set_thread_flag(TIF_POLLING_NRFLAG);
>  	while (1) {
>  		tick_nohz_idle_enter();
> -		if (!idle_uses_rcu)
> -			rcu_idle_enter();
> +		rcu_idle_enter();
> 
>  		while (!need_resched() && !cpu_should_die()) {
>  			ppc64_runlatch_off();
> @@ -106,8 +99,7 @@ void cpu_idle(void)
> 
>  		HMT_medium();
>  		ppc64_runlatch_on();
> -		if (!idle_uses_rcu)
> -			rcu_idle_exit();
> +		rcu_idle_exit();
>  		tick_nohz_idle_exit();
>  		preempt_enable_no_resched();
>  		if (cpu_should_die())
> Index: linux-build/arch/powerpc/platforms/pseries/lpar.c
> ===================================================================
> --- linux-build.orig/arch/powerpc/platforms/pseries/lpar.c	2012-01-10 11:07:22.079614961 +1100
> +++ linux-build/arch/powerpc/platforms/pseries/lpar.c	2012-01-10 11:16:55.710226236 +1100
> @@ -546,6 +546,13 @@ void __trace_hcall_entry(unsigned long o
>  	unsigned long flags;
>  	unsigned int *depth;
> 
> +	/*
> +	 * We cannot call tracepoints inside RCU idle regions which
> +	 * means we must not trace H_CEDE.
> +	 */
> +	if (opcode == H_CEDE)
> +		return;
> +
>  	local_irq_save(flags);
> 
>  	depth = &__get_cpu_var(hcall_trace_depth);
> @@ -556,8 +563,6 @@ void __trace_hcall_entry(unsigned long o
>  	(*depth)++;
>  	preempt_disable();
>  	trace_hcall_entry(opcode, args);
> -	if (opcode == H_CEDE)
> -		rcu_idle_enter();
>  	(*depth)--;
> 
>  out:
> @@ -570,6 +575,9 @@ void __trace_hcall_exit(long opcode, uns
>  	unsigned long flags;
>  	unsigned int *depth;
> 
> +	if (opcode == H_CEDE)
> +		return;
> +
>  	local_irq_save(flags);
> 
>  	depth = &__get_cpu_var(hcall_trace_depth);
> @@ -578,8 +586,6 @@ void __trace_hcall_exit(long opcode, uns
>  		goto out;
> 
>  	(*depth)++;
> -	if (opcode == H_CEDE)
> -		rcu_idle_exit();
>  	trace_hcall_exit(opcode, retval, retbuf);
>  	preempt_enable();
>  	(*depth)--;
> 

^ permalink raw reply

* Re: [RFC PATCH 14/16] KVM: PPC: booke: category E.HV (GS-mode) support
From: Scott Wood @ 2012-01-10  0:51 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <1A656B06-E290-4B75-8F98-F8FAB8B817F1@suse.de>

On 01/09/2012 11:46 AM, Alexander Graf wrote:
> 
> On 21.12.2011, at 02:34, Scott Wood wrote:
> 
>> Chips such as e500mc that implement category E.HV in Power ISA 2.06
>> provide hardware virtualization features, including a new MSR mode for
>> guest state.  The guest OS can perform many operations without trapping
>> into the hypervisor, including transitions to and from guest userspace.
>>
>> Since we can use SRR1[GS] to reliably tell whether an exception came from
>> guest state, instead of messing around with IVPR, we use DO_KVM similarly
>> to book3s.
> 
> Is there any benefit of using DO_KVM? I would assume that messing with IVPR is faster.

Using the GS bit to decide which handler to run means we won't get
confused if a machine check or critical interrupt happens between
entering/exiting the guest and updating IVPR (we could use the IS bit
similarly in PR-mode).

This could be supplemented with IVPR (though that will add a few cycles
to guest entry/exit) or some sort of runtime patching (would be more
coarse-grained, active when any KVM guest exists) to avoid adding
overhead to traps when KVM is not used, but I'd like to quantify that
overhead first.  It should be much lower than what happens on book3s.

>> Current issues include:
>> - Machine checks from guest state are not routed to the host handler.
>> - The guest can cause a host oops by executing an emulated instruction
>>   in a page that lacks read permission.  Existing e500/4xx support has
>>   the same problem.
> 
> We solve that in book3s pr by doing
> 
>   LAST_INST = <known bad value>;
>   PACA->kvm_mode = <recover at next inst>;
>   lwz(guest pc);
>   do_more_stuff();
> 
> That way when an exception occurs at lwz() the DO_KVM handler checks that we're in kvm mode "recover" which does basically srr0+=4; rfi;.

I was thinking we'd check ESR[EPID] or SRR1[IS] as appropriate, and
treat it as a kernel fault (search exception table) -- but this works
too and is a bit cleaner (could be other uses of external pid), at the
expense of a couple extra instructions in the emulation path (but
probably a slightly faster host TLB handler).

The check wouldn't go in DO_KVM, though, since on bookehv that only
deals with diverting flow when xSRR1[GS] is set, which wouldn't be the
case here.

>> @@ -243,16 +324,20 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
>> 	case BOOKE_IRQPRIO_AP_UNAVAIL:
>> 	case BOOKE_IRQPRIO_ALIGNMENT:
>> 		allowed = 1;
>> -		msr_mask = MSR_CE|MSR_ME|MSR_DE;
>> +		msr_mask = MSR_GS | MSR_CE | MSR_ME | MSR_DE;
> 
> No need to do this. You already force MSR_GS in set_msr();

OK.  This was here since before set_msr() started doing that. :-)

>> +	if (!current->thread.kvm_vcpu) {
>> +		WARN(1, "no vcpu\n");
>> +		return -EPERM;
>> +	}
> 
> Huh?

Oops, leftover debugging.

>> +static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
>> +{
>> +	enum emulation_result er;
>> +
>> +	er = kvmppc_emulate_instruction(run, vcpu);
>> +	switch (er) {
>> +	case EMULATE_DONE:
>> +		/* don't overwrite subtypes, just account kvm_stats */
>> +		kvmppc_account_exit_stat(vcpu, EMULATED_INST_EXITS);
>> +		/* Future optimization: only reload non-volatiles if
>> +		 * they were actually modified by emulation. */
>> +		return RESUME_GUEST_NV;
>> +
>> +	case EMULATE_DO_DCR:
>> +		run->exit_reason = KVM_EXIT_DCR;
>> +		return RESUME_HOST;
>> +
>> +	case EMULATE_FAIL:
>> +		/* XXX Deliver Program interrupt to guest. */
>> +		printk(KERN_CRIT "%s: emulation at %lx failed (%08x)\n",
>> +		       __func__, vcpu->arch.regs.nip, vcpu->arch.last_inst);
> 
> This should be throttled, otherwise the guest can spam our logs.

Yes it should, but I'm just moving the code here.

>> +		/* For debugging, encode the failing instruction and
>> +		 * report it to userspace. */
>> +		run->hw.hardware_exit_reason = ~0ULL << 32;
>> +		run->hw.hardware_exit_reason |= vcpu->arch.last_inst;
> 
> 
> I'm fairly sure you want to fix this :)

Likewise, that's what booke.c already does.  What should it do instead?

> /**
>>  * kvmppc_handle_exit
>>  *
>> @@ -374,12 +530,39 @@ out:
>> int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>>                        unsigned int exit_nr)
>> {
>> -	enum emulation_result er;
>> 	int r = RESUME_HOST;
>>
>> 	/* update before a new last_exit_type is rewritten */
>> 	kvmppc_update_timing_stats(vcpu);
>>
>> +	/*
>> +	 * If we actually care, we could copy MSR, DEAR, and ESR to regs,
>> +	 * insert an appropriate trap number, etc.
>> +	 *
>> +	 * Seems like a waste of cycles for something that should only matter
>> +	 * to someone using sysrq-t/p or similar host kernel debug facility.
>> +	 * We have other debug facilities to get that information from a
>> +	 * guest through userspace.
>> +	 */
>> +	switch (exit_nr) {
>> +	case BOOKE_INTERRUPT_EXTERNAL:
>> +		do_IRQ(&vcpu->arch.regs);
> 
> Ah, so that's what you want to use regs for. So is having a pt_regs
> struct that only contains useful register values in half its fields
> any useful here? Or could we keep control of the registers ourselves,
> enabling us to maybe one day optimize things more.

I think it contains enough to be useful for debugging code such as sysrq
and tracers, and as noted in the comment we could copy the rest if we
care enough.  MSR might be worth copying.

It will eventually be used for machine checks as well, which I'd like to
hand reasonable register state to, at least for GPRs, LR, and PC.

If there's a good enough performance reason, we could just copy
everything over for machine checks and pass NULL to do_IRQ (I think it
can take this -- a dummy regs struct if not), but it seems premature at
the moment unless the switch already causes measured performance loss
(cache utilization?).

>> @@ -387,30 +570,56 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>>
>> 	switch (exit_nr) {
>> 	case BOOKE_INTERRUPT_MACHINE_CHECK:
>> -		printk("MACHINE CHECK: %lx\n", mfspr(SPRN_MCSR));
>> -		kvmppc_dump_vcpu(vcpu);
>> -		r = RESUME_HOST;
>> +		kvm_resched(vcpu);
>> +		r = RESUME_GUEST;
> 
> huh?

Patch shuffling accident -- this belongs with a later patch that invokes
the host machine check handler similar to what is done with do_IRQ().
The host machine check handler needs some work first, though.

>> 		break;
>>
>> 	case BOOKE_INTERRUPT_EXTERNAL:
>> 		kvmppc_account_exit(vcpu, EXT_INTR_EXITS);
>> -		if (need_resched())
>> -			cond_resched();
>> +		kvm_resched(vcpu);
> 
> Why are we explicit about the resched? On book3s I just call kvm_resched(vcpu) before the switch().

There are a few exit types where we don't currently do the resched -- if
they're all bugs or don't-cares, we could move it out of the switch.

We probably should defer the check until after we've disabled
interrupts, similar to signals -- even if we didn't exit for an
interrupt, we could have received one after enabling them.

>> +		if (kvm_is_visible_gfn(vcpu->kvm, gfn)) {
>> +			/* The guest TLB had a mapping, but the shadow TLB
>> +			 * didn't. This could be because:
>> +			 * a) the entry is mapping the host kernel, or
>> +			 * b) the guest used a large mapping which we're faking
>> +			 * Either way, we need to satisfy the fault without
>> +			 * invoking the guest. */
>> +			kvmppc_mmu_map(vcpu, eaddr, gpaddr, gtlb_index);
>> +		} else {
>> +			/* Guest mapped and leaped at non-RAM! */
>> +			kvmppc_booke_queue_irqprio(vcpu,
>> +						   BOOKE_IRQPRIO_MACHINE_CHECK);
> 
> Are you sure? Couldn't this also be MMIO? That doesn't really improve the situation as executing from MMIO is tricky with the KVM model, but it's not necessarily bad. Oh well, I guess we'll have to do something and throwing an #MC isn't all that ugly.

I think I asked you about executing from MMIO once, and you said it
wasn't supported even in straight QEMU.  Have things changed?

>> diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h
>> index 05d1d99..d53bcf2 100644
>> --- a/arch/powerpc/kvm/booke.h
>> +++ b/arch/powerpc/kvm/booke.h
>> @@ -48,7 +48,20 @@
>> #define BOOKE_IRQPRIO_PERFORMANCE_MONITOR 19
>> /* Internal pseudo-irqprio for level triggered externals */
>> #define BOOKE_IRQPRIO_EXTERNAL_LEVEL 20
>> -#define BOOKE_IRQPRIO_MAX 20
>> +#define BOOKE_IRQPRIO_DBELL 21
>> +#define BOOKE_IRQPRIO_DBELL_CRIT 22
>> +#define BOOKE_IRQPRIO_MAX 23
> 
> So was MAX wrong before or is it too big now?

MAX is just a marker for how many IRQPRIOs we have, not any sort of
external limit.  This patch adds new IRQPRIOs, so MAX goes up.

The actual limit is the number of bits in a long.

>> +	.if	\flags & NEED_EMU
>> +	lwz	r9, VCPU_KVM(r4)
> 
> writing r9
> 
>> +	.endif
>> +
>> +#ifdef CONFIG_KVM_EXIT_TIMING
>> +	/* save exit time */
>> +1:	mfspr	r7, SPRN_TBRU
>> +	mfspr	r8, SPRN_TBRL
>> +	mfspr	r9, SPRN_TBRU
> 
> overwriting r9 again?

Oops.  It's RFC for a reason. :-)

>> +#ifndef CONFIG_64BIT
> 
> Double negation is always hard to read. Please reverse the ifdef :)

OK.

>> +lightweight_exit:
>> +	PPC_STL	r2, HOST_R2(r1)
>> +
>> +	mfspr	r3, SPRN_PID
>> +	stw	r3, VCPU_HOST_PID(r4)
>> +	lwz	r3, VCPU_GUEST_PID(r4)
>> +	mtspr	SPRN_PID, r3
>> +
>> +	/* Save vcpu pointer for the exception handlers
>> +	 * must be done before loading guest r2.
>> +	 */
>> +//	SET_VCPU(r4)
> 
> hm?

Can just be removed, it's handled in booke's vcpu load/put.

>> +	lwz	r6, (VCPU_SHARED_MAS2 + 4)(r11)
>> +#else
>> +	ld	r6, (VCPU_SHARED_MAS2)(r11)
>> +#endif
>> +	lwz	r7, VCPU_SHARED_MAS7_3+4(r11)
>> +	lwz	r8, VCPU_SHARED_MAS4(r11)
>> +	mtspr	SPRN_MAS0, r3
>> +	mtspr	SPRN_MAS1, r5
>> +	mtspr	SPRN_MAS2, r6
>> +	mtspr	SPRN_MAS3, r7
>> +	mtspr	SPRN_MAS4, r8
>> +	lwz	r3, VCPU_SHARED_MAS6(r11)
>> +	lwz	r5, VCPU_SHARED_MAS7_3+0(r11)
>> +	mtspr	SPRN_MAS6, r3
>> +	mtspr	SPRN_MAS7, r5
>> +	/* Disable MAS register updates via exception */
>> +	mfspr	r3, SPRN_EPCR
>> +	oris	r3, r3, SPRN_EPCR_DMIUH@h
>> +	mtspr	SPRN_EPCR, r3
> 
> Shouldn't this happen before you set the MAS registers? :)

Yes (though we really shouldn't be getting a TLB miss here, at least on
e500mc).

>> +	/* Load some guest volatiles. */
>> +	PPC_LL	r3, VCPU_LR(r4)
>> +	PPC_LL	r5, VCPU_XER(r4)
>> +	PPC_LL	r6, VCPU_CTR(r4)
>> +	PPC_LL	r7, VCPU_CR(r4)
>> +	PPC_LL	r8, VCPU_PC(r4)
>> +#ifndef CONFIG_64BIT
>> +	lwz	r9, (VCPU_SHARED_MSR + 4)(r11)
>> +#else
>> +	ld	r9, (VCPU_SHARED_MSR)(r11)
>> +#endif
>> +	PPC_LL	r0, VCPU_GPR(r0)(r4)
>> +	PPC_LL	r1, VCPU_GPR(r1)(r4)
>> +	PPC_LL	r2, VCPU_GPR(r2)(r4)
>> +	PPC_LL	r10, VCPU_GPR(r10)(r4)
>> +	PPC_LL	r11, VCPU_GPR(r11)(r4)
>> +	PPC_LL	r12, VCPU_GPR(r12)(r4)
>> +	PPC_LL	r13, VCPU_GPR(r13)(r4)
>> +	mtlr	r3
>> +	mtxer	r5
>> +	mtctr	r6
>> +	mtcr	r7
>> +	mtsrr0	r8
>> +	mtsrr1	r9
> 
> Are you sure this should be shared->msr, not shadow_msr?

Yes, we don't use shadow_msr on bookehv.  I'll add a comment in the
struct definition as discussed in the other thread, as well as other
areas where there are subtle differences between PR-mode and GS-mode.

-Scott

^ permalink raw reply

* Re: [PATCH 1/2][v2] mtd/nand:Fix wrong address read in is_blank()
From: Scott Wood @ 2012-01-10  1:07 UTC (permalink / raw)
  To: Prabhakar Kushwaha; +Cc: linux-mtd, linuxppc-dev, Poonam Aggrwal
In-Reply-To: <1326111847-13085-1-git-send-email-prabhakar@freescale.com>

On 01/09/2012 06:24 AM, Prabhakar Kushwaha wrote:
> @@ -215,12 +215,15 @@ static int is_blank(struct mtd_info *mtd, unsigned int bufnum)
>  static int check_read_ecc(struct mtd_info *mtd, struct fsl_ifc_ctrl *ctrl,
>  			  u32 *eccstat, unsigned int bufnum)
>  {
> +	struct nand_chip *chip = mtd->priv;
> +	int bufperpage = mtd->writesize / chip->ecc.size;
> +	int eccbuf_num = bufnum + (bufnum / bufperpage) * bufperpage;

This is unnecessarily complicated (and introduces two more, dependent,
runtime divisions).  I don't think there are any changes required in
this function.  You're awkwardly compensating for the fact that the
caller hasn't been updated for the new definition of bufnum.

bufperpage used in fsl_ifc_run_command should be doubled to account for
the OOB buffers.  We should probably rename it from "buf" to something
else (chunk? subpage?) to avoid confusion with bufnum_mask, which refers
to page-sized buffers.

>  	u32 reg = eccstat[bufnum / 4];
>  	int errors = (reg >> ((3 - bufnum % 4) * 8)) & 15;
>  
>  	if (errors == 15) { /* uncorrectable */
>  		/* Blank pages fail hw ECC checks */
> -		if (is_blank(mtd, bufnum))
> +		if (is_blank(mtd, eccbuf_num))
>  			return 1;
>  
>  		/*
> @@ -273,7 +276,7 @@ static void fsl_ifc_run_command(struct mtd_info *mtd)
>  		dev_err(priv->dev, "NAND Flash Write Protect Error\n");
>  
>  	if (nctrl->eccread) {
> -		int bufperpage = mtd->writesize / 512;
> +		int bufperpage = mtd->writesize / chip->ecc.size;
>  		int bufnum = (nctrl->page & priv->bufnum_mask) * bufperpage;
>  		int bufnum_end = bufnum + bufperpage - 1;
>  

Again, please calculate bufperpage (chunksperpage? subsperpage? perbuf?)
at driver init, as is done with bufnum_mask.

-Scott

^ permalink raw reply

* Re: [PATCH 2/2][v2] mtd/nand: Fix IFC driver to support 2K NAND page
From: Scott Wood @ 2012-01-10  1:10 UTC (permalink / raw)
  To: Prabhakar Kushwaha; +Cc: linux-mtd, linuxppc-dev, Poonam Aggrwal
In-Reply-To: <1326111866-13120-1-git-send-email-prabhakar@freescale.com>

On 01/09/2012 06:24 AM, Prabhakar Kushwaha wrote:
> 1) OOB area should be updated irrespective of NAND page size. Earlier it was
> updated only for 512byte NAND page.
> 
> 2) During OOB update fbcr should be equal to OOB size.
> 
> Signed-off-by: Poonam Aggrwal <poonam.aggrwal@freescale.com>
> Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com>
> ---
>  git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc.git (branch next)
> 
>  This patch is created on top of IFC driver patch (already floated in mailing
>  list). Please find their link:
>  http://patchwork.ozlabs.org/patch/133315/
>  http://patchwork.ozlabs.org/patch/133316/

Looks good.

-Scott

^ permalink raw reply

* Re: [RFC PATCH 14/16] KVM: PPC: booke: category E.HV (GS-mode) support
From: Alexander Graf @ 2012-01-10  3:11 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <4F0B8B7D.4@freescale.com>


On 10.01.2012, at 01:51, Scott Wood wrote:

> On 01/09/2012 11:46 AM, Alexander Graf wrote:
>>=20
>> On 21.12.2011, at 02:34, Scott Wood wrote:
>>=20
>>> Chips such as e500mc that implement category E.HV in Power ISA 2.06
>>> provide hardware virtualization features, including a new MSR mode =
for
>>> guest state.  The guest OS can perform many operations without =
trapping
>>> into the hypervisor, including transitions to and from guest =
userspace.
>>>=20
>>> Since we can use SRR1[GS] to reliably tell whether an exception came =
from
>>> guest state, instead of messing around with IVPR, we use DO_KVM =
similarly
>>> to book3s.
>>=20
>> Is there any benefit of using DO_KVM? I would assume that messing =
with IVPR is faster.
>=20
> Using the GS bit to decide which handler to run means we won't get
> confused if a machine check or critical interrupt happens between
> entering/exiting the guest and updating IVPR (we could use the IS bit
> similarly in PR-mode).
>=20
> This could be supplemented with IVPR (though that will add a few =
cycles
> to guest entry/exit) or some sort of runtime patching (would be more
> coarse-grained, active when any KVM guest exists) to avoid adding
> overhead to traps when KVM is not used, but I'd like to quantify that
> overhead first.  It should be much lower than what happens on book3s.

Hrm. Yeah, given that your DO_KVM handler is so much simpler, it might =
make sense to stick with that method. Benchmarks would be useful in the =
long run though.

>=20
>>> Current issues include:
>>> - Machine checks from guest state are not routed to the host =
handler.
>>> - The guest can cause a host oops by executing an emulated =
instruction
>>>  in a page that lacks read permission.  Existing e500/4xx support =
has
>>>  the same problem.
>>=20
>> We solve that in book3s pr by doing
>>=20
>>  LAST_INST =3D <known bad value>;
>>  PACA->kvm_mode =3D <recover at next inst>;
>>  lwz(guest pc);
>>  do_more_stuff();
>>=20
>> That way when an exception occurs at lwz() the DO_KVM handler checks =
that we're in kvm mode "recover" which does basically srr0+=3D4; rfi;.
>=20
> I was thinking we'd check ESR[EPID] or SRR1[IS] as appropriate, and
> treat it as a kernel fault (search exception table) -- but this works
> too and is a bit cleaner (could be other uses of external pid), at the
> expense of a couple extra instructions in the emulation path (but
> probably a slightly faster host TLB handler).
>=20
> The check wouldn't go in DO_KVM, though, since on bookehv that only
> deals with diverting flow when xSRR1[GS] is set, which wouldn't be the
> case here.

Yup, not sure where you'd put the check, as it'd slow down normal =
operation too. Hrm.

>=20
>>> @@ -243,16 +324,20 @@ static int kvmppc_booke_irqprio_deliver(struct =
kvm_vcpu *vcpu,
>>> 	case BOOKE_IRQPRIO_AP_UNAVAIL:
>>> 	case BOOKE_IRQPRIO_ALIGNMENT:
>>> 		allowed =3D 1;
>>> -		msr_mask =3D MSR_CE|MSR_ME|MSR_DE;
>>> +		msr_mask =3D MSR_GS | MSR_CE | MSR_ME | MSR_DE;
>>=20
>> No need to do this. You already force MSR_GS in set_msr();
>=20
> OK.  This was here since before set_msr() started doing that. :-)
>=20
>>> +	if (!current->thread.kvm_vcpu) {
>>> +		WARN(1, "no vcpu\n");
>>> +		return -EPERM;
>>> +	}
>>=20
>> Huh?
>=20
> Oops, leftover debugging.
>=20
>>> +static int emulation_exit(struct kvm_run *run, struct kvm_vcpu =
*vcpu)
>>> +{
>>> +	enum emulation_result er;
>>> +
>>> +	er =3D kvmppc_emulate_instruction(run, vcpu);
>>> +	switch (er) {
>>> +	case EMULATE_DONE:
>>> +		/* don't overwrite subtypes, just account kvm_stats */
>>> +		kvmppc_account_exit_stat(vcpu, EMULATED_INST_EXITS);
>>> +		/* Future optimization: only reload non-volatiles if
>>> +		 * they were actually modified by emulation. */
>>> +		return RESUME_GUEST_NV;
>>> +
>>> +	case EMULATE_DO_DCR:
>>> +		run->exit_reason =3D KVM_EXIT_DCR;
>>> +		return RESUME_HOST;
>>> +
>>> +	case EMULATE_FAIL:
>>> +		/* XXX Deliver Program interrupt to guest. */
>>> +		printk(KERN_CRIT "%s: emulation at %lx failed (%08x)\n",
>>> +		       __func__, vcpu->arch.regs.nip, =
vcpu->arch.last_inst);
>>=20
>> This should be throttled, otherwise the guest can spam our logs.
>=20
> Yes it should, but I'm just moving the code here.

Yeah, only realized this later. Maybe next time (not for this patch set, =
next time you're sending something) just extract these mechanical parts, =
so it's easier to review the pieces where code actually changes :).

>=20
>>> +		/* For debugging, encode the failing instruction and
>>> +		 * report it to userspace. */
>>> +		run->hw.hardware_exit_reason =3D ~0ULL << 32;
>>> +		run->hw.hardware_exit_reason |=3D vcpu->arch.last_inst;
>>=20
>>=20
>> I'm fairly sure you want to fix this :)
>=20
> Likewise, that's what booke.c already does.  What should it do =
instead?

This is what book3s does:

                case EMULATE_FAIL:
                        printk(KERN_CRIT "%s: emulation at %lx failed =
(%08x)\n",
                               __func__, kvmppc_get_pc(vcpu), =
kvmppc_get_last_inst(vcpu));
                        kvmppc_core_queue_program(vcpu, flags);
                        r =3D RESUME_GUEST;

which also doesn't throttle the printk, but I think injecting a program =
fault into the guest is the most sensible thing to do if we don't know =
what the instruction is supposed to do. Best case we get an oops inside =
the guest telling us what broke :).

>=20
>> /**
>>> * kvmppc_handle_exit
>>> *
>>> @@ -374,12 +530,39 @@ out:
>>> int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>>>                       unsigned int exit_nr)
>>> {
>>> -	enum emulation_result er;
>>> 	int r =3D RESUME_HOST;
>>>=20
>>> 	/* update before a new last_exit_type is rewritten */
>>> 	kvmppc_update_timing_stats(vcpu);
>>>=20
>>> +	/*
>>> +	 * If we actually care, we could copy MSR, DEAR, and ESR to =
regs,
>>> +	 * insert an appropriate trap number, etc.
>>> +	 *
>>> +	 * Seems like a waste of cycles for something that should only =
matter
>>> +	 * to someone using sysrq-t/p or similar host kernel debug =
facility.
>>> +	 * We have other debug facilities to get that information from a
>>> +	 * guest through userspace.
>>> +	 */
>>> +	switch (exit_nr) {
>>> +	case BOOKE_INTERRUPT_EXTERNAL:
>>> +		do_IRQ(&vcpu->arch.regs);
>>=20
>> Ah, so that's what you want to use regs for. So is having a pt_regs
>> struct that only contains useful register values in half its fields
>> any useful here? Or could we keep control of the registers ourselves,
>> enabling us to maybe one day optimize things more.
>=20
> I think it contains enough to be useful for debugging code such as =
sysrq
> and tracers, and as noted in the comment we could copy the rest if we
> care enough.  MSR might be worth copying.
>=20
> It will eventually be used for machine checks as well, which I'd like =
to
> hand reasonable register state to, at least for GPRs, LR, and PC.
>=20
> If there's a good enough performance reason, we could just copy
> everything over for machine checks and pass NULL to do_IRQ (I think it
> can take this -- a dummy regs struct if not), but it seems premature =
at
> the moment unless the switch already causes measured performance loss
> (cache utilization?).

I'm definitely not concerned about performance, but complexity and =
uniqueness. With the pt_regs struct, we have a bunch of fields in the =
vcpu that are there, but unused. I find that situation pretty confusing.

So yes, I would definitely prefer to copy registers during MC and keep =
the registers where they are today - unless there are SPRs for them of =
course.

Imagine we'd one day want to share GPRs with user space through the =
kvm_run structure (see the s390 patches on the ML for this). I really =
wouldn't want to make pt_regs part of our userspace ABI.

>=20
>>> @@ -387,30 +570,56 @@ int kvmppc_handle_exit(struct kvm_run *run, =
struct kvm_vcpu *vcpu,
>>>=20
>>> 	switch (exit_nr) {
>>> 	case BOOKE_INTERRUPT_MACHINE_CHECK:
>>> -		printk("MACHINE CHECK: %lx\n", mfspr(SPRN_MCSR));
>>> -		kvmppc_dump_vcpu(vcpu);
>>> -		r =3D RESUME_HOST;
>>> +		kvm_resched(vcpu);
>>> +		r =3D RESUME_GUEST;
>>=20
>> huh?
>=20
> Patch shuffling accident -- this belongs with a later patch that =
invokes
> the host machine check handler similar to what is done with do_IRQ().
> The host machine check handler needs some work first, though.
>=20
>>> 		break;
>>>=20
>>> 	case BOOKE_INTERRUPT_EXTERNAL:
>>> 		kvmppc_account_exit(vcpu, EXT_INTR_EXITS);
>>> -		if (need_resched())
>>> -			cond_resched();
>>> +		kvm_resched(vcpu);
>>=20
>> Why are we explicit about the resched? On book3s I just call =
kvm_resched(vcpu) before the switch().
>=20
> There are a few exit types where we don't currently do the resched -- =
if
> they're all bugs or don't-cares, we could move it out of the switch.
>=20
> We probably should defer the check until after we've disabled
> interrupts, similar to signals -- even if we didn't exit for an
> interrupt, we could have received one after enabling them.

Yup. I just don't think you can call resched() with interrupts disabled, =
so a bit cleverness is probably required here.

>=20
>>> +		if (kvm_is_visible_gfn(vcpu->kvm, gfn)) {
>>> +			/* The guest TLB had a mapping, but the shadow =
TLB
>>> +			 * didn't. This could be because:
>>> +			 * a) the entry is mapping the host kernel, or
>>> +			 * b) the guest used a large mapping which we're =
faking
>>> +			 * Either way, we need to satisfy the fault =
without
>>> +			 * invoking the guest. */
>>> +			kvmppc_mmu_map(vcpu, eaddr, gpaddr, gtlb_index);
>>> +		} else {
>>> +			/* Guest mapped and leaped at non-RAM! */
>>> +			kvmppc_booke_queue_irqprio(vcpu,
>>> +						   =
BOOKE_IRQPRIO_MACHINE_CHECK);
>>=20
>> Are you sure? Couldn't this also be MMIO? That doesn't really improve =
the situation as executing from MMIO is tricky with the KVM model, but =
it's not necessarily bad. Oh well, I guess we'll have to do something =
and throwing an #MC isn't all that ugly.
>=20
> I think I asked you about executing from MMIO once, and you said it
> wasn't supported even in straight QEMU.  Have things changed?

Yeah, I talked to Anthony about that part and apparently the QEMU design =
does support execution from MMIO. But don't worry about it for now. I =
don't think we'll really have guest OSs doing this. And if they do, we =
can worry about it then.

>=20
>>> diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h
>>> index 05d1d99..d53bcf2 100644
>>> --- a/arch/powerpc/kvm/booke.h
>>> +++ b/arch/powerpc/kvm/booke.h
>>> @@ -48,7 +48,20 @@
>>> #define BOOKE_IRQPRIO_PERFORMANCE_MONITOR 19
>>> /* Internal pseudo-irqprio for level triggered externals */
>>> #define BOOKE_IRQPRIO_EXTERNAL_LEVEL 20
>>> -#define BOOKE_IRQPRIO_MAX 20
>>> +#define BOOKE_IRQPRIO_DBELL 21
>>> +#define BOOKE_IRQPRIO_DBELL_CRIT 22
>>> +#define BOOKE_IRQPRIO_MAX 23
>>=20
>> So was MAX wrong before or is it too big now?
>=20
> MAX is just a marker for how many IRQPRIOs we have, not any sort of
> external limit.  This patch adds new IRQPRIOs, so MAX goes up.
>=20
> The actual limit is the number of bits in a long.

Yes, and before the highest value was 20 with MAX being 20, now the =
highest value is 22 with MAX being 23. Either MAX =3D=3D highest number =
or MAX =3D=3D highest number + 1, but you're changing the semantics of =
MAX here. Maybe it was wrong before, I don't know, hence I'm asking :).

>=20
>>> +	.if	\flags & NEED_EMU
>>> +	lwz	r9, VCPU_KVM(r4)
>>=20
>> writing r9
>>=20
>>> +	.endif
>>> +
>>> +#ifdef CONFIG_KVM_EXIT_TIMING
>>> +	/* save exit time */
>>> +1:	mfspr	r7, SPRN_TBRU
>>> +	mfspr	r8, SPRN_TBRL
>>> +	mfspr	r9, SPRN_TBRU
>>=20
>> overwriting r9 again?
>=20
> Oops.  It's RFC for a reason. :-)
>=20
>>> +#ifndef CONFIG_64BIT
>>=20
>> Double negation is always hard to read. Please reverse the ifdef :)
>=20
> OK.
>=20
>>> +lightweight_exit:
>>> +	PPC_STL	r2, HOST_R2(r1)
>>> +
>>> +	mfspr	r3, SPRN_PID
>>> +	stw	r3, VCPU_HOST_PID(r4)
>>> +	lwz	r3, VCPU_GUEST_PID(r4)
>>> +	mtspr	SPRN_PID, r3
>>> +
>>> +	/* Save vcpu pointer for the exception handlers
>>> +	 * must be done before loading guest r2.
>>> +	 */
>>> +//	SET_VCPU(r4)
>>=20
>> hm?
>=20
> Can just be removed, it's handled in booke's vcpu load/put.
>=20
>>> +	lwz	r6, (VCPU_SHARED_MAS2 + 4)(r11)
>>> +#else
>>> +	ld	r6, (VCPU_SHARED_MAS2)(r11)
>>> +#endif
>>> +	lwz	r7, VCPU_SHARED_MAS7_3+4(r11)
>>> +	lwz	r8, VCPU_SHARED_MAS4(r11)
>>> +	mtspr	SPRN_MAS0, r3
>>> +	mtspr	SPRN_MAS1, r5
>>> +	mtspr	SPRN_MAS2, r6
>>> +	mtspr	SPRN_MAS3, r7
>>> +	mtspr	SPRN_MAS4, r8
>>> +	lwz	r3, VCPU_SHARED_MAS6(r11)
>>> +	lwz	r5, VCPU_SHARED_MAS7_3+0(r11)
>>> +	mtspr	SPRN_MAS6, r3
>>> +	mtspr	SPRN_MAS7, r5
>>> +	/* Disable MAS register updates via exception */
>>> +	mfspr	r3, SPRN_EPCR
>>> +	oris	r3, r3, SPRN_EPCR_DMIUH@h
>>> +	mtspr	SPRN_EPCR, r3
>>=20
>> Shouldn't this happen before you set the MAS registers? :)
>=20
> Yes (though we really shouldn't be getting a TLB miss here, at least =
on
> e500mc).

Yeah, but the way it's now it gives you a false feeling of security :)

>=20
>>> +	/* Load some guest volatiles. */
>>> +	PPC_LL	r3, VCPU_LR(r4)
>>> +	PPC_LL	r5, VCPU_XER(r4)
>>> +	PPC_LL	r6, VCPU_CTR(r4)
>>> +	PPC_LL	r7, VCPU_CR(r4)
>>> +	PPC_LL	r8, VCPU_PC(r4)
>>> +#ifndef CONFIG_64BIT
>>> +	lwz	r9, (VCPU_SHARED_MSR + 4)(r11)
>>> +#else
>>> +	ld	r9, (VCPU_SHARED_MSR)(r11)
>>> +#endif
>>> +	PPC_LL	r0, VCPU_GPR(r0)(r4)
>>> +	PPC_LL	r1, VCPU_GPR(r1)(r4)
>>> +	PPC_LL	r2, VCPU_GPR(r2)(r4)
>>> +	PPC_LL	r10, VCPU_GPR(r10)(r4)
>>> +	PPC_LL	r11, VCPU_GPR(r11)(r4)
>>> +	PPC_LL	r12, VCPU_GPR(r12)(r4)
>>> +	PPC_LL	r13, VCPU_GPR(r13)(r4)
>>> +	mtlr	r3
>>> +	mtxer	r5
>>> +	mtctr	r6
>>> +	mtcr	r7
>>> +	mtsrr0	r8
>>> +	mtsrr1	r9
>>=20
>> Are you sure this should be shared->msr, not shadow_msr?
>=20
> Yes, we don't use shadow_msr on bookehv.  I'll add a comment in the
> struct definition as discussed in the other thread, as well as other
> areas where there are subtle differences between PR-mode and GS-mode.

Thanks!


Alex

^ permalink raw reply

* Re: [PATCH 2/2] Kbuild: Use dtc's -d (dependency) option
From: Shawn Guo @ 2012-01-10  3:12 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Michal Marek, Jon Loeliger, Russell King, linux-c6x-dev,
	Aurelien Jacquiot, microblaze-uclinux, Devicetree Discuss,
	linux-kbuild, linux-kernel, Rob Herring, linux, linuxppc-dev,
	linux-arm-kernel, David Gibson
In-Reply-To: <1326134295-15547-2-git-send-email-swarren@nvidia.com>

On Mon, Jan 09, 2012 at 11:38:15AM -0700, Stephen Warren wrote:
> This hooks dtc into Kbuild's dependency system.
> 
> Thus, for example, "make dtbs" will rebuild tegra-harmony.dtb if only
> tegra20.dtsi has changed yet tegra-harmony.dts has not. The previous
> lack of this feature recently caused me to have very confusing "git
> bisect" results.
> 
> For ARM, it's obvious what to add to $(targets). I'm not familiar enough
> with other architectures to know what to add there. Powerpc appears to
> already add various .dtb files into $(targets), but the other archs may
> need something added to $(targets) to work.
> 
> Signed-off-by: Stephen Warren <swarren@nvidia.com>

Though I did not look into the patches deeply, I know the problem very
well and it annoys me a lot.  It's great that we have patches to fix
it, so

Acked-by: Shawn Guo <shawn.guo@linaro.org>

-- 
Regards,
Shawn

^ permalink raw reply

* RE: [PATCH] powerpc/85xx: Add P1024rdb dts support
From: Tang Yuantian-B29983 @ 2012-01-10  3:45 UTC (permalink / raw)
  To: Wood Scott-B07421; +Cc: linuxppc-dev@lists.ozlabs.org
In-Reply-To: <4F0B561A.7060903@freescale.com>


> On 01/09/2012 02:37 AM, b29983@freescale.com wrote:
> > +/include/ "p1024rdb.dtsi"
> > +/include/ "fsl/p1020si-post.dtsi"
>=20
> Is p1024 100% software-compatible with p1020?
>=20
> They have different manuals...
>=20
> -Scott

P1020rdb has vitesse-7385 switch.
fsl/p1020si-post.dtsi can be used for both boards.

Regards,
Yuantian

^ permalink raw reply

* Re: [SDK v1.2][PATCH 1/2 v3] powerpc/85xx: Add dts for P1021RDB-PC board
From: Xu Jiucheng @ 2012-01-10  7:17 UTC (permalink / raw)
  To: Scott Wood; +Cc: Matthew McClintock, linuxppc-dev
In-Reply-To: <4F0B541C.1050001@freescale.com>

=E5=9C=A8 2012-01-09Mon=E7=9A=84 14:54 -0600=EF=BC=8CScott Wood=E5=86=99=E9=
=81=93=EF=BC=9A
> On 01/09/2012 12:53 AM, Xu Jiucheng wrote:
> > +	nand@1,0 {
> > +		#address-cells =3D <1>;
> > +		#size-cells =3D <1>;
> > +		compatible =3D "fsl,p1020-fcm-nand",
> > +			     "fsl,elbc-fcm-nand";
>=20
> s/p1020/p1021/
>=20
> -Scott

Ok.

Thanks & Best Regards
Jiucheng

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox