LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH v2 05/37] KVM: PPC: Book3S HV: Ensure MSR[ME] is always set in guest MSR
From: Daniel Axtens @ 2021-02-26  6:06 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin, Fabiano Rosas
In-Reply-To: <20210225134652.2127648-6-npiggin@gmail.com>

Hi Nick,

>  void kvmppc_set_msr_hv(struct kvm_vcpu *vcpu, u64 msr)
>  {
> +	/*
> +	 * Guest must always run with machine check interrupt
> +	 * enabled.
> +	 */
> +	if (!(msr & MSR_ME))
> +		msr |= MSR_ME;

This 'if' is technically redundant but you mention a future patch warning
on !(msr & MSR_ME) so I'm holding off on any judgement about the 'if' until
I get to that patch :)

The patch seems sane to me, I agree that we don't want guests running with
MSR_ME=0 and kvmppc_set_msr_hv already ensures that the transactional state is
sane so this is another sanity-enforcement in the same sort of vein.

All up:
Reviewed-by: Daniel Axtens <dja@axtens.net>

Kind regards,
Daniel

> +
>  	/*
>  	 * Check for illegal transactional state bit combination
>  	 * and if we find it, force the TS field to a safe state.
> -- 
> 2.23.0

^ permalink raw reply

* [PATCH V2 1/2] powerpc/perf: Infrastructure to support checking of attr.config*
From: Madhavan Srinivasan @ 2021-02-26  6:50 UTC (permalink / raw)
  To: mpe; +Cc: Madhavan Srinivasan, linuxppc-dev

Introduce code to support the checking of attr.config* for
values which are reserved for a given platform.
Performance Monitoring Unit (PMU) configuration registers
have fields that are reserved and specific value to bit field
as reserved. For ex., MMCRA[61:62] is Randome Sampling Mode (SM)
and value of 0b11 to this field is reserved.

Writing a non-zero values in these fields or writing invalid
value to bit fields will have unknown behaviours.

Patch adds a generic call-back function "check_attr_config"
in "struct power_pmu", to be called in event_init to
check for attr.config* values for a given platform.
"check_attr_config" is valid only for raw event type.

Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
---
Changelog v1:
-Fixed commit message and in-code comments

 arch/powerpc/include/asm/perf_event_server.h |  6 ++++++
 arch/powerpc/perf/core-book3s.c              | 14 ++++++++++++++
 2 files changed, 20 insertions(+)

diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
index 00e7e671bb4b..dde97d7d9253 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -67,6 +67,12 @@ struct power_pmu {
 	 * the pmu supports extended perf regs capability
 	 */
 	int		capabilities;
+	/*
+	 * Function to check event code for values which are
+	 * reserved. Function takes struct perf_event as input,
+	 * since event code could be spread in attr.config*
+	 */
+	int		(*check_attr_config)(struct perf_event *ev);
 };
 
 /*
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 6817331e22ff..c6eeb4fdc5fd 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1958,6 +1958,20 @@ static int power_pmu_event_init(struct perf_event *event)
 
 		if (ppmu->blacklist_ev && is_event_blacklisted(ev))
 			return -EINVAL;
+		/*
+		 * PMU config registers have fields that are
+		 * reserved and specific value to bit field as reserved.
+		 * For ex., MMCRA[61:62] is Randome Sampling Mode (SM)
+		 * and value of 0b11 to this field is reserved.
+		 *
+		 * This check is needed only for raw event type,
+		 * since tools like fuzzer use raw event type to
+		 * provide randomized event code values for test.
+		 *
+		 */
+		if (ppmu->check_attr_config &&
+		    ppmu->check_attr_config(event))
+			return -EINVAL;
 		break;
 	default:
 		return -ENOENT;
-- 
2.26.2


^ permalink raw reply related

* [PATCH V2 2/2] powerpc/perf: Add platform specific check_attr_config
From: Madhavan Srinivasan @ 2021-02-26  6:50 UTC (permalink / raw)
  To: mpe; +Cc: Madhavan Srinivasan, linuxppc-dev
In-Reply-To: <20210226065025.1254973-1-maddy@linux.ibm.com>

Add platform specific attr.config value checks. Patch
includes checks for both power9 and power10.

Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
---
Changelog v1:
- No changes.

 arch/powerpc/perf/isa207-common.c | 41 +++++++++++++++++++++++++++++++
 arch/powerpc/perf/isa207-common.h |  2 ++
 arch/powerpc/perf/power10-pmu.c   | 13 ++++++++++
 arch/powerpc/perf/power9-pmu.c    | 13 ++++++++++
 4 files changed, 69 insertions(+)

diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c
index e4f577da33d8..b255799f5b51 100644
--- a/arch/powerpc/perf/isa207-common.c
+++ b/arch/powerpc/perf/isa207-common.c
@@ -694,3 +694,44 @@ int isa207_get_alternatives(u64 event, u64 alt[], int size, unsigned int flags,
 
 	return num_alt;
 }
+
+int isa3_X_check_attr_config(struct perf_event *ev)
+{
+	u64 val, sample_mode;
+	u64 event = ev->attr.config;
+
+	val = (event >> EVENT_SAMPLE_SHIFT) & EVENT_SAMPLE_MASK;
+	sample_mode = val & 0x3;
+
+	/*
+	 * MMCRA[61:62] is Randome Sampling Mode (SM).
+	 * value of 0b11 is reserved.
+	 */
+	if (sample_mode == 0x3)
+		return -1;
+
+	/*
+	 * Check for all reserved value
+	 */
+	switch (val) {
+	case 0x5:
+	case 0x9:
+	case 0xD:
+	case 0x19:
+	case 0x1D:
+	case 0x1A:
+	case 0x1E:
+		return -1;
+	}
+
+	/*
+	 * MMCRA[48:51]/[52:55]) Threshold Start/Stop
+	 * Events Selection.
+	 * 0b11110000/0b00001111 is reserved.
+	 */
+	val = (event >> EVENT_THR_CTL_SHIFT) & EVENT_THR_CTL_MASK;
+	if (((val & 0xF0) == 0xF0) || ((val & 0xF) == 0xF))
+		return -1;
+
+	return 0;
+}
diff --git a/arch/powerpc/perf/isa207-common.h b/arch/powerpc/perf/isa207-common.h
index 1af0e8c97ac7..ae8eaf05efd1 100644
--- a/arch/powerpc/perf/isa207-common.h
+++ b/arch/powerpc/perf/isa207-common.h
@@ -280,4 +280,6 @@ void isa207_get_mem_data_src(union perf_mem_data_src *dsrc, u32 flags,
 							struct pt_regs *regs);
 void isa207_get_mem_weight(u64 *weight);
 
+int isa3_X_check_attr_config(struct perf_event *ev);
+
 #endif
diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
index a901c1348cad..bc64354cab6a 100644
--- a/arch/powerpc/perf/power10-pmu.c
+++ b/arch/powerpc/perf/power10-pmu.c
@@ -106,6 +106,18 @@ static int power10_get_alternatives(u64 event, unsigned int flags, u64 alt[])
 	return num_alt;
 }
 
+static int power10_check_attr_config(struct perf_event *ev)
+{
+	u64 val;
+	u64 event = ev->attr.config;
+
+	val = (event >> EVENT_SAMPLE_SHIFT) & EVENT_SAMPLE_MASK;
+	if (val == 0x10 || isa3_X_check_attr_config(ev))
+		return -1;
+
+	return 0;
+}
+
 GENERIC_EVENT_ATTR(cpu-cycles,			PM_RUN_CYC);
 GENERIC_EVENT_ATTR(instructions,		PM_RUN_INST_CMPL);
 GENERIC_EVENT_ATTR(branch-instructions,		PM_BR_CMPL);
@@ -559,6 +571,7 @@ static struct power_pmu power10_pmu = {
 	.attr_groups		= power10_pmu_attr_groups,
 	.bhrb_nr		= 32,
 	.capabilities           = PERF_PMU_CAP_EXTENDED_REGS,
+	.check_attr_config	= power10_check_attr_config,
 };
 
 int init_power10_pmu(void)
diff --git a/arch/powerpc/perf/power9-pmu.c b/arch/powerpc/perf/power9-pmu.c
index 2a57e93a79dc..b3b9b226d053 100644
--- a/arch/powerpc/perf/power9-pmu.c
+++ b/arch/powerpc/perf/power9-pmu.c
@@ -151,6 +151,18 @@ static int power9_get_alternatives(u64 event, unsigned int flags, u64 alt[])
 	return num_alt;
 }
 
+static int power9_check_attr_config(struct perf_event *ev)
+{
+	u64 val;
+	u64 event = ev->attr.config;
+
+	val = (event >> EVENT_SAMPLE_SHIFT) & EVENT_SAMPLE_MASK;
+	if (val == 0xC || isa3_X_check_attr_config(ev))
+		return -1;
+
+	return 0;
+}
+
 GENERIC_EVENT_ATTR(cpu-cycles,			PM_CYC);
 GENERIC_EVENT_ATTR(stalled-cycles-frontend,	PM_ICT_NOSLOT_CYC);
 GENERIC_EVENT_ATTR(stalled-cycles-backend,	PM_CMPLU_STALL);
@@ -437,6 +449,7 @@ static struct power_pmu power9_pmu = {
 	.attr_groups		= power9_pmu_attr_groups,
 	.bhrb_nr		= 32,
 	.capabilities           = PERF_PMU_CAP_EXTENDED_REGS,
+	.check_attr_config	= power9_check_attr_config,
 };
 
 int init_power9_pmu(void)
-- 
2.26.2


^ permalink raw reply related

* Re: [PATCH next v3 12/15] printk: introduce a kmsg_dump iterator
From: John Ogness @ 2021-02-26  7:59 UTC (permalink / raw)
  To: Petr Mladek
  Cc: linux-hyperv, Sergey Senozhatsky, Douglas Anderson,
	Paul Mackerras, Miquel Raynal, K. Y. Srinivasan, Thomas Meyer,
	Vignesh Raghavendra, Wei Liu, Madhavan Srinivasan,
	Stephen Hemminger, kernel test robot, Anton Vorontsov,
	clang-built-linux, Joel Stanley, Jason Wessel, Anton Ivanov,
	Wei Li, Haiyang Zhang, Ravi Bangoria, Kees Cook, Alistair Popple,
	Jeff Dike, Colin Cross, linux-um, Daniel Thompson, Steven Rostedt,
	Davidlohr Bueso, Nicholas Piggin, Oleg Nesterov, Thomas Gleixner,
	Andy Shevchenko, Jordan Niethe, Michael Kelley, Christophe Leroy,
	Tony Luck, kbuild-all, Pavel Tatashin, linux-kernel,
	Sergey Senozhatsky, Richard Weinberger, kgdb-bugreport, linux-mtd,
	linuxppc-dev, Mike Rapoport
In-Reply-To: <20210225202438.28985-13-john.ogness@linutronix.de>

Hello,

Thank you kernel test robot!

Despite all of my efforts to carefully construct and test this series,
somehome I managed to miss a compile test with CONFIG_MTD_OOPS. That
kmsg_dumper does require the dumper parameter so that it can use
container_of().

I will discuss this with the printk team. But most likely we will just
re-instate the dumper parameter in the callback.

I apologize for the lack of care on my part.

John Ogness

On 2021-02-26, kernel test robot <lkp@intel.com> wrote:
> Hi John,
>
> I love your patch! Yet something to improve:
>
> [auto build test ERROR on next-20210225]
>
> url:    https://github.com/0day-ci/linux/commits/John-Ogness/printk-remove-logbuf_lock/20210226-043457
> base:    7f206cf3ec2bee4621325cfacb2588e5085c07f5
> config: arm-randconfig-r024-20210225 (attached as .config)
> compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project a921aaf789912d981cbb2036bdc91ad7289e1523)
> reproduce (this is a W=1 build):
>         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # install arm cross compiling tool for clang build
>         # apt-get install binutils-arm-linux-gnueabi
>         # https://github.com/0day-ci/linux/commit/fc7f655cded40fc98ba5304c200e3a01e8291fb4
>         git remote add linux-review https://github.com/0day-ci/linux
>         git fetch --no-tags linux-review John-Ogness/printk-remove-logbuf_lock/20210226-043457
>         git checkout fc7f655cded40fc98ba5304c200e3a01e8291fb4
>         # save the attached .config to linux build tree
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm 
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
>
> All errors (new ones prefixed by >>):
>
>>> drivers/mtd/mtdoops.c:277:45: error: use of undeclared identifier 'dumper'
>            struct mtdoops_context *cxt = container_of(dumper,
>                                                       ^
>>> drivers/mtd/mtdoops.c:277:45: error: use of undeclared identifier 'dumper'
>>> drivers/mtd/mtdoops.c:277:45: error: use of undeclared identifier 'dumper'
>    3 errors generated.
>
>
> vim +/dumper +277 drivers/mtd/mtdoops.c
>
> 4b23aff083649e Richard Purdie 2007-05-29  274  
> fc7f655cded40f John Ogness    2021-02-25  275  static void mtdoops_do_dump(enum kmsg_dump_reason reason)
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  276  {
> 2e386e4bac9055 Simon Kagstrom 2009-11-03 @277  	struct mtdoops_context *cxt = container_of(dumper,
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  278  			struct mtdoops_context, dump);
> fc7f655cded40f John Ogness    2021-02-25  279  	struct kmsg_dump_iter iter;
> fc2d557c74dc58 Seiji Aguchi   2011-01-12  280  
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  281  	/* Only dump oopses if dump_oops is set */
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  282  	if (reason == KMSG_DUMP_OOPS && !dump_oops)
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  283  		return;
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  284  
> fc7f655cded40f John Ogness    2021-02-25  285  	kmsg_dump_rewind(&iter);
> fc7f655cded40f John Ogness    2021-02-25  286  
> df92cad8a03e83 John Ogness    2021-02-25  287  	if (test_and_set_bit(0, &cxt->oops_buf_busy))
> df92cad8a03e83 John Ogness    2021-02-25  288  		return;
> fc7f655cded40f John Ogness    2021-02-25  289  	kmsg_dump_get_buffer(&iter, true, cxt->oops_buf + MTDOOPS_HEADER_SIZE,
> e2ae715d66bf4b Kay Sievers    2012-06-15  290  			     record_size - MTDOOPS_HEADER_SIZE, NULL);
> df92cad8a03e83 John Ogness    2021-02-25  291  	clear_bit(0, &cxt->oops_buf_busy);
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  292  
> c1cf1d57d14922 Mark Tomlinson 2020-09-03  293  	if (reason != KMSG_DUMP_OOPS) {
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  294  		/* Panics must be written immediately */
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  295  		mtdoops_write(cxt, 1);
> c1cf1d57d14922 Mark Tomlinson 2020-09-03  296  	} else {
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  297  		/* For other cases, schedule work to write it "nicely" */
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  298  		schedule_work(&cxt->work_write);
> 2e386e4bac9055 Simon Kagstrom 2009-11-03  299  	}
> c1cf1d57d14922 Mark Tomlinson 2020-09-03  300  }
> 4b23aff083649e Richard Purdie 2007-05-29  301  
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply

* Re: [PATCH] perf bench numa: Fix the condition checks for max number of numa nodes
From: Srikar Dronamraju @ 2021-02-26  8:58 UTC (permalink / raw)
  To: Athira Rajeev
  Cc: ravi.bangoria, maddy, peterz, linux-kernel, acme,
	linux-perf-users, jolsa, kjain, linuxppc-dev, kan.liang
In-Reply-To: <1614271802-1503-1-git-send-email-atrajeev@linux.vnet.ibm.com>

* Athira Rajeev <atrajeev@linux.vnet.ibm.com> [2021-02-25 11:50:02]:

> In systems having higher node numbers available like node
> 255, perf numa bench will fail with SIGABORT.
> 
> <<>>
> perf: bench/numa.c:1416: init: Assertion `!(g->p.nr_nodes > 64 || g->p.nr_nodes < 0)' failed.
> Aborted (core dumped)
> <<>>
> 

Looks good to me.

Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply

* Re: [PATCH V2] powerpc/perf: Fix handling of privilege level checks in perf interrupt context
From: Peter Zijlstra @ 2021-02-26  9:35 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: maddy, omosnace, acme, jolsa, linuxppc-dev, kan.liang
In-Reply-To: <1614247839-1428-1-git-send-email-atrajeev@linux.vnet.ibm.com>

On Thu, Feb 25, 2021 at 05:10:39AM -0500, Athira Rajeev wrote:
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 4b4319d8..c8be44c 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -222,7 +222,7 @@ static inline void perf_get_data_addr(struct perf_event *event, struct pt_regs *
>  	if (!(mmcra & MMCRA_SAMPLE_ENABLE) || sdar_valid)
>  		*addrp = mfspr(SPRN_SDAR);
>  
> -	if (is_kernel_addr(mfspr(SPRN_SDAR)) && perf_allow_kernel(&event->attr) != 0)
> +	if (is_kernel_addr(mfspr(SPRN_SDAR)) && event->attr.exclude_kernel)
>  		*addrp = 0;
>  }
>  
> @@ -507,7 +507,7 @@ static void power_pmu_bhrb_read(struct perf_event *event, struct cpu_hw_events *
>  			 * addresses, hence include a check before filtering code
>  			 */
>  			if (!(ppmu->flags & PPMU_ARCH_31) &&
> -				is_kernel_addr(addr) && perf_allow_kernel(&event->attr) != 0)
> +			    is_kernel_addr(addr) && event->attr.exclude_kernel)
>  				continue;
>  
>  			/* Branches are read most recent first (ie. mfbhrb 0 is

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply

* Re: [PATCH v4 12/14] swiotlb: Add restricted DMA alloc/free support.
From: Claire Chang @ 2021-02-26  9:35 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: heikki.krogerus, peterz, grant.likely, paulus, Frank Rowand,
	mingo, Marek Szyprowski, sstabellini, Saravana Kannan,
	Joerg Roedel, Rafael J . Wysocki, Bartosz Golaszewski, xen-devel,
	Thierry Reding, linux-devicetree, Will Deacon,
	Konrad Rzeszutek Wilk, Dan Williams, linuxppc-dev, Rob Herring,
	boris.ostrovsky, Andy Shevchenko, jgross, Nicolas Boichat,
	Greg KH, Randy Dunlap, lkml, list@263.net:IOMMU DRIVERS,
	Jim Quinlan, xypron.glpk, Robin Murphy, bauerman
In-Reply-To: <20210226051740.GB2072@lst.de>

On Fri, Feb 26, 2021 at 1:17 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Fri, Feb 26, 2021 at 12:17:50PM +0800, Claire Chang wrote:
> > Do you think I should fix this and rebase on the latest linux-next
> > now? I wonder if there are more factor and clean up coming and I
> > should wait after that.
>
> Here is my preferred plan:
>
>  1) wait for my series to support the min alignment in swiotlb to
>     land in Linus tree
>  2) I'll resend my series with the further swiotlb cleanup and
>     refactoring, which includes a slightly rebased version of your
>     patch to add the io_tlb_mem structure
>  3) resend your series on top of that as a baseline
>
> This is my current WIP tree for 2:
>
>   http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/swiotlb-struct

Sounds good to me. Thanks!

^ permalink raw reply

* Re: [PATCH] powerpc/sstep: Fix VSX instruction emulation
From: Ravi Bangoria @ 2021-02-26 11:09 UTC (permalink / raw)
  To: Jordan Niethe; +Cc: Ravi Bangoria, linuxppc-dev, bala24
In-Reply-To: <20210225031946.1458206-1-jniethe5@gmail.com>



On 2/25/21 8:49 AM, Jordan Niethe wrote:
> Commit af99da74333b ("powerpc/sstep: Support VSX vector paired storage
> access instructions") added loading and storing 32 word long data into
> adjacent VSRs. However the calculation used to determine if two VSRs
> needed to be loaded/stored inadvertently prevented the load/storing
> taking place for instructions with a data length less than 16 words.
> 
> This causes the emulation to not function correctly, which can be seen
> by the alignment_handler selftest:
> 
> $ ./alignment_handler
> [snip]
> test: test_alignment_handler_vsx_207
> tags: git_version:powerpc-5.12-1-0-g82d2c16b350f
> VSX: 2.07B
>          Doing lxsspx:   PASSED
>          Doing lxsiwax:  FAILED: Wrong Data
>          Doing lxsiwzx:  PASSED
>          Doing stxsspx:  PASSED
>          Doing stxsiwx:  PASSED
> failure: test_alignment_handler_vsx_207
> test: test_alignment_handler_vsx_300
> tags: git_version:powerpc-5.12-1-0-g82d2c16b350f
> VSX: 3.00B
>          Doing lxsd:     PASSED
>          Doing lxsibzx:  PASSED
>          Doing lxsihzx:  PASSED
>          Doing lxssp:    FAILED: Wrong Data
>          Doing lxv:      PASSED
>          Doing lxvb16x:  PASSED
>          Doing lxvh8x:   PASSED
>          Doing lxvx:     PASSED
>          Doing lxvwsx:   FAILED: Wrong Data
>          Doing lxvl:     PASSED
>          Doing lxvll:    PASSED
>          Doing stxsd:    PASSED
>          Doing stxsibx:  PASSED
>          Doing stxsihx:  PASSED
>          Doing stxssp:   PASSED
>          Doing stxv:     PASSED
>          Doing stxvb16x: PASSED
>          Doing stxvh8x:  PASSED
>          Doing stxvx:    PASSED
>          Doing stxvl:    PASSED
>          Doing stxvll:   PASSED
> failure: test_alignment_handler_vsx_300
> [snip]
> 
> Fix this by making sure all VSX instruction emulation correctly
> load/store from the VSRs.
> 
> Fixes: af99da74333b ("powerpc/sstep: Support VSX vector paired storage access instructions")
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>

Yikes!

Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>

^ permalink raw reply

* Latest Git kernel doesn't compile because of the LINUX_VERSION_CODE issue
From: Christian Zigotzky @ 2021-02-26 12:34 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: Darren Stevens, R.T.Dickinson
In-Reply-To: <CAOSf1CHQ=QDwH=J4kLYqboe481poa7EdbC6gzq29W7KYHhn1YQ@mail.gmail.com>

Hello,

I tried to compile the latest Git kernel today. Unfortunately it doesn't 
compile.

Error messages:

   CC      arch/powerpc/kernel/udbg_16550.o
In file included from ./include/linux/stackprotector.h:10:0,
                  from arch/powerpc/kernel/smp.c:35:
./arch/powerpc/include/asm/stackprotector.h: In function 
‘boot_init_stack_canary’:
./arch/powerpc/include/asm/stackprotector.h:29:30: error: expected 
expression before ‘;’ token
   canary ^= LINUX_VERSION_CODE;
                               ^
scripts/Makefile.build:271: recipe for target 
'arch/powerpc/kernel/smp.o' failed
make[2]: *** [arch/powerpc/kernel/smp.o] Error 1

----

drivers/media/cec/core/cec-api.c: In function ‘cec_adap_g_caps’:
drivers/media/cec/core/cec-api.c:85:35: error: expected expression 
before ‘;’ token
   caps.version = LINUX_VERSION_CODE;

----

I have found the bad commit. It's "Merge tag 'kbuild-v5.12' of 
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild" [1]

The changes in the Makefile (a/Makefile) are responsible for the 
compiling errors. [2]

I was able to revert this bad commit. After that it compiled without any 
problems.

Could you please compile the latest Git kernel and confirm this issue?

Thanks,
Christian

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6fbd6cf85a3be127454a1ad58525a3adcf8612ab
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/Makefile?id=6fbd6cf85a3be127454a1ad58525a3adcf8612ab

^ permalink raw reply

* Re: [PATCH V2 1/2] powerpc/perf: Infrastructure to support checking of attr.config*
From: Paul A. Clarke @ 2021-02-26 14:03 UTC (permalink / raw)
  To: Madhavan Srinivasan; +Cc: linuxppc-dev
In-Reply-To: <20210226065025.1254973-1-maddy@linux.ibm.com>

Another drive-by review... just some minor nits, below...

On Fri, Feb 26, 2021 at 12:20:24PM +0530, Madhavan Srinivasan wrote:
> Introduce code to support the checking of attr.config* for
> values which are reserved for a given platform.
> Performance Monitoring Unit (PMU) configuration registers
> have fields that are reserved and specific value to bit field

I'd reword to "some specific values for bit fields are reserved".

> as reserved. For ex., MMCRA[61:62] is Randome Sampling Mode (SM)

s/Randome/Random/
This occurs here, and below, and in patch 2/2.

> and value of 0b11 to this field is reserved.

s/to/for/

> Writing a non-zero values in these fields or writing invalid
> value to bit fields will have unknown behaviours.

Suggest: Writing non-zero or invalid values in these fields
will have unknown behaviors. (or "behaviours" ;-)

PC

> Patch adds a generic call-back function "check_attr_config"
> in "struct power_pmu", to be called in event_init to
> check for attr.config* values for a given platform.
> "check_attr_config" is valid only for raw event type.
> 
> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
> ---
> Changelog v1:
> -Fixed commit message and in-code comments
> 
>  arch/powerpc/include/asm/perf_event_server.h |  6 ++++++
>  arch/powerpc/perf/core-book3s.c              | 14 ++++++++++++++
>  2 files changed, 20 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
> index 00e7e671bb4b..dde97d7d9253 100644
> --- a/arch/powerpc/include/asm/perf_event_server.h
> +++ b/arch/powerpc/include/asm/perf_event_server.h
> @@ -67,6 +67,12 @@ struct power_pmu {
>  	 * the pmu supports extended perf regs capability
>  	 */
>  	int		capabilities;
> +	/*
> +	 * Function to check event code for values which are
> +	 * reserved. Function takes struct perf_event as input,
> +	 * since event code could be spread in attr.config*
> +	 */
> +	int		(*check_attr_config)(struct perf_event *ev);
>  };
> 
>  /*
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 6817331e22ff..c6eeb4fdc5fd 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -1958,6 +1958,20 @@ static int power_pmu_event_init(struct perf_event *event)
> 
>  		if (ppmu->blacklist_ev && is_event_blacklisted(ev))
>  			return -EINVAL;
> +		/*
> +		 * PMU config registers have fields that are
> +		 * reserved and specific value to bit field as reserved.
> +		 * For ex., MMCRA[61:62] is Randome Sampling Mode (SM)
> +		 * and value of 0b11 to this field is reserved.
> +		 *
> +		 * This check is needed only for raw event type,
> +		 * since tools like fuzzer use raw event type to
> +		 * provide randomized event code values for test.
> +		 *
> +		 */
> +		if (ppmu->check_attr_config &&
> +		    ppmu->check_attr_config(event))
> +			return -EINVAL;
>  		break;
>  	default:
>  		return -ENOENT;
> -- 
> 2.26.2
> 

^ permalink raw reply

* Re: Freeing page tables through RCU
From: Jason Gunthorpe @ 2021-02-26 14:42 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm, linuxppc-dev, linux-kernel, linux-s390
In-Reply-To: <20210225205820.GC2858050@casper.infradead.org>

On Thu, Feb 25, 2021 at 08:58:20PM +0000, Matthew Wilcox wrote:

> I'd like to hear better ideas than this.

You didn't like my suggestion to put a sleepable lock around the
freeing of page tables during flushing?

I still don't see how you convert the sleepable page walkers to use
rcu??

Jason

^ permalink raw reply

* Re: [PATCH v2 13/37] KVM: PPC: Book3S HV P9: Move radix MMU switching instructions together
From: Fabiano Rosas @ 2021-02-26 15:56 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin
In-Reply-To: <20210225134652.2127648-14-npiggin@gmail.com>

Nicholas Piggin <npiggin@gmail.com> writes:

> Switching the MMU from radix<->radix mode is tricky particularly as the
> MMU can remain enabled and requires a certain sequence of SPR updates.
> Move these together into their own functions.
>
> This also includes the radix TLB check / flush because it's tied in to
> MMU switching due to tlbiel getting LPID from LPIDR.
>
> (XXX: isync / hwsync synchronisation TBD)

I see bot mtlpidr and mtlpcr requiring a CSI in the ISA. Do you say we
might need more than an isync?

Regardless, I'd expect that to be separate from the refactoring here,
so:

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>

>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/kvm/book3s_hv.c | 55 +++++++++++++++++++++---------------
>  1 file changed, 32 insertions(+), 23 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 895090636295..23d6dc04b0e9 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -3440,12 +3440,38 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
>  	trace_kvmppc_run_core(vc, 1);
>  }
>
> +static void switch_mmu_to_guest_radix(struct kvm *kvm, struct kvm_vcpu *vcpu, u64 lpcr)
> +{
> +	struct kvmppc_vcore *vc = vcpu->arch.vcore;
> +	struct kvm_nested_guest *nested = vcpu->arch.nested;
> +	u32 lpid;
> +
> +	lpid = nested ? nested->shadow_lpid : kvm->arch.lpid;
> +
> +	mtspr(SPRN_LPID, lpid);
> +	mtspr(SPRN_LPCR, lpcr);
> +	mtspr(SPRN_PID, vcpu->arch.pid);
> +	isync();
> +
> +	/* TLBIEL must have LPIDR set, so set guest LPID before flushing. */
> +	kvmppc_check_need_tlb_flush(kvm, vc->pcpu, nested);
> +}
> +
> +static void switch_mmu_to_host_radix(struct kvm *kvm, u32 pid)
> +{
> +	mtspr(SPRN_PID, pid);
> +	mtspr(SPRN_LPID, kvm->arch.host_lpid);
> +	mtspr(SPRN_LPCR, kvm->arch.host_lpcr);
> +	isync();
> +}
> +
>  /*
>   * Load up hypervisor-mode registers on P9.
>   */
>  static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>  				     unsigned long lpcr)
>  {
> +	struct kvm *kvm = vcpu->kvm;
>  	struct kvmppc_vcore *vc = vcpu->arch.vcore;
>  	s64 hdec;
>  	u64 tb, purr, spurr;
> @@ -3468,12 +3494,12 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>  	 * P8 and P9 suppress the HDEC exception when LPCR[HDICE] = 0,
>  	 * so set HDICE before writing HDEC.
>  	 */
> -	mtspr(SPRN_LPCR, vcpu->kvm->arch.host_lpcr | LPCR_HDICE);
> +	mtspr(SPRN_LPCR, kvm->arch.host_lpcr | LPCR_HDICE);
>  	isync();
>
>  	hdec = time_limit - mftb();
>  	if (hdec < 0) {
> -		mtspr(SPRN_LPCR, vcpu->kvm->arch.host_lpcr);
> +		mtspr(SPRN_LPCR, kvm->arch.host_lpcr);
>  		isync();
>  		return BOOK3S_INTERRUPT_HV_DECREMENTER;
>  	}
> @@ -3508,7 +3534,6 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>  	}
>  	mtspr(SPRN_CIABR, vcpu->arch.ciabr);
>  	mtspr(SPRN_IC, vcpu->arch.ic);
> -	mtspr(SPRN_PID, vcpu->arch.pid);
>
>  	mtspr(SPRN_PSSCR, vcpu->arch.psscr | PSSCR_EC |
>  	      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
> @@ -3522,8 +3547,7 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>
>  	mtspr(SPRN_AMOR, ~0UL);
>
> -	mtspr(SPRN_LPCR, lpcr);
> -	isync();
> +	switch_mmu_to_guest_radix(kvm, vcpu, lpcr);
>
>  	kvmppc_xive_push_vcpu(vcpu);
>
> @@ -3562,7 +3586,6 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>  		mtspr(SPRN_DAWR1, host_dawr1);
>  		mtspr(SPRN_DAWRX1, host_dawrx1);
>  	}
> -	mtspr(SPRN_PID, host_pidr);
>
>  	/*
>  	 * Since this is radix, do a eieio; tlbsync; ptesync sequence in
> @@ -3577,9 +3600,6 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>  	if (cpu_has_feature(CPU_FTR_ARCH_31))
>  		asm volatile(PPC_CP_ABORT);
>
> -	mtspr(SPRN_LPID, vcpu->kvm->arch.host_lpid);	/* restore host LPID */
> -	isync();
> -
>  	vc->dpdes = mfspr(SPRN_DPDES);
>  	vc->vtb = mfspr(SPRN_VTB);
>  	mtspr(SPRN_DPDES, 0);
> @@ -3596,7 +3616,8 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>  	}
>
>  	mtspr(SPRN_HDEC, 0x7fffffff);
> -	mtspr(SPRN_LPCR, vcpu->kvm->arch.host_lpcr);
> +
> +	switch_mmu_to_host_radix(kvm, host_pidr);
>
>  	return trap;
>  }
> @@ -4130,7 +4151,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>  {
>  	struct kvm_run *run = vcpu->run;
>  	int trap, r, pcpu;
> -	int srcu_idx, lpid;
> +	int srcu_idx;
>  	struct kvmppc_vcore *vc;
>  	struct kvm *kvm = vcpu->kvm;
>  	struct kvm_nested_guest *nested = vcpu->arch.nested;
> @@ -4204,13 +4225,6 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>  	vc->vcore_state = VCORE_RUNNING;
>  	trace_kvmppc_run_core(vc, 0);
>
> -	if (cpu_has_feature(CPU_FTR_HVMODE)) {
> -		lpid = nested ? nested->shadow_lpid : kvm->arch.lpid;
> -		mtspr(SPRN_LPID, lpid);
> -		isync();
> -		kvmppc_check_need_tlb_flush(kvm, pcpu, nested);
> -	}
> -
>  	guest_enter_irqoff();
>
>  	srcu_idx = srcu_read_lock(&kvm->srcu);
> @@ -4229,11 +4243,6 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>
>  	srcu_read_unlock(&kvm->srcu, srcu_idx);
>
> -	if (cpu_has_feature(CPU_FTR_HVMODE)) {
> -		mtspr(SPRN_LPID, kvm->arch.host_lpid);
> -		isync();
> -	}
> -
>  	set_irq_happened(trap);
>
>  	kvmppc_set_host_core(pcpu);

^ permalink raw reply

* Re: Freeing page tables through RCU
From: Matthew Wilcox @ 2021-02-26 16:03 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: linux-mm, linuxppc-dev, linux-kernel, linux-s390
In-Reply-To: <20210226144200.GV2643399@ziepe.ca>

On Fri, Feb 26, 2021 at 10:42:00AM -0400, Jason Gunthorpe wrote:
> On Thu, Feb 25, 2021 at 08:58:20PM +0000, Matthew Wilcox wrote:
> 
> > I'd like to hear better ideas than this.
> 
> You didn't like my suggestion to put a sleepable lock around the
> freeing of page tables during flushing?
> 
> I still don't see how you convert the sleepable page walkers to use
> rcu??

I don't want to convert the sleepable ones to use RCU ... I want to
convert the non-sleeping ones to use RCU.  A page_table_free_lock might
work, but it might have its own problems later (eg a sleeping lock can't
be acquired under RCU or spinlock, and it can't be a spinlock because
it'd have to be held while we wait for IPIs).

I think it would solve my immediate problem, and I wonder if it might
solve some other problems ...

^ permalink raw reply

* Re: Latest Git kernel doesn't compile because of the LINUX_VERSION_CODE issue
From: Christophe Leroy @ 2021-02-26 16:10 UTC (permalink / raw)
  To: Christian Zigotzky, Michael Ellerman, linuxppc-dev
  Cc: Darren Stevens, R.T.Dickinson
In-Reply-To: <99f6d05a-d431-7444-bb0a-180c042c2afd@xenosoft.de>



Le 26/02/2021 à 13:34, Christian Zigotzky a écrit :
> Hello,
> 
> I tried to compile the latest Git kernel today. Unfortunately it doesn't compile.

I have no such problem with latest git kernel.

Christophe

> 
> Error messages:
> 
>    CC      arch/powerpc/kernel/udbg_16550.o
> In file included from ./include/linux/stackprotector.h:10:0,
>                   from arch/powerpc/kernel/smp.c:35:
> ./arch/powerpc/include/asm/stackprotector.h: In function ‘boot_init_stack_canary’:
> ./arch/powerpc/include/asm/stackprotector.h:29:30: error: expected expression before ‘;’ token
>    canary ^= LINUX_VERSION_CODE;
>                                ^
> scripts/Makefile.build:271: recipe for target 'arch/powerpc/kernel/smp.o' failed
> make[2]: *** [arch/powerpc/kernel/smp.o] Error 1
> 
> ----
> 
> drivers/media/cec/core/cec-api.c: In function ‘cec_adap_g_caps’:
> drivers/media/cec/core/cec-api.c:85:35: error: expected expression before ‘;’ token
>    caps.version = LINUX_VERSION_CODE;
> 
> ----
> 
> I have found the bad commit. It's "Merge tag 'kbuild-v5.12' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild" [1]
> 
> The changes in the Makefile (a/Makefile) are responsible for the compiling errors. [2]
> 
> I was able to revert this bad commit. After that it compiled without any problems.
> 
> Could you please compile the latest Git kernel and confirm this issue?
> 
> Thanks,
> Christian
> 
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6fbd6cf85a3be127454a1ad58525a3adcf8612ab 
> 
> [2] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/Makefile?id=6fbd6cf85a3be127454a1ad58525a3adcf8612ab 
> 

^ permalink raw reply

* Re: Freeing page tables through RCU
From: Gerald Schaefer @ 2021-02-26 16:12 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-s390, Janosch Frank, Christian Borntraeger, Heiko Carstens,
	linux-kernel, linux-mm, Alexander Gordeev, Claudio Imbrenda,
	linuxppc-dev
In-Reply-To: <20210225205820.GC2858050@casper.infradead.org>

On Thu, 25 Feb 2021 20:58:20 +0000
Matthew Wilcox <willy@infradead.org> wrote:

> In order to walk the page tables without the mmap semaphore, it must
> be possible to prevent them from being freed and reused (eg if munmap()
> races with viewing /proc/$pid/smaps).
> 
> There is various commentary within the mm on how to prevent this.  One way
> is to disable interrupts, relying on that to block rcu_sched or IPIs.
> I don't think the RT people are terribly happy about reading a proc file
> disabling interrupts, and it doesn't work for architectures that free
> page tables directly instead of batching them into an rcu_sched (because
> the IPI may not be sent to this CPU if the task has never run on it).
> 
> See "Fast GUP" in mm/gup.c
> 
> Ideally, I'd like rcu_read_lock() to delay page table reuse.  This is
> close to trivial for architectures which use entire pages or multiple
> pages for levels of their page tables as we can use the rcu_head embedded
> in struct page to queue the page for RCU.
> 
> s390 and powerpc are the only two architectures I know of that have
> levels of their page table that are smaller than their PAGE_SIZE.
> I'd like to discuss options.  There may be a complicated scheme that
> allows partial pages to be freed via RCU, but I have something simpler
> in mind.  For powerpc in particular, it can have a PAGE_SIZE of 64kB
> and then the MMU wants to see 4kB entries in the PMD.  I suggest that
> instead of allocating each 4kB entry individually, we allocate a 64kB
> page and fill in 16 consecutive PMDs.  This could cost a bit more memory
> (although if you've asked for a CONFIG_PAGE_SIZE of 64kB, you presumably
> don't care too much about it), but it'll make future page faults cheaper
> (as the PMDs will already be present, assuming you have good locality
> of reference).
> 
> I'd like to hear better ideas than this.

Some background on the situation for s390: The architecture defines
an 8 bit pagetable index, so we have 256 entries in a 2 KB pagetable,
but PAGE_SIZE is 4 KB. pte_alloc(_one) will use alloc_page() to allocate
a full 4 KB page, and then do some housekeeping to maintain a per-mm list
of such 4 KB pages, which will contain either one or two 2 KB pagetable
fragments.

This is also the reason why pgtable_t on s390 is not pointing to the
struct page of the (4 KB) page containing a 2 KB pagetable fragment, but
rather to the 2 KB pagetable itself.

I see at least two issues here, with using rcu_head embedded in the
struct page (for a 4 KB page):

1) There might be two 2 KB pagetables present in that 4 KB page,
and the rcu_head would affect both. Not sure if this would really be
a problem, because we already have a similar situation with the split
ptlock embedded in struct page, which also might lock two 2 KB pagetables,
i.e. more than necessary. It still is far less "over-locking" than
using mm->page_table_lock, and the move_pte() code e.g. takes care
to avoid a deadlock if src and dst ptlocks happen to be on the
same page.

So, a similar "over-locking" might also be possible and acceptable
for the rcu_head approach, but I do not really understand if that could
have some deadlock or other unwanted side-effects.

2) The "housekeeping" of our 2 KB pagetable fragments uses page->lru
to maintain the per-mm list. It also (mis)uses page->_refcount to mark
which 2 KB half is used/free, but that should not be an issue I guess.
Using page->lru will be an issue though. IIUC, then page->rcu_head will
overlay with page->lru, so using page->rcu_head for pagetable pages
on s390 would conflict with our page->lru usage for such pagetable pages.

I do not really see how that could be fixed, maybe we could find and
re-use other struct page members for our 2 KB fragment list. Also, for
kvm, there seem to be even more users of page->lru for pagetable pages,
in arch/s390/mm/gmap.c. Not sure though if those would actually also
affect "regular" pagetable walks, or if they are somehow independent.
But if we'd find some new list home for the 2 KB fragments, then that
could probably also be used for the gmap stuff.

^ permalink raw reply

* Re: Freeing page tables through RCU
From: Jason Gunthorpe @ 2021-02-26 16:21 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm, linuxppc-dev, linux-kernel, linux-s390
In-Reply-To: <20210226160354.GE2723601@casper.infradead.org>

On Fri, Feb 26, 2021 at 04:03:54PM +0000, Matthew Wilcox wrote:
> On Fri, Feb 26, 2021 at 10:42:00AM -0400, Jason Gunthorpe wrote:
> > On Thu, Feb 25, 2021 at 08:58:20PM +0000, Matthew Wilcox wrote:
> > 
> > > I'd like to hear better ideas than this.
> > 
> > You didn't like my suggestion to put a sleepable lock around the
> > freeing of page tables during flushing?
> > 
> > I still don't see how you convert the sleepable page walkers to use
> > rcu??
> 
> I don't want to convert the sleepable ones to use RCU ... I want to
> convert the non-sleeping ones to use RCU.  

Why? Convert them to use the normal page table spinlocks?

It makes everything much more understandable if it is locked properly.

The lockless walks should be reserved for special places because they
are actually hard to do properly.

> A page_table_free_lock might work, but it might have its own
> problems later (eg a sleeping lock can't be acquired under RCU or
> spinlock, and it can't be a spinlock because it'd have to be held
> while we wait for IPIs).

The mmap_sem today is serving the function of the page_table_free_lock
idea, so I think a sleepable lock is already proven to work?

Jason

^ permalink raw reply

* Re: [PATCH v4 2/5] ibmvfc: fix invalid sub-CRQ handles after hard reset
From: Brian King @ 2021-02-26 16:39 UTC (permalink / raw)
  To: Tyrel Datwyler, james.bottomley
  Cc: brking, linuxppc-dev, linux-scsi, martin.petersen, linux-kernel
In-Reply-To: <20210225215057.23020-3-tyreld@linux.ibm.com>

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center


^ permalink raw reply

* Re: [PATCH v2 17/37] KVM: PPC: Book3S HV P9: Move setting HDEC after switching to guest LPCR
From: Fabiano Rosas @ 2021-02-26 16:38 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin
In-Reply-To: <20210225134652.2127648-18-npiggin@gmail.com>

Nicholas Piggin <npiggin@gmail.com> writes:

> LPCR[HDICE]=0 suppresses hypervisor decrementer exceptions on some
> processors, so it must be enabled before HDEC is set.
>
> Rather than set it in the host LPCR then setting HDEC, move the HDEC
> update to after the guest MMU context (including LPCR) is loaded.
> There shouldn't be much concern with delaying HDEC by some 10s or 100s
> of nanoseconds by setting it a bit later.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/kvm/book3s_hv.c | 24 ++++++++++--------------
>  1 file changed, 10 insertions(+), 14 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index d4770b222d7e..63cc92c45c5d 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -3490,23 +3490,13 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>  		host_dawrx1 = mfspr(SPRN_DAWRX1);
>  	}
>
> -	/*
> -	 * P8 and P9 suppress the HDEC exception when LPCR[HDICE] = 0,
> -	 * so set HDICE before writing HDEC.
> -	 */
> -	mtspr(SPRN_LPCR, kvm->arch.host_lpcr | LPCR_HDICE);
> -	isync();
> -
> -	hdec = time_limit - mftb();

Would it be possible to leave the mftb() in this patch and then replace
them all at once in patch 20/37 - "KVM: PPC: Book3S HV P9: Reduce mftb
per guest entry/exit"?

> -	if (hdec < 0) {
> -		mtspr(SPRN_LPCR, kvm->arch.host_lpcr);
> -		isync();
> +	tb = mftb();
> +	hdec = time_limit - tb;
> +	if (hdec < 0)
>  		return BOOK3S_INTERRUPT_HV_DECREMENTER;
> -	}
> -	mtspr(SPRN_HDEC, hdec);
>
>  	if (vc->tb_offset) {
> -		u64 new_tb = mftb() + vc->tb_offset;
> +		u64 new_tb = tb + vc->tb_offset;
>  		mtspr(SPRN_TBU40, new_tb);
>  		tb = mftb();
>  		if ((tb & 0xffffff) < (new_tb & 0xffffff))
> @@ -3549,6 +3539,12 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>
>  	switch_mmu_to_guest_radix(kvm, vcpu, lpcr);
>
> +	/*
> +	 * P9 suppresses the HDEC exception when LPCR[HDICE] = 0,
> +	 * so set guest LPCR (with HDICE) before writing HDEC.
> +	 */
> +	mtspr(SPRN_HDEC, hdec);
> +
>  	mtspr(SPRN_SRR0, vcpu->arch.shregs.srr0);
>  	mtspr(SPRN_SRR1, vcpu->arch.shregs.srr1);

^ permalink raw reply

* Re: [PATCH v4 5/5] ibmvfc: reinitialize sub-CRQs and perform channel enquiry after LPM
From: Brian King @ 2021-02-26 16:43 UTC (permalink / raw)
  To: Tyrel Datwyler, james.bottomley
  Cc: brking, linuxppc-dev, linux-scsi, martin.petersen, linux-kernel
In-Reply-To: <20210225215057.23020-6-tyreld@linux.ibm.com>

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>


-- 
Brian King
Power Linux I/O
IBM Linux Technology Center


^ permalink raw reply

* Re: [PATCH v2 23/37] KVM: PPC: Book3S HV P9: Implement the rest of the P9 path in C
From: Fabiano Rosas @ 2021-02-26 22:37 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin
In-Reply-To: <20210225134652.2127648-24-npiggin@gmail.com>

Nicholas Piggin <npiggin@gmail.com> writes:

Hi, thanks for this. It helped me clarify a bunch of details that I
haven't understood while reading the asm code.

Some comments below:

> Almost all logic is moved to C, by introducing a new in_guest mode that
> selects and branches very early in the interrupt handler to the P9 exit
> code.
>
> The remaining assembly is only about 160 lines of low level stack setup,
> with VCPU vs host register save and restore, plus a small shim to the
> legacy paths in the interrupt handler.
>
> There are two motivations for this, the first is just make the code more
> maintainable being in C. The second is to reduce the amount of code
> running in a special KVM mode, "realmode". I put that in quotes because
> with radix it is no longer necessarily real-mode in the MMU, but it
> still has to be treated specially because it may be in real-mode, and
> has various important registers like PID, DEC, TB, etc set to guest.
> This is hostile to the rest of Linux and can't use arbitrary kernel
> functionality or be instrumented well.
>
> This initial patch is a reasonably faithful conversion of the asm code.
> It does lack any loop to return quickly back into the guest without
> switching out of realmode in the case of unimportant or easily handled
> interrupts, as explained in the previous change, handling HV interrupts
> in real mode is not so important for P9.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---

<snip>

> diff --git a/arch/powerpc/kvm/book3s_64_entry.S b/arch/powerpc/kvm/book3s_64_entry.S
> index f826c8dc2e19..cc7b76865a16 100644
> --- a/arch/powerpc/kvm/book3s_64_entry.S
> +++ b/arch/powerpc/kvm/book3s_64_entry.S
> @@ -1,10 +1,14 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
>  #include <asm/asm-offsets.h>
>  #include <asm/cache.h>
> +#include <asm/code-patching-asm.h>
>  #include <asm/exception-64s.h>
> +#include <asm/export.h>
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_book3s_asm.h>
>  #include <asm/ppc_asm.h>
>  #include <asm/reg.h>
> +#include <asm/ultravisor-api.h>
>
>  /*
>   * These are branched to from interrupt handlers in exception-64s.S which set
> @@ -13,13 +17,24 @@
>  .global	kvmppc_hcall
>  .balign IFETCH_ALIGN_BYTES
>  kvmppc_hcall:
> +	lbz	r10,HSTATE_IN_GUEST(r13)
> +	cmpwi	r10,KVM_GUEST_MODE_GUEST_HV_FAST
> +	beq	kvmppc_p9_exit_hcall
>  	ld	r10,PACA_EXGEN+EX_R13(r13)
>  	SET_SCRATCH0(r10)
>  	li	r10,0xc00
> +	li	r11,PACA_EXGEN
> +	b	1f
>
>  .global	kvmppc_interrupt
>  .balign IFETCH_ALIGN_BYTES
>  kvmppc_interrupt:
> +	std	r10,HSTATE_SCRATCH0(r13)
> +	lbz	r10,HSTATE_IN_GUEST(r13)
> +	cmpwi	r10,KVM_GUEST_MODE_GUEST_HV_FAST
> +	beq	kvmppc_p9_exit_interrupt
> +	ld	r10,HSTATE_SCRATCH0(r13)
> +	lbz	r11,HSTATE_IN_GUEST(r13)
>  	li	r11,PACA_EXGEN
>  	cmpdi	r10,0x200
>  	bgt+	1f
> @@ -114,3 +129,169 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
>  	GET_SCRATCH0(r13)
>  	HRFI_TO_KERNEL
>  #endif
> +
> +/* Stack frame offsets for kvmppc_hv_entry */
> +#define SFS			208
> +#define STACK_SLOT_VCPU		(SFS-8)
> +#define STACK_SLOT_NVGPRS	(SFS-152)	/* 18 gprs */
> +
> +/*
> + * void kvmppc_p9_enter_guest(struct vcpu *vcpu);
> + *
> + * Enter the guest on a ISAv3.0 or later system where we have exactly
> + * one vcpu per vcore, and both the host and guest are radix, and threads
> + * are set to "indepdent mode".
> + */
> +.balign	IFETCH_ALIGN_BYTES
> +_GLOBAL(kvmppc_p9_enter_guest)
> +EXPORT_SYMBOL_GPL(kvmppc_p9_enter_guest)
> +	mflr	r0
> +	std	r0,PPC_LR_STKOFF(r1)
> +	stdu	r1,-SFS(r1)
> +
> +	std	r1,HSTATE_HOST_R1(r13)
> +	std	r3,STACK_SLOT_VCPU(r1)

The vcpu was stored on the paca previously. I don't get the change,
could you explain?

> +
> +	mfcr	r4
> +	stw	r4,SFS+8(r1)
> +
> +	reg = 14
> +	.rept	18
> +	std	reg,STACK_SLOT_NVGPRS + ((reg - 14) * 8)(r1)
> +	reg = reg + 1
> +	.endr
> +
> +	ld	r4,VCPU_LR(r3)
> +	mtlr	r4
> +	ld	r4,VCPU_CTR(r3)
> +	mtctr	r4
> +	ld	r4,VCPU_XER(r3)
> +	mtspr	SPRN_XER,r4
> +
> +	ld	r1,VCPU_CR(r3)
> +
> +BEGIN_FTR_SECTION
> +	ld	r4,VCPU_CFAR(r3)
> +	mtspr	SPRN_CFAR,r4
> +END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
> +BEGIN_FTR_SECTION
> +	ld	r4,VCPU_PPR(r3)
> +	mtspr	SPRN_PPR,r4
> +END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
> +
> +	reg = 4
> +	.rept	28
> +	ld	reg,__VCPU_GPR(reg)(r3)
> +	reg = reg + 1
> +	.endr
> +
> +	ld	r4,VCPU_KVM(r3)
> +	lbz	r4,KVM_SECURE_GUEST(r4)
> +	cmpdi	r4,0
> +	ld	r4,VCPU_GPR(R4)(r3)
> +	bne	.Lret_to_ultra
> +
> +	mtcr	r1
> +
> +	ld	r0,VCPU_GPR(R0)(r3)
> +	ld	r1,VCPU_GPR(R1)(r3)
> +	ld	r2,VCPU_GPR(R2)(r3)
> +	ld	r3,VCPU_GPR(R3)(r3)
> +
> +	HRFI_TO_GUEST
> +	b	.
> +
> +	/*
> +	 * Use UV_RETURN ultracall to return control back to the Ultravisor
> +	 * after processing an hypercall or interrupt that was forwarded
> +	 * (a.k.a. reflected) to the Hypervisor.
> +	 *
> +	 * All registers have already been reloaded except the ucall requires:
> +	 *   R0 = hcall result
> +	 *   R2 = SRR1, so UV can detect a synthesized interrupt (if any)
> +	 *   R3 = UV_RETURN
> +	 */
> +.Lret_to_ultra:
> +	mtcr	r1
> +	ld	r1,VCPU_GPR(R1)(r3)
> +
> +	ld	r0,VCPU_GPR(R3)(r3)
> +	mfspr	r2,SPRN_SRR1
> +	LOAD_REG_IMMEDIATE(r3, UV_RETURN)
> +	sc	2
> +
> +/*
> + * kvmppc_p9_exit_hcall and kvmppc_p9_exit_interrupt are branched to from
> + * above if the interrupt was taken for a guest that was entered via
> + * kvmppc_p9_enter_guest().
> + *
> + * This code recovers the host stack and vcpu pointer, saves all GPRs and
> + * CR, LR, CTR, XER as well as guest MSR and NIA into the VCPU, then re-
> + * establishes the host stack and registers to return from  the
> + * kvmppc_p9_enter_guest() function.
> + */
> +.balign	IFETCH_ALIGN_BYTES
> +kvmppc_p9_exit_hcall:
> +	mfspr	r11,SPRN_SRR0
> +	mfspr	r12,SPRN_SRR1
> +	li	r10,0xc00
> +	std	r10,HSTATE_SCRATCH0(r13)
> +
> +.balign	IFETCH_ALIGN_BYTES
> +kvmppc_p9_exit_interrupt:
> +	std     r1,HSTATE_SCRATCH1(r13)
> +	std     r3,HSTATE_SCRATCH2(r13)
> +	ld	r1,HSTATE_HOST_R1(r13)
> +	ld	r3,STACK_SLOT_VCPU(r1)
> +
> +	std	r9,VCPU_CR(r3)
> +
> +1:
> +	std	r11,VCPU_PC(r3)
> +	std	r12,VCPU_MSR(r3)
> +
> +	reg = 14
> +	.rept	18
> +	std	reg,__VCPU_GPR(reg)(r3)
> +	reg = reg + 1
> +	.endr
> +
> +	/* r1, r3, r9-r13 are saved to vcpu by C code */

If we just saved r1 and r3, why don't we put them in the vcpu right now?
That would avoid having the C code knowing about scratch areas.

> +	std	r0,VCPU_GPR(R0)(r3)
> +	std	r2,VCPU_GPR(R2)(r3)
> +	reg = 4
> +	.rept	5
> +	std	reg,__VCPU_GPR(reg)(r3)
> +	reg = reg + 1
> +	.endr
> +
> +	ld	r2,PACATOC(r13)
> +
> +	mflr	r4
> +	std	r4,VCPU_LR(r3)
> +	mfspr	r4,SPRN_XER
> +	std	r4,VCPU_XER(r3)
> +
> +	reg = 14
> +	.rept	18
> +	ld	reg,STACK_SLOT_NVGPRS + ((reg - 14) * 8)(r1)
> +	reg = reg + 1
> +	.endr
> +
> +	lwz	r4,SFS+8(r1)
> +	mtcr	r4
> +
> +	/*
> +	 * Flush the link stack here, before executing the first blr on the
> +	 * way out of the guest.
> +	 *
> +	 * The link stack won't match coming out of the guest anyway so the
> +	 * only cost is the flush itself. The call clobbers r0.
> +	 */
> +1:	nop
> +	patch_site 1b patch__call_kvm_flush_link_stack_2
> +
> +	addi	r1,r1,SFS
> +	ld	r0,PPC_LR_STKOFF(r1)
> +	mtlr	r0
> +	blr
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 1997cf347d3e..28a2761515e3 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -1421,6 +1421,8 @@ static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu,
>  	 */
>  	case BOOK3S_INTERRUPT_H_DATA_STORAGE:
>  		r = RESUME_PAGE_FAULT;
> +		if (vcpu->arch.fault_dsisr == HDSISR_CANARY)
> +			r = RESUME_GUEST; /* Just retry if it's the canary */
>  		break;
>  	case BOOK3S_INTERRUPT_H_INST_STORAGE:
>  		vcpu->arch.fault_dar = kvmppc_get_pc(vcpu);
> @@ -3736,14 +3738,14 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
>  		vcpu->arch.shregs.dsisr = mfspr(SPRN_DSISR);
>  		vcpu->arch.psscr = mfspr(SPRN_PSSCR_PR);
>  		mtspr(SPRN_PSSCR_PR, host_psscr);
> -
>  		/* H_CEDE has to be handled now, not later */
> -		if (trap == BOOK3S_INTERRUPT_SYSCALL && !vcpu->arch.nested &&
> +		if (trap == BOOK3S_INTERRUPT_SYSCALL &&
>  		    kvmppc_get_gpr(vcpu, 3) == H_CEDE) {
>  			kvmppc_cede(vcpu);
>  			kvmppc_set_gpr(vcpu, 3, 0);
>  			trap = 0;
>  		}
> +
>  	} else {
>  		kvmppc_xive_push_vcpu(vcpu);
>  		trap = kvmhv_load_hv_regs_and_go(vcpu, time_limit, lpcr);
> @@ -3768,9 +3770,10 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
>  			}
>  		}
>  		kvmppc_xive_pull_vcpu(vcpu);
> +
> +		vcpu->arch.slb_max = 0;
>  	}
>
> -	vcpu->arch.slb_max = 0;
>  	dec = mfspr(SPRN_DEC);
>  	if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
>  		dec = (s32) dec;
> @@ -4429,11 +4432,19 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu)
>  		else
>  			r = kvmppc_run_vcpu(vcpu);
>
> -		if (run->exit_reason == KVM_EXIT_PAPR_HCALL &&
> -		    !(vcpu->arch.shregs.msr & MSR_PR)) {
> -			trace_kvm_hcall_enter(vcpu);
> -			r = kvmppc_pseries_do_hcall(vcpu);
> -			trace_kvm_hcall_exit(vcpu, r);
> +		if (run->exit_reason == KVM_EXIT_PAPR_HCALL) {
> +			if (unlikely(vcpu->arch.shregs.msr & MSR_PR)) {
> +				/*
> +				 * Guest userspace executed sc 1, reflect it
> +				 * back as a privileged program check interrupt.
> +				 */
> +				kvmppc_core_queue_program(vcpu, SRR1_PROGPRIV);
> +				r = RESUME_GUEST;

This is in conflict with this snippet in kvmppc_handle_exit_hv:

	case BOOK3S_INTERRUPT_SYSCALL:
	{
		/* hcall - punt to userspace */
		int i;

		/* hypercall with MSR_PR has already been handled in rmode,
		 * and never reaches here.
		 */

That function already queues some 0x700s so maybe we could move this one
in there as well.

> +			} else {
> +				trace_kvm_hcall_enter(vcpu);
> +				r = kvmppc_pseries_do_hcall(vcpu);
> +				trace_kvm_hcall_exit(vcpu, r);
> +			}
>  			kvmppc_core_prepare_to_enter(vcpu);
>  		} else if (r == RESUME_PAGE_FAULT) {
>  			srcu_idx = srcu_read_lock(&kvm->srcu);
> diff --git a/arch/powerpc/kvm/book3s_hv_interrupt.c b/arch/powerpc/kvm/book3s_hv_interrupt.c
> new file mode 100644
> index 000000000000..5a7b036c447f
> --- /dev/null
> +++ b/arch/powerpc/kvm/book3s_hv_interrupt.c
> @@ -0,0 +1,221 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +#include <linux/kernel.h>
> +#include <linux/kvm_host.h>
> +#include <asm/asm-prototypes.h>
> +#include <asm/dbell.h>
> +#include <asm/kvm_ppc.h>
> +
> +#ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
> +static void __start_timing(struct kvm_vcpu *vcpu, struct kvmhv_tb_accumulator *next)
> +{
> +	struct kvmppc_vcore *vc = vcpu->arch.vcore;
> +	u64 tb = mftb() - vc->tb_offset_applied;
> +
> +	vcpu->arch.cur_activity = next;
> +	vcpu->arch.cur_tb_start = tb;
> +}
> +
> +static void __accumulate_time(struct kvm_vcpu *vcpu, struct kvmhv_tb_accumulator *next)
> +{
> +	struct kvmppc_vcore *vc = vcpu->arch.vcore;
> +	struct kvmhv_tb_accumulator *curr;
> +	u64 tb = mftb() - vc->tb_offset_applied;
> +	u64 prev_tb;
> +	u64 delta;
> +	u64 seq;
> +
> +	curr = vcpu->arch.cur_activity;
> +	vcpu->arch.cur_activity = next;
> +	prev_tb = vcpu->arch.cur_tb_start;
> +	vcpu->arch.cur_tb_start = tb;
> +
> +	if (!curr)
> +		return;
> +
> +	delta = tb - prev_tb;
> +
> +	seq = curr->seqcount;
> +	curr->seqcount = seq + 1;
> +	smp_wmb();
> +	curr->tb_total += delta;
> +	if (seq == 0 || delta < curr->tb_min)
> +		curr->tb_min = delta;
> +	if (delta > curr->tb_max)
> +		curr->tb_max = delta;
> +	smp_wmb();
> +	curr->seqcount = seq + 2;
> +}
> +
> +#define start_timing(vcpu, next) __start_timing(vcpu, next)
> +#define end_timing(vcpu) __start_timing(vcpu, NULL)
> +#define accumulate_time(vcpu, next) __accumulate_time(vcpu, next)
> +#else
> +#define start_timing(vcpu, next) do {} while (0)
> +#define end_timing(vcpu) do {} while (0)
> +#define accumulate_time(vcpu, next) do {} while (0)
> +#endif
> +
> +static inline void mfslb(unsigned int idx, u64 *slbee, u64 *slbev)
> +{
> +	asm volatile("slbmfev  %0,%1" : "=r" (*slbev) : "r" (idx));
> +	asm volatile("slbmfee  %0,%1" : "=r" (*slbee) : "r" (idx));
> +}
> +
> +static inline void mtslb(unsigned int idx, u64 slbee, u64 slbev)
> +{
> +	BUG_ON((slbee & 0xfff) != idx);
> +
> +	asm volatile("slbmte %0,%1" :: "r" (slbev), "r" (slbee));
> +}
> +
> +static inline void slb_invalidate(unsigned int ih)
> +{
> +	asm volatile("slbia %0" :: "i"(ih));
> +}
> +
> +/*
> + * Malicious or buggy radix guests may have inserted SLB entries
> + * (only 0..3 because radix always runs with UPRT=1), so these must
> + * be cleared here to avoid side-channels. slbmte is used rather
> + * than slbia, as it won't clear cached translations.
> + */
> +static void radix_clear_slb(void)
> +{
> +	u64 slbee, slbev;
> +	int i;
> +
> +	for (i = 0; i < 4; i++) {
> +		mfslb(i, &slbee, &slbev);
> +		if (unlikely(slbee || slbev)) {
> +			slbee = i;
> +			slbev = 0;
> +			mtslb(i, slbee, slbev);
> +		}
> +	}
> +}
> +
> +int __kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu)
> +{
> +	u64 *exsave;
> +	unsigned long msr = mfmsr();
> +	int trap;
> +
> +	start_timing(vcpu, &vcpu->arch.rm_entry);
> +
> +	vcpu->arch.ceded = 0;
> +
> +	WARN_ON_ONCE(vcpu->arch.shregs.msr & MSR_HV);
> +	WARN_ON_ONCE(!(vcpu->arch.shregs.msr & MSR_ME));
> +
> +	mtspr(SPRN_HSRR0, vcpu->arch.regs.nip);
> +	mtspr(SPRN_HSRR1, (vcpu->arch.shregs.msr & ~MSR_HV) | MSR_ME);
> +
> +	/*
> +	 * On POWER9 DD2.1 and below, sometimes on a Hypervisor Data Storage
> +	 * Interrupt (HDSI) the HDSISR is not be updated at all.

s/be //

> +	 *
> +	 * To work around this we put a canary value into the HDSISR before
> +	 * returning to a guest and then check for this canary when we take a
> +	 * HDSI. If we find the canary on a HDSI, we know the hardware didn't
> +	 * update the HDSISR. In this case we return to the guest to retake the
> +	 * HDSI which should correctly update the HDSISR the second time HDSI
> +	 * entry.
> +	 *
> +	 * Just do this on all p9 processors for now.
> +	 */
> +	mtspr(SPRN_HDSISR, HDSISR_CANARY);
> +
> +	accumulate_time(vcpu, &vcpu->arch.guest_time);
> +
> +	local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_GUEST_HV_FAST;
> +	kvmppc_p9_enter_guest(vcpu);
> +	// Radix host and guest means host never runs with guest MMU state
> +	local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_NONE;
> +
> +	accumulate_time(vcpu, &vcpu->arch.rm_intr);
> +
> +	/* Get these from r11/12 and paca exsave */
> +	vcpu->arch.shregs.srr0 = mfspr(SPRN_SRR0);
> +	vcpu->arch.shregs.srr1 = mfspr(SPRN_SRR1);
> +	vcpu->arch.shregs.dar = mfspr(SPRN_DAR);
> +	vcpu->arch.shregs.dsisr = mfspr(SPRN_DSISR);
> +
> +	trap = local_paca->kvm_hstate.scratch0 & ~0x2;

Couldn't we return the trap from kvmppc_p9_enter_guest? Seems like a
nice pattern to have and avoids exposing the IVEC+0x2 magic which is hidden
in asm already.

> +	if (likely(trap > BOOK3S_INTERRUPT_MACHINE_CHECK)) {
> +		exsave = local_paca->exgen;
> +	} else if (trap == BOOK3S_INTERRUPT_SYSTEM_RESET) {
> +		exsave = local_paca->exnmi;
> +	} else { /* trap == 0x200 */
> +		exsave = local_paca->exmc;
> +	}
> +
> +	vcpu->arch.regs.gpr[1] = local_paca->kvm_hstate.scratch1;
> +	vcpu->arch.regs.gpr[3] = local_paca->kvm_hstate.scratch2;
> +	vcpu->arch.regs.gpr[9] = exsave[EX_R9/sizeof(u64)];
> +	vcpu->arch.regs.gpr[10] = exsave[EX_R10/sizeof(u64)];
> +	vcpu->arch.regs.gpr[11] = exsave[EX_R11/sizeof(u64)];
> +	vcpu->arch.regs.gpr[12] = exsave[EX_R12/sizeof(u64)];
> +	vcpu->arch.regs.gpr[13] = exsave[EX_R13/sizeof(u64)];
> +	vcpu->arch.ppr = exsave[EX_PPR/sizeof(u64)];
> +	vcpu->arch.cfar = exsave[EX_CFAR/sizeof(u64)];
> +	vcpu->arch.regs.ctr = exsave[EX_CTR/sizeof(u64)];
> +
> +	vcpu->arch.last_inst = KVM_INST_FETCH_FAILED;
> +
> +	if (unlikely(trap == BOOK3S_INTERRUPT_MACHINE_CHECK)) {
> +		vcpu->arch.fault_dar = exsave[EX_DAR/sizeof(u64)];
> +		vcpu->arch.fault_dsisr = exsave[EX_DSISR/sizeof(u64)];
> +		kvmppc_realmode_machine_check(vcpu);
> +
> +	} else if (unlikely(trap == BOOK3S_INTERRUPT_HMI)) {
> +		kvmppc_realmode_hmi_handler();
> +
> +	} else if (trap == BOOK3S_INTERRUPT_H_EMUL_ASSIST) {
> +		vcpu->arch.emul_inst = mfspr(SPRN_HEIR);
> +
> +	} else if (trap == BOOK3S_INTERRUPT_H_DATA_STORAGE) {
> +		vcpu->arch.fault_dar = exsave[EX_DAR/sizeof(u64)];
> +		vcpu->arch.fault_dsisr = exsave[EX_DSISR/sizeof(u64)];
> +		vcpu->arch.fault_gpa = mfspr(SPRN_ASDR);
> +
> +	} else if (trap == BOOK3S_INTERRUPT_H_INST_STORAGE) {
> +		vcpu->arch.fault_gpa = mfspr(SPRN_ASDR);
> +
> +	} else if (trap == BOOK3S_INTERRUPT_H_FAC_UNAVAIL) {
> +		vcpu->arch.hfscr = mfspr(SPRN_HFSCR);
> +
> +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> +	/*
> +	 * Softpatch interrupt for transactional memory emulation cases
> +	 * on POWER9 DD2.2.  This is early in the guest exit path - we
> +	 * haven't saved registers or done a treclaim yet.
> +	 */
> +	} else if (trap == BOOK3S_INTERRUPT_HV_SOFTPATCH) {
> +		vcpu->arch.emul_inst = mfspr(SPRN_HEIR);
> +
> +		/*
> +		 * The cases we want to handle here are those where the guest
> +		 * is in real suspend mode and is trying to transition to
> +		 * transactional mode.
> +		 */
> +		if (local_paca->kvm_hstate.fake_suspend &&
> +				(vcpu->arch.shregs.msr & MSR_TS_S)) {
> +			if (kvmhv_p9_tm_emulation_early(vcpu)) {
> +				/* Prevent it being handled again. */
> +				trap = 0;
> +			}
> +		}
> +#endif
> +	}
> +
> +	radix_clear_slb();
> +
> +	__mtmsrd(msr, 0);

Isn't this the same as mtmsr(msr) ?

> +
> +	accumulate_time(vcpu, &vcpu->arch.rm_exit);
> +
> +	end_timing(vcpu);
> +
> +	return trap;
> +}
> +EXPORT_SYMBOL_GPL(__kvmhv_vcpu_entry_p9);

<snip>

^ permalink raw reply

* Re: [PATCH v2 01/37] KVM: PPC: Book3S 64: remove unused kvmppc_h_protect argument
From: Nicholas Piggin @ 2021-02-26 23:50 UTC (permalink / raw)
  To: Daniel Axtens, kvm-ppc; +Cc: linuxppc-dev
In-Reply-To: <878s7ba0cm.fsf@linkitivity.dja.id.au>

Excerpts from Daniel Axtens's message of February 26, 2021 3:01 pm:
> Hi Nick,
> 
>> The va argument is not used in the function or set by its asm caller,
>> so remove it to be safe.
> 
> Huh, so it isn't. I tracked the original implementation down to commit
> a8606e20e41a ("KVM: PPC: Handle some PAPR hcalls in the kernel") where
> paulus first added the ability to handle it in the kernel - there it
> takes a va argument but even then doesn't do anything with it.
> 
> ajd also pointed out that we don't pass a va when linux is running as a
> guest, and LoPAR does not mention va as an argument.

Yeah interesting, maybe it was from a pre-release version of PAPR? Who
knows.

> One small nit: checkpatch is complaining about spaces vs tabs:
> ERROR: code indent should use tabs where possible
> #25: FILE: arch/powerpc/include/asm/kvm_ppc.h:770:
> +                      unsigned long pte_index, unsigned long avpn);$
> 
> WARNING: please, no spaces at the start of a line
> #25: FILE: arch/powerpc/include/asm/kvm_ppc.h:770:
> +                      unsigned long pte_index, unsigned long avpn);$

All the declarations are using the same style in this file so I think
I'll leave it for someone to do a cleanup patch on. Okay?

> 
> Once that is resolved,
>   Reviewed-by: Daniel Axtens <dja@axtens.net>

Thanks,
Nick

> 
> Kind regards,
> Daniel Axtens
> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>  arch/powerpc/include/asm/kvm_ppc.h  | 3 +--
>>  arch/powerpc/kvm/book3s_hv_rm_mmu.c | 3 +--
>>  2 files changed, 2 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
>> index 8aacd76bb702..9531b1c1b190 100644
>> --- a/arch/powerpc/include/asm/kvm_ppc.h
>> +++ b/arch/powerpc/include/asm/kvm_ppc.h
>> @@ -767,8 +767,7 @@ long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
>>                       unsigned long pte_index, unsigned long avpn);
>>  long kvmppc_h_bulk_remove(struct kvm_vcpu *vcpu);
>>  long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long flags,
>> -                      unsigned long pte_index, unsigned long avpn,
>> -                      unsigned long va);
>> +                      unsigned long pte_index, unsigned long avpn);
>>  long kvmppc_h_read(struct kvm_vcpu *vcpu, unsigned long flags,
>>                     unsigned long pte_index);
>>  long kvmppc_h_clear_ref(struct kvm_vcpu *vcpu, unsigned long flags,
>> diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
>> index 88da2764c1bb..7af7c70f1468 100644
>> --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
>> +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
>> @@ -673,8 +673,7 @@ long kvmppc_h_bulk_remove(struct kvm_vcpu *vcpu)
>>  }
>>  
>>  long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long flags,
>> -		      unsigned long pte_index, unsigned long avpn,
>> -		      unsigned long va)
>> +		      unsigned long pte_index, unsigned long avpn)
>>  {
>>  	struct kvm *kvm = vcpu->kvm;
>>  	__be64 *hpte;
>> -- 
>> 2.23.0
> 

^ permalink raw reply

* Re: [PATCH v2 04/37] powerpc/64s: remove KVM SKIP test from instruction breakpoint handler
From: Nicholas Piggin @ 2021-02-26 23:51 UTC (permalink / raw)
  To: Daniel Axtens, kvm-ppc; +Cc: linuxppc-dev, Fabiano Rosas
In-Reply-To: <8735xj9yd4.fsf@linkitivity.dja.id.au>

Excerpts from Daniel Axtens's message of February 26, 2021 3:44 pm:
> Nicholas Piggin <npiggin@gmail.com> writes:
> 
>> The code being executed in KVM_GUEST_MODE_SKIP is hypervisor code with
>> MSR[IR]=0, so the faults of concern are the d-side ones caused by access
>> to guest context by the hypervisor.
>>
>> Instruction breakpoint interrupts are not a concern here. It's unlikely
>> any good would come of causing breaks in this code, but skipping the
>> instruction that caused it won't help matters (e.g., skip the mtmsr that
>> sets MSR[DR]=0 or clears KVM_GUEST_MODE_SKIP).
> 
> I'm not entirely clear on the example here, but the patch makes sense
> and I can follow your logic for removing the IKVM_SKIP handler from the
> instruction breakpoint exception.

The example just means that a breakpoint interrupt on any instruction 
inside the guest mode skip region would be skipped, and skipping one of 
those (mtmsrd or store that gets us out of guest mode skip) would cause 
a crash.

Thanks,
Nick

> 
> On that basis:
> Reviewed-by: Daniel Axtens <dja@axtens.net>
> 
> Kind regards,
> Daniel
> 
>>
>> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>  arch/powerpc/kernel/exceptions-64s.S | 1 -
>>  1 file changed, 1 deletion(-)
>>
>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>> index a027600beeb1..0097e0676ed7 100644
>> --- a/arch/powerpc/kernel/exceptions-64s.S
>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>> @@ -2553,7 +2553,6 @@ EXC_VIRT_NONE(0x5200, 0x100)
>>  INT_DEFINE_BEGIN(instruction_breakpoint)
>>  	IVEC=0x1300
>>  #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
>> -	IKVM_SKIP=1
>>  	IKVM_REAL=1
>>  #endif
>>  INT_DEFINE_END(instruction_breakpoint)
>> -- 
>> 2.23.0
> 

^ permalink raw reply

* Re: [PATCH v2 05/37] KVM: PPC: Book3S HV: Ensure MSR[ME] is always set in guest MSR
From: Nicholas Piggin @ 2021-02-26 23:55 UTC (permalink / raw)
  To: Daniel Axtens, kvm-ppc; +Cc: linuxppc-dev, Fabiano Rosas
In-Reply-To: <87zgzr8is2.fsf@linkitivity.dja.id.au>

Excerpts from Daniel Axtens's message of February 26, 2021 4:06 pm:
> Hi Nick,
> 
>>  void kvmppc_set_msr_hv(struct kvm_vcpu *vcpu, u64 msr)
>>  {
>> +	/*
>> +	 * Guest must always run with machine check interrupt
>> +	 * enabled.
>> +	 */
>> +	if (!(msr & MSR_ME))
>> +		msr |= MSR_ME;
> 
> This 'if' is technically redundant but you mention a future patch warning
> on !(msr & MSR_ME) so I'm holding off on any judgement about the 'if' until
> I get to that patch :)

That's true. The warning is actually further down when we're setting up 
the msr to run in guest mode. At this point the MSR I think comes from
qemu (and arguably the guest setup code shouldn't need to know about HV
specific MSR bits) so a warning here wouldn't be appropriate.

I could remove the if, although the compiler might already do that.

> 
> The patch seems sane to me, I agree that we don't want guests running with
> MSR_ME=0 and kvmppc_set_msr_hv already ensures that the transactional state is
> sane so this is another sanity-enforcement in the same sort of vein.
> 
> All up:
> Reviewed-by: Daniel Axtens <dja@axtens.net>

Thanks,
Nick

^ permalink raw reply

* Re: [PATCH v2 13/37] KVM: PPC: Book3S HV P9: Move radix MMU switching instructions together
From: Nicholas Piggin @ 2021-02-26 23:57 UTC (permalink / raw)
  To: Fabiano Rosas, kvm-ppc; +Cc: linuxppc-dev
In-Reply-To: <875z2eyg9r.fsf@linux.ibm.com>

Excerpts from Fabiano Rosas's message of February 27, 2021 1:56 am:
> Nicholas Piggin <npiggin@gmail.com> writes:
> 
>> Switching the MMU from radix<->radix mode is tricky particularly as the
>> MMU can remain enabled and requires a certain sequence of SPR updates.
>> Move these together into their own functions.
>>
>> This also includes the radix TLB check / flush because it's tied in to
>> MMU switching due to tlbiel getting LPID from LPIDR.
>>
>> (XXX: isync / hwsync synchronisation TBD)
> 
> I see bot mtlpidr and mtlpcr requiring a CSI in the ISA. Do you say we
> might need more than an isync?

We might need a CSI before it, we might also need a hwsync before it. I
don't know whether we need isyncs between any of them (I don't think we
should because they're all mtsprs).


> 
> Regardless, I'd expect that to be separate from the refactoring here,
> so:
> 
> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>

Thanks,
Nick
> 
>>
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>  arch/powerpc/kvm/book3s_hv.c | 55 +++++++++++++++++++++---------------
>>  1 file changed, 32 insertions(+), 23 deletions(-)
>>
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 895090636295..23d6dc04b0e9 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -3440,12 +3440,38 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
>>  	trace_kvmppc_run_core(vc, 1);
>>  }
>>
>> +static void switch_mmu_to_guest_radix(struct kvm *kvm, struct kvm_vcpu *vcpu, u64 lpcr)
>> +{
>> +	struct kvmppc_vcore *vc = vcpu->arch.vcore;
>> +	struct kvm_nested_guest *nested = vcpu->arch.nested;
>> +	u32 lpid;
>> +
>> +	lpid = nested ? nested->shadow_lpid : kvm->arch.lpid;
>> +
>> +	mtspr(SPRN_LPID, lpid);
>> +	mtspr(SPRN_LPCR, lpcr);
>> +	mtspr(SPRN_PID, vcpu->arch.pid);
>> +	isync();
>> +
>> +	/* TLBIEL must have LPIDR set, so set guest LPID before flushing. */
>> +	kvmppc_check_need_tlb_flush(kvm, vc->pcpu, nested);
>> +}
>> +
>> +static void switch_mmu_to_host_radix(struct kvm *kvm, u32 pid)
>> +{
>> +	mtspr(SPRN_PID, pid);
>> +	mtspr(SPRN_LPID, kvm->arch.host_lpid);
>> +	mtspr(SPRN_LPCR, kvm->arch.host_lpcr);
>> +	isync();
>> +}
>> +
>>  /*
>>   * Load up hypervisor-mode registers on P9.
>>   */
>>  static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>>  				     unsigned long lpcr)
>>  {
>> +	struct kvm *kvm = vcpu->kvm;
>>  	struct kvmppc_vcore *vc = vcpu->arch.vcore;
>>  	s64 hdec;
>>  	u64 tb, purr, spurr;
>> @@ -3468,12 +3494,12 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>>  	 * P8 and P9 suppress the HDEC exception when LPCR[HDICE] = 0,
>>  	 * so set HDICE before writing HDEC.
>>  	 */
>> -	mtspr(SPRN_LPCR, vcpu->kvm->arch.host_lpcr | LPCR_HDICE);
>> +	mtspr(SPRN_LPCR, kvm->arch.host_lpcr | LPCR_HDICE);
>>  	isync();
>>
>>  	hdec = time_limit - mftb();
>>  	if (hdec < 0) {
>> -		mtspr(SPRN_LPCR, vcpu->kvm->arch.host_lpcr);
>> +		mtspr(SPRN_LPCR, kvm->arch.host_lpcr);
>>  		isync();
>>  		return BOOK3S_INTERRUPT_HV_DECREMENTER;
>>  	}
>> @@ -3508,7 +3534,6 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>>  	}
>>  	mtspr(SPRN_CIABR, vcpu->arch.ciabr);
>>  	mtspr(SPRN_IC, vcpu->arch.ic);
>> -	mtspr(SPRN_PID, vcpu->arch.pid);
>>
>>  	mtspr(SPRN_PSSCR, vcpu->arch.psscr | PSSCR_EC |
>>  	      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
>> @@ -3522,8 +3547,7 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>>
>>  	mtspr(SPRN_AMOR, ~0UL);
>>
>> -	mtspr(SPRN_LPCR, lpcr);
>> -	isync();
>> +	switch_mmu_to_guest_radix(kvm, vcpu, lpcr);
>>
>>  	kvmppc_xive_push_vcpu(vcpu);
>>
>> @@ -3562,7 +3586,6 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>>  		mtspr(SPRN_DAWR1, host_dawr1);
>>  		mtspr(SPRN_DAWRX1, host_dawrx1);
>>  	}
>> -	mtspr(SPRN_PID, host_pidr);
>>
>>  	/*
>>  	 * Since this is radix, do a eieio; tlbsync; ptesync sequence in
>> @@ -3577,9 +3600,6 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>>  	if (cpu_has_feature(CPU_FTR_ARCH_31))
>>  		asm volatile(PPC_CP_ABORT);
>>
>> -	mtspr(SPRN_LPID, vcpu->kvm->arch.host_lpid);	/* restore host LPID */
>> -	isync();
>> -
>>  	vc->dpdes = mfspr(SPRN_DPDES);
>>  	vc->vtb = mfspr(SPRN_VTB);
>>  	mtspr(SPRN_DPDES, 0);
>> @@ -3596,7 +3616,8 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
>>  	}
>>
>>  	mtspr(SPRN_HDEC, 0x7fffffff);
>> -	mtspr(SPRN_LPCR, vcpu->kvm->arch.host_lpcr);
>> +
>> +	switch_mmu_to_host_radix(kvm, host_pidr);
>>
>>  	return trap;
>>  }
>> @@ -4130,7 +4151,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>>  {
>>  	struct kvm_run *run = vcpu->run;
>>  	int trap, r, pcpu;
>> -	int srcu_idx, lpid;
>> +	int srcu_idx;
>>  	struct kvmppc_vcore *vc;
>>  	struct kvm *kvm = vcpu->kvm;
>>  	struct kvm_nested_guest *nested = vcpu->arch.nested;
>> @@ -4204,13 +4225,6 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>>  	vc->vcore_state = VCORE_RUNNING;
>>  	trace_kvmppc_run_core(vc, 0);
>>
>> -	if (cpu_has_feature(CPU_FTR_HVMODE)) {
>> -		lpid = nested ? nested->shadow_lpid : kvm->arch.lpid;
>> -		mtspr(SPRN_LPID, lpid);
>> -		isync();
>> -		kvmppc_check_need_tlb_flush(kvm, pcpu, nested);
>> -	}
>> -
>>  	guest_enter_irqoff();
>>
>>  	srcu_idx = srcu_read_lock(&kvm->srcu);
>> @@ -4229,11 +4243,6 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>>
>>  	srcu_read_unlock(&kvm->srcu, srcu_idx);
>>
>> -	if (cpu_has_feature(CPU_FTR_HVMODE)) {
>> -		mtspr(SPRN_LPID, kvm->arch.host_lpid);
>> -		isync();
>> -	}
>> -
>>  	set_irq_happened(trap);
>>
>>  	kvmppc_set_host_core(pcpu);
> 

^ permalink raw reply

* Re: [PATCH v2 16/37] KVM: PPC: Book3S HV P9: Stop handling hcalls in real-mode in the P9 path
From: Nicholas Piggin @ 2021-02-26 23:59 UTC (permalink / raw)
  To: Cédric Le Goater, kvm-ppc; +Cc: linuxppc-dev
In-Reply-To: <47ae7b2f-9356-6cff-da38-142eaea773ca@kaod.org>

Excerpts from Cédric Le Goater's message of February 26, 2021 12:51 am:
> On 2/25/21 2:46 PM, Nicholas Piggin wrote:
>> In the interest of minimising the amount of code that is run in
>> "real-mode", don't handle hcalls in real mode in the P9 path.
>> 
>> POWER8 and earlier are much more expensive to exit from HV real mode
>> and switch to host mode, because on those processors HV interrupts get
>> to the hypervisor with the MMU off, and the other threads in the core
>> need to be pulled out of the guest, and SLBs all need to be saved,
>> ERATs invalidated, and host SLB reloaded before the MMU is re-enabled
>> in host mode. Hash guests also require a lot of hcalls to run. The
>> XICS interrupt controller requires hcalls to run.
>> 
>> By contrast, POWER9 has independent thread switching, and in radix mode
>> the hypervisor is already in a host virtual memory mode when the HV
>> interrupt is taken. Radix + xive guests don't need hcalls to handle
>> interrupts or manage translations.
>> 
>> So it's much less important to handle hcalls in real mode in P9.
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>  arch/powerpc/include/asm/kvm_ppc.h      |  5 +++++
>>  arch/powerpc/kvm/book3s_hv.c            | 25 ++++++++++++++++++++++---
>>  arch/powerpc/kvm/book3s_hv_rmhandlers.S |  5 +++++
>>  arch/powerpc/kvm/book3s_xive.c          | 25 +++++++++++++++++++++++++
>>  4 files changed, 57 insertions(+), 3 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
>> index 73b1ca5a6471..db6646c2ade2 100644
>> --- a/arch/powerpc/include/asm/kvm_ppc.h
>> +++ b/arch/powerpc/include/asm/kvm_ppc.h
>> @@ -607,6 +607,7 @@ extern void kvmppc_free_pimap(struct kvm *kvm);
>>  extern int kvmppc_xics_rm_complete(struct kvm_vcpu *vcpu, u32 hcall);
>>  extern void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu);
>>  extern int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd);
>> +extern int kvmppc_xive_xics_hcall(struct kvm_vcpu *vcpu, u32 req);
>>  extern u64 kvmppc_xics_get_icp(struct kvm_vcpu *vcpu);
>>  extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval);
>>  extern int kvmppc_xics_connect_vcpu(struct kvm_device *dev,
>> @@ -639,6 +640,8 @@ static inline int kvmppc_xics_enabled(struct kvm_vcpu *vcpu)
>>  static inline void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu) { }
>>  static inline int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd)
>>  	{ return 0; }
>> +static inline int kvmppc_xive_xics_hcall(struct kvm_vcpu *vcpu, u32 req)
>> +	{ return 0; }
>>  #endif
>>  
>>  #ifdef CONFIG_KVM_XIVE
>> @@ -673,6 +676,7 @@ extern int kvmppc_xive_set_irq(struct kvm *kvm, int irq_source_id, u32 irq,
>>  			       int level, bool line_status);
>>  extern void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu);
>>  extern void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu);
>> +extern void kvmppc_xive_cede_vcpu(struct kvm_vcpu *vcpu);
> 
> I can not find this routine. Is it missing or coming later in the patchset ? 

Yeah it leaked into a later patch but it belongs here. I'll fix it.

Thanks,
Nick

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox