* [RFCv2 1/4] pseries: Add hypercall wrappers for hash page table resizing
2016-01-11 5:52 [RFCv2 0/4] Prototype PAPR hash page table resizing (guest side) David Gibson
@ 2016-01-11 5:52 ` David Gibson
2016-01-11 5:52 ` [RFCv2 2/4] pseries: Add support for hash " David Gibson
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: David Gibson @ 2016-01-11 5:52 UTC (permalink / raw)
To: paulus, benh, michael, bharata; +Cc: thuth, lvivier, linuxppc-dev, David Gibson
This adds the hypercall numbers and wrapper functions for the hash page
table resizing hypercalls.
These are experimental "platform specific" values for now, until we have a
formal PAPR update.
It also adds a new firmware feature flat to track the presence of the
HPT resizing calls.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
arch/powerpc/include/asm/firmware.h | 5 +++--
arch/powerpc/include/asm/hvcall.h | 2 ++
arch/powerpc/include/asm/plpar_wrappers.h | 12 ++++++++++++
arch/powerpc/platforms/pseries/firmware.c | 1 +
4 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/firmware.h b/arch/powerpc/include/asm/firmware.h
index e05808a..339f71d 100644
--- a/arch/powerpc/include/asm/firmware.h
+++ b/arch/powerpc/include/asm/firmware.h
@@ -42,7 +42,7 @@
#define FW_FEATURE_SPLPAR ASM_CONST(0x0000000000100000)
#define FW_FEATURE_LPAR ASM_CONST(0x0000000000400000)
#define FW_FEATURE_PS3_LV1 ASM_CONST(0x0000000000800000)
-/* Free ASM_CONST(0x0000000001000000) */
+#define FW_FEATURE_HPT_RESIZE ASM_CONST(0x0000000001000000)
#define FW_FEATURE_CMO ASM_CONST(0x0000000002000000)
#define FW_FEATURE_VPHN ASM_CONST(0x0000000004000000)
#define FW_FEATURE_XCMO ASM_CONST(0x0000000008000000)
@@ -68,7 +68,8 @@ enum {
FW_FEATURE_MULTITCE | FW_FEATURE_SPLPAR | FW_FEATURE_LPAR |
FW_FEATURE_CMO | FW_FEATURE_VPHN | FW_FEATURE_XCMO |
FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY |
- FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN,
+ FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN |
+ FW_FEATURE_HPT_RESIZE,
FW_FEATURE_PSERIES_ALWAYS = 0,
FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL | FW_FEATURE_OPALv2 |
FW_FEATURE_OPALv3,
diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index 85bc8c0..ae1fcb7 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -273,6 +273,8 @@
/* Platform specific hcalls, used by KVM */
#define H_RTAS 0xf000
+#define H_RESIZE_HPT_PREPARE 0xf003
+#define H_RESIZE_HPT_COMMIT 0xf004
/* "Platform specific hcalls", provided by PHYP */
#define H_GET_24X7_CATALOG_PAGE 0xF078
diff --git a/arch/powerpc/include/asm/plpar_wrappers.h b/arch/powerpc/include/asm/plpar_wrappers.h
index 67859ed..8f1d8fe 100644
--- a/arch/powerpc/include/asm/plpar_wrappers.h
+++ b/arch/powerpc/include/asm/plpar_wrappers.h
@@ -225,6 +225,18 @@ static inline long plpar_pte_protect(unsigned long flags, unsigned long ptex,
return plpar_hcall_norets(H_PROTECT, flags, ptex, avpn);
}
+static inline long plpar_resize_hpt_prepare(unsigned long flags,
+ unsigned long shift)
+{
+ return plpar_hcall_norets(H_RESIZE_HPT_PREPARE, flags, shift);
+}
+
+static inline long plpar_resize_hpt_commit(unsigned long flags,
+ unsigned long shift)
+{
+ return plpar_hcall_norets(H_RESIZE_HPT_COMMIT, flags, shift);
+}
+
static inline long plpar_tce_get(unsigned long liobn, unsigned long ioba,
unsigned long *tce_ret)
{
diff --git a/arch/powerpc/platforms/pseries/firmware.c b/arch/powerpc/platforms/pseries/firmware.c
index 8c80588..7b287be 100644
--- a/arch/powerpc/platforms/pseries/firmware.c
+++ b/arch/powerpc/platforms/pseries/firmware.c
@@ -63,6 +63,7 @@ hypertas_fw_features_table[] = {
{FW_FEATURE_VPHN, "hcall-vphn"},
{FW_FEATURE_SET_MODE, "hcall-set-mode"},
{FW_FEATURE_BEST_ENERGY, "hcall-best-energy-1*"},
+ {FW_FEATURE_HPT_RESIZE, "hcall-hpt-resize"},
};
/* Build up the firmware features bitmask using the contents of
--
2.5.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [RFCv2 2/4] pseries: Add support for hash table resizing
2016-01-11 5:52 [RFCv2 0/4] Prototype PAPR hash page table resizing (guest side) David Gibson
2016-01-11 5:52 ` [RFCv2 1/4] pseries: Add hypercall wrappers for hash page table resizing David Gibson
@ 2016-01-11 5:52 ` David Gibson
2016-01-11 5:52 ` [RFCv2 3/4] pseries: debugfs hook to trigger a hash page table resize David Gibson
2016-01-11 5:52 ` [RFCv2 4/4] pseries: Advertise HPT resizing support via CAS David Gibson
3 siblings, 0 replies; 5+ messages in thread
From: David Gibson @ 2016-01-11 5:52 UTC (permalink / raw)
To: paulus, benh, michael, bharata; +Cc: thuth, lvivier, linuxppc-dev, David Gibson
This adds support for using experimental hypercalls to change the size
of the main hash page table while running as a PAPR guest. For now these
hypercalls are only in experimental qemu versions.
The interface is two part: first H_RESIZE_HPT_PREPARE is used to allocate
and prepare the new hash table. This may be slow, but can be done
asynchronously. Then, H_RESIZE_HPT_COMMIT is used to switch to the new
hash table. This requires that no CPUs be concurrently updating the HPT,
and so must be run under stop_machine().
This patch only supplies a function to execute the hash table change,
nothing yet calls it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
arch/powerpc/platforms/pseries/lpar.c | 109 ++++++++++++++++++++++++++++++++++
1 file changed, 109 insertions(+)
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index b7a67e3..f6e7af5 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -27,6 +27,8 @@
#include <linux/console.h>
#include <linux/export.h>
#include <linux/jump_label.h>
+#include <linux/delay.h>
+#include <linux/stop_machine.h>
#include <asm/processor.h>
#include <asm/mmu.h>
#include <asm/page.h>
@@ -794,3 +796,110 @@ int h_get_mpp_x(struct hvcall_mpp_x_data *mpp_x_data)
return rc;
}
+
+#define HPT_RESIZE_TIMEOUT 10000 /* ms */
+
+struct hpt_resize_state {
+ unsigned long shift;
+ int commit_rc;
+};
+
+static int pseries_lpar_resize_hpt_commit(void *data)
+{
+ struct hpt_resize_state *state = data;
+
+ state->commit_rc = plpar_resize_hpt_commit(0, state->shift);
+ if (state->commit_rc != H_SUCCESS)
+ return -EIO;
+
+ /* Hypervisor has transitioned the HTAB, update our globals */
+ ppc64_pft_size = state->shift;
+ htab_size_bytes = 1UL << ppc64_pft_size;
+ htab_hash_mask = (htab_size_bytes >> 7) - 1;
+
+ return 0;
+}
+
+/* Must be called in user context */
+int pseries_lpar_resize_hpt(unsigned long shift)
+{
+ struct hpt_resize_state state = {
+ .shift = shift,
+ .commit_rc = H_FUNCTION,
+ };
+ unsigned int delay, total_delay = 0;
+ int rc;
+ ktime_t t0, t1, t2;
+
+ might_sleep();
+
+ if (!firmware_has_feature(FW_FEATURE_HPT_RESIZE))
+ return -ENODEV;
+
+ printk(KERN_INFO "lpar: Attempting to resize HPT to shift %lu\n",
+ shift);
+
+ t0 = ktime_get();
+
+ rc = plpar_resize_hpt_prepare(0, shift);
+ while (H_IS_LONG_BUSY(rc)) {
+ delay = get_longbusy_msecs(rc);
+ total_delay += delay;
+ if (total_delay > HPT_RESIZE_TIMEOUT) {
+ /* prepare call with shift==0 cancels an
+ * in-progress resize */
+ rc = plpar_resize_hpt_prepare(0, 0);
+ if (rc != H_SUCCESS)
+ printk(KERN_WARNING
+ "lpar: Unexpected error %d cancelling timed out HPT resize\n",
+ rc);
+ return -ETIMEDOUT;
+ }
+ msleep(delay);
+ rc = plpar_resize_hpt_prepare(0, shift);
+ };
+
+ switch (rc) {
+ case H_SUCCESS:
+ /* Continue on */
+ break;
+
+ case H_PARAMETER:
+ return -EINVAL;
+ case H_RESOURCE:
+ return -EPERM;
+ default:
+ printk(KERN_WARNING
+ "lpar: Unexpected error %d from H_RESIZE_HPT_PREPARE\n",
+ rc);
+ return -EIO;
+ }
+
+ t1 = ktime_get();
+
+ rc = stop_machine(pseries_lpar_resize_hpt_commit, &state, NULL);
+
+ t2 = ktime_get();
+
+ if (rc != 0) {
+ switch (state.commit_rc) {
+ case H_PTEG_FULL:
+ printk(KERN_WARNING
+ "lpar: Hash collision while resizing HPT\n");
+ return -ENOSPC;
+
+ default:
+ printk(KERN_WARNING
+ "lpar: Unexpected error %d from H_RESIZE_HPT_COMMIT\n",
+ state.commit_rc);
+ return -EIO;
+ };
+ }
+
+ printk(KERN_INFO
+ "lpar: HPT resize to shift %lu complete (%lld ms / %lld ms)\n",
+ shift, (long long) ktime_ms_delta(t1, t0),
+ (long long) ktime_ms_delta(t2, t1));
+
+ return 0;
+}
--
2.5.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [RFCv2 3/4] pseries: debugfs hook to trigger a hash page table resize
2016-01-11 5:52 [RFCv2 0/4] Prototype PAPR hash page table resizing (guest side) David Gibson
2016-01-11 5:52 ` [RFCv2 1/4] pseries: Add hypercall wrappers for hash page table resizing David Gibson
2016-01-11 5:52 ` [RFCv2 2/4] pseries: Add support for hash " David Gibson
@ 2016-01-11 5:52 ` David Gibson
2016-01-11 5:52 ` [RFCv2 4/4] pseries: Advertise HPT resizing support via CAS David Gibson
3 siblings, 0 replies; 5+ messages in thread
From: David Gibson @ 2016-01-11 5:52 UTC (permalink / raw)
To: paulus, benh, michael, bharata; +Cc: thuth, lvivier, linuxppc-dev, David Gibson
This patch adds a special file /sys/kernel/debug/powerpc/pft-size
which can be used to view the current size of the hash page table (as
a bit shift) and to trigger a resize of the hash table on PAPR guests.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
arch/powerpc/platforms/pseries/lpar.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index f6e7af5..dba9644 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -29,6 +29,7 @@
#include <linux/jump_label.h>
#include <linux/delay.h>
#include <linux/stop_machine.h>
+#include <linux/debugfs.h>
#include <asm/processor.h>
#include <asm/mmu.h>
#include <asm/page.h>
@@ -903,3 +904,28 @@ int pseries_lpar_resize_hpt(unsigned long shift)
return 0;
}
+
+static int ppc64_pft_size_get(void *data, u64 *val)
+{
+ *val = ppc64_pft_size;
+ return 0;
+}
+
+static int ppc64_pft_size_set(void *data, u64 val)
+{
+ return pseries_lpar_resize_hpt(val);
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(fops_ppc64_pft_size,
+ ppc64_pft_size_get, ppc64_pft_size_set, "%llu\n");
+
+static int __init pseries_lpar_debugfs(void)
+{
+ if (!debugfs_create_file("pft-size", 0600, powerpc_debugfs_root,
+ NULL, &fops_ppc64_pft_size)) {
+ pr_err("lpar: unable to create ppc64_pft_size debugsfs file\n");
+ }
+
+ return 0;
+}
+machine_device_initcall(pseries, pseries_lpar_debugfs);
--
2.5.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [RFCv2 4/4] pseries: Advertise HPT resizing support via CAS
2016-01-11 5:52 [RFCv2 0/4] Prototype PAPR hash page table resizing (guest side) David Gibson
` (2 preceding siblings ...)
2016-01-11 5:52 ` [RFCv2 3/4] pseries: debugfs hook to trigger a hash page table resize David Gibson
@ 2016-01-11 5:52 ` David Gibson
3 siblings, 0 replies; 5+ messages in thread
From: David Gibson @ 2016-01-11 5:52 UTC (permalink / raw)
To: paulus, benh, michael, bharata; +Cc: thuth, lvivier, linuxppc-dev, David Gibson
The hypervisor needs to know a guest is capable of using the HPT resizing
PAPR extension in order to make full advantage of it for memory hotplug.
If the hypervisor knows the guest is HPT resize aware, it can size the
initial HPT based on the initial guest RAM size, relying on the guest to
resize the HPT when more memory is hot-added. Without this, the hypervisor
must size the HPT for the maximum possible guest RAM, which can lead to
a huge waste of space if the guest never actually expends to that maximum
size.
This patch advertises the guest's support for HPT resizing via the
ibm,client-architecture-support OF interface. Obviously, the actual
encoding in the CAS vector is tentative until the extension is officially
incorporated into PAPR. For now we use bit 0 of (previously unused) byte 8
of option vector 5.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
arch/powerpc/include/asm/prom.h | 1 +
arch/powerpc/kernel/prom_init.c | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
index 7f436ba..7a57b77 100644
--- a/arch/powerpc/include/asm/prom.h
+++ b/arch/powerpc/include/asm/prom.h
@@ -151,6 +151,7 @@ struct of_drconf_cell {
#define OV5_XCMO 0x0440 /* Page Coalescing */
#define OV5_TYPE1_AFFINITY 0x0580 /* Type 1 NUMA affinity */
#define OV5_PRRN 0x0540 /* Platform Resource Reassignment */
+#define OV5_HPT_RESIZE 0x880 /* Hash Page Table resizing */
#define OV5_PFO_HW_RNG 0x0E80 /* PFO Random Number Generator */
#define OV5_PFO_HW_842 0x0E40 /* PFO Compression Accelerator */
#define OV5_PFO_HW_ENCR 0x0E20 /* PFO Encryption Accelerator */
diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index 92dea8d..d82b883 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -712,7 +712,7 @@ unsigned char ibm_architecture_vec[] = {
OV5_FEAT(OV5_TYPE1_AFFINITY) | OV5_FEAT(OV5_PRRN),
0,
0,
- 0,
+ OV5_FEAT(OV5_HPT_RESIZE),
/* WARNING: The offset of the "number of cores" field below
* must match by the macro below. Update the definition if
* the structure layout changes.
--
2.5.0
^ permalink raw reply related [flat|nested] 5+ messages in thread