From: Nathan Lynch <nathanl@linux.ibm.com>
To: linuxppc-dev@lists.ozlabs.org
Cc: tyreld@linux.ibm.com, ldufour@linux.ibm.com,
aneesh.kumar@linux.ibm.com, danielhb413@gmail.com
Subject: [PATCH v2 1/4] powerpc/pseries/cpuhp: cache node corrections
Date: Mon, 27 Sep 2021 15:19:30 -0500 [thread overview]
Message-ID: <20210927201933.76786-2-nathanl@linux.ibm.com> (raw)
In-Reply-To: <20210927201933.76786-1-nathanl@linux.ibm.com>
On pseries, cache nodes in the device tree can be added and removed by the
CPU DLPAR code as well as the partition migration (mobility) code. PowerVM
partitions in dedicated processor mode typically have L2 and L3 cache
nodes.
The CPU DLPAR code has the following shortcomings:
* Cache nodes returned as siblings of a new CPU node by
ibm,configure-connector are silently discarded; only the CPU node is
added to the device tree.
* Cache nodes which become unreferenced in the processor removal path are
not removed from the device tree. This can lead to duplicate nodes when
the post-migration device tree update code replaces cache nodes.
This is long-standing behavior. Presumably it has gone mostly unnoticed
because the two bugs have the property of obscuring each other in common
simple scenarios (e.g. remove a CPU and add it back). Likely you'd notice
only if you cared to inspect the device tree or the sysfs cacheinfo
information.
Booted with two processors:
$ pwd
/sys/firmware/devicetree/base/cpus
$ ls -1d */
l2-cache@2010/
l2-cache@2011/
l3-cache@3110/
l3-cache@3111/
PowerPC,POWER9@0/
PowerPC,POWER9@8/
$ lsprop */l2-cache
l2-cache@2010/l2-cache
00003110 (12560)
l2-cache@2011/l2-cache
00003111 (12561)
PowerPC,POWER9@0/l2-cache
00002010 (8208)
PowerPC,POWER9@8/l2-cache
00002011 (8209)
$ ls /sys/devices/system/cpu/cpu0/cache/
index0 index1 index2 index3
After DLPAR-adding PowerPC,POWER9@10, we see that its associated cache
nodes are absent, its threads' L2+L3 cacheinfo is unpopulated, and it is
missing a cache level in its sched domain hierarchy:
$ ls -1d */
l2-cache@2010/
l2-cache@2011/
l3-cache@3110/
l3-cache@3111/
PowerPC,POWER9@0/
PowerPC,POWER9@10/
PowerPC,POWER9@8/
$ lsprop PowerPC\,POWER9@10/l2-cache
PowerPC,POWER9@10/l2-cache
00002012 (8210)
$ ls /sys/devices/system/cpu/cpu16/cache/
index0 index1
$ grep . /sys/kernel/debug/sched/domains/cpu{0,8,16}/domain*/name
/sys/kernel/debug/sched/domains/cpu0/domain0/name:SMT
/sys/kernel/debug/sched/domains/cpu0/domain1/name:CACHE
/sys/kernel/debug/sched/domains/cpu0/domain2/name:DIE
/sys/kernel/debug/sched/domains/cpu8/domain0/name:SMT
/sys/kernel/debug/sched/domains/cpu8/domain1/name:CACHE
/sys/kernel/debug/sched/domains/cpu8/domain2/name:DIE
/sys/kernel/debug/sched/domains/cpu16/domain0/name:SMT
/sys/kernel/debug/sched/domains/cpu16/domain1/name:DIE
When removing PowerPC,POWER9@8, we see that its cache nodes are left
behind:
$ ls -1d */
l2-cache@2010/
l2-cache@2011/
l3-cache@3110/
l3-cache@3111/
PowerPC,POWER9@0/
When DLPAR is combined with VM migration, we can get duplicate nodes. E.g.
removing one processor, then migrating, adding a processor, and then
migrating again can result in warnings from the OF core during
post-migration device tree updates:
Duplicate name in cpus, renamed to "l2-cache@2011#1"
Duplicate name in cpus, renamed to "l3-cache@3111#1"
and nodes with duplicated phandles in the tree, making lookup behavior
unpredictable:
$ lsprop l[23]-cache@*/ibm,phandle
l2-cache@2010/ibm,phandle
00002010 (8208)
l2-cache@2011#1/ibm,phandle
00002011 (8209)
l2-cache@2011/ibm,phandle
00002011 (8209)
l3-cache@3110/ibm,phandle
00003110 (12560)
l3-cache@3111#1/ibm,phandle
00003111 (12561)
l3-cache@3111/ibm,phandle
00003111 (12561)
Address these issues by:
* Correctly processing siblings of the node returned from
dlpar_configure_connector().
* Removing cache nodes in the CPU remove path when it can be determined
that they are not associated with other CPUs or caches.
Use the of_changeset API in both cases, which allows us to keep the error
handling in this code from becoming more complex while ensuring that the
device tree cannot become inconsistent.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Fixes: ac71380 ("powerpc/pseries: Add CPU dlpar remove functionality")
Fixes: 90edf18 ("powerpc/pseries: Add CPU dlpar add functionality")
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>
---
arch/powerpc/platforms/pseries/hotplug-cpu.c | 75 ++++++++++++++++++--
1 file changed, 71 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index d646c22e94ab..00ac7d7e63e5 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -521,6 +521,27 @@ static bool valid_cpu_drc_index(struct device_node *parent, u32 drc_index)
return found;
}
+static int pseries_cpuhp_attach_nodes(struct device_node *dn)
+{
+ struct of_changeset cs;
+ int ret;
+
+ /*
+ * This device node is unattached but may have siblings; open-code the
+ * traversal.
+ */
+ for (of_changeset_init(&cs); dn != NULL; dn = dn->sibling) {
+ ret = of_changeset_attach_node(&cs, dn);
+ if (ret)
+ goto out;
+ }
+
+ ret = of_changeset_apply(&cs);
+out:
+ of_changeset_destroy(&cs);
+ return ret;
+}
+
static ssize_t dlpar_cpu_add(u32 drc_index)
{
struct device_node *dn, *parent;
@@ -563,7 +584,7 @@ static ssize_t dlpar_cpu_add(u32 drc_index)
return -EINVAL;
}
- rc = dlpar_attach_node(dn, parent);
+ rc = pseries_cpuhp_attach_nodes(dn);
/* Regardless we are done with parent now */
of_node_put(parent);
@@ -600,6 +621,53 @@ static ssize_t dlpar_cpu_add(u32 drc_index)
return rc;
}
+static unsigned int pseries_cpuhp_cache_use_count(const struct device_node *cachedn)
+{
+ unsigned int use_count = 0;
+ struct device_node *dn;
+
+ WARN_ON(!of_node_is_type(cachedn, "cache"));
+
+ for_each_of_cpu_node(dn) {
+ if (of_find_next_cache_node(dn) == cachedn)
+ use_count++;
+ }
+
+ for_each_node_by_type(dn, "cache") {
+ if (of_find_next_cache_node(dn) == cachedn)
+ use_count++;
+ }
+
+ return use_count;
+}
+
+static int pseries_cpuhp_detach_nodes(struct device_node *cpudn)
+{
+ struct device_node *dn;
+ struct of_changeset cs;
+ int ret = 0;
+
+ of_changeset_init(&cs);
+ ret = of_changeset_detach_node(&cs, cpudn);
+ if (ret)
+ goto out;
+
+ dn = cpudn;
+ while ((dn = of_find_next_cache_node(dn))) {
+ if (pseries_cpuhp_cache_use_count(dn) > 1)
+ break;
+
+ ret = of_changeset_detach_node(&cs, dn);
+ if (ret)
+ goto out;
+ }
+
+ ret = of_changeset_apply(&cs);
+out:
+ of_changeset_destroy(&cs);
+ return ret;
+}
+
static ssize_t dlpar_cpu_remove(struct device_node *dn, u32 drc_index)
{
int rc;
@@ -621,7 +689,7 @@ static ssize_t dlpar_cpu_remove(struct device_node *dn, u32 drc_index)
return rc;
}
- rc = dlpar_detach_node(dn);
+ rc = pseries_cpuhp_detach_nodes(dn);
if (rc) {
int saved_rc = rc;
@@ -885,10 +953,9 @@ static int dlpar_cpu_add_by_count(u32 cpus_to_add)
int dlpar_cpu(struct pseries_hp_errorlog *hp_elog)
{
- u32 count, drc_index;
+ u32 drc_index;
int rc;
- count = hp_elog->_drc_u.drc_count;
drc_index = hp_elog->_drc_u.drc_index;
lock_device_hotplug();
--
2.31.1
next prev parent reply other threads:[~2021-09-27 20:20 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-27 20:19 [PATCH v2 0/4] CPU DLPAR/hotplug for v5.16 Nathan Lynch
2021-09-27 20:19 ` Nathan Lynch [this message]
2021-09-27 20:19 ` [PATCH v2 2/4] powerpc/cpuhp: BUG -> WARN conversion in offline path Nathan Lynch
2021-09-27 20:19 ` [PATCH v2 3/4] powerpc/pseries/cpuhp: delete add/remove_by_count code Nathan Lynch
2021-09-27 20:19 ` [PATCH v2 4/4] powerpc/pseries/cpuhp: remove obsolete comment from pseries_cpu_die Nathan Lynch
2021-09-29 0:14 ` Daniel Henrique Barboza
2021-09-29 5:45 ` Michael Ellerman
2021-09-29 12:06 ` Nathan Lynch
2021-10-11 12:34 ` Michael Ellerman
2021-10-11 12:06 ` [PATCH v2 0/4] CPU DLPAR/hotplug for v5.16 Michael Ellerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210927201933.76786-2-nathanl@linux.ibm.com \
--to=nathanl@linux.ibm.com \
--cc=aneesh.kumar@linux.ibm.com \
--cc=danielhb413@gmail.com \
--cc=ldufour@linux.ibm.com \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=tyreld@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).