* [PATCH 0/3] Unisolate LMBs DRC on removal error + cleanups
@ 2021-04-30 12:09 Daniel Henrique Barboza
2021-04-30 12:09 ` [PATCH 1/3] powerpc/pseries: Set UNISOLATE on dlpar_memory_remove_by_ic() error Daniel Henrique Barboza
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Daniel Henrique Barboza @ 2021-04-30 12:09 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Daniel Henrique Barboza, david
Hi,
This is a follow-up of the work done in dlpar_cpu_remove() to
report CPU removal error by unisolating the DRC. This time I'm
doing it for LMBs. Patch 01 handles this.
Patches 2 and 3 are cleanups I consider worth posting.
Daniel Henrique Barboza (3):
powerpc/pseries: Set UNISOLATE on dlpar_memory_remove_by_ic() error
hotplug-memory.c: enhance dlpar_memory_remove* LMB checks
pseries/hotplug-memory.c: adding dlpar_memory_remove_lmbs_common()
.../platforms/pseries/hotplug-memory.c | 162 ++++++++----------
1 file changed, 69 insertions(+), 93 deletions(-)
--
2.30.2
^ permalink raw reply [flat|nested] 10+ messages in thread* [PATCH 1/3] powerpc/pseries: Set UNISOLATE on dlpar_memory_remove_by_ic() error 2021-04-30 12:09 [PATCH 0/3] Unisolate LMBs DRC on removal error + cleanups Daniel Henrique Barboza @ 2021-04-30 12:09 ` Daniel Henrique Barboza 2021-05-04 0:45 ` David Gibson 2021-04-30 12:09 ` [PATCH 2/3] hotplug-memory.c: enhance dlpar_memory_remove* LMB checks Daniel Henrique Barboza 2021-04-30 12:09 ` [PATCH 3/3] pseries/hotplug-memory.c: adding dlpar_memory_remove_lmbs_common() Daniel Henrique Barboza 2 siblings, 1 reply; 10+ messages in thread From: Daniel Henrique Barboza @ 2021-04-30 12:09 UTC (permalink / raw) To: linuxppc-dev; +Cc: Daniel Henrique Barboza, david As previously done in dlpar_cpu_remove() for CPUs, this patch changes dlpar_memory_remove_by_ic() to unisolate the LMB DRC when the LMB is failed to be removed. The hypervisor, seeing a LMB DRC that was supposed to be removed being unisolated instead, can do error recovery on its side. This change is done in dlpar_memory_remove_by_ic() only because, as of today, only QEMU is using this code path for error recovery (via the PSERIES_HP_ELOG_ID_DRC_IC event). phyp treats it as a no-op. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- arch/powerpc/platforms/pseries/hotplug-memory.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c index 8377f1f7c78e..bb98574a84a2 100644 --- a/arch/powerpc/platforms/pseries/hotplug-memory.c +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -551,6 +551,13 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index) if (!drmem_lmb_reserved(lmb)) continue; + /* + * Setting the isolation state of an UNISOLATED/CONFIGURED + * device to UNISOLATE is a no-op, but the hypervisor can + * use it as a hint that the LMB removal failed. + */ + dlpar_unisolate_drc(lmb->drc_index); + rc = dlpar_add_lmb(lmb); if (rc) pr_err("Failed to add LMB, drc index %x\n", -- 2.30.2 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] powerpc/pseries: Set UNISOLATE on dlpar_memory_remove_by_ic() error 2021-04-30 12:09 ` [PATCH 1/3] powerpc/pseries: Set UNISOLATE on dlpar_memory_remove_by_ic() error Daniel Henrique Barboza @ 2021-05-04 0:45 ` David Gibson 0 siblings, 0 replies; 10+ messages in thread From: David Gibson @ 2021-05-04 0:45 UTC (permalink / raw) To: Daniel Henrique Barboza; +Cc: linuxppc-dev [-- Attachment #1: Type: text/plain, Size: 1811 bytes --] On Fri, Apr 30, 2021 at 09:09:15AM -0300, Daniel Henrique Barboza wrote: > As previously done in dlpar_cpu_remove() for CPUs, this patch changes > dlpar_memory_remove_by_ic() to unisolate the LMB DRC when the LMB is > failed to be removed. The hypervisor, seeing a LMB DRC that was supposed > to be removed being unisolated instead, can do error recovery on its > side. > > This change is done in dlpar_memory_remove_by_ic() only because, as of > today, only QEMU is using this code path for error recovery (via the > PSERIES_HP_ELOG_ID_DRC_IC event). phyp treats it as a no-op. > > Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> > --- > arch/powerpc/platforms/pseries/hotplug-memory.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c > index 8377f1f7c78e..bb98574a84a2 100644 > --- a/arch/powerpc/platforms/pseries/hotplug-memory.c > +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c > @@ -551,6 +551,13 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index) > if (!drmem_lmb_reserved(lmb)) > continue; > > + /* > + * Setting the isolation state of an UNISOLATED/CONFIGURED > + * device to UNISOLATE is a no-op, but the hypervisor can > + * use it as a hint that the LMB removal failed. > + */ > + dlpar_unisolate_drc(lmb->drc_index); > + > rc = dlpar_add_lmb(lmb); > if (rc) > pr_err("Failed to add LMB, drc index %x\n", -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/3] hotplug-memory.c: enhance dlpar_memory_remove* LMB checks 2021-04-30 12:09 [PATCH 0/3] Unisolate LMBs DRC on removal error + cleanups Daniel Henrique Barboza 2021-04-30 12:09 ` [PATCH 1/3] powerpc/pseries: Set UNISOLATE on dlpar_memory_remove_by_ic() error Daniel Henrique Barboza @ 2021-04-30 12:09 ` Daniel Henrique Barboza 2021-05-04 1:02 ` David Gibson 2021-04-30 12:09 ` [PATCH 3/3] pseries/hotplug-memory.c: adding dlpar_memory_remove_lmbs_common() Daniel Henrique Barboza 2 siblings, 1 reply; 10+ messages in thread From: Daniel Henrique Barboza @ 2021-04-30 12:09 UTC (permalink / raw) To: linuxppc-dev; +Cc: Daniel Henrique Barboza, david dlpar_memory_remove_by_ic() validates the amount of LMBs to be removed by checking !DRCONF_MEM_RESERVED, and in the following loop before dlpar_remove_lmb() a check for DRCONF_MEM_ASSIGNED is made before removing it. This means that a LMB that is both !DRCONF_MEM_RESERVED and !DRCONF_MEM_ASSIGNED will be counted as valid, but then not being removed. The function will end up not removing all 'lmbs_to_remove' LMBs while also not reporting any errors. Comparing it to dlpar_memory_remove_by_count(), the validation is done via lmb_is_removable(), which checks for DRCONF_MEM_ASSIGNED and fadump constraints. No additional check is made afterwards, and DRCONF_MEM_RESERVED is never checked before dlpar_remove_lmb(). The function doesn't have the same 'check A for validation, then B for removal' issue as remove_by_ic(), but it's not checking if the LMB is reserved. There is no reason for these functions to validate the same operation in two different manners. This patch addresses that by changing lmb_is_removable() to also check for DRCONF_MEM_RESERVED to tell if a lmb is removable, making dlpar_memory_remove_by_count() take the reservation state into account when counting the LMBs. lmb_is_removable() is then used in the validation step of dlpar_memory_remove_by_ic(), which is already checking for both states but in different stages, to avoid counting a LMB that is not assigned as eligible for removal. We can then skip the check before dlpar_remove_lmb() since we're validating all LMBs beforehand. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- arch/powerpc/platforms/pseries/hotplug-memory.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c index bb98574a84a2..4e6d162c3f1a 100644 --- a/arch/powerpc/platforms/pseries/hotplug-memory.c +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -348,7 +348,8 @@ static int pseries_remove_mem_node(struct device_node *np) static bool lmb_is_removable(struct drmem_lmb *lmb) { - if (!(lmb->flags & DRCONF_MEM_ASSIGNED)) + if ((lmb->flags & DRCONF_MEM_RESERVED) || + !(lmb->flags & DRCONF_MEM_ASSIGNED)) return false; #ifdef CONFIG_FA_DUMP @@ -523,7 +524,7 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index) /* Validate that there are enough LMBs to satisfy the request */ for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { - if (lmb->flags & DRCONF_MEM_RESERVED) + if (!lmb_is_removable(lmb)) break; lmbs_available++; @@ -533,9 +534,6 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index) return -EINVAL; for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { - if (!(lmb->flags & DRCONF_MEM_ASSIGNED)) - continue; - rc = dlpar_remove_lmb(lmb); if (rc) break; -- 2.30.2 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] hotplug-memory.c: enhance dlpar_memory_remove* LMB checks 2021-04-30 12:09 ` [PATCH 2/3] hotplug-memory.c: enhance dlpar_memory_remove* LMB checks Daniel Henrique Barboza @ 2021-05-04 1:02 ` David Gibson 2021-05-07 16:36 ` Daniel Henrique Barboza 2021-05-12 20:35 ` Daniel Henrique Barboza 0 siblings, 2 replies; 10+ messages in thread From: David Gibson @ 2021-05-04 1:02 UTC (permalink / raw) To: Daniel Henrique Barboza; +Cc: linuxppc-dev [-- Attachment #1: Type: text/plain, Size: 4217 bytes --] On Fri, Apr 30, 2021 at 09:09:16AM -0300, Daniel Henrique Barboza wrote: > dlpar_memory_remove_by_ic() validates the amount of LMBs to be removed > by checking !DRCONF_MEM_RESERVED, and in the following loop before > dlpar_remove_lmb() a check for DRCONF_MEM_ASSIGNED is made before > removing it. This means that a LMB that is both !DRCONF_MEM_RESERVED and > !DRCONF_MEM_ASSIGNED will be counted as valid, but then not being > removed. The function will end up not removing all 'lmbs_to_remove' > LMBs while also not reporting any errors. > > Comparing it to dlpar_memory_remove_by_count(), the validation is done > via lmb_is_removable(), which checks for DRCONF_MEM_ASSIGNED and fadump > constraints. No additional check is made afterwards, and > DRCONF_MEM_RESERVED is never checked before dlpar_remove_lmb(). The > function doesn't have the same 'check A for validation, then B for > removal' issue as remove_by_ic(), but it's not checking if the LMB is > reserved. > > There is no reason for these functions to validate the same operation in > two different manners. Actually, I think there is: remove_by_ic() is handling a request to remove a specific range of LMBs. If any are reserved, they can't be removed and so this needs to fail. But if they are !ASSIGNED, that essentially means they're *already* removed (or never added), so "removing" them is, correctly, a no-op. remove_by_count(), in contrast, is being asked to remove a fixed number of LMBs from wherever they can be found, and for that it needs to find LMBs that haven't already been removed. Basically remove_by_ic() is an absolute request: "make this set of LMBs be not-plugged", whereas remove_by_count() is a relative request "make N less LMBs be plugged". So I think remove_by_ic()s existing handling is correct. I'm less sure if remove_by_count() ignoring RESERVED is correct - I couldn't quickly find under what circumstances RESERVED gets set. > This patch addresses that by changing > lmb_is_removable() to also check for DRCONF_MEM_RESERVED to tell if a > lmb is removable, making dlpar_memory_remove_by_count() take the > reservation state into account when counting the LMBs. > lmb_is_removable() is then used in the validation step of > dlpar_memory_remove_by_ic(), which is already checking for both states > but in different stages, to avoid counting a LMB that is not assigned as > eligible for removal. We can then skip the check before > dlpar_remove_lmb() since we're validating all LMBs beforehand. > > Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> > --- > arch/powerpc/platforms/pseries/hotplug-memory.c | 8 +++----- > 1 file changed, 3 insertions(+), 5 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c > index bb98574a84a2..4e6d162c3f1a 100644 > --- a/arch/powerpc/platforms/pseries/hotplug-memory.c > +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c > @@ -348,7 +348,8 @@ static int pseries_remove_mem_node(struct device_node *np) > > static bool lmb_is_removable(struct drmem_lmb *lmb) > { > - if (!(lmb->flags & DRCONF_MEM_ASSIGNED)) > + if ((lmb->flags & DRCONF_MEM_RESERVED) || > + !(lmb->flags & DRCONF_MEM_ASSIGNED)) > return false; > > #ifdef CONFIG_FA_DUMP > @@ -523,7 +524,7 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index) > > /* Validate that there are enough LMBs to satisfy the request */ > for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { > - if (lmb->flags & DRCONF_MEM_RESERVED) > + if (!lmb_is_removable(lmb)) > break; > > lmbs_available++; > @@ -533,9 +534,6 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index) > return -EINVAL; > > for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { > - if (!(lmb->flags & DRCONF_MEM_ASSIGNED)) > - continue; > - > rc = dlpar_remove_lmb(lmb); > if (rc) > break; -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] hotplug-memory.c: enhance dlpar_memory_remove* LMB checks 2021-05-04 1:02 ` David Gibson @ 2021-05-07 16:36 ` Daniel Henrique Barboza 2021-05-09 8:43 ` David Gibson 2021-05-12 20:35 ` Daniel Henrique Barboza 1 sibling, 1 reply; 10+ messages in thread From: Daniel Henrique Barboza @ 2021-05-07 16:36 UTC (permalink / raw) To: David Gibson; +Cc: linuxppc-dev On 5/3/21 10:02 PM, David Gibson wrote: > On Fri, Apr 30, 2021 at 09:09:16AM -0300, Daniel Henrique Barboza wrote: >> dlpar_memory_remove_by_ic() validates the amount of LMBs to be removed >> by checking !DRCONF_MEM_RESERVED, and in the following loop before >> dlpar_remove_lmb() a check for DRCONF_MEM_ASSIGNED is made before >> removing it. This means that a LMB that is both !DRCONF_MEM_RESERVED and >> !DRCONF_MEM_ASSIGNED will be counted as valid, but then not being >> removed. The function will end up not removing all 'lmbs_to_remove' >> LMBs while also not reporting any errors. >> >> Comparing it to dlpar_memory_remove_by_count(), the validation is done >> via lmb_is_removable(), which checks for DRCONF_MEM_ASSIGNED and fadump >> constraints. No additional check is made afterwards, and >> DRCONF_MEM_RESERVED is never checked before dlpar_remove_lmb(). The >> function doesn't have the same 'check A for validation, then B for >> removal' issue as remove_by_ic(), but it's not checking if the LMB is >> reserved. >> >> There is no reason for these functions to validate the same operation in >> two different manners. > > Actually, I think there is: remove_by_ic() is handling a request to > remove a specific range of LMBs. If any are reserved, they can't be > removed and so this needs to fail. But if they are !ASSIGNED, that > essentially means they're *already* removed (or never added), so > "removing" them is, correctly, a no-op. I guess that makes sense. Although I am not aware of any situation, at least thinking about how QEMU adds/removes LMBs, where some LMBs would be removed 'ad-hoc' in the middle of a LMB range that maps to a QEMU DIMM, I can't say that this wouldn't never happen either. It is sensible to make remove_by_ic() resilient to this situation. I'll re-send this patch just with the remove_by_count() change. Thanks, Daniel > > remove_by_count(), in contrast, is being asked to remove a fixed > number of LMBs from wherever they can be found, and for that it needs > to find LMBs that haven't already been removed. > > Basically remove_by_ic() is an absolute request: "make this set of > LMBs be not-plugged", whereas remove_by_count() is a relative request > "make N less LMBs be plugged". > > > So I think remove_by_ic()s existing handling is correct. I'm less > sure if remove_by_count() ignoring RESERVED is correct - I couldn't > quickly find under what circumstances RESERVED gets set. > > >> This patch addresses that by changing >> lmb_is_removable() to also check for DRCONF_MEM_RESERVED to tell if a >> lmb is removable, making dlpar_memory_remove_by_count() take the >> reservation state into account when counting the LMBs. >> lmb_is_removable() is then used in the validation step of >> dlpar_memory_remove_by_ic(), which is already checking for both states >> but in different stages, to avoid counting a LMB that is not assigned as >> eligible for removal. We can then skip the check before >> dlpar_remove_lmb() since we're validating all LMBs beforehand. >> >> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> >> --- >> arch/powerpc/platforms/pseries/hotplug-memory.c | 8 +++----- >> 1 file changed, 3 insertions(+), 5 deletions(-) >> >> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c >> index bb98574a84a2..4e6d162c3f1a 100644 >> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c >> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c >> @@ -348,7 +348,8 @@ static int pseries_remove_mem_node(struct device_node *np) >> >> static bool lmb_is_removable(struct drmem_lmb *lmb) >> { >> - if (!(lmb->flags & DRCONF_MEM_ASSIGNED)) >> + if ((lmb->flags & DRCONF_MEM_RESERVED) || >> + !(lmb->flags & DRCONF_MEM_ASSIGNED)) >> return false; >> >> #ifdef CONFIG_FA_DUMP >> @@ -523,7 +524,7 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index) >> >> /* Validate that there are enough LMBs to satisfy the request */ >> for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { >> - if (lmb->flags & DRCONF_MEM_RESERVED) >> + if (!lmb_is_removable(lmb)) >> break; >> >> lmbs_available++; >> @@ -533,9 +534,6 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index) >> return -EINVAL; >> >> for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { >> - if (!(lmb->flags & DRCONF_MEM_ASSIGNED)) >> - continue; >> - >> rc = dlpar_remove_lmb(lmb); >> if (rc) >> break; > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] hotplug-memory.c: enhance dlpar_memory_remove* LMB checks 2021-05-07 16:36 ` Daniel Henrique Barboza @ 2021-05-09 8:43 ` David Gibson 0 siblings, 0 replies; 10+ messages in thread From: David Gibson @ 2021-05-09 8:43 UTC (permalink / raw) To: Daniel Henrique Barboza; +Cc: linuxppc-dev [-- Attachment #1: Type: text/plain, Size: 5383 bytes --] On Fri, May 07, 2021 at 01:36:06PM -0300, Daniel Henrique Barboza wrote: > > > On 5/3/21 10:02 PM, David Gibson wrote: > > On Fri, Apr 30, 2021 at 09:09:16AM -0300, Daniel Henrique Barboza wrote: > > > dlpar_memory_remove_by_ic() validates the amount of LMBs to be removed > > > by checking !DRCONF_MEM_RESERVED, and in the following loop before > > > dlpar_remove_lmb() a check for DRCONF_MEM_ASSIGNED is made before > > > removing it. This means that a LMB that is both !DRCONF_MEM_RESERVED and > > > !DRCONF_MEM_ASSIGNED will be counted as valid, but then not being > > > removed. The function will end up not removing all 'lmbs_to_remove' > > > LMBs while also not reporting any errors. > > > > > > Comparing it to dlpar_memory_remove_by_count(), the validation is done > > > via lmb_is_removable(), which checks for DRCONF_MEM_ASSIGNED and fadump > > > constraints. No additional check is made afterwards, and > > > DRCONF_MEM_RESERVED is never checked before dlpar_remove_lmb(). The > > > function doesn't have the same 'check A for validation, then B for > > > removal' issue as remove_by_ic(), but it's not checking if the LMB is > > > reserved. > > > > > > There is no reason for these functions to validate the same operation in > > > two different manners. > > > > Actually, I think there is: remove_by_ic() is handling a request to > > remove a specific range of LMBs. If any are reserved, they can't be > > removed and so this needs to fail. But if they are !ASSIGNED, that > > essentially means they're *already* removed (or never added), so > > "removing" them is, correctly, a no-op. > > I guess that makes sense. Although I am not aware of any situation, at least > thinking about how QEMU adds/removes LMBs, where some LMBs would be removed > 'ad-hoc' in the middle of a LMB range that maps to a QEMU DIMM, I can't say > that this wouldn't never happen either. Right. I believe a user could explicitly offline LMBs in the middle of a DIMM. There's not much reason to do so, but it's possible. There might also be situations involving memory errors where individual LMBs could get offlined. > It is sensible to make remove_by_ic() > resilient to this situation. > > I'll re-send this patch just with the remove_by_count() change. > > > Thanks, > > > Daniel > > > > > remove_by_count(), in contrast, is being asked to remove a fixed > > number of LMBs from wherever they can be found, and for that it needs > > to find LMBs that haven't already been removed. > > > > Basically remove_by_ic() is an absolute request: "make this set of > > LMBs be not-plugged", whereas remove_by_count() is a relative request > > "make N less LMBs be plugged". > > > > > > So I think remove_by_ic()s existing handling is correct. I'm less > > sure if remove_by_count() ignoring RESERVED is correct - I couldn't > > quickly find under what circumstances RESERVED gets set. > > > > > > > This patch addresses that by changing > > > lmb_is_removable() to also check for DRCONF_MEM_RESERVED to tell if a > > > lmb is removable, making dlpar_memory_remove_by_count() take the > > > reservation state into account when counting the LMBs. > > > lmb_is_removable() is then used in the validation step of > > > dlpar_memory_remove_by_ic(), which is already checking for both states > > > but in different stages, to avoid counting a LMB that is not assigned as > > > eligible for removal. We can then skip the check before > > > dlpar_remove_lmb() since we're validating all LMBs beforehand. > > > > > > Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> > > > --- > > > arch/powerpc/platforms/pseries/hotplug-memory.c | 8 +++----- > > > 1 file changed, 3 insertions(+), 5 deletions(-) > > > > > > diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c > > > index bb98574a84a2..4e6d162c3f1a 100644 > > > --- a/arch/powerpc/platforms/pseries/hotplug-memory.c > > > +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c > > > @@ -348,7 +348,8 @@ static int pseries_remove_mem_node(struct device_node *np) > > > static bool lmb_is_removable(struct drmem_lmb *lmb) > > > { > > > - if (!(lmb->flags & DRCONF_MEM_ASSIGNED)) > > > + if ((lmb->flags & DRCONF_MEM_RESERVED) || > > > + !(lmb->flags & DRCONF_MEM_ASSIGNED)) > > > return false; > > > #ifdef CONFIG_FA_DUMP > > > @@ -523,7 +524,7 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index) > > > /* Validate that there are enough LMBs to satisfy the request */ > > > for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { > > > - if (lmb->flags & DRCONF_MEM_RESERVED) > > > + if (!lmb_is_removable(lmb)) > > > break; > > > lmbs_available++; > > > @@ -533,9 +534,6 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index) > > > return -EINVAL; > > > for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { > > > - if (!(lmb->flags & DRCONF_MEM_ASSIGNED)) > > > - continue; > > > - > > > rc = dlpar_remove_lmb(lmb); > > > if (rc) > > > break; > > > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] hotplug-memory.c: enhance dlpar_memory_remove* LMB checks 2021-05-04 1:02 ` David Gibson 2021-05-07 16:36 ` Daniel Henrique Barboza @ 2021-05-12 20:35 ` Daniel Henrique Barboza 2021-05-13 5:22 ` David Gibson 1 sibling, 1 reply; 10+ messages in thread From: Daniel Henrique Barboza @ 2021-05-12 20:35 UTC (permalink / raw) To: David Gibson; +Cc: linuxppc-dev On 5/3/21 10:02 PM, David Gibson wrote: > On Fri, Apr 30, 2021 at 09:09:16AM -0300, Daniel Henrique Barboza wrote: >> dlpar_memory_remove_by_ic() validates the amount of LMBs to be removed >> by checking !DRCONF_MEM_RESERVED, and in the following loop before >> dlpar_remove_lmb() a check for DRCONF_MEM_ASSIGNED is made before >> removing it. This means that a LMB that is both !DRCONF_MEM_RESERVED and >> !DRCONF_MEM_ASSIGNED will be counted as valid, but then not being >> removed. The function will end up not removing all 'lmbs_to_remove' >> LMBs while also not reporting any errors. >> >> Comparing it to dlpar_memory_remove_by_count(), the validation is done >> via lmb_is_removable(), which checks for DRCONF_MEM_ASSIGNED and fadump >> constraints. No additional check is made afterwards, and >> DRCONF_MEM_RESERVED is never checked before dlpar_remove_lmb(). The >> function doesn't have the same 'check A for validation, then B for >> removal' issue as remove_by_ic(), but it's not checking if the LMB is >> reserved. >> >> There is no reason for these functions to validate the same operation in >> two different manners. > > Actually, I think there is: remove_by_ic() is handling a request to > remove a specific range of LMBs. If any are reserved, they can't be > removed and so this needs to fail. But if they are !ASSIGNED, that > essentially means they're *already* removed (or never added), so > "removing" them is, correctly, a no-op. > > remove_by_count(), in contrast, is being asked to remove a fixed > number of LMBs from wherever they can be found, and for that it needs > to find LMBs that haven't already been removed. > > Basically remove_by_ic() is an absolute request: "make this set of > LMBs be not-plugged", whereas remove_by_count() is a relative request > "make N less LMBs be plugged". > > > So I think remove_by_ic()s existing handling is correct. I'm less > sure if remove_by_count() ignoring RESERVED is correct - I couldn't > quickly find under what circumstances RESERVED gets set. RESERVED is never set by the kernel. It is written in the DT by the firmware/hypervisor and the kernel just checks its value. QEMU sets it in spapr_dt_dynamic_memory() with the following comment: /* * LMB information for RMA, boot time RAM and gap b/n RAM and * device memory region -- all these are marked as reserved * and as having no valid DRC. */ dynamic_memory[0] = cpu_to_be32(addr >> 32); dynamic_memory[1] = cpu_to_be32(addr & 0xffffffff); dynamic_memory[2] = cpu_to_be32(0); dynamic_memory[3] = cpu_to_be32(0); /* reserved */ dynamic_memory[4] = cpu_to_be32(-1); dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_RESERVED | SPAPR_LMB_FLAGS_DRC_INVALID); The flag is formally described in LOPAR section 4.2.8, "Reserved Memory": "Memory nodes marked with the special value of the “status” property of “reserved” is not to be used or altered by the base OS." This makes me confident that we should check DRCONF_MEM_RESERVED in remove_by_count() as well, since phyp needs do adhere to these semantics and shouldn't be able to remove a LMB marked as RESERVED. Thanks, Daniel > > >> This patch addresses that by changing >> lmb_is_removable() to also check for DRCONF_MEM_RESERVED to tell if a >> lmb is removable, making dlpar_memory_remove_by_count() take the >> reservation state into account when counting the LMBs. >> lmb_is_removable() is then used in the validation step of >> dlpar_memory_remove_by_ic(), which is already checking for both states >> but in different stages, to avoid counting a LMB that is not assigned as >> eligible for removal. We can then skip the check before >> dlpar_remove_lmb() since we're validating all LMBs beforehand. >> >> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> >> --- >> arch/powerpc/platforms/pseries/hotplug-memory.c | 8 +++----- >> 1 file changed, 3 insertions(+), 5 deletions(-) >> >> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c >> index bb98574a84a2..4e6d162c3f1a 100644 >> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c >> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c >> @@ -348,7 +348,8 @@ static int pseries_remove_mem_node(struct device_node *np) >> >> static bool lmb_is_removable(struct drmem_lmb *lmb) >> { >> - if (!(lmb->flags & DRCONF_MEM_ASSIGNED)) >> + if ((lmb->flags & DRCONF_MEM_RESERVED) || >> + !(lmb->flags & DRCONF_MEM_ASSIGNED)) >> return false; >> >> #ifdef CONFIG_FA_DUMP >> @@ -523,7 +524,7 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index) >> >> /* Validate that there are enough LMBs to satisfy the request */ >> for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { >> - if (lmb->flags & DRCONF_MEM_RESERVED) >> + if (!lmb_is_removable(lmb)) >> break; >> >> lmbs_available++; >> @@ -533,9 +534,6 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index) >> return -EINVAL; >> >> for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { >> - if (!(lmb->flags & DRCONF_MEM_ASSIGNED)) >> - continue; >> - >> rc = dlpar_remove_lmb(lmb); >> if (rc) >> break; > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] hotplug-memory.c: enhance dlpar_memory_remove* LMB checks 2021-05-12 20:35 ` Daniel Henrique Barboza @ 2021-05-13 5:22 ` David Gibson 0 siblings, 0 replies; 10+ messages in thread From: David Gibson @ 2021-05-13 5:22 UTC (permalink / raw) To: Daniel Henrique Barboza; +Cc: linuxppc-dev [-- Attachment #1: Type: text/plain, Size: 3982 bytes --] On Wed, May 12, 2021 at 05:35:39PM -0300, Daniel Henrique Barboza wrote: > > On 5/3/21 10:02 PM, David Gibson wrote: > > On Fri, Apr 30, 2021 at 09:09:16AM -0300, Daniel Henrique Barboza wrote: > > > dlpar_memory_remove_by_ic() validates the amount of LMBs to be removed > > > by checking !DRCONF_MEM_RESERVED, and in the following loop before > > > dlpar_remove_lmb() a check for DRCONF_MEM_ASSIGNED is made before > > > removing it. This means that a LMB that is both !DRCONF_MEM_RESERVED and > > > !DRCONF_MEM_ASSIGNED will be counted as valid, but then not being > > > removed. The function will end up not removing all 'lmbs_to_remove' > > > LMBs while also not reporting any errors. > > > > > > Comparing it to dlpar_memory_remove_by_count(), the validation is done > > > via lmb_is_removable(), which checks for DRCONF_MEM_ASSIGNED and fadump > > > constraints. No additional check is made afterwards, and > > > DRCONF_MEM_RESERVED is never checked before dlpar_remove_lmb(). The > > > function doesn't have the same 'check A for validation, then B for > > > removal' issue as remove_by_ic(), but it's not checking if the LMB is > > > reserved. > > > > > > There is no reason for these functions to validate the same operation in > > > two different manners. > > > > Actually, I think there is: remove_by_ic() is handling a request to > > remove a specific range of LMBs. If any are reserved, they can't be > > removed and so this needs to fail. But if they are !ASSIGNED, that > > essentially means they're *already* removed (or never added), so > > "removing" them is, correctly, a no-op. > > > > remove_by_count(), in contrast, is being asked to remove a fixed > > number of LMBs from wherever they can be found, and for that it needs > > to find LMBs that haven't already been removed. > > > > Basically remove_by_ic() is an absolute request: "make this set of > > LMBs be not-plugged", whereas remove_by_count() is a relative request > > "make N less LMBs be plugged". > > > > > > So I think remove_by_ic()s existing handling is correct. I'm less > > sure if remove_by_count() ignoring RESERVED is correct - I couldn't > > quickly find under what circumstances RESERVED gets set. > > RESERVED is never set by the kernel. It is written in the DT by the > firmware/hypervisor and the kernel just checks its value. QEMU sets it in > spapr_dt_dynamic_memory() with the following comment: > > > /* > * LMB information for RMA, boot time RAM and gap b/n RAM and > * device memory region -- all these are marked as reserved > * and as having no valid DRC. > */ > dynamic_memory[0] = cpu_to_be32(addr >> 32); > dynamic_memory[1] = cpu_to_be32(addr & 0xffffffff); > dynamic_memory[2] = cpu_to_be32(0); > dynamic_memory[3] = cpu_to_be32(0); /* reserved */ > dynamic_memory[4] = cpu_to_be32(-1); > dynamic_memory[5] = cpu_to_be32(SPAPR_LMB_FLAGS_RESERVED | > SPAPR_LMB_FLAGS_DRC_INVALID); > > > The flag is formally described in LOPAR section 4.2.8, "Reserved Memory": > > "Memory nodes marked with the special value of the “status” property of > “reserved” is not to be used or altered by the base OS." > > > This makes me confident that we should check DRCONF_MEM_RESERVED in > remove_by_count() as well, since phyp needs do adhere to these semantics and > shouldn't be able to remove a LMB marked as RESERVED. Right. I doubt it would have caused a problem in practice, because I'm pretty sure we should never get an LMB which is RESERVED && ASSIGNED, but it's probably safer to make it explicit. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 3/3] pseries/hotplug-memory.c: adding dlpar_memory_remove_lmbs_common() 2021-04-30 12:09 [PATCH 0/3] Unisolate LMBs DRC on removal error + cleanups Daniel Henrique Barboza 2021-04-30 12:09 ` [PATCH 1/3] powerpc/pseries: Set UNISOLATE on dlpar_memory_remove_by_ic() error Daniel Henrique Barboza 2021-04-30 12:09 ` [PATCH 2/3] hotplug-memory.c: enhance dlpar_memory_remove* LMB checks Daniel Henrique Barboza @ 2021-04-30 12:09 ` Daniel Henrique Barboza 2 siblings, 0 replies; 10+ messages in thread From: Daniel Henrique Barboza @ 2021-04-30 12:09 UTC (permalink / raw) To: linuxppc-dev; +Cc: Daniel Henrique Barboza, david One difference between dlpar_memory_remove_by_count() and dlpar_memory_remove_by_ic() is that the latter, added in commit 753843471cbb, removes the LMBs in a contiguous block. This was done because QEMU works with DIMMs, which is nothing more than a set of LMBs, that must be added or removed together. Failing to remove one LMB must fail the removal of the entire set of LMBs. Another difference is that dlpar_memory_remove_by_ic() is going to set the LMB DRC to unisolate in case of a removal error, which is a no-op for the hypervisors that doesn't care about this error handling knob, and could be called by remove_by_count() without issues. Aside from that, the logic is the same for both functions, and yet we keep them separated and having to duplicate LMB removal logic in both. This patch introduces a helper called dlpar_memory_remove_lmbs_common() to be used by both functions. The helper handles the case of block removal of remove_by_ic() by failing earlier in the validation and removal steps if the helper was called by remove_by_ic() (i.e. it was called with a drc_index) while preserving the more relaxed behavior of remove_by_count() if drc_index is 0. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- .../platforms/pseries/hotplug-memory.c | 163 +++++++----------- 1 file changed, 67 insertions(+), 96 deletions(-) diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c index 4e6d162c3f1a..a031993725ca 100644 --- a/arch/powerpc/platforms/pseries/hotplug-memory.c +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -399,25 +399,43 @@ static int dlpar_remove_lmb(struct drmem_lmb *lmb) return 0; } -static int dlpar_memory_remove_by_count(u32 lmbs_to_remove) +static int dlpar_memory_remove_lmbs_common(u32 lmbs_to_remove, u32 drc_index) { - struct drmem_lmb *lmb; - int lmbs_removed = 0; + struct drmem_lmb *lmb, *start_lmb, *end_lmb; int lmbs_available = 0; + int lmbs_removed = 0; int rc; - - pr_info("Attempting to hot-remove %d LMB(s)\n", lmbs_to_remove); + /* + * dlpar_memory_remove_by_ic() wants to remove all + * 'lmbs_to_remove' LMBs, starting from drc_index, in a + * contiguous block. + */ + bool block_removal; if (lmbs_to_remove == 0) return -EINVAL; + block_removal = drc_index != 0; + + if (block_removal) { + rc = get_lmb_range(drc_index, lmbs_to_remove, &start_lmb, &end_lmb); + if (rc) + return -EINVAL; + } else { + start_lmb = &drmem_info->lmbs[0]; + end_lmb = &drmem_info->lmbs[drmem_info->n_lmbs]; + } + /* Validate that there are enough LMBs to satisfy the request */ - for_each_drmem_lmb(lmb) { - if (lmb_is_removable(lmb)) + for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { + if (lmb_is_removable(lmb)) { lmbs_available++; - if (lmbs_available == lmbs_to_remove) + if (lmbs_available == lmbs_to_remove) + break; + } else if (block_removal) { break; + } } if (lmbs_available < lmbs_to_remove) { @@ -426,28 +444,40 @@ static int dlpar_memory_remove_by_count(u32 lmbs_to_remove) return -EINVAL; } - for_each_drmem_lmb(lmb) { + for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { rc = dlpar_remove_lmb(lmb); - if (rc) - continue; - /* Mark this lmb so we can add it later if all of the - * requested LMBs cannot be removed. - */ - drmem_mark_lmb_reserved(lmb); + if (!rc) { + /* Mark this lmb so we can add it later if all of the + * requested LMBs cannot be removed. + */ + drmem_mark_lmb_reserved(lmb); - lmbs_removed++; - if (lmbs_removed == lmbs_to_remove) + lmbs_removed++; + if (lmbs_removed == lmbs_to_remove) + break; + } else if (block_removal) { break; + } } if (lmbs_removed != lmbs_to_remove) { - pr_err("Memory hot-remove failed, adding LMB's back\n"); + if (block_removal) + pr_err("Memory indexed-count-remove failed, adding any removed LMBs\n"); + else + pr_err("Memory hot-remove failed, adding LMB's back\n"); - for_each_drmem_lmb(lmb) { + for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { if (!drmem_lmb_reserved(lmb)) continue; + /* + * Setting the isolation state of an UNISOLATED/CONFIGURED + * device to UNISOLATE is a no-op, but the hypervisor can + * use it as a hint that the LMB removal failed. + */ + dlpar_unisolate_drc(lmb->drc_index); + rc = dlpar_add_lmb(lmb); if (rc) pr_err("Failed to add LMB back, drc index %x\n", @@ -458,13 +488,13 @@ static int dlpar_memory_remove_by_count(u32 lmbs_to_remove) rc = -EINVAL; } else { - for_each_drmem_lmb(lmb) { + for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { if (!drmem_lmb_reserved(lmb)) continue; dlpar_release_drc(lmb->drc_index); - pr_info("Memory at %llx was hot-removed\n", - lmb->base_addr); + pr_info("Memory at %llx (drc index %x) was hot-removed\n", + lmb->base_addr, lmb->drc_index); drmem_remove_lmb_reservation(lmb); } @@ -474,6 +504,21 @@ static int dlpar_memory_remove_by_count(u32 lmbs_to_remove) return rc; } +static int dlpar_memory_remove_by_count(u32 lmbs_to_remove) +{ + pr_info("Attempting to hot-remove %d LMB(s)\n", lmbs_to_remove); + + return dlpar_memory_remove_lmbs_common(lmbs_to_remove, 0); +} + +static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index) +{ + pr_info("Attempting to hot-remove %u LMB(s) at %x\n", + lmbs_to_remove, drc_index); + + return dlpar_memory_remove_lmbs_common(lmbs_to_remove, drc_index); +} + static int dlpar_memory_remove_by_index(u32 drc_index) { struct drmem_lmb *lmb; @@ -506,80 +551,6 @@ static int dlpar_memory_remove_by_index(u32 drc_index) return rc; } -static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index) -{ - struct drmem_lmb *lmb, *start_lmb, *end_lmb; - int lmbs_available = 0; - int rc; - - pr_info("Attempting to hot-remove %u LMB(s) at %x\n", - lmbs_to_remove, drc_index); - - if (lmbs_to_remove == 0) - return -EINVAL; - - rc = get_lmb_range(drc_index, lmbs_to_remove, &start_lmb, &end_lmb); - if (rc) - return -EINVAL; - - /* Validate that there are enough LMBs to satisfy the request */ - for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { - if (!lmb_is_removable(lmb)) - break; - - lmbs_available++; - } - - if (lmbs_available < lmbs_to_remove) - return -EINVAL; - - for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { - rc = dlpar_remove_lmb(lmb); - if (rc) - break; - - drmem_mark_lmb_reserved(lmb); - } - - if (rc) { - pr_err("Memory indexed-count-remove failed, adding any removed LMBs\n"); - - - for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { - if (!drmem_lmb_reserved(lmb)) - continue; - - /* - * Setting the isolation state of an UNISOLATED/CONFIGURED - * device to UNISOLATE is a no-op, but the hypervisor can - * use it as a hint that the LMB removal failed. - */ - dlpar_unisolate_drc(lmb->drc_index); - - rc = dlpar_add_lmb(lmb); - if (rc) - pr_err("Failed to add LMB, drc index %x\n", - lmb->drc_index); - - drmem_remove_lmb_reservation(lmb); - } - rc = -EINVAL; - } else { - for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { - if (!drmem_lmb_reserved(lmb)) - continue; - - dlpar_release_drc(lmb->drc_index); - pr_info("Memory at %llx (drc index %x) was hot-removed\n", - lmb->base_addr, lmb->drc_index); - - drmem_remove_lmb_reservation(lmb); - } - } - - return rc; -} - #else static inline int pseries_remove_memblock(unsigned long base, unsigned long memblock_size) -- 2.30.2 ^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-05-13 5:23 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-04-30 12:09 [PATCH 0/3] Unisolate LMBs DRC on removal error + cleanups Daniel Henrique Barboza 2021-04-30 12:09 ` [PATCH 1/3] powerpc/pseries: Set UNISOLATE on dlpar_memory_remove_by_ic() error Daniel Henrique Barboza 2021-05-04 0:45 ` David Gibson 2021-04-30 12:09 ` [PATCH 2/3] hotplug-memory.c: enhance dlpar_memory_remove* LMB checks Daniel Henrique Barboza 2021-05-04 1:02 ` David Gibson 2021-05-07 16:36 ` Daniel Henrique Barboza 2021-05-09 8:43 ` David Gibson 2021-05-12 20:35 ` Daniel Henrique Barboza 2021-05-13 5:22 ` David Gibson 2021-04-30 12:09 ` [PATCH 3/3] pseries/hotplug-memory.c: adding dlpar_memory_remove_lmbs_common() Daniel Henrique Barboza
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).