* [PATCHv3 2/2] powerpc/pseries: update device tree before ejecting hotplug uevents
@ 2020-07-22 1:52 ` Pingfan Liu
0 siblings, 0 replies; 20+ messages in thread
From: Pingfan Liu @ 2020-07-22 1:52 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nathan Lynch, kexec, Hari Bathini, Pingfan Liu
A bug is observed on pseries by taking the following steps on rhel:
-1. drmgr -c mem -r -q 5
-2. echo c > /proc/sysrq-trigger
And then, the failure looks like:
kdump: saving to /sysroot//var/crash/127.0.0.1-2020-01-16-02:06:14/
kdump: saving vmcore-dmesg.txt
kdump: saving vmcore-dmesg.txt complete
kdump: saving vmcore
Checking for memory holes : [ 0.0 %] / Checking for memory holes : [100.0 %] | Excluding unnecessary pages : [100.0 %] \ Copying data : [ 0.3 %] - eta: 38s[ 44.337636] hash-mmu: mm: Hashing failure ! EA=0x7fffba400000 access=0x8000000000000004 current=makedumpfile
[ 44.337663] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 psize 2 pte=0xc000000050000504
[ 44.337677] hash-mmu: mm: Hashing failure ! EA=0x7fffba400000 access=0x8000000000000004 current=makedumpfile
[ 44.337692] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 psize 2 pte=0xc000000050000504
[ 44.337708] makedumpfile[469]: unhandled signal 7 at 00007fffba400000 nip 00007fffbbc4d7fc lr 000000011356ca3c code 2
[ 44.338548] Core dump to |/bin/false pipe failed
/lib/kdump-lib-initramfs.sh: line 98: 469 Bus error $CORE_COLLECTOR /proc/vmcore $_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/vmcore-incomplete
kdump: saving vmcore failed
* Root cause *
After analyzing, it turns out that in the current implementation,
when hot-removing lmb, the KOBJ_REMOVE event ejects before the dt updating as
the code __remove_memory() comes before drmem_update_dt().
So in kdump kernel, when read_from_oldmem() resorts to
pSeries_lpar_hpte_insert() to install hpte, but fails with -2 due to
non-exist pfn. And finally, low_hash_fault() raise SIGBUS to process, as it
can be observed "Bus error"
From a viewpoint of listener and publisher, the publisher notifies the
listener before data is ready. This introduces a problem where udev
launches kexec-tools (due to KOBJ_REMOVE) and loads a stale dt before
updating. And in capture kernel, makedumpfile will access the memory based
on the stale dt info, and hit a SIGBUS error due to an un-existed lmb.
* Fix *
In order to fix this issue, update dt before __remove_memory(), and
accordingly the same rule in hot-add path.
This will introduce extra dt updating payload for each involved lmb when hotplug.
But it should be fine since drmem_update_dt() is memory based operation and
hotplug is not a hot path.
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
To: linuxppc-dev@lists.ozlabs.org
Cc: kexec@lists.infradead.org
---
v2 -> v3: rebase onto ppc next-test branch
---
arch/powerpc/platforms/pseries/hotplug-memory.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 1a3ac3b..def8cb3f 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -372,6 +372,7 @@ static int dlpar_remove_lmb(struct drmem_lmb *lmb)
invalidate_lmb_associativity_index(lmb);
lmb_clear_nid(lmb);
lmb->flags &= ~DRCONF_MEM_ASSIGNED;
+ drmem_update_dt();
__remove_memory(nid, base_addr, block_sz);
@@ -607,6 +608,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
lmb_set_nid(lmb);
lmb->flags |= DRCONF_MEM_ASSIGNED;
+ drmem_update_dt();
block_sz = memory_block_size_bytes();
@@ -625,6 +627,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
invalidate_lmb_associativity_index(lmb);
lmb_clear_nid(lmb);
lmb->flags &= ~DRCONF_MEM_ASSIGNED;
+ drmem_update_dt();
__remove_memory(nid, base_addr, block_sz);
}
@@ -877,9 +880,6 @@ int dlpar_memory(struct pseries_hp_errorlog *hp_elog)
break;
}
- if (!rc)
- rc = drmem_update_dt();
-
unlock_device_hotplug();
return rc;
}
--
2.7.5
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCHv3 2/2] powerpc/pseries: update device tree before ejecting hotplug uevents
2020-07-22 1:52 ` Pingfan Liu
@ 2020-07-22 4:57 ` Michael Ellerman
-1 siblings, 0 replies; 20+ messages in thread
From: Michael Ellerman @ 2020-07-22 4:57 UTC (permalink / raw)
To: Pingfan Liu, linuxppc-dev; +Cc: Nathan Lynch, kexec, Hari Bathini, Pingfan Liu
Pingfan Liu <kernelfans@gmail.com> writes:
> A bug is observed on pseries by taking the following steps on rhel:
^
RHEL
I assume it happens on mainline too?
> -1. drmgr -c mem -r -q 5
> -2. echo c > /proc/sysrq-trigger
>
> And then, the failure looks like:
> kdump: saving to /sysroot//var/crash/127.0.0.1-2020-01-16-02:06:14/
> kdump: saving vmcore-dmesg.txt
> kdump: saving vmcore-dmesg.txt complete
> kdump: saving vmcore
> Checking for memory holes : [ 0.0 %] / Checking for memory holes : [100.0 %] | Excluding unnecessary pages : [100.0 %] \ Copying data : [ 0.3 %] - eta: 38s[ 44.337636] hash-mmu: mm: Hashing failure ! EA=0x7fffba400000 access=0x8000000000000004 current=makedumpfile
> [ 44.337663] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 psize 2 pte=0xc000000050000504
> [ 44.337677] hash-mmu: mm: Hashing failure ! EA=0x7fffba400000 access=0x8000000000000004 current=makedumpfile
> [ 44.337692] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 psize 2 pte=0xc000000050000504
> [ 44.337708] makedumpfile[469]: unhandled signal 7 at 00007fffba400000 nip 00007fffbbc4d7fc lr 000000011356ca3c code 2
> [ 44.338548] Core dump to |/bin/false pipe failed
> /lib/kdump-lib-initramfs.sh: line 98: 469 Bus error $CORE_COLLECTOR /proc/vmcore $_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/vmcore-incomplete
> kdump: saving vmcore failed
>
> * Root cause *
> After analyzing, it turns out that in the current implementation,
> when hot-removing lmb, the KOBJ_REMOVE event ejects before the dt updating as
> the code __remove_memory() comes before drmem_update_dt().
> So in kdump kernel, when read_from_oldmem() resorts to
> pSeries_lpar_hpte_insert() to install hpte, but fails with -2 due to
> non-exist pfn. And finally, low_hash_fault() raise SIGBUS to process, as it
> can be observed "Bus error"
>
> From a viewpoint of listener and publisher, the publisher notifies the
> listener before data is ready. This introduces a problem where udev
> launches kexec-tools (due to KOBJ_REMOVE) and loads a stale dt before
> updating. And in capture kernel, makedumpfile will access the memory based
> on the stale dt info, and hit a SIGBUS error due to an un-existed lmb.
>
> * Fix *
> In order to fix this issue, update dt before __remove_memory(), and
> accordingly the same rule in hot-add path.
>
> This will introduce extra dt updating payload for each involved lmb when hotplug.
> But it should be fine since drmem_update_dt() is memory based operation and
> hotplug is not a hot path.
>
> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Hari Bathini <hbathini@linux.ibm.com>
> Cc: Nathan Lynch <nathanl@linux.ibm.com>
> To: linuxppc-dev@lists.ozlabs.org
> Cc: kexec@lists.infradead.org
> ---
> v2 -> v3: rebase onto ppc next-test branch
> ---
> arch/powerpc/platforms/pseries/hotplug-memory.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index 1a3ac3b..def8cb3f 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -372,6 +372,7 @@ static int dlpar_remove_lmb(struct drmem_lmb *lmb)
> invalidate_lmb_associativity_index(lmb);
> lmb_clear_nid(lmb);
> lmb->flags &= ~DRCONF_MEM_ASSIGNED;
> + drmem_update_dt();
No error checking?
> __remove_memory(nid, base_addr, block_sz);
>
> @@ -607,6 +608,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
>
> lmb_set_nid(lmb);
> lmb->flags |= DRCONF_MEM_ASSIGNED;
> + drmem_update_dt();
And here ..
>
> block_sz = memory_block_size_bytes();
>
> @@ -625,6 +627,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
> invalidate_lmb_associativity_index(lmb);
> lmb_clear_nid(lmb);
> lmb->flags &= ~DRCONF_MEM_ASSIGNED;
> + drmem_update_dt();
And here ..
> __remove_memory(nid, base_addr, block_sz);
> }
> @@ -877,9 +880,6 @@ int dlpar_memory(struct pseries_hp_errorlog *hp_elog)
> break;
> }
>
> - if (!rc)
> - rc = drmem_update_dt();
> -
> unlock_device_hotplug();
> return rc;
Whereas previously we did check it.
cheers
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCHv3 2/2] powerpc/pseries: update device tree before ejecting hotplug uevents
@ 2020-07-22 4:57 ` Michael Ellerman
0 siblings, 0 replies; 20+ messages in thread
From: Michael Ellerman @ 2020-07-22 4:57 UTC (permalink / raw)
To: Pingfan Liu, linuxppc-dev; +Cc: Nathan Lynch, kexec, Hari Bathini, Pingfan Liu
Pingfan Liu <kernelfans@gmail.com> writes:
> A bug is observed on pseries by taking the following steps on rhel:
^
RHEL
I assume it happens on mainline too?
> -1. drmgr -c mem -r -q 5
> -2. echo c > /proc/sysrq-trigger
>
> And then, the failure looks like:
> kdump: saving to /sysroot//var/crash/127.0.0.1-2020-01-16-02:06:14/
> kdump: saving vmcore-dmesg.txt
> kdump: saving vmcore-dmesg.txt complete
> kdump: saving vmcore
> Checking for memory holes : [ 0.0 %] / Checking for memory holes : [100.0 %] | Excluding unnecessary pages : [100.0 %] \ Copying data : [ 0.3 %] - eta: 38s[ 44.337636] hash-mmu: mm: Hashing failure ! EA=0x7fffba400000 access=0x8000000000000004 current=makedumpfile
> [ 44.337663] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 psize 2 pte=0xc000000050000504
> [ 44.337677] hash-mmu: mm: Hashing failure ! EA=0x7fffba400000 access=0x8000000000000004 current=makedumpfile
> [ 44.337692] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 psize 2 pte=0xc000000050000504
> [ 44.337708] makedumpfile[469]: unhandled signal 7 at 00007fffba400000 nip 00007fffbbc4d7fc lr 000000011356ca3c code 2
> [ 44.338548] Core dump to |/bin/false pipe failed
> /lib/kdump-lib-initramfs.sh: line 98: 469 Bus error $CORE_COLLECTOR /proc/vmcore $_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/vmcore-incomplete
> kdump: saving vmcore failed
>
> * Root cause *
> After analyzing, it turns out that in the current implementation,
> when hot-removing lmb, the KOBJ_REMOVE event ejects before the dt updating as
> the code __remove_memory() comes before drmem_update_dt().
> So in kdump kernel, when read_from_oldmem() resorts to
> pSeries_lpar_hpte_insert() to install hpte, but fails with -2 due to
> non-exist pfn. And finally, low_hash_fault() raise SIGBUS to process, as it
> can be observed "Bus error"
>
> From a viewpoint of listener and publisher, the publisher notifies the
> listener before data is ready. This introduces a problem where udev
> launches kexec-tools (due to KOBJ_REMOVE) and loads a stale dt before
> updating. And in capture kernel, makedumpfile will access the memory based
> on the stale dt info, and hit a SIGBUS error due to an un-existed lmb.
>
> * Fix *
> In order to fix this issue, update dt before __remove_memory(), and
> accordingly the same rule in hot-add path.
>
> This will introduce extra dt updating payload for each involved lmb when hotplug.
> But it should be fine since drmem_update_dt() is memory based operation and
> hotplug is not a hot path.
>
> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Hari Bathini <hbathini@linux.ibm.com>
> Cc: Nathan Lynch <nathanl@linux.ibm.com>
> To: linuxppc-dev@lists.ozlabs.org
> Cc: kexec@lists.infradead.org
> ---
> v2 -> v3: rebase onto ppc next-test branch
> ---
> arch/powerpc/platforms/pseries/hotplug-memory.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index 1a3ac3b..def8cb3f 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -372,6 +372,7 @@ static int dlpar_remove_lmb(struct drmem_lmb *lmb)
> invalidate_lmb_associativity_index(lmb);
> lmb_clear_nid(lmb);
> lmb->flags &= ~DRCONF_MEM_ASSIGNED;
> + drmem_update_dt();
No error checking?
> __remove_memory(nid, base_addr, block_sz);
>
> @@ -607,6 +608,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
>
> lmb_set_nid(lmb);
> lmb->flags |= DRCONF_MEM_ASSIGNED;
> + drmem_update_dt();
And here ..
>
> block_sz = memory_block_size_bytes();
>
> @@ -625,6 +627,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
> invalidate_lmb_associativity_index(lmb);
> lmb_clear_nid(lmb);
> lmb->flags &= ~DRCONF_MEM_ASSIGNED;
> + drmem_update_dt();
And here ..
> __remove_memory(nid, base_addr, block_sz);
> }
> @@ -877,9 +880,6 @@ int dlpar_memory(struct pseries_hp_errorlog *hp_elog)
> break;
> }
>
> - if (!rc)
> - rc = drmem_update_dt();
> -
> unlock_device_hotplug();
> return rc;
Whereas previously we did check it.
cheers
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCHv3 2/2] powerpc/pseries: update device tree before ejecting hotplug uevents
2020-07-22 4:57 ` Michael Ellerman
@ 2020-07-23 2:27 ` Pingfan Liu
-1 siblings, 0 replies; 20+ messages in thread
From: Pingfan Liu @ 2020-07-23 2:27 UTC (permalink / raw)
To: Michael Ellerman
Cc: Nathan Lynch, Kexec Mailing List, linuxppc-dev, Hari Bathini
On Wed, Jul 22, 2020 at 12:57 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Pingfan Liu <kernelfans@gmail.com> writes:
> > A bug is observed on pseries by taking the following steps on rhel:
> ^
> RHEL
>
> I assume it happens on mainline too?
Yes, it does.
>
[...]
> > diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
> > index 1a3ac3b..def8cb3f 100644
> > --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> > +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> > @@ -372,6 +372,7 @@ static int dlpar_remove_lmb(struct drmem_lmb *lmb)
> > invalidate_lmb_associativity_index(lmb);
> > lmb_clear_nid(lmb);
> > lmb->flags &= ~DRCONF_MEM_ASSIGNED;
> > + drmem_update_dt();
>
> No error checking?
Hmm, here should be a more careful design. Please see the comment at the end.
>
> > __remove_memory(nid, base_addr, block_sz);
> >
> > @@ -607,6 +608,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
> >
> > lmb_set_nid(lmb);
> > lmb->flags |= DRCONF_MEM_ASSIGNED;
> > + drmem_update_dt();
>
> And here ..
> >
> > block_sz = memory_block_size_bytes();
> >
> > @@ -625,6 +627,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
> > invalidate_lmb_associativity_index(lmb);
> > lmb_clear_nid(lmb);
> > lmb->flags &= ~DRCONF_MEM_ASSIGNED;
> > + drmem_update_dt();
>
>
> And here ..
>
> > __remove_memory(nid, base_addr, block_sz);
> > }
> > @@ -877,9 +880,6 @@ int dlpar_memory(struct pseries_hp_errorlog *hp_elog)
> > break;
> > }
> >
> > - if (!rc)
> > - rc = drmem_update_dt();
> > -
> > unlock_device_hotplug();
> > return rc;
>
> Whereas previously we did check it.
drmem_update_dt() fails iff allocating memory fail. And in the failed
case, even the original code does not roll back the effect of
__add_memory()/__remove_memory().
And I plan to do the following in V4: if drmem_update_dt() fails in
dlpar_add_lmb(), then bails out immediately.
Thanks,
Pingfan
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCHv3 2/2] powerpc/pseries: update device tree before ejecting hotplug uevents
@ 2020-07-23 2:27 ` Pingfan Liu
0 siblings, 0 replies; 20+ messages in thread
From: Pingfan Liu @ 2020-07-23 2:27 UTC (permalink / raw)
To: Michael Ellerman
Cc: Nathan Lynch, Kexec Mailing List, linuxppc-dev, Hari Bathini
On Wed, Jul 22, 2020 at 12:57 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Pingfan Liu <kernelfans@gmail.com> writes:
> > A bug is observed on pseries by taking the following steps on rhel:
> ^
> RHEL
>
> I assume it happens on mainline too?
Yes, it does.
>
[...]
> > diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
> > index 1a3ac3b..def8cb3f 100644
> > --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> > +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> > @@ -372,6 +372,7 @@ static int dlpar_remove_lmb(struct drmem_lmb *lmb)
> > invalidate_lmb_associativity_index(lmb);
> > lmb_clear_nid(lmb);
> > lmb->flags &= ~DRCONF_MEM_ASSIGNED;
> > + drmem_update_dt();
>
> No error checking?
Hmm, here should be a more careful design. Please see the comment at the end.
>
> > __remove_memory(nid, base_addr, block_sz);
> >
> > @@ -607,6 +608,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
> >
> > lmb_set_nid(lmb);
> > lmb->flags |= DRCONF_MEM_ASSIGNED;
> > + drmem_update_dt();
>
> And here ..
> >
> > block_sz = memory_block_size_bytes();
> >
> > @@ -625,6 +627,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
> > invalidate_lmb_associativity_index(lmb);
> > lmb_clear_nid(lmb);
> > lmb->flags &= ~DRCONF_MEM_ASSIGNED;
> > + drmem_update_dt();
>
>
> And here ..
>
> > __remove_memory(nid, base_addr, block_sz);
> > }
> > @@ -877,9 +880,6 @@ int dlpar_memory(struct pseries_hp_errorlog *hp_elog)
> > break;
> > }
> >
> > - if (!rc)
> > - rc = drmem_update_dt();
> > -
> > unlock_device_hotplug();
> > return rc;
>
> Whereas previously we did check it.
drmem_update_dt() fails iff allocating memory fail. And in the failed
case, even the original code does not roll back the effect of
__add_memory()/__remove_memory().
And I plan to do the following in V4: if drmem_update_dt() fails in
dlpar_add_lmb(), then bails out immediately.
Thanks,
Pingfan
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCHv3 2/2] powerpc/pseries: update device tree before ejecting hotplug uevents
2020-07-23 2:27 ` Pingfan Liu
@ 2020-07-23 6:41 ` Michael Ellerman
-1 siblings, 0 replies; 20+ messages in thread
From: Michael Ellerman @ 2020-07-23 6:41 UTC (permalink / raw)
To: Pingfan Liu; +Cc: Nathan Lynch, Kexec Mailing List, linuxppc-dev, Hari Bathini
Pingfan Liu <kernelfans@gmail.com> writes:
> On Wed, Jul 22, 2020 at 12:57 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>>
>> Pingfan Liu <kernelfans@gmail.com> writes:
>> > A bug is observed on pseries by taking the following steps on rhel:
>> ^
>> RHEL
>>
>> I assume it happens on mainline too?
> Yes, it does.
>>
> [...]
>> > diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
>> > index 1a3ac3b..def8cb3f 100644
>> > --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>> > +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>> > @@ -372,6 +372,7 @@ static int dlpar_remove_lmb(struct drmem_lmb *lmb)
>> > invalidate_lmb_associativity_index(lmb);
>> > lmb_clear_nid(lmb);
>> > lmb->flags &= ~DRCONF_MEM_ASSIGNED;
>> > + drmem_update_dt();
>>
>> No error checking?
> Hmm, here should be a more careful design. Please see the comment at the end.
>>
>> > __remove_memory(nid, base_addr, block_sz);
>> >
>> > @@ -607,6 +608,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
>> >
>> > lmb_set_nid(lmb);
>> > lmb->flags |= DRCONF_MEM_ASSIGNED;
>> > + drmem_update_dt();
>>
>> And here ..
>> >
>> > block_sz = memory_block_size_bytes();
>> >
>> > @@ -625,6 +627,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
>> > invalidate_lmb_associativity_index(lmb);
>> > lmb_clear_nid(lmb);
>> > lmb->flags &= ~DRCONF_MEM_ASSIGNED;
>> > + drmem_update_dt();
>>
>>
>> And here ..
>>
>> > __remove_memory(nid, base_addr, block_sz);
>> > }
>> > @@ -877,9 +880,6 @@ int dlpar_memory(struct pseries_hp_errorlog *hp_elog)
>> > break;
>> > }
>> >
>> > - if (!rc)
>> > - rc = drmem_update_dt();
>> > -
>> > unlock_device_hotplug();
>> > return rc;
>>
>> Whereas previously we did check it.
>
> drmem_update_dt() fails iff allocating memory fail.
That's true currently, but it might change in future.
> And in the failed case, even the original code does not roll back the
> effect of __add_memory()/__remove_memory().
Yeah hard to know what the desired behaviour is. If something fails we
at least need to print a message though, not silently swallow it.
> And I plan to do the following in V4: if drmem_update_dt() fails in
> dlpar_add_lmb(), then bails out immediately.
That sounds reasonable.
cheers
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCHv3 2/2] powerpc/pseries: update device tree before ejecting hotplug uevents
@ 2020-07-23 6:41 ` Michael Ellerman
0 siblings, 0 replies; 20+ messages in thread
From: Michael Ellerman @ 2020-07-23 6:41 UTC (permalink / raw)
To: Pingfan Liu; +Cc: Nathan Lynch, Kexec Mailing List, linuxppc-dev, Hari Bathini
Pingfan Liu <kernelfans@gmail.com> writes:
> On Wed, Jul 22, 2020 at 12:57 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>>
>> Pingfan Liu <kernelfans@gmail.com> writes:
>> > A bug is observed on pseries by taking the following steps on rhel:
>> ^
>> RHEL
>>
>> I assume it happens on mainline too?
> Yes, it does.
>>
> [...]
>> > diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
>> > index 1a3ac3b..def8cb3f 100644
>> > --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>> > +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>> > @@ -372,6 +372,7 @@ static int dlpar_remove_lmb(struct drmem_lmb *lmb)
>> > invalidate_lmb_associativity_index(lmb);
>> > lmb_clear_nid(lmb);
>> > lmb->flags &= ~DRCONF_MEM_ASSIGNED;
>> > + drmem_update_dt();
>>
>> No error checking?
> Hmm, here should be a more careful design. Please see the comment at the end.
>>
>> > __remove_memory(nid, base_addr, block_sz);
>> >
>> > @@ -607,6 +608,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
>> >
>> > lmb_set_nid(lmb);
>> > lmb->flags |= DRCONF_MEM_ASSIGNED;
>> > + drmem_update_dt();
>>
>> And here ..
>> >
>> > block_sz = memory_block_size_bytes();
>> >
>> > @@ -625,6 +627,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
>> > invalidate_lmb_associativity_index(lmb);
>> > lmb_clear_nid(lmb);
>> > lmb->flags &= ~DRCONF_MEM_ASSIGNED;
>> > + drmem_update_dt();
>>
>>
>> And here ..
>>
>> > __remove_memory(nid, base_addr, block_sz);
>> > }
>> > @@ -877,9 +880,6 @@ int dlpar_memory(struct pseries_hp_errorlog *hp_elog)
>> > break;
>> > }
>> >
>> > - if (!rc)
>> > - rc = drmem_update_dt();
>> > -
>> > unlock_device_hotplug();
>> > return rc;
>>
>> Whereas previously we did check it.
>
> drmem_update_dt() fails iff allocating memory fail.
That's true currently, but it might change in future.
> And in the failed case, even the original code does not roll back the
> effect of __add_memory()/__remove_memory().
Yeah hard to know what the desired behaviour is. If something fails we
at least need to print a message though, not silently swallow it.
> And I plan to do the following in V4: if drmem_update_dt() fails in
> dlpar_add_lmb(), then bails out immediately.
That sounds reasonable.
cheers
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCHv3 2/2] powerpc/pseries: update device tree before ejecting hotplug uevents
2020-07-22 1:52 ` Pingfan Liu
@ 2020-07-23 13:27 ` Nathan Lynch
-1 siblings, 0 replies; 20+ messages in thread
From: Nathan Lynch @ 2020-07-23 13:27 UTC (permalink / raw)
To: Pingfan Liu
Cc: Michael Ellerman, cheloha, kexec, ldufour, linuxppc-dev,
Hari Bathini
Pingfan Liu <kernelfans@gmail.com> writes:
> A bug is observed on pseries by taking the following steps on rhel:
> -1. drmgr -c mem -r -q 5
> -2. echo c > /proc/sysrq-trigger
>
> And then, the failure looks like:
> kdump: saving to /sysroot//var/crash/127.0.0.1-2020-01-16-02:06:14/
> kdump: saving vmcore-dmesg.txt
> kdump: saving vmcore-dmesg.txt complete
> kdump: saving vmcore
> Checking for memory holes : [ 0.0 %] / Checking for memory holes : [100.0 %] | Excluding unnecessary pages : [100.0 %] \ Copying data : [ 0.3 %] - eta: 38s[ 44.337636] hash-mmu: mm: Hashing failure ! EA=0x7fffba400000 access=0x8000000000000004 current=makedumpfile
> [ 44.337663] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 psize 2 pte=0xc000000050000504
> [ 44.337677] hash-mmu: mm: Hashing failure ! EA=0x7fffba400000 access=0x8000000000000004 current=makedumpfile
> [ 44.337692] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 psize 2 pte=0xc000000050000504
> [ 44.337708] makedumpfile[469]: unhandled signal 7 at 00007fffba400000 nip 00007fffbbc4d7fc lr 000000011356ca3c code 2
> [ 44.338548] Core dump to |/bin/false pipe failed
> /lib/kdump-lib-initramfs.sh: line 98: 469 Bus error $CORE_COLLECTOR /proc/vmcore $_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/vmcore-incomplete
> kdump: saving vmcore failed
>
> * Root cause *
> After analyzing, it turns out that in the current implementation,
> when hot-removing lmb, the KOBJ_REMOVE event ejects before the dt updating as
> the code __remove_memory() comes before drmem_update_dt().
> So in kdump kernel, when read_from_oldmem() resorts to
> pSeries_lpar_hpte_insert() to install hpte, but fails with -2 due to
> non-exist pfn. And finally, low_hash_fault() raise SIGBUS to process, as it
> can be observed "Bus error"
>
> From a viewpoint of listener and publisher, the publisher notifies the
> listener before data is ready. This introduces a problem where udev
> launches kexec-tools (due to KOBJ_REMOVE) and loads a stale dt before
> updating. And in capture kernel, makedumpfile will access the memory based
> on the stale dt info, and hit a SIGBUS error due to an un-existed lmb.
>
> * Fix *
> In order to fix this issue, update dt before __remove_memory(), and
> accordingly the same rule in hot-add path.
>
> This will introduce extra dt updating payload for each involved lmb when hotplug.
> But it should be fine since drmem_update_dt() is memory based operation and
> hotplug is not a hot path.
This is great analysis but the performance implications of the change
are grave. The add/remove paths here are already O(n) where n is the
quantity of memory assigned to the LP, this change would make it O(n^2):
dlpar_memory_add_by_count
for_each_drmem_lmb <--
dlpar_add_lmb
drmem_update_dt(_v1|_v2)
for_each_drmem_lmb <--
Memory add/remove isn't a hot path but quadratic runtime complexity
isn't acceptable. Its current performance is bad enough that I have
internal bugs open on it.
Not to mention we leak memory every time drmem_update_dt is called
because we can't safely free device tree properties :-(
Also note that this sort of reverts (fixes?) 063b8b1251fd
("powerpc/pseries/memory-hotplug: Only update DT once per memory DLPAR
request").
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCHv3 2/2] powerpc/pseries: update device tree before ejecting hotplug uevents
@ 2020-07-23 13:27 ` Nathan Lynch
0 siblings, 0 replies; 20+ messages in thread
From: Nathan Lynch @ 2020-07-23 13:27 UTC (permalink / raw)
To: Pingfan Liu; +Cc: cheloha, kexec, ldufour, linuxppc-dev, Hari Bathini
Pingfan Liu <kernelfans@gmail.com> writes:
> A bug is observed on pseries by taking the following steps on rhel:
> -1. drmgr -c mem -r -q 5
> -2. echo c > /proc/sysrq-trigger
>
> And then, the failure looks like:
> kdump: saving to /sysroot//var/crash/127.0.0.1-2020-01-16-02:06:14/
> kdump: saving vmcore-dmesg.txt
> kdump: saving vmcore-dmesg.txt complete
> kdump: saving vmcore
> Checking for memory holes : [ 0.0 %] / Checking for memory holes : [100.0 %] | Excluding unnecessary pages : [100.0 %] \ Copying data : [ 0.3 %] - eta: 38s[ 44.337636] hash-mmu: mm: Hashing failure ! EA=0x7fffba400000 access=0x8000000000000004 current=makedumpfile
> [ 44.337663] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 psize 2 pte=0xc000000050000504
> [ 44.337677] hash-mmu: mm: Hashing failure ! EA=0x7fffba400000 access=0x8000000000000004 current=makedumpfile
> [ 44.337692] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 psize 2 pte=0xc000000050000504
> [ 44.337708] makedumpfile[469]: unhandled signal 7 at 00007fffba400000 nip 00007fffbbc4d7fc lr 000000011356ca3c code 2
> [ 44.338548] Core dump to |/bin/false pipe failed
> /lib/kdump-lib-initramfs.sh: line 98: 469 Bus error $CORE_COLLECTOR /proc/vmcore $_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/vmcore-incomplete
> kdump: saving vmcore failed
>
> * Root cause *
> After analyzing, it turns out that in the current implementation,
> when hot-removing lmb, the KOBJ_REMOVE event ejects before the dt updating as
> the code __remove_memory() comes before drmem_update_dt().
> So in kdump kernel, when read_from_oldmem() resorts to
> pSeries_lpar_hpte_insert() to install hpte, but fails with -2 due to
> non-exist pfn. And finally, low_hash_fault() raise SIGBUS to process, as it
> can be observed "Bus error"
>
> From a viewpoint of listener and publisher, the publisher notifies the
> listener before data is ready. This introduces a problem where udev
> launches kexec-tools (due to KOBJ_REMOVE) and loads a stale dt before
> updating. And in capture kernel, makedumpfile will access the memory based
> on the stale dt info, and hit a SIGBUS error due to an un-existed lmb.
>
> * Fix *
> In order to fix this issue, update dt before __remove_memory(), and
> accordingly the same rule in hot-add path.
>
> This will introduce extra dt updating payload for each involved lmb when hotplug.
> But it should be fine since drmem_update_dt() is memory based operation and
> hotplug is not a hot path.
This is great analysis but the performance implications of the change
are grave. The add/remove paths here are already O(n) where n is the
quantity of memory assigned to the LP, this change would make it O(n^2):
dlpar_memory_add_by_count
for_each_drmem_lmb <--
dlpar_add_lmb
drmem_update_dt(_v1|_v2)
for_each_drmem_lmb <--
Memory add/remove isn't a hot path but quadratic runtime complexity
isn't acceptable. Its current performance is bad enough that I have
internal bugs open on it.
Not to mention we leak memory every time drmem_update_dt is called
because we can't safely free device tree properties :-(
Also note that this sort of reverts (fixes?) 063b8b1251fd
("powerpc/pseries/memory-hotplug: Only update DT once per memory DLPAR
request").
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCHv3 2/2] powerpc/pseries: update device tree before ejecting hotplug uevents
2020-07-23 13:27 ` Nathan Lynch
@ 2020-07-24 12:24 ` Pingfan Liu
-1 siblings, 0 replies; 20+ messages in thread
From: Pingfan Liu @ 2020-07-24 12:24 UTC (permalink / raw)
To: Nathan Lynch
Cc: Michael Ellerman, cheloha, Kexec Mailing List, ldufour,
linuxppc-dev, Hari Bathini
On Thu, Jul 23, 2020 at 9:27 PM Nathan Lynch <nathanl@linux.ibm.com> wrote:
>
> Pingfan Liu <kernelfans@gmail.com> writes:
> > A bug is observed on pseries by taking the following steps on rhel:
> > -1. drmgr -c mem -r -q 5
> > -2. echo c > /proc/sysrq-trigger
> >
> > And then, the failure looks like:
> > kdump: saving to /sysroot//var/crash/127.0.0.1-2020-01-16-02:06:14/
> > kdump: saving vmcore-dmesg.txt
> > kdump: saving vmcore-dmesg.txt complete
> > kdump: saving vmcore
> > Checking for memory holes : [ 0.0 %] / Checking for memory holes : [100.0 %] | Excluding unnecessary pages : [100.0 %] \ Copying data : [ 0.3 %] - eta: 38s[ 44.337636] hash-mmu: mm: Hashing failure ! EA=0x7fffba400000 access=0x8000000000000004 current=makedumpfile
> > [ 44.337663] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 psize 2 pte=0xc000000050000504
> > [ 44.337677] hash-mmu: mm: Hashing failure ! EA=0x7fffba400000 access=0x8000000000000004 current=makedumpfile
> > [ 44.337692] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 psize 2 pte=0xc000000050000504
> > [ 44.337708] makedumpfile[469]: unhandled signal 7 at 00007fffba400000 nip 00007fffbbc4d7fc lr 000000011356ca3c code 2
> > [ 44.338548] Core dump to |/bin/false pipe failed
> > /lib/kdump-lib-initramfs.sh: line 98: 469 Bus error $CORE_COLLECTOR /proc/vmcore $_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/vmcore-incomplete
> > kdump: saving vmcore failed
> >
> > * Root cause *
> > After analyzing, it turns out that in the current implementation,
> > when hot-removing lmb, the KOBJ_REMOVE event ejects before the dt updating as
> > the code __remove_memory() comes before drmem_update_dt().
> > So in kdump kernel, when read_from_oldmem() resorts to
> > pSeries_lpar_hpte_insert() to install hpte, but fails with -2 due to
> > non-exist pfn. And finally, low_hash_fault() raise SIGBUS to process, as it
> > can be observed "Bus error"
> >
> > From a viewpoint of listener and publisher, the publisher notifies the
> > listener before data is ready. This introduces a problem where udev
> > launches kexec-tools (due to KOBJ_REMOVE) and loads a stale dt before
> > updating. And in capture kernel, makedumpfile will access the memory based
> > on the stale dt info, and hit a SIGBUS error due to an un-existed lmb.
> >
> > * Fix *
> > In order to fix this issue, update dt before __remove_memory(), and
> > accordingly the same rule in hot-add path.
> >
> > This will introduce extra dt updating payload for each involved lmb when hotplug.
> > But it should be fine since drmem_update_dt() is memory based operation and
> > hotplug is not a hot path.
>
> This is great analysis but the performance implications of the change
> are grave. The add/remove paths here are already O(n) where n is the
> quantity of memory assigned to the LP, this change would make it O(n^2):
>
> dlpar_memory_add_by_count
> for_each_drmem_lmb <--
> dlpar_add_lmb
> drmem_update_dt(_v1|_v2)
> for_each_drmem_lmb <--
>
> Memory add/remove isn't a hot path but quadratic runtime complexity
> isn't acceptable. Its current performance is bad enough that I have
Yes, the quadratic runtime complexity sounds terrible.
And I am curious about the bug. Does the system have thousands of lmb?
> internal bugs open on it.
>
> Not to mention we leak memory every time drmem_update_dt is called
> because we can't safely free device tree properties :-(
Do you know what block us to free it?
>
> Also note that this sort of reverts (fixes?) 063b8b1251fd
> ("powerpc/pseries/memory-hotplug: Only update DT once per memory DLPAR
> request").
Yes. And now, I think I need to bring up another method to fix it.
Thanks,
Pingfan
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCHv3 2/2] powerpc/pseries: update device tree before ejecting hotplug uevents
@ 2020-07-24 12:24 ` Pingfan Liu
0 siblings, 0 replies; 20+ messages in thread
From: Pingfan Liu @ 2020-07-24 12:24 UTC (permalink / raw)
To: Nathan Lynch
Cc: cheloha, Kexec Mailing List, ldufour, linuxppc-dev, Hari Bathini
On Thu, Jul 23, 2020 at 9:27 PM Nathan Lynch <nathanl@linux.ibm.com> wrote:
>
> Pingfan Liu <kernelfans@gmail.com> writes:
> > A bug is observed on pseries by taking the following steps on rhel:
> > -1. drmgr -c mem -r -q 5
> > -2. echo c > /proc/sysrq-trigger
> >
> > And then, the failure looks like:
> > kdump: saving to /sysroot//var/crash/127.0.0.1-2020-01-16-02:06:14/
> > kdump: saving vmcore-dmesg.txt
> > kdump: saving vmcore-dmesg.txt complete
> > kdump: saving vmcore
> > Checking for memory holes : [ 0.0 %] / Checking for memory holes : [100.0 %] | Excluding unnecessary pages : [100.0 %] \ Copying data : [ 0.3 %] - eta: 38s[ 44.337636] hash-mmu: mm: Hashing failure ! EA=0x7fffba400000 access=0x8000000000000004 current=makedumpfile
> > [ 44.337663] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 psize 2 pte=0xc000000050000504
> > [ 44.337677] hash-mmu: mm: Hashing failure ! EA=0x7fffba400000 access=0x8000000000000004 current=makedumpfile
> > [ 44.337692] hash-mmu: trap=0x300 vsid=0x13a109c ssize=1 base psize=2 psize 2 pte=0xc000000050000504
> > [ 44.337708] makedumpfile[469]: unhandled signal 7 at 00007fffba400000 nip 00007fffbbc4d7fc lr 000000011356ca3c code 2
> > [ 44.338548] Core dump to |/bin/false pipe failed
> > /lib/kdump-lib-initramfs.sh: line 98: 469 Bus error $CORE_COLLECTOR /proc/vmcore $_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/vmcore-incomplete
> > kdump: saving vmcore failed
> >
> > * Root cause *
> > After analyzing, it turns out that in the current implementation,
> > when hot-removing lmb, the KOBJ_REMOVE event ejects before the dt updating as
> > the code __remove_memory() comes before drmem_update_dt().
> > So in kdump kernel, when read_from_oldmem() resorts to
> > pSeries_lpar_hpte_insert() to install hpte, but fails with -2 due to
> > non-exist pfn. And finally, low_hash_fault() raise SIGBUS to process, as it
> > can be observed "Bus error"
> >
> > From a viewpoint of listener and publisher, the publisher notifies the
> > listener before data is ready. This introduces a problem where udev
> > launches kexec-tools (due to KOBJ_REMOVE) and loads a stale dt before
> > updating. And in capture kernel, makedumpfile will access the memory based
> > on the stale dt info, and hit a SIGBUS error due to an un-existed lmb.
> >
> > * Fix *
> > In order to fix this issue, update dt before __remove_memory(), and
> > accordingly the same rule in hot-add path.
> >
> > This will introduce extra dt updating payload for each involved lmb when hotplug.
> > But it should be fine since drmem_update_dt() is memory based operation and
> > hotplug is not a hot path.
>
> This is great analysis but the performance implications of the change
> are grave. The add/remove paths here are already O(n) where n is the
> quantity of memory assigned to the LP, this change would make it O(n^2):
>
> dlpar_memory_add_by_count
> for_each_drmem_lmb <--
> dlpar_add_lmb
> drmem_update_dt(_v1|_v2)
> for_each_drmem_lmb <--
>
> Memory add/remove isn't a hot path but quadratic runtime complexity
> isn't acceptable. Its current performance is bad enough that I have
Yes, the quadratic runtime complexity sounds terrible.
And I am curious about the bug. Does the system have thousands of lmb?
> internal bugs open on it.
>
> Not to mention we leak memory every time drmem_update_dt is called
> because we can't safely free device tree properties :-(
Do you know what block us to free it?
>
> Also note that this sort of reverts (fixes?) 063b8b1251fd
> ("powerpc/pseries/memory-hotplug: Only update DT once per memory DLPAR
> request").
Yes. And now, I think I need to bring up another method to fix it.
Thanks,
Pingfan
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCHv3 2/2] powerpc/pseries: update device tree before ejecting hotplug uevents
2020-07-24 12:24 ` Pingfan Liu
@ 2020-07-24 16:50 ` Nathan Lynch
-1 siblings, 0 replies; 20+ messages in thread
From: Nathan Lynch @ 2020-07-24 16:50 UTC (permalink / raw)
To: Pingfan Liu
Cc: Michael Ellerman, cheloha, Kexec Mailing List, ldufour,
linuxppc-dev, Hari Bathini
Pingfan Liu <kernelfans@gmail.com> writes:
> On Thu, Jul 23, 2020 at 9:27 PM Nathan Lynch <nathanl@linux.ibm.com> wrote:
>> Pingfan Liu <kernelfans@gmail.com> writes:
>> > This will introduce extra dt updating payload for each involved lmb when hotplug.
>> > But it should be fine since drmem_update_dt() is memory based operation and
>> > hotplug is not a hot path.
>>
>> This is great analysis but the performance implications of the change
>> are grave. The add/remove paths here are already O(n) where n is the
>> quantity of memory assigned to the LP, this change would make it O(n^2):
>>
>> dlpar_memory_add_by_count
>> for_each_drmem_lmb <--
>> dlpar_add_lmb
>> drmem_update_dt(_v1|_v2)
>> for_each_drmem_lmb <--
>>
>> Memory add/remove isn't a hot path but quadratic runtime complexity
>> isn't acceptable. Its current performance is bad enough that I have
> Yes, the quadratic runtime complexity sounds terrible.
> And I am curious about the bug. Does the system have thousands of lmb?
Yes.
>> Not to mention we leak memory every time drmem_update_dt is called
>> because we can't safely free device tree properties :-(
> Do you know what block us to free it?
It's a longstanding problem. References to device tree properties aren't
counted or tracked so there's no way to safely free them unless the node
itself is released. But the ibm,dynamic-reconfiguration-memory node does
not ever go away and its properties are only subject to updates.
Maybe there's a way to address the specific case of
ibm,dynamic-reconfiguration-memory and the ibm,dynamic-memory(-v2)
properties, instead of tackling the general problem.
Regardless of all that, the drmem code needs better data structures and
lookup functions.
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCHv3 2/2] powerpc/pseries: update device tree before ejecting hotplug uevents
@ 2020-07-24 16:50 ` Nathan Lynch
0 siblings, 0 replies; 20+ messages in thread
From: Nathan Lynch @ 2020-07-24 16:50 UTC (permalink / raw)
To: Pingfan Liu
Cc: cheloha, Kexec Mailing List, ldufour, linuxppc-dev, Hari Bathini
Pingfan Liu <kernelfans@gmail.com> writes:
> On Thu, Jul 23, 2020 at 9:27 PM Nathan Lynch <nathanl@linux.ibm.com> wrote:
>> Pingfan Liu <kernelfans@gmail.com> writes:
>> > This will introduce extra dt updating payload for each involved lmb when hotplug.
>> > But it should be fine since drmem_update_dt() is memory based operation and
>> > hotplug is not a hot path.
>>
>> This is great analysis but the performance implications of the change
>> are grave. The add/remove paths here are already O(n) where n is the
>> quantity of memory assigned to the LP, this change would make it O(n^2):
>>
>> dlpar_memory_add_by_count
>> for_each_drmem_lmb <--
>> dlpar_add_lmb
>> drmem_update_dt(_v1|_v2)
>> for_each_drmem_lmb <--
>>
>> Memory add/remove isn't a hot path but quadratic runtime complexity
>> isn't acceptable. Its current performance is bad enough that I have
> Yes, the quadratic runtime complexity sounds terrible.
> And I am curious about the bug. Does the system have thousands of lmb?
Yes.
>> Not to mention we leak memory every time drmem_update_dt is called
>> because we can't safely free device tree properties :-(
> Do you know what block us to free it?
It's a longstanding problem. References to device tree properties aren't
counted or tracked so there's no way to safely free them unless the node
itself is released. But the ibm,dynamic-reconfiguration-memory node does
not ever go away and its properties are only subject to updates.
Maybe there's a way to address the specific case of
ibm,dynamic-reconfiguration-memory and the ibm,dynamic-memory(-v2)
properties, instead of tackling the general problem.
Regardless of all that, the drmem code needs better data structures and
lookup functions.
^ permalink raw reply [flat|nested] 20+ messages in thread