* [PATCH v7 0/4] resource: Use list_head to link sibling resource
@ 2018-07-18 2:49 ` Baoquan He
0 siblings, 0 replies; 83+ messages in thread
From: Baoquan He @ 2018-07-18 2:49 UTC (permalink / raw)
To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
dan.j.williams-ral2JQCrhuEAvxtiuMwx3w,
nicolas.pitre-QSEj5FYQhm4dnm+yROfE0A, josh-iaAMLnmF4UmaiuxdJuQwMA,
fengguang.wu-ral2JQCrhuEAvxtiuMwx3w, bp-l3A5Bk7waGM,
andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w
Cc: brijesh.singh-5C7GfCeVMHo, devicetree-u79uwXL29TY76Z2rM5mHXA,
airlied-cv59FeDIM0c, linux-pci-u79uwXL29TY76Z2rM5mHXA,
richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w,
jcmvbkbc-Re5JQEeQqe8AvxtiuMwx3w,
baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/,
kys-0li6OtcxBFHby3iVrkZq2A, frowand.list-Re5JQEeQqe8AvxtiuMwx3w,
lorenzo.pieralisi-5wv7dgnIgG8, sthemmin-0li6OtcxBFHby3iVrkZq2A,
Baoquan He, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw,
patrik.r.jakobsson-Re5JQEeQqe8AvxtiuMwx3w,
linux-input-u79uwXL29TY76Z2rM5mHXA,
gustavo-THi1TnShQwVAfugRpC6u6w, dyoung-H+wXaHxf7aLQT0dZR+AlfA,
thomas.lendacky-5C7GfCeVMHo, haiyangz-0li6OtcxBFHby3iVrkZq2A,
maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA,
jglisse-H+wXaHxf7aLQT0dZR+AlfA, seanpaul-F7+t8E8rja9g9hUCZPvPmw,
bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, tglx-hfZtesqFncYOwBW4kG4KsQ,
yinghai-DgEjT+Ai2ygdnm+yROfE0A,
jonathan.derrick-ral2JQCrhuEAvxtiuMwx3w,
chris-YvXeqwSYzG2sTnJN9+BGXg, monstr-pSz03upnqPeHXe+LvDLADg,
linux-parisc-u79uwXL29TY76Z2rM5mHXA,
gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w,
ebiederm-aS9lmoZGLiVWk0Htik3J/w,
devel-tBiZLqfeLfOHmIFyCCdPziST3g8Odh+X,
linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ, davem-fT/PcQaiUtIeIZ0/mPfg9Q
This patchset is doing:
1) Move reparent_resources() to kernel/resource.c to clean up duplicated
code in arch/microblaze/pci/pci-common.c and
arch/powerpc/kernel/pci-common.c .
2) Replace struct resource's sibling list from singly linked list to
list_head. Clearing out those pointer operation within singly linked
list for better code readability.
2) Based on list_head replacement, add a new function
walk_system_ram_res_rev() which can does reversed iteration on
iomem_resource's siblings.
3) Change kexec_file loading to search system RAM top down for kernel
loadin, using walk_system_ram_res_rev().
Note:
This patchset only passed testing on x86_64 arch with network
enabling. The thing we need pay attetion to is that a root resource's
child member need be initialized specifically with LIST_HEAD_INIT() if
statically defined or INIT_LIST_HEAD() for dynamically definition. Here
Just like we do for iomem_resource/ioport_resource, or the change in
get_pci_domain_busn_res().
v6:
http://lkml.kernel.org/r/20180704041038.8190-1-bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
v5:
http://lkml.kernel.org/r/20180612032831.29747-1-bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
v4:
http://lkml.kernel.org/r/20180507063224.24229-1-bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
v3:
http://lkml.kernel.org/r/20180419001848.3041-1-bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
v2:
http://lkml.kernel.org/r/20180408024724.16812-1-bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
v1:
http://lkml.kernel.org/r/20180322033722.9279-1-bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
Changelog:
v6->v7:
Fix code bugs that test robot reported on mips and ia64.
Add error code description in reparent_resources() according to
Andy's comment, and fix minor log typo.
v5->v6:
Fix code style problems in reparent_resources() and use existing
error codes, according to Andy's suggestion.
Fix bugs test robot reported.
v4->v5:
Add new patch 0001 to move duplicated reparent_resources() to
kernel/resource.c to make it be shared by different ARCH-es.
Fix several code bugs reported by test robot on ARCH powerpc and
microblaze.
v3->v4:
Fix several bugs test robot reported. Rewrite cover letter and patch
log according to reviewer's comment.
v2->v3:
Rename resource functions first_child() and sibling() to
resource_first_chils() and resource_sibling(). Dan suggested this.
Move resource_first_chils() and resource_sibling() to linux/ioport.h
and make them as inline function. Rob suggested this. Accordingly add
linux/list.h including in linux/ioport.h, please help review if this
bring efficiency degradation or code redundancy.
The change on struct resource {} bring two pointers of size increase,
mention this in git log to make it more specifically, Rob suggested
this.
v1->v2:
Use list_head instead to link resource siblings. This is suggested by
Andrew.
Rewrite walk_system_ram_res_rev() after list_head is taken to link
resouce siblings.
Baoquan He (4):
resource: Move reparent_resources() to kernel/resource.c and make it
public
resource: Use list_head to link sibling resource
resource: add walk_system_ram_res_rev()
kexec_file: Load kernel at top of system RAM if required
arch/arm/plat-samsung/pm-check.c | 6 +-
arch/ia64/sn/kernel/io_init.c | 2 +-
arch/microblaze/pci/pci-common.c | 41 +----
arch/mips/pci/pci-rc32434.c | 12 +-
arch/powerpc/kernel/pci-common.c | 39 +---
arch/sparc/kernel/ioport.c | 2 +-
arch/xtensa/include/asm/pci-bridge.h | 4 +-
drivers/eisa/eisa-bus.c | 2 +
drivers/gpu/drm/drm_memory.c | 3 +-
drivers/gpu/drm/gma500/gtt.c | 5 +-
drivers/hv/vmbus_drv.c | 52 +++---
drivers/input/joystick/iforce/iforce-main.c | 4 +-
drivers/nvdimm/namespace_devs.c | 6 +-
drivers/nvdimm/nd.h | 5 +-
drivers/of/address.c | 4 +-
drivers/parisc/lba_pci.c | 4 +-
drivers/pci/controller/vmd.c | 8 +-
drivers/pci/probe.c | 2 +
drivers/pci/setup-bus.c | 2 +-
include/linux/ioport.h | 21 ++-
kernel/kexec_file.c | 2 +
kernel/resource.c | 266 ++++++++++++++++++----------
22 files changed, 260 insertions(+), 232 deletions(-)
--
2.13.6
^ permalink raw reply [flat|nested] 83+ messages in thread* [PATCH v7 0/4] resource: Use list_head to link sibling resource @ 2018-07-18 2:49 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-18 2:49 UTC (permalink / raw) To: linux-kernel, akpm, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko Cc: patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, ebiederm, vgoyal, dyoung, yinghai, monstr, davem, chris, jcmvbkbc, gustavo, maarten.lankhorst, seanpaul, linux-parisc, linuxppc-dev, Baoquan He This patchset is doing: 1) Move reparent_resources() to kernel/resource.c to clean up duplicated code in arch/microblaze/pci/pci-common.c and arch/powerpc/kernel/pci-common.c . 2) Replace struct resource's sibling list from singly linked list to list_head. Clearing out those pointer operation within singly linked list for better code readability. 2) Based on list_head replacement, add a new function walk_system_ram_res_rev() which can does reversed iteration on iomem_resource's siblings. 3) Change kexec_file loading to search system RAM top down for kernel loadin, using walk_system_ram_res_rev(). Note: This patchset only passed testing on x86_64 arch with network enabling. The thing we need pay attetion to is that a root resource's child member need be initialized specifically with LIST_HEAD_INIT() if statically defined or INIT_LIST_HEAD() for dynamically definition. Here Just like we do for iomem_resource/ioport_resource, or the change in get_pci_domain_busn_res(). v6: http://lkml.kernel.org/r/20180704041038.8190-1-bhe@redhat.com v5: http://lkml.kernel.org/r/20180612032831.29747-1-bhe@redhat.com v4: http://lkml.kernel.org/r/20180507063224.24229-1-bhe@redhat.com v3: http://lkml.kernel.org/r/20180419001848.3041-1-bhe@redhat.com v2: http://lkml.kernel.org/r/20180408024724.16812-1-bhe@redhat.com v1: http://lkml.kernel.org/r/20180322033722.9279-1-bhe@redhat.com Changelog: v6->v7: Fix code bugs that test robot reported on mips and ia64. Add error code description in reparent_resources() according to Andy's comment, and fix minor log typo. v5->v6: Fix code style problems in reparent_resources() and use existing error codes, according to Andy's suggestion. Fix bugs test robot reported. v4->v5: Add new patch 0001 to move duplicated reparent_resources() to kernel/resource.c to make it be shared by different ARCH-es. Fix several code bugs reported by test robot on ARCH powerpc and microblaze. v3->v4: Fix several bugs test robot reported. Rewrite cover letter and patch log according to reviewer's comment. v2->v3: Rename resource functions first_child() and sibling() to resource_first_chils() and resource_sibling(). Dan suggested this. Move resource_first_chils() and resource_sibling() to linux/ioport.h and make them as inline function. Rob suggested this. Accordingly add linux/list.h including in linux/ioport.h, please help review if this bring efficiency degradation or code redundancy. The change on struct resource {} bring two pointers of size increase, mention this in git log to make it more specifically, Rob suggested this. v1->v2: Use list_head instead to link resource siblings. This is suggested by Andrew. Rewrite walk_system_ram_res_rev() after list_head is taken to link resouce siblings. Baoquan He (4): resource: Move reparent_resources() to kernel/resource.c and make it public resource: Use list_head to link sibling resource resource: add walk_system_ram_res_rev() kexec_file: Load kernel at top of system RAM if required arch/arm/plat-samsung/pm-check.c | 6 +- arch/ia64/sn/kernel/io_init.c | 2 +- arch/microblaze/pci/pci-common.c | 41 +---- arch/mips/pci/pci-rc32434.c | 12 +- arch/powerpc/kernel/pci-common.c | 39 +--- arch/sparc/kernel/ioport.c | 2 +- arch/xtensa/include/asm/pci-bridge.h | 4 +- drivers/eisa/eisa-bus.c | 2 + drivers/gpu/drm/drm_memory.c | 3 +- drivers/gpu/drm/gma500/gtt.c | 5 +- drivers/hv/vmbus_drv.c | 52 +++--- drivers/input/joystick/iforce/iforce-main.c | 4 +- drivers/nvdimm/namespace_devs.c | 6 +- drivers/nvdimm/nd.h | 5 +- drivers/of/address.c | 4 +- drivers/parisc/lba_pci.c | 4 +- drivers/pci/controller/vmd.c | 8 +- drivers/pci/probe.c | 2 + drivers/pci/setup-bus.c | 2 +- include/linux/ioport.h | 21 ++- kernel/kexec_file.c | 2 + kernel/resource.c | 266 ++++++++++++++++++---------- 22 files changed, 260 insertions(+), 232 deletions(-) -- 2.13.6 ^ permalink raw reply [flat|nested] 83+ messages in thread
* [PATCH v7 0/4] resource: Use list_head to link sibling resource @ 2018-07-18 2:49 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-18 2:49 UTC (permalink / raw) To: linux-kernel, akpm, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko Cc: brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, jcmvbkbc, baiyaowei, kys, frowand.list, lorenzo.pieralisi, sthemmin, Baoquan He, linux-nvdimm, patrik.r.jakobsson, linux-input, gustavo, dyoung, thomas.lendacky, haiyangz, maarten.lankhorst, jglisse, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, ebiederm, devel, linuxppc-dev, davem This patchset is doing: 1) Move reparent_resources() to kernel/resource.c to clean up duplicated code in arch/microblaze/pci/pci-common.c and arch/powerpc/kernel/pci-common.c . 2) Replace struct resource's sibling list from singly linked list to list_head. Clearing out those pointer operation within singly linked list for better code readability. 2) Based on list_head replacement, add a new function walk_system_ram_res_rev() which can does reversed iteration on iomem_resource's siblings. 3) Change kexec_file loading to search system RAM top down for kernel loadin, using walk_system_ram_res_rev(). Note: This patchset only passed testing on x86_64 arch with network enabling. The thing we need pay attetion to is that a root resource's child member need be initialized specifically with LIST_HEAD_INIT() if statically defined or INIT_LIST_HEAD() for dynamically definition. Here Just like we do for iomem_resource/ioport_resource, or the change in get_pci_domain_busn_res(). v6: http://lkml.kernel.org/r/20180704041038.8190-1-bhe@redhat.com v5: http://lkml.kernel.org/r/20180612032831.29747-1-bhe@redhat.com v4: http://lkml.kernel.org/r/20180507063224.24229-1-bhe@redhat.com v3: http://lkml.kernel.org/r/20180419001848.3041-1-bhe@redhat.com v2: http://lkml.kernel.org/r/20180408024724.16812-1-bhe@redhat.com v1: http://lkml.kernel.org/r/20180322033722.9279-1-bhe@redhat.com Changelog: v6->v7: Fix code bugs that test robot reported on mips and ia64. Add error code description in reparent_resources() according to Andy's comment, and fix minor log typo. v5->v6: Fix code style problems in reparent_resources() and use existing error codes, according to Andy's suggestion. Fix bugs test robot reported. v4->v5: Add new patch 0001 to move duplicated reparent_resources() to kernel/resource.c to make it be shared by different ARCH-es. Fix several code bugs reported by test robot on ARCH powerpc and microblaze. v3->v4: Fix several bugs test robot reported. Rewrite cover letter and patch log according to reviewer's comment. v2->v3: Rename resource functions first_child() and sibling() to resource_first_chils() and resource_sibling(). Dan suggested this. Move resource_first_chils() and resource_sibling() to linux/ioport.h and make them as inline function. Rob suggested this. Accordingly add linux/list.h including in linux/ioport.h, please help review if this bring efficiency degradation or code redundancy. The change on struct resource {} bring two pointers of size increase, mention this in git log to make it more specifically, Rob suggested this. v1->v2: Use list_head instead to link resource siblings. This is suggested by Andrew. Rewrite walk_system_ram_res_rev() after list_head is taken to link resouce siblings. Baoquan He (4): resource: Move reparent_resources() to kernel/resource.c and make it public resource: Use list_head to link sibling resource resource: add walk_system_ram_res_rev() kexec_file: Load kernel at top of system RAM if required arch/arm/plat-samsung/pm-check.c | 6 +- arch/ia64/sn/kernel/io_init.c | 2 +- arch/microblaze/pci/pci-common.c | 41 +---- arch/mips/pci/pci-rc32434.c | 12 +- arch/powerpc/kernel/pci-common.c | 39 +--- arch/sparc/kernel/ioport.c | 2 +- arch/xtensa/include/asm/pci-bridge.h | 4 +- drivers/eisa/eisa-bus.c | 2 + drivers/gpu/drm/drm_memory.c | 3 +- drivers/gpu/drm/gma500/gtt.c | 5 +- drivers/hv/vmbus_drv.c | 52 +++--- drivers/input/joystick/iforce/iforce-main.c | 4 +- drivers/nvdimm/namespace_devs.c | 6 +- drivers/nvdimm/nd.h | 5 +- drivers/of/address.c | 4 +- drivers/parisc/lba_pci.c | 4 +- drivers/pci/controller/vmd.c | 8 +- drivers/pci/probe.c | 2 + drivers/pci/setup-bus.c | 2 +- include/linux/ioport.h | 21 ++- kernel/kexec_file.c | 2 + kernel/resource.c | 266 ++++++++++++++++++---------- 22 files changed, 260 insertions(+), 232 deletions(-) -- 2.13.6 _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 83+ messages in thread
[parent not found: <20180718024944.577-1-bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* [PATCH v7 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public 2018-07-18 2:49 ` Baoquan He (?) @ 2018-07-18 2:49 ` Baoquan He -1 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-18 2:49 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, dan.j.williams-ral2JQCrhuEAvxtiuMwx3w, nicolas.pitre-QSEj5FYQhm4dnm+yROfE0A, josh-iaAMLnmF4UmaiuxdJuQwMA, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w, bp-l3A5Bk7waGM, andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w Cc: brijesh.singh-5C7GfCeVMHo, devicetree-u79uwXL29TY76Z2rM5mHXA, airlied-cv59FeDIM0c, linux-pci-u79uwXL29TY76Z2rM5mHXA, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w, jcmvbkbc-Re5JQEeQqe8AvxtiuMwx3w, Paul Mackerras, baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/, kys-0li6OtcxBFHby3iVrkZq2A, frowand.list-Re5JQEeQqe8AvxtiuMwx3w, lorenzo.pieralisi-5wv7dgnIgG8, sthemmin-0li6OtcxBFHby3iVrkZq2A, Baoquan He, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, Michael Ellerman, patrik.r.jakobsson-Re5JQEeQqe8AvxtiuMwx3w, linux-input-u79uwXL29TY76Z2rM5mHXA, gustavo-THi1TnShQwVAfugRpC6u6w, dyoung-H+wXaHxf7aLQT0dZR+AlfA, thomas.lendacky-5C7GfCeVMHo, haiyangz-0li6OtcxBFHby3iVrkZq2A, maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA, jglisse-H+wXaHxf7aLQT0dZR+AlfA, seanpaul-F7+t8E8rja9g9hUCZPvPmw, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, tglx-hfZtesqFncYOwBW4kG4KsQ, yinghai-DgEjT+Ai2ygdnm+yROfE0A, jonathan.derrick-ral2JQCrhuEAvxtiuMwx3w, chris-YvXeqwSYzG2sTnJN9+BGXg, monstr-pSz03upnqPeHXe+LvDLADg, linux-parisc-u79uwXL29TY76Z2rM5mHXA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, Benjamin Herrenschmidt reparent_resources() is duplicated in arch/microblaze/pci/pci-common.c and arch/powerpc/kernel/pci-common.c, so move it to kernel/resource.c so that it's shared. Reviewed-by: Andy Shevchenko <andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Signed-off-by: Baoquan He <bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Cc: Michal Simek <monstr-pSz03upnqPeHXe+LvDLADg@public.gmane.org> Cc: Benjamin Herrenschmidt <benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org> Cc: Paul Mackerras <paulus-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> Cc: Michael Ellerman <mpe-Gsx/Oe8HsFggBc27wqDAHg@public.gmane.org> Cc: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org --- arch/microblaze/pci/pci-common.c | 37 ----------------------------------- arch/powerpc/kernel/pci-common.c | 35 --------------------------------- include/linux/ioport.h | 1 + kernel/resource.c | 42 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 43 insertions(+), 72 deletions(-) diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c index f34346d56095..7899bafab064 100644 --- a/arch/microblaze/pci/pci-common.c +++ b/arch/microblaze/pci/pci-common.c @@ -619,43 +619,6 @@ int pcibios_add_device(struct pci_dev *dev) EXPORT_SYMBOL(pcibios_add_device); /* - * Reparent resource children of pr that conflict with res - * under res, and make res replace those children. - */ -static int __init reparent_resources(struct resource *parent, - struct resource *res) -{ - struct resource *p, **pp; - struct resource **firstpp = NULL; - - for (pp = &parent->child; (p = *pp) != NULL; pp = &p->sibling) { - if (p->end < res->start) - continue; - if (res->end < p->start) - break; - if (p->start < res->start || p->end > res->end) - return -1; /* not completely contained */ - if (firstpp == NULL) - firstpp = pp; - } - if (firstpp == NULL) - return -1; /* didn't find any conflicting entries? */ - res->parent = parent; - res->child = *firstpp; - res->sibling = *pp; - *firstpp = res; - *pp = NULL; - for (p = res->child; p != NULL; p = p->sibling) { - p->parent = res; - pr_debug("PCI: Reparented %s [%llx..%llx] under %s\n", - p->name, - (unsigned long long)p->start, - (unsigned long long)p->end, res->name); - } - return 0; -} - -/* * Handle resources of PCI devices. If the world were perfect, we could * just allocate all the resource regions and do nothing more. It isn't. * On the other hand, we cannot just re-allocate all devices, as it would diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index fe9733ffffaa..926035bb378d 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -1088,41 +1088,6 @@ resource_size_t pcibios_align_resource(void *data, const struct resource *res, EXPORT_SYMBOL(pcibios_align_resource); /* - * Reparent resource children of pr that conflict with res - * under res, and make res replace those children. - */ -static int reparent_resources(struct resource *parent, - struct resource *res) -{ - struct resource *p, **pp; - struct resource **firstpp = NULL; - - for (pp = &parent->child; (p = *pp) != NULL; pp = &p->sibling) { - if (p->end < res->start) - continue; - if (res->end < p->start) - break; - if (p->start < res->start || p->end > res->end) - return -1; /* not completely contained */ - if (firstpp == NULL) - firstpp = pp; - } - if (firstpp == NULL) - return -1; /* didn't find any conflicting entries? */ - res->parent = parent; - res->child = *firstpp; - res->sibling = *pp; - *firstpp = res; - *pp = NULL; - for (p = res->child; p != NULL; p = p->sibling) { - p->parent = res; - pr_debug("PCI: Reparented %s %pR under %s\n", - p->name, p, res->name); - } - return 0; -} - -/* * Handle resources of PCI devices. If the world were perfect, we could * just allocate all the resource regions and do nothing more. It isn't. * On the other hand, we cannot just re-allocate all devices, as it would diff --git a/include/linux/ioport.h b/include/linux/ioport.h index da0ebaec25f0..dfdcd0bfe54e 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -192,6 +192,7 @@ extern int allocate_resource(struct resource *root, struct resource *new, struct resource *lookup_resource(struct resource *root, resource_size_t start); int adjust_resource(struct resource *res, resource_size_t start, resource_size_t size); +int reparent_resources(struct resource *parent, struct resource *res); resource_size_t resource_alignment(struct resource *res); static inline resource_size_t resource_size(const struct resource *res) { diff --git a/kernel/resource.c b/kernel/resource.c index 30e1bc68503b..81ccd19c1d9f 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -983,6 +983,48 @@ int adjust_resource(struct resource *res, resource_size_t start, } EXPORT_SYMBOL(adjust_resource); +/** + * reparent_resources - reparent resource children of parent that res covers + * @parent: parent resource descriptor + * @res: resource descriptor desired by caller + * + * Returns 0 on success, -ENOTSUPP if child resource is not completely + * contained by 'res', -ECANCELED if no any conflicting entry found. + * + * Reparent resource children of 'parent' that conflict with 'res' + * under 'res', and make 'res' replace those children. + */ +int reparent_resources(struct resource *parent, struct resource *res) +{ + struct resource *p, **pp; + struct resource **firstpp = NULL; + + for (pp = &parent->child; (p = *pp) != NULL; pp = &p->sibling) { + if (p->end < res->start) + continue; + if (res->end < p->start) + break; + if (p->start < res->start || p->end > res->end) + return -ENOTSUPP; /* not completely contained */ + if (firstpp == NULL) + firstpp = pp; + } + if (firstpp == NULL) + return -ECANCELED; /* didn't find any conflicting entries? */ + res->parent = parent; + res->child = *firstpp; + res->sibling = *pp; + *firstpp = res; + *pp = NULL; + for (p = res->child; p != NULL; p = p->sibling) { + p->parent = res; + pr_debug("PCI: Reparented %s %pR under %s\n", + p->name, p, res->name); + } + return 0; +} +EXPORT_SYMBOL(reparent_resources); + static void __init __reserve_region_with_split(struct resource *root, resource_size_t start, resource_size_t end, const char *name) -- 2.13.6 ^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH v7 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public @ 2018-07-18 2:49 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-18 2:49 UTC (permalink / raw) To: linux-kernel, akpm, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko Cc: patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, ebiederm, vgoyal, dyoung, yinghai, monstr, davem, chris, jcmvbkbc, gustavo, maarten.lankhorst, seanpaul, linux-parisc, linuxppc-dev, Baoquan He, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman reparent_resources() is duplicated in arch/microblaze/pci/pci-common.c and arch/powerpc/kernel/pci-common.c, so move it to kernel/resource.c so that it's shared. Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org --- arch/microblaze/pci/pci-common.c | 37 ----------------------------------- arch/powerpc/kernel/pci-common.c | 35 --------------------------------- include/linux/ioport.h | 1 + kernel/resource.c | 42 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 43 insertions(+), 72 deletions(-) diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c index f34346d56095..7899bafab064 100644 --- a/arch/microblaze/pci/pci-common.c +++ b/arch/microblaze/pci/pci-common.c @@ -619,43 +619,6 @@ int pcibios_add_device(struct pci_dev *dev) EXPORT_SYMBOL(pcibios_add_device); /* - * Reparent resource children of pr that conflict with res - * under res, and make res replace those children. - */ -static int __init reparent_resources(struct resource *parent, - struct resource *res) -{ - struct resource *p, **pp; - struct resource **firstpp = NULL; - - for (pp = &parent->child; (p = *pp) != NULL; pp = &p->sibling) { - if (p->end < res->start) - continue; - if (res->end < p->start) - break; - if (p->start < res->start || p->end > res->end) - return -1; /* not completely contained */ - if (firstpp == NULL) - firstpp = pp; - } - if (firstpp == NULL) - return -1; /* didn't find any conflicting entries? */ - res->parent = parent; - res->child = *firstpp; - res->sibling = *pp; - *firstpp = res; - *pp = NULL; - for (p = res->child; p != NULL; p = p->sibling) { - p->parent = res; - pr_debug("PCI: Reparented %s [%llx..%llx] under %s\n", - p->name, - (unsigned long long)p->start, - (unsigned long long)p->end, res->name); - } - return 0; -} - -/* * Handle resources of PCI devices. If the world were perfect, we could * just allocate all the resource regions and do nothing more. It isn't. * On the other hand, we cannot just re-allocate all devices, as it would diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index fe9733ffffaa..926035bb378d 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -1088,41 +1088,6 @@ resource_size_t pcibios_align_resource(void *data, const struct resource *res, EXPORT_SYMBOL(pcibios_align_resource); /* - * Reparent resource children of pr that conflict with res - * under res, and make res replace those children. - */ -static int reparent_resources(struct resource *parent, - struct resource *res) -{ - struct resource *p, **pp; - struct resource **firstpp = NULL; - - for (pp = &parent->child; (p = *pp) != NULL; pp = &p->sibling) { - if (p->end < res->start) - continue; - if (res->end < p->start) - break; - if (p->start < res->start || p->end > res->end) - return -1; /* not completely contained */ - if (firstpp == NULL) - firstpp = pp; - } - if (firstpp == NULL) - return -1; /* didn't find any conflicting entries? */ - res->parent = parent; - res->child = *firstpp; - res->sibling = *pp; - *firstpp = res; - *pp = NULL; - for (p = res->child; p != NULL; p = p->sibling) { - p->parent = res; - pr_debug("PCI: Reparented %s %pR under %s\n", - p->name, p, res->name); - } - return 0; -} - -/* * Handle resources of PCI devices. If the world were perfect, we could * just allocate all the resource regions and do nothing more. It isn't. * On the other hand, we cannot just re-allocate all devices, as it would diff --git a/include/linux/ioport.h b/include/linux/ioport.h index da0ebaec25f0..dfdcd0bfe54e 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -192,6 +192,7 @@ extern int allocate_resource(struct resource *root, struct resource *new, struct resource *lookup_resource(struct resource *root, resource_size_t start); int adjust_resource(struct resource *res, resource_size_t start, resource_size_t size); +int reparent_resources(struct resource *parent, struct resource *res); resource_size_t resource_alignment(struct resource *res); static inline resource_size_t resource_size(const struct resource *res) { diff --git a/kernel/resource.c b/kernel/resource.c index 30e1bc68503b..81ccd19c1d9f 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -983,6 +983,48 @@ int adjust_resource(struct resource *res, resource_size_t start, } EXPORT_SYMBOL(adjust_resource); +/** + * reparent_resources - reparent resource children of parent that res covers + * @parent: parent resource descriptor + * @res: resource descriptor desired by caller + * + * Returns 0 on success, -ENOTSUPP if child resource is not completely + * contained by 'res', -ECANCELED if no any conflicting entry found. + * + * Reparent resource children of 'parent' that conflict with 'res' + * under 'res', and make 'res' replace those children. + */ +int reparent_resources(struct resource *parent, struct resource *res) +{ + struct resource *p, **pp; + struct resource **firstpp = NULL; + + for (pp = &parent->child; (p = *pp) != NULL; pp = &p->sibling) { + if (p->end < res->start) + continue; + if (res->end < p->start) + break; + if (p->start < res->start || p->end > res->end) + return -ENOTSUPP; /* not completely contained */ + if (firstpp == NULL) + firstpp = pp; + } + if (firstpp == NULL) + return -ECANCELED; /* didn't find any conflicting entries? */ + res->parent = parent; + res->child = *firstpp; + res->sibling = *pp; + *firstpp = res; + *pp = NULL; + for (p = res->child; p != NULL; p = p->sibling) { + p->parent = res; + pr_debug("PCI: Reparented %s %pR under %s\n", + p->name, p, res->name); + } + return 0; +} +EXPORT_SYMBOL(reparent_resources); + static void __init __reserve_region_with_split(struct resource *root, resource_size_t start, resource_size_t end, const char *name) -- 2.13.6 ^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH v7 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public @ 2018-07-18 2:49 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-18 2:49 UTC (permalink / raw) To: linux-kernel, akpm, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko Cc: brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, jcmvbkbc, Paul Mackerras, baiyaowei, kys, frowand.list, lorenzo.pieralisi, sthemmin, Baoquan He, linux-nvdimm, Michael Ellerman, patrik.r.jakobsson, linux-input, gustavo, dyoung, thomas.lendacky, haiyangz, maarten.lankhorst, jglisse, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, Benjamin Herrenschmidt, ebiederm, devel, linuxppc-dev, davem reparent_resources() is duplicated in arch/microblaze/pci/pci-common.c and arch/powerpc/kernel/pci-common.c, so move it to kernel/resource.c so that it's shared. Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org --- arch/microblaze/pci/pci-common.c | 37 ----------------------------------- arch/powerpc/kernel/pci-common.c | 35 --------------------------------- include/linux/ioport.h | 1 + kernel/resource.c | 42 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 43 insertions(+), 72 deletions(-) diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c index f34346d56095..7899bafab064 100644 --- a/arch/microblaze/pci/pci-common.c +++ b/arch/microblaze/pci/pci-common.c @@ -619,43 +619,6 @@ int pcibios_add_device(struct pci_dev *dev) EXPORT_SYMBOL(pcibios_add_device); /* - * Reparent resource children of pr that conflict with res - * under res, and make res replace those children. - */ -static int __init reparent_resources(struct resource *parent, - struct resource *res) -{ - struct resource *p, **pp; - struct resource **firstpp = NULL; - - for (pp = &parent->child; (p = *pp) != NULL; pp = &p->sibling) { - if (p->end < res->start) - continue; - if (res->end < p->start) - break; - if (p->start < res->start || p->end > res->end) - return -1; /* not completely contained */ - if (firstpp == NULL) - firstpp = pp; - } - if (firstpp == NULL) - return -1; /* didn't find any conflicting entries? */ - res->parent = parent; - res->child = *firstpp; - res->sibling = *pp; - *firstpp = res; - *pp = NULL; - for (p = res->child; p != NULL; p = p->sibling) { - p->parent = res; - pr_debug("PCI: Reparented %s [%llx..%llx] under %s\n", - p->name, - (unsigned long long)p->start, - (unsigned long long)p->end, res->name); - } - return 0; -} - -/* * Handle resources of PCI devices. If the world were perfect, we could * just allocate all the resource regions and do nothing more. It isn't. * On the other hand, we cannot just re-allocate all devices, as it would diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index fe9733ffffaa..926035bb378d 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -1088,41 +1088,6 @@ resource_size_t pcibios_align_resource(void *data, const struct resource *res, EXPORT_SYMBOL(pcibios_align_resource); /* - * Reparent resource children of pr that conflict with res - * under res, and make res replace those children. - */ -static int reparent_resources(struct resource *parent, - struct resource *res) -{ - struct resource *p, **pp; - struct resource **firstpp = NULL; - - for (pp = &parent->child; (p = *pp) != NULL; pp = &p->sibling) { - if (p->end < res->start) - continue; - if (res->end < p->start) - break; - if (p->start < res->start || p->end > res->end) - return -1; /* not completely contained */ - if (firstpp == NULL) - firstpp = pp; - } - if (firstpp == NULL) - return -1; /* didn't find any conflicting entries? */ - res->parent = parent; - res->child = *firstpp; - res->sibling = *pp; - *firstpp = res; - *pp = NULL; - for (p = res->child; p != NULL; p = p->sibling) { - p->parent = res; - pr_debug("PCI: Reparented %s %pR under %s\n", - p->name, p, res->name); - } - return 0; -} - -/* * Handle resources of PCI devices. If the world were perfect, we could * just allocate all the resource regions and do nothing more. It isn't. * On the other hand, we cannot just re-allocate all devices, as it would diff --git a/include/linux/ioport.h b/include/linux/ioport.h index da0ebaec25f0..dfdcd0bfe54e 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -192,6 +192,7 @@ extern int allocate_resource(struct resource *root, struct resource *new, struct resource *lookup_resource(struct resource *root, resource_size_t start); int adjust_resource(struct resource *res, resource_size_t start, resource_size_t size); +int reparent_resources(struct resource *parent, struct resource *res); resource_size_t resource_alignment(struct resource *res); static inline resource_size_t resource_size(const struct resource *res) { diff --git a/kernel/resource.c b/kernel/resource.c index 30e1bc68503b..81ccd19c1d9f 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -983,6 +983,48 @@ int adjust_resource(struct resource *res, resource_size_t start, } EXPORT_SYMBOL(adjust_resource); +/** + * reparent_resources - reparent resource children of parent that res covers + * @parent: parent resource descriptor + * @res: resource descriptor desired by caller + * + * Returns 0 on success, -ENOTSUPP if child resource is not completely + * contained by 'res', -ECANCELED if no any conflicting entry found. + * + * Reparent resource children of 'parent' that conflict with 'res' + * under 'res', and make 'res' replace those children. + */ +int reparent_resources(struct resource *parent, struct resource *res) +{ + struct resource *p, **pp; + struct resource **firstpp = NULL; + + for (pp = &parent->child; (p = *pp) != NULL; pp = &p->sibling) { + if (p->end < res->start) + continue; + if (res->end < p->start) + break; + if (p->start < res->start || p->end > res->end) + return -ENOTSUPP; /* not completely contained */ + if (firstpp == NULL) + firstpp = pp; + } + if (firstpp == NULL) + return -ECANCELED; /* didn't find any conflicting entries? */ + res->parent = parent; + res->child = *firstpp; + res->sibling = *pp; + *firstpp = res; + *pp = NULL; + for (p = res->child; p != NULL; p = p->sibling) { + p->parent = res; + pr_debug("PCI: Reparented %s %pR under %s\n", + p->name, p, res->name); + } + return 0; +} +EXPORT_SYMBOL(reparent_resources); + static void __init __reserve_region_with_split(struct resource *root, resource_size_t start, resource_size_t end, const char *name) -- 2.13.6 _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply related [flat|nested] 83+ messages in thread
* Re: [PATCH v7 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public 2018-07-18 2:49 ` Baoquan He (?) @ 2018-07-18 16:36 ` Andy Shevchenko -1 siblings, 0 replies; 83+ messages in thread From: Andy Shevchenko @ 2018-07-18 16:36 UTC (permalink / raw) To: Baoquan He Cc: Nicolas Pitre, brijesh.singh, devicetree, David Airlie, linux-pci, richard.weiyang, Keith Busch, Max Filippov, Paul Mackerras, baiyaowei, Frank Rowand, Dan Williams, Lorenzo Pieralisi, Stephen Hemminger, linux-nvdimm, Michael Ellerman, Patrik Jakobsson, linux-input, Gustavo Padovan, Borislav Petkov, Dave Young, Vivek Goyal, Tom Lendacky, Haiyang Zhang, Maarten On Wed, Jul 18, 2018 at 5:49 AM, Baoquan He <bhe@redhat.com> wrote: > reparent_resources() is duplicated in arch/microblaze/pci/pci-common.c > and arch/powerpc/kernel/pci-common.c, so move it to kernel/resource.c > so that it's shared. Some minor stuff. > +/** > + * reparent_resources - reparent resource children of parent that res covers > + * @parent: parent resource descriptor > + * @res: resource descriptor desired by caller > + * > + * Returns 0 on success, -ENOTSUPP if child resource is not completely > + * contained by 'res', -ECANCELED if no any conflicting entry found. 'res' -> @res > + * > + * Reparent resource children of 'parent' that conflict with 'res' Ditto + 'parent' -> @parent > + * under 'res', and make 'res' replace those children. Ditto. > + */ > +int reparent_resources(struct resource *parent, struct resource *res) > +{ > + struct resource *p, **pp; > + struct resource **firstpp = NULL; > + > + for (pp = &parent->child; (p = *pp) != NULL; pp = &p->sibling) { > + if (p->end < res->start) > + continue; > + if (res->end < p->start) > + break; > + if (p->start < res->start || p->end > res->end) > + return -ENOTSUPP; /* not completely contained */ > + if (firstpp == NULL) > + firstpp = pp; > + } > + if (firstpp == NULL) > + return -ECANCELED; /* didn't find any conflicting entries? */ > + res->parent = parent; > + res->child = *firstpp; > + res->sibling = *pp; > + *firstpp = res; > + *pp = NULL; > + for (p = res->child; p != NULL; p = p->sibling) { > + p->parent = res; > + pr_debug("PCI: Reparented %s %pR under %s\n", > + p->name, p, res->name); Now, PCI is a bit confusing here. > + } > + return 0; > +} > +EXPORT_SYMBOL(reparent_resources); > + > static void __init __reserve_region_with_split(struct resource *root, > resource_size_t start, resource_size_t end, > const char *name) > -- > 2.13.6 > -- With Best Regards, Andy Shevchenko ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public @ 2018-07-18 16:36 ` Andy Shevchenko 0 siblings, 0 replies; 83+ messages in thread From: Andy Shevchenko @ 2018-07-18 16:36 UTC (permalink / raw) To: Baoquan He Cc: Linux Kernel Mailing List, Andrew Morton, Rob Herring, Dan Williams, Nicolas Pitre, Josh Triplett, kbuild test robot, Borislav Petkov, Patrik Jakobsson, David Airlie, KY Srinivasan, Haiyang Zhang, Stephen Hemminger, Dmitry Torokhov, Frank Rowand, Keith Busch, Jon Derrick, Lorenzo Pieralisi, Bjorn Helgaas, Thomas Gleixner, brijesh.singh, Jérôme Glisse, Tom Lendacky, Greg Kroah-Hartman, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, Eric Biederman, Vivek Goyal, Dave Young, Yinghai Lu, Michal Simek, David S. Miller, Chris Zankel, Max Filippov, Gustavo Padovan, Maarten Lankhorst, Sean Paul, linux-parisc, open list:LINUX FOR POWERPC PA SEMI PWRFICIENT, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman On Wed, Jul 18, 2018 at 5:49 AM, Baoquan He <bhe@redhat.com> wrote: > reparent_resources() is duplicated in arch/microblaze/pci/pci-common.c > and arch/powerpc/kernel/pci-common.c, so move it to kernel/resource.c > so that it's shared. Some minor stuff. > +/** > + * reparent_resources - reparent resource children of parent that res covers > + * @parent: parent resource descriptor > + * @res: resource descriptor desired by caller > + * > + * Returns 0 on success, -ENOTSUPP if child resource is not completely > + * contained by 'res', -ECANCELED if no any conflicting entry found. 'res' -> @res > + * > + * Reparent resource children of 'parent' that conflict with 'res' Ditto + 'parent' -> @parent > + * under 'res', and make 'res' replace those children. Ditto. > + */ > +int reparent_resources(struct resource *parent, struct resource *res) > +{ > + struct resource *p, **pp; > + struct resource **firstpp = NULL; > + > + for (pp = &parent->child; (p = *pp) != NULL; pp = &p->sibling) { > + if (p->end < res->start) > + continue; > + if (res->end < p->start) > + break; > + if (p->start < res->start || p->end > res->end) > + return -ENOTSUPP; /* not completely contained */ > + if (firstpp == NULL) > + firstpp = pp; > + } > + if (firstpp == NULL) > + return -ECANCELED; /* didn't find any conflicting entries? */ > + res->parent = parent; > + res->child = *firstpp; > + res->sibling = *pp; > + *firstpp = res; > + *pp = NULL; > + for (p = res->child; p != NULL; p = p->sibling) { > + p->parent = res; > + pr_debug("PCI: Reparented %s %pR under %s\n", > + p->name, p, res->name); Now, PCI is a bit confusing here. > + } > + return 0; > +} > +EXPORT_SYMBOL(reparent_resources); > + > static void __init __reserve_region_with_split(struct resource *root, > resource_size_t start, resource_size_t end, > const char *name) > -- > 2.13.6 > -- With Best Regards, Andy Shevchenko ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public @ 2018-07-18 16:36 ` Andy Shevchenko 0 siblings, 0 replies; 83+ messages in thread From: Andy Shevchenko @ 2018-07-18 16:36 UTC (permalink / raw) To: Baoquan He Cc: Nicolas Pitre, brijesh.singh, devicetree, David Airlie, linux-pci, richard.weiyang, Max Filippov, Paul Mackerras, baiyaowei, KY Srinivasan, Frank Rowand, Lorenzo Pieralisi, Stephen Hemminger, linux-nvdimm, Michael Ellerman, Patrik Jakobsson, linux-input, Gustavo Padovan, Borislav Petkov, Dave Young, Tom Lendacky, Haiyang Zhang, Maarten Lankhorst, Josh Triplett, Jérôme Glisse, Rob Herring, Sean Paul, Bjorn Helgaas, Thomas Gleixner, Yinghai Lu, Jon Derrick, Chris Zankel, Michal Simek, linux-parisc, Greg Kroah-Hartman, Dmitry Torokhov, Linux Kernel Mailing List, Benjamin Herrenschmidt, Eric Biederman, devel, Andrew Morton, kbuild test robot, open list:LINUX FOR POWERPC PA SEMI PWRFICIENT, David S. Miller On Wed, Jul 18, 2018 at 5:49 AM, Baoquan He <bhe@redhat.com> wrote: > reparent_resources() is duplicated in arch/microblaze/pci/pci-common.c > and arch/powerpc/kernel/pci-common.c, so move it to kernel/resource.c > so that it's shared. Some minor stuff. > +/** > + * reparent_resources - reparent resource children of parent that res covers > + * @parent: parent resource descriptor > + * @res: resource descriptor desired by caller > + * > + * Returns 0 on success, -ENOTSUPP if child resource is not completely > + * contained by 'res', -ECANCELED if no any conflicting entry found. 'res' -> @res > + * > + * Reparent resource children of 'parent' that conflict with 'res' Ditto + 'parent' -> @parent > + * under 'res', and make 'res' replace those children. Ditto. > + */ > +int reparent_resources(struct resource *parent, struct resource *res) > +{ > + struct resource *p, **pp; > + struct resource **firstpp = NULL; > + > + for (pp = &parent->child; (p = *pp) != NULL; pp = &p->sibling) { > + if (p->end < res->start) > + continue; > + if (res->end < p->start) > + break; > + if (p->start < res->start || p->end > res->end) > + return -ENOTSUPP; /* not completely contained */ > + if (firstpp == NULL) > + firstpp = pp; > + } > + if (firstpp == NULL) > + return -ECANCELED; /* didn't find any conflicting entries? */ > + res->parent = parent; > + res->child = *firstpp; > + res->sibling = *pp; > + *firstpp = res; > + *pp = NULL; > + for (p = res->child; p != NULL; p = p->sibling) { > + p->parent = res; > + pr_debug("PCI: Reparented %s %pR under %s\n", > + p->name, p, res->name); Now, PCI is a bit confusing here. > + } > + return 0; > +} > +EXPORT_SYMBOL(reparent_resources); > + > static void __init __reserve_region_with_split(struct resource *root, > resource_size_t start, resource_size_t end, > const char *name) > -- > 2.13.6 > -- With Best Regards, Andy Shevchenko _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 83+ messages in thread
[parent not found: <CAHp75VdO88ydJQ9GHdaDUmAmzL6QHR=US6JiXZ1R_EEA-xWR1Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public 2018-07-18 16:36 ` Andy Shevchenko (?) (?) @ 2018-07-18 16:37 ` Andy Shevchenko -1 siblings, 0 replies; 83+ messages in thread From: Andy Shevchenko @ 2018-07-18 16:37 UTC (permalink / raw) To: Baoquan He Cc: Nicolas Pitre, brijesh.singh-5C7GfCeVMHo, devicetree, David Airlie, linux-pci-u79uwXL29TY76Z2rM5mHXA, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w, Max Filippov, Paul Mackerras, baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/, KY Srinivasan, Frank Rowand, Lorenzo Pieralisi, Stephen Hemminger, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, Michael Ellerman, Patrik Jakobsson, linux-input, Gustavo Padovan, Borislav Petkov, Dave Young, Tom Lendacky, Haiyang Zhang, Maarten Lankhorst, Josh Triplett On Wed, Jul 18, 2018 at 7:36 PM, Andy Shevchenko <andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > On Wed, Jul 18, 2018 at 5:49 AM, Baoquan He <bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: >> reparent_resources() is duplicated in arch/microblaze/pci/pci-common.c >> and arch/powerpc/kernel/pci-common.c, so move it to kernel/resource.c >> so that it's shared. >> + * Returns 0 on success, -ENOTSUPP if child resource is not completely >> + * contained by 'res', -ECANCELED if no any conflicting entry found. You also can refer to constants by prefixing them with %, e.g. %-ENOTSUPP. But this is up to you completely. -- With Best Regards, Andy Shevchenko ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public @ 2018-07-18 16:37 ` Andy Shevchenko 0 siblings, 0 replies; 83+ messages in thread From: Andy Shevchenko @ 2018-07-18 16:37 UTC (permalink / raw) To: Baoquan He Cc: Linux Kernel Mailing List, Andrew Morton, Rob Herring, Dan Williams, Nicolas Pitre, Josh Triplett, kbuild test robot, Borislav Petkov, Patrik Jakobsson, David Airlie, KY Srinivasan, Haiyang Zhang, Stephen Hemminger, Dmitry Torokhov, Frank Rowand, Keith Busch, Jon Derrick, Lorenzo Pieralisi, Bjorn Helgaas, Thomas Gleixner, brijesh.singh, Jérôme Glisse, Tom Lendacky, Greg Kroah-Hartman, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, Eric Biederman, Vivek Goyal, Dave Young, Yinghai Lu, Michal Simek, David S. Miller, Chris Zankel, Max Filippov, Gustavo Padovan, Maarten Lankhorst, Sean Paul, linux-parisc, open list:LINUX FOR POWERPC PA SEMI PWRFICIENT, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman On Wed, Jul 18, 2018 at 7:36 PM, Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > On Wed, Jul 18, 2018 at 5:49 AM, Baoquan He <bhe@redhat.com> wrote: >> reparent_resources() is duplicated in arch/microblaze/pci/pci-common.c >> and arch/powerpc/kernel/pci-common.c, so move it to kernel/resource.c >> so that it's shared. >> + * Returns 0 on success, -ENOTSUPP if child resource is not completely >> + * contained by 'res', -ECANCELED if no any conflicting entry found. You also can refer to constants by prefixing them with %, e.g. %-ENOTSUPP. But this is up to you completely. -- With Best Regards, Andy Shevchenko ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public @ 2018-07-18 16:37 ` Andy Shevchenko 0 siblings, 0 replies; 83+ messages in thread From: Andy Shevchenko @ 2018-07-18 16:37 UTC (permalink / raw) To: Baoquan He Cc: Nicolas Pitre, brijesh.singh-5C7GfCeVMHo, devicetree, David Airlie, linux-pci-u79uwXL29TY76Z2rM5mHXA, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w, Max Filippov, Paul Mackerras, baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/, KY Srinivasan, Frank Rowand, Lorenzo Pieralisi, Stephen Hemminger, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, Michael Ellerman, Patrik Jakobsson, linux-input, Gustavo Padovan, Borislav Petkov, Dave Young, Tom Lendacky, Haiyang Zhang, Maarten Lankhorst, Josh Triplett On Wed, Jul 18, 2018 at 7:36 PM, Andy Shevchenko <andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > On Wed, Jul 18, 2018 at 5:49 AM, Baoquan He <bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: >> reparent_resources() is duplicated in arch/microblaze/pci/pci-common.c >> and arch/powerpc/kernel/pci-common.c, so move it to kernel/resource.c >> so that it's shared. >> + * Returns 0 on success, -ENOTSUPP if child resource is not completely >> + * contained by 'res', -ECANCELED if no any conflicting entry found. You also can refer to constants by prefixing them with %, e.g. %-ENOTSUPP. But this is up to you completely. -- With Best Regards, Andy Shevchenko ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public @ 2018-07-18 16:37 ` Andy Shevchenko 0 siblings, 0 replies; 83+ messages in thread From: Andy Shevchenko @ 2018-07-18 16:37 UTC (permalink / raw) To: Baoquan He Cc: Nicolas Pitre, brijesh.singh, devicetree, David Airlie, linux-pci, richard.weiyang, Max Filippov, Paul Mackerras, baiyaowei, KY Srinivasan, Frank Rowand, Lorenzo Pieralisi, Stephen Hemminger, linux-nvdimm, Michael Ellerman, Patrik Jakobsson, linux-input, Gustavo Padovan, Borislav Petkov, Dave Young, Tom Lendacky, Haiyang Zhang, Maarten Lankhorst, Josh Triplett, Jérôme Glisse, Rob Herring, Sean Paul, Bjorn Helgaas, Thomas Gleixner, Yinghai Lu, Jon Derrick, Chris Zankel, Michal Simek, linux-parisc, Greg Kroah-Hartman, Dmitry Torokhov, Linux Kernel Mailing List, Benjamin Herrenschmidt, Eric Biederman, devel, Andrew Morton, kbuild test robot, open list:LINUX FOR POWERPC PA SEMI PWRFICIENT, David S. Miller On Wed, Jul 18, 2018 at 7:36 PM, Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > On Wed, Jul 18, 2018 at 5:49 AM, Baoquan He <bhe@redhat.com> wrote: >> reparent_resources() is duplicated in arch/microblaze/pci/pci-common.c >> and arch/powerpc/kernel/pci-common.c, so move it to kernel/resource.c >> so that it's shared. >> + * Returns 0 on success, -ENOTSUPP if child resource is not completely >> + * contained by 'res', -ECANCELED if no any conflicting entry found. You also can refer to constants by prefixing them with %, e.g. %-ENOTSUPP. But this is up to you completely. -- With Best Regards, Andy Shevchenko _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 83+ messages in thread
[parent not found: <CAHp75Vf2yEwHhEhhQH2XN+pOQ=-skiAHZ=FgLnfVV8vcm59qeQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public 2018-07-18 16:37 ` Andy Shevchenko (?) (?) @ 2018-07-19 15:18 ` Baoquan He -1 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-19 15:18 UTC (permalink / raw) To: Andy Shevchenko Cc: Nicolas Pitre, brijesh.singh-5C7GfCeVMHo, devicetree, David Airlie, linux-pci-u79uwXL29TY76Z2rM5mHXA, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w, Max Filippov, Paul Mackerras, baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/, KY Srinivasan, Frank Rowand, Lorenzo Pieralisi, Stephen Hemminger, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, Michael Ellerman, Patrik Jakobsson, linux-input, Gustavo Padovan, Borislav Petkov, Dave Young, Tom Lendacky, Haiyang Zhang, Maarten Lankhorst, Josh Triplett On 07/18/18 at 07:37pm, Andy Shevchenko wrote: > On Wed, Jul 18, 2018 at 7:36 PM, Andy Shevchenko > <andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > On Wed, Jul 18, 2018 at 5:49 AM, Baoquan He <bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > >> reparent_resources() is duplicated in arch/microblaze/pci/pci-common.c > >> and arch/powerpc/kernel/pci-common.c, so move it to kernel/resource.c > >> so that it's shared. > > >> + * Returns 0 on success, -ENOTSUPP if child resource is not completely > >> + * contained by 'res', -ECANCELED if no any conflicting entry found. > > You also can refer to constants by prefixing them with %, e.g. %-ENOTSUPP. > But this is up to you completely. Thanks, will fix when repost. ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public @ 2018-07-19 15:18 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-19 15:18 UTC (permalink / raw) To: Andy Shevchenko Cc: Linux Kernel Mailing List, Andrew Morton, Rob Herring, Dan Williams, Nicolas Pitre, Josh Triplett, kbuild test robot, Borislav Petkov, Patrik Jakobsson, David Airlie, KY Srinivasan, Haiyang Zhang, Stephen Hemminger, Dmitry Torokhov, Frank Rowand, Keith Busch, Jon Derrick, Lorenzo Pieralisi, Bjorn Helgaas, Thomas Gleixner, brijesh.singh, Jérôme Glisse, Tom Lendacky, Greg Kroah-Hartman, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, Eric Biederman, Vivek Goyal, Dave Young, Yinghai Lu, Michal Simek, David S. Miller, Chris Zankel, Max Filippov, Gustavo Padovan, Maarten Lankhorst, Sean Paul, linux-parisc, open list:LINUX FOR POWERPC PA SEMI PWRFICIENT, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman On 07/18/18 at 07:37pm, Andy Shevchenko wrote: > On Wed, Jul 18, 2018 at 7:36 PM, Andy Shevchenko > <andy.shevchenko@gmail.com> wrote: > > On Wed, Jul 18, 2018 at 5:49 AM, Baoquan He <bhe@redhat.com> wrote: > >> reparent_resources() is duplicated in arch/microblaze/pci/pci-common.c > >> and arch/powerpc/kernel/pci-common.c, so move it to kernel/resource.c > >> so that it's shared. > > >> + * Returns 0 on success, -ENOTSUPP if child resource is not completely > >> + * contained by 'res', -ECANCELED if no any conflicting entry found. > > You also can refer to constants by prefixing them with %, e.g. %-ENOTSUPP. > But this is up to you completely. Thanks, will fix when repost. ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public @ 2018-07-19 15:18 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-19 15:18 UTC (permalink / raw) To: Andy Shevchenko Cc: Nicolas Pitre, brijesh.singh-5C7GfCeVMHo, devicetree, David Airlie, linux-pci-u79uwXL29TY76Z2rM5mHXA, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w, Max Filippov, Paul Mackerras, baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/, KY Srinivasan, Frank Rowand, Lorenzo Pieralisi, Stephen Hemminger, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, Michael Ellerman, Patrik Jakobsson, linux-input, Gustavo Padovan, Borislav Petkov, Dave Young, Tom Lendacky, Haiyang Zhang, Maarten Lankhorst, Josh Triplett On 07/18/18 at 07:37pm, Andy Shevchenko wrote: > On Wed, Jul 18, 2018 at 7:36 PM, Andy Shevchenko > <andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > On Wed, Jul 18, 2018 at 5:49 AM, Baoquan He <bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > >> reparent_resources() is duplicated in arch/microblaze/pci/pci-common.c > >> and arch/powerpc/kernel/pci-common.c, so move it to kernel/resource.c > >> so that it's shared. > > >> + * Returns 0 on success, -ENOTSUPP if child resource is not completely > >> + * contained by 'res', -ECANCELED if no any conflicting entry found. > > You also can refer to constants by prefixing them with %, e.g. %-ENOTSUPP. > But this is up to you completely. Thanks, will fix when repost. ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public @ 2018-07-19 15:18 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-19 15:18 UTC (permalink / raw) To: Andy Shevchenko Cc: Nicolas Pitre, brijesh.singh, devicetree, David Airlie, linux-pci, richard.weiyang, Max Filippov, Paul Mackerras, baiyaowei, KY Srinivasan, Frank Rowand, Lorenzo Pieralisi, Stephen Hemminger, linux-nvdimm, Michael Ellerman, Patrik Jakobsson, linux-input, Gustavo Padovan, Borislav Petkov, Dave Young, Tom Lendacky, Haiyang Zhang, Maarten Lankhorst, Josh Triplett, Jérôme Glisse, Rob Herring, Sean Paul, Bjorn Helgaas, Thomas Gleixner, Yinghai Lu, Jon Derrick, Chris Zankel, Michal Simek, linux-parisc, Greg Kroah-Hartman, Dmitry Torokhov, Linux Kernel Mailing List, Benjamin Herrenschmidt, Eric Biederman, devel, Andrew Morton, kbuild test robot, open list:LINUX FOR POWERPC PA SEMI PWRFICIENT, David S. Miller On 07/18/18 at 07:37pm, Andy Shevchenko wrote: > On Wed, Jul 18, 2018 at 7:36 PM, Andy Shevchenko > <andy.shevchenko@gmail.com> wrote: > > On Wed, Jul 18, 2018 at 5:49 AM, Baoquan He <bhe@redhat.com> wrote: > >> reparent_resources() is duplicated in arch/microblaze/pci/pci-common.c > >> and arch/powerpc/kernel/pci-common.c, so move it to kernel/resource.c > >> so that it's shared. > > >> + * Returns 0 on success, -ENOTSUPP if child resource is not completely > >> + * contained by 'res', -ECANCELED if no any conflicting entry found. > > You also can refer to constants by prefixing them with %, e.g. %-ENOTSUPP. > But this is up to you completely. Thanks, will fix when repost. _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 83+ messages in thread
* [PATCH v7 2/4] resource: Use list_head to link sibling resource 2018-07-18 2:49 ` Baoquan He (?) (?) @ 2018-07-18 2:49 ` Baoquan He -1 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-18 2:49 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, dan.j.williams-ral2JQCrhuEAvxtiuMwx3w, nicolas.pitre-QSEj5FYQhm4dnm+yROfE0A, josh-iaAMLnmF4UmaiuxdJuQwMA, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w, bp-l3A5Bk7waGM, andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w Cc: linux-mips-6z/3iImG2C8G8FEW9MqTrA, brijesh.singh-5C7GfCeVMHo, devicetree-u79uwXL29TY76Z2rM5mHXA, airlied-cv59FeDIM0c, linux-pci-u79uwXL29TY76Z2rM5mHXA, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w, jcmvbkbc-Re5JQEeQqe8AvxtiuMwx3w, Paul Mackerras, baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/, kys-0li6OtcxBFHby3iVrkZq2A, frowand.list-Re5JQEeQqe8AvxtiuMwx3w, lorenzo.pieralisi-5wv7dgnIgG8, sthemmin-0li6OtcxBFHby3iVrkZq2A, Baoquan He, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, Michael Ellerman, patrik.r.jakobsson-Re5JQEeQqe8AvxtiuMwx3w, linux-input-u79uwXL29TY76Z2rM5mHXA, gustavo-THi1TnShQwVAfugRpC6u6w, dyoung-H+wXaHxf7aLQT0dZR+AlfA, thomas.lendacky-5C7GfCeVMHo, haiyangz-0li6OtcxBFHby3iVrkZq2A, maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA, jglisse-H+wXaHxf7aLQT0dZR+AlfA, seanpaul-F7+t8E8rja9g9hUCZPvPmw, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, tglx-hfZtesqFncYOwBW4kG4KsQ, yinghai-DgEjT+Ai2ygdnm+yROfE0A, jonathan.derrick-ral2JQCrhuEAvxtiuMwx3w, chris-YvXeqwSYzG2sTnJN9+BGXg, monstr-pSz03upnqPeHXe+LvDLADg, linux-parisc-u79uwXL29TY76Z2rM5mHXA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, Benjamin The struct resource uses singly linked list to link siblings, implemented by pointer operation. Replace it with list_head for better code readability. Based on this list_head replacement, it will be very easy to do reverse iteration on iomem_resource's sibling list in later patch. Besides, type of member variables of struct resource, sibling and child, are changed from 'struct resource *' to 'struct list_head'. This brings two pointers of size increase. Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Cc: David Airlie <airlied@linux.ie> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Jonathan Derrick <jonathan.derrick@intel.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: devel@linuxdriverproject.org Cc: linux-input@vger.kernel.org Cc: linux-nvdimm@lists.01.org Cc: devicetree@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: Michal Simek <monstr@monstr.eu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linux-mips@linux-mips.org --- arch/arm/plat-samsung/pm-check.c | 6 +- arch/ia64/sn/kernel/io_init.c | 2 +- arch/microblaze/pci/pci-common.c | 4 +- arch/mips/pci/pci-rc32434.c | 12 +- arch/powerpc/kernel/pci-common.c | 4 +- arch/sparc/kernel/ioport.c | 2 +- arch/xtensa/include/asm/pci-bridge.h | 4 +- drivers/eisa/eisa-bus.c | 2 + drivers/gpu/drm/drm_memory.c | 3 +- drivers/gpu/drm/gma500/gtt.c | 5 +- drivers/hv/vmbus_drv.c | 52 +++---- drivers/input/joystick/iforce/iforce-main.c | 4 +- drivers/nvdimm/namespace_devs.c | 6 +- drivers/nvdimm/nd.h | 5 +- drivers/of/address.c | 4 +- drivers/parisc/lba_pci.c | 4 +- drivers/pci/controller/vmd.c | 8 +- drivers/pci/probe.c | 2 + drivers/pci/setup-bus.c | 2 +- include/linux/ioport.h | 17 ++- kernel/resource.c | 206 ++++++++++++++-------------- 21 files changed, 183 insertions(+), 171 deletions(-) diff --git a/arch/arm/plat-samsung/pm-check.c b/arch/arm/plat-samsung/pm-check.c index cd2c02c68bc3..5494355b1c49 100644 --- a/arch/arm/plat-samsung/pm-check.c +++ b/arch/arm/plat-samsung/pm-check.c @@ -46,8 +46,8 @@ typedef u32 *(run_fn_t)(struct resource *ptr, u32 *arg); static void s3c_pm_run_res(struct resource *ptr, run_fn_t fn, u32 *arg) { while (ptr != NULL) { - if (ptr->child != NULL) - s3c_pm_run_res(ptr->child, fn, arg); + if (!list_empty(&ptr->child)) + s3c_pm_run_res(resource_first_child(&ptr->child), fn, arg); if ((ptr->flags & IORESOURCE_SYSTEM_RAM) == IORESOURCE_SYSTEM_RAM) { @@ -57,7 +57,7 @@ static void s3c_pm_run_res(struct resource *ptr, run_fn_t fn, u32 *arg) arg = (fn)(ptr, arg); } - ptr = ptr->sibling; + ptr = resource_sibling(ptr); } } diff --git a/arch/ia64/sn/kernel/io_init.c b/arch/ia64/sn/kernel/io_init.c index d63809a6adfa..338a7b7f194d 100644 --- a/arch/ia64/sn/kernel/io_init.c +++ b/arch/ia64/sn/kernel/io_init.c @@ -192,7 +192,7 @@ sn_io_slot_fixup(struct pci_dev *dev) * if it's already in the device structure, remove it before * inserting */ - if (res->parent && res->parent->child) + if (res->parent && !list_empty(&res->parent->child)) release_resource(res); if (res->flags & IORESOURCE_IO) diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c index 7899bafab064..2bf73e27e231 100644 --- a/arch/microblaze/pci/pci-common.c +++ b/arch/microblaze/pci/pci-common.c @@ -533,7 +533,9 @@ void pci_process_bridge_OF_ranges(struct pci_controller *hose, res->flags = range.flags; res->start = range.cpu_addr; res->end = range.cpu_addr + range.size - 1; - res->parent = res->child = res->sibling = NULL; + res->parent = NULL; + INIT_LIST_HEAD(&res->child); + INIT_LIST_HEAD(&res->sibling); } } diff --git a/arch/mips/pci/pci-rc32434.c b/arch/mips/pci/pci-rc32434.c index 7f6ce6d734c0..e80283df7925 100644 --- a/arch/mips/pci/pci-rc32434.c +++ b/arch/mips/pci/pci-rc32434.c @@ -53,8 +53,8 @@ static struct resource rc32434_res_pci_mem1 = { .start = 0x50000000, .end = 0x5FFFFFFF, .flags = IORESOURCE_MEM, - .sibling = NULL, - .child = &rc32434_res_pci_mem2 + .sibling = LIST_HEAD_INIT(rc32434_res_pci_mem1.sibling), + .child = LIST_HEAD_INIT(rc32434_res_pci_mem1.child), }; static struct resource rc32434_res_pci_mem2 = { @@ -63,8 +63,8 @@ static struct resource rc32434_res_pci_mem2 = { .end = 0x6FFFFFFF, .flags = IORESOURCE_MEM, .parent = &rc32434_res_pci_mem1, - .sibling = NULL, - .child = NULL + .sibling = LIST_HEAD_INIT(rc32434_res_pci_mem2.sibling), + .child = LIST_HEAD_INIT(rc32434_res_pci_mem2.child), }; static struct resource rc32434_res_pci_io1 = { @@ -72,6 +72,8 @@ static struct resource rc32434_res_pci_io1 = { .start = 0x18800000, .end = 0x188FFFFF, .flags = IORESOURCE_IO, + .sibling = LIST_HEAD_INIT(rc32434_res_pci_io1.sibling), + .child = LIST_HEAD_INIT(rc32434_res_pci_io1.child), }; extern struct pci_ops rc32434_pci_ops; @@ -208,6 +210,8 @@ static int __init rc32434_pci_init(void) pr_info("PCI: Initializing PCI\n"); + list_add(&rc32434_res_pci_mem2.sibling, &rc32434_res_pci_mem1.child); + ioport_resource.start = rc32434_res_pci_io1.start; ioport_resource.end = rc32434_res_pci_io1.end; diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index 926035bb378d..28fbe83c9daf 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -761,7 +761,9 @@ void pci_process_bridge_OF_ranges(struct pci_controller *hose, res->flags = range.flags; res->start = range.cpu_addr; res->end = range.cpu_addr + range.size - 1; - res->parent = res->child = res->sibling = NULL; + res->parent = NULL; + INIT_LIST_HEAD(&res->child); + INIT_LIST_HEAD(&res->sibling); } } } diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c index cca9134cfa7d..99efe4e98b16 100644 --- a/arch/sparc/kernel/ioport.c +++ b/arch/sparc/kernel/ioport.c @@ -669,7 +669,7 @@ static int sparc_io_proc_show(struct seq_file *m, void *v) struct resource *root = m->private, *r; const char *nm; - for (r = root->child; r != NULL; r = r->sibling) { + list_for_each_entry(r, &root->child, sibling) { if ((nm = r->name) == NULL) nm = "???"; seq_printf(m, "%016llx-%016llx: %s\n", (unsigned long long)r->start, diff --git a/arch/xtensa/include/asm/pci-bridge.h b/arch/xtensa/include/asm/pci-bridge.h index 0b68c76ec1e6..f487b06817df 100644 --- a/arch/xtensa/include/asm/pci-bridge.h +++ b/arch/xtensa/include/asm/pci-bridge.h @@ -71,8 +71,8 @@ static inline void pcibios_init_resource(struct resource *res, res->flags = flags; res->name = name; res->parent = NULL; - res->sibling = NULL; - res->child = NULL; + INIT_LIST_HEAD(&res->child); + INIT_LIST_HEAD(&res->sibling); } diff --git a/drivers/eisa/eisa-bus.c b/drivers/eisa/eisa-bus.c index 1e8062f6dbfc..dba78f75fd06 100644 --- a/drivers/eisa/eisa-bus.c +++ b/drivers/eisa/eisa-bus.c @@ -408,6 +408,8 @@ static struct resource eisa_root_res = { .start = 0, .end = 0xffffffff, .flags = IORESOURCE_IO, + .sibling = LIST_HEAD_INIT(eisa_root_res.sibling), + .child = LIST_HEAD_INIT(eisa_root_res.child), }; static int eisa_bus_count; diff --git a/drivers/gpu/drm/drm_memory.c b/drivers/gpu/drm/drm_memory.c index d69e4fc1ee77..33baa7fa5e41 100644 --- a/drivers/gpu/drm/drm_memory.c +++ b/drivers/gpu/drm/drm_memory.c @@ -155,9 +155,8 @@ u64 drm_get_max_iomem(void) struct resource *tmp; resource_size_t max_iomem = 0; - for (tmp = iomem_resource.child; tmp; tmp = tmp->sibling) { + list_for_each_entry(tmp, &iomem_resource.child, sibling) max_iomem = max(max_iomem, tmp->end); - } return max_iomem; } diff --git a/drivers/gpu/drm/gma500/gtt.c b/drivers/gpu/drm/gma500/gtt.c index 3949b0990916..addd3bc009af 100644 --- a/drivers/gpu/drm/gma500/gtt.c +++ b/drivers/gpu/drm/gma500/gtt.c @@ -565,7 +565,7 @@ int psb_gtt_init(struct drm_device *dev, int resume) int psb_gtt_restore(struct drm_device *dev) { struct drm_psb_private *dev_priv = dev->dev_private; - struct resource *r = dev_priv->gtt_mem->child; + struct resource *r; struct gtt_range *range; unsigned int restored = 0, total = 0, size = 0; @@ -573,14 +573,13 @@ int psb_gtt_restore(struct drm_device *dev) mutex_lock(&dev_priv->gtt_mutex); psb_gtt_init(dev, 1); - while (r != NULL) { + list_for_each_entry(r, &dev_priv->gtt_mem->child, sibling) { range = container_of(r, struct gtt_range, resource); if (range->pages) { psb_gtt_insert(dev, range, 1); size += range->resource.end - range->resource.start; restored++; } - r = r->sibling; total++; } mutex_unlock(&dev_priv->gtt_mutex); diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index b10fe26c4891..d87ec5a1bc4c 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1412,9 +1412,8 @@ static acpi_status vmbus_walk_resources(struct acpi_resource *res, void *ctx) { resource_size_t start = 0; resource_size_t end = 0; - struct resource *new_res; + struct resource *new_res, *tmp; struct resource **old_res = &hyperv_mmio; - struct resource **prev_res = NULL; switch (res->type) { @@ -1461,44 +1460,36 @@ static acpi_status vmbus_walk_resources(struct acpi_resource *res, void *ctx) /* * If two ranges are adjacent, merge them. */ - do { - if (!*old_res) { - *old_res = new_res; - break; - } - - if (((*old_res)->end + 1) == new_res->start) { - (*old_res)->end = new_res->end; + if (!*old_res) { + *old_res = new_res; + return AE_OK; + } + tmp = *old_res; + list_for_each_entry_from(tmp, &tmp->parent->child, sibling) { + if ((tmp->end + 1) == new_res->start) { + tmp->end = new_res->end; kfree(new_res); break; } - if ((*old_res)->start == new_res->end + 1) { - (*old_res)->start = new_res->start; + if (tmp->start == new_res->end + 1) { + tmp->start = new_res->start; kfree(new_res); break; } - if ((*old_res)->start > new_res->end) { - new_res->sibling = *old_res; - if (prev_res) - (*prev_res)->sibling = new_res; - *old_res = new_res; + if (tmp->start > new_res->end) { + list_add(&new_res->sibling, tmp->sibling.prev); break; } - - prev_res = old_res; - old_res = &(*old_res)->sibling; - - } while (1); + } return AE_OK; } static int vmbus_acpi_remove(struct acpi_device *device) { - struct resource *cur_res; - struct resource *next_res; + struct resource *res; if (hyperv_mmio) { if (fb_mmio) { @@ -1507,10 +1498,9 @@ static int vmbus_acpi_remove(struct acpi_device *device) fb_mmio = NULL; } - for (cur_res = hyperv_mmio; cur_res; cur_res = next_res) { - next_res = cur_res->sibling; - kfree(cur_res); - } + res = hyperv_mmio; + list_for_each_entry_from(res, &res->parent->child, sibling) + kfree(res); } return 0; @@ -1596,7 +1586,8 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, } } - for (iter = hyperv_mmio; iter; iter = iter->sibling) { + iter = hyperv_mmio; + list_for_each_entry_from(iter, &iter->parent->child, sibling) { if ((iter->start >= max) || (iter->end <= min)) continue; @@ -1639,7 +1630,8 @@ void vmbus_free_mmio(resource_size_t start, resource_size_t size) struct resource *iter; down(&hyperv_mmio_lock); - for (iter = hyperv_mmio; iter; iter = iter->sibling) { + iter = hyperv_mmio; + list_for_each_entry_from(iter, &iter->parent->child, sibling) { if ((iter->start >= start + size) || (iter->end <= start)) continue; diff --git a/drivers/input/joystick/iforce/iforce-main.c b/drivers/input/joystick/iforce/iforce-main.c index daeeb4c7e3b0..5c0be27b33ff 100644 --- a/drivers/input/joystick/iforce/iforce-main.c +++ b/drivers/input/joystick/iforce/iforce-main.c @@ -305,8 +305,8 @@ int iforce_init_device(struct iforce *iforce) iforce->device_memory.end = 200; iforce->device_memory.flags = IORESOURCE_MEM; iforce->device_memory.parent = NULL; - iforce->device_memory.child = NULL; - iforce->device_memory.sibling = NULL; + INIT_LIST_HEAD(&iforce->device_memory.child); + INIT_LIST_HEAD(&iforce->device_memory.sibling); /* * Wait until device ready - until it sends its first response. diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c index 28afdd668905..f53d410d9981 100644 --- a/drivers/nvdimm/namespace_devs.c +++ b/drivers/nvdimm/namespace_devs.c @@ -637,7 +637,7 @@ static resource_size_t scan_allocate(struct nd_region *nd_region, retry: first = 0; for_each_dpa_resource(ndd, res) { - struct resource *next = res->sibling, *new_res = NULL; + struct resource *next = resource_sibling(res), *new_res = NULL; resource_size_t allocate, available = 0; enum alloc_loc loc = ALLOC_ERR; const char *action; @@ -763,7 +763,7 @@ static resource_size_t scan_allocate(struct nd_region *nd_region, * an initial "pmem-reserve pass". Only do an initial BLK allocation * when none of the DPA space is reserved. */ - if ((is_pmem || !ndd->dpa.child) && n == to_allocate) + if ((is_pmem || list_empty(&ndd->dpa.child)) && n == to_allocate) return init_dpa_allocation(label_id, nd_region, nd_mapping, n); return n; } @@ -779,7 +779,7 @@ static int merge_dpa(struct nd_region *nd_region, retry: for_each_dpa_resource(ndd, res) { int rc; - struct resource *next = res->sibling; + struct resource *next = resource_sibling(res); resource_size_t end = res->start + resource_size(res); if (!next || strcmp(res->name, label_id->id) != 0 diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h index 32e0364b48b9..da7da15e03e7 100644 --- a/drivers/nvdimm/nd.h +++ b/drivers/nvdimm/nd.h @@ -102,11 +102,10 @@ unsigned sizeof_namespace_label(struct nvdimm_drvdata *ndd); (unsigned long long) (res ? res->start : 0), ##arg) #define for_each_dpa_resource(ndd, res) \ - for (res = (ndd)->dpa.child; res; res = res->sibling) + list_for_each_entry(res, &(ndd)->dpa.child, sibling) #define for_each_dpa_resource_safe(ndd, res, next) \ - for (res = (ndd)->dpa.child, next = res ? res->sibling : NULL; \ - res; res = next, next = next ? next->sibling : NULL) + list_for_each_entry_safe(res, next, &(ndd)->dpa.child, sibling) struct nd_percpu_lane { int count; diff --git a/drivers/of/address.c b/drivers/of/address.c index 53349912ac75..e2e25719ab52 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -330,7 +330,9 @@ int of_pci_range_to_resource(struct of_pci_range *range, { int err; res->flags = range->flags; - res->parent = res->child = res->sibling = NULL; + res->parent = NULL; + INIT_LIST_HEAD(&res->child); + INIT_LIST_HEAD(&res->sibling); res->name = np->full_name; if (res->flags & IORESOURCE_IO) { diff --git a/drivers/parisc/lba_pci.c b/drivers/parisc/lba_pci.c index 69bd98421eb1..7482bdfd1959 100644 --- a/drivers/parisc/lba_pci.c +++ b/drivers/parisc/lba_pci.c @@ -170,8 +170,8 @@ lba_dump_res(struct resource *r, int d) for (i = d; i ; --i) printk(" "); printk(KERN_DEBUG "%p [%lx,%lx]/%lx\n", r, (long)r->start, (long)r->end, r->flags); - lba_dump_res(r->child, d+2); - lba_dump_res(r->sibling, d); + lba_dump_res(resource_first_child(&r->child), d+2); + lba_dump_res(resource_sibling(r), d); } diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c index 942b64fc7f1f..e3ace20345c7 100644 --- a/drivers/pci/controller/vmd.c +++ b/drivers/pci/controller/vmd.c @@ -542,14 +542,14 @@ static struct pci_ops vmd_ops = { static void vmd_attach_resources(struct vmd_dev *vmd) { - vmd->dev->resource[VMD_MEMBAR1].child = &vmd->resources[1]; - vmd->dev->resource[VMD_MEMBAR2].child = &vmd->resources[2]; + list_add(&vmd->resources[1].sibling, &vmd->dev->resource[VMD_MEMBAR1].child); + list_add(&vmd->resources[2].sibling, &vmd->dev->resource[VMD_MEMBAR2].child); } static void vmd_detach_resources(struct vmd_dev *vmd) { - vmd->dev->resource[VMD_MEMBAR1].child = NULL; - vmd->dev->resource[VMD_MEMBAR2].child = NULL; + INIT_LIST_HEAD(&vmd->dev->resource[VMD_MEMBAR1].child); + INIT_LIST_HEAD(&vmd->dev->resource[VMD_MEMBAR2].child); } /* diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index ac876e32de4b..9624dd1dfd49 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -59,6 +59,8 @@ static struct resource *get_pci_domain_busn_res(int domain_nr) r->res.start = 0; r->res.end = 0xff; r->res.flags = IORESOURCE_BUS | IORESOURCE_PCI_FIXED; + INIT_LIST_HEAD(&r->res.child); + INIT_LIST_HEAD(&r->res.sibling); list_add_tail(&r->list, &pci_domain_busn_res_list); diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c index 79b1824e83b4..8e685af8938d 100644 --- a/drivers/pci/setup-bus.c +++ b/drivers/pci/setup-bus.c @@ -2107,7 +2107,7 @@ int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type) continue; /* Ignore BARs which are still in use */ - if (res->child) + if (!list_empty(&res->child)) continue; ret = add_to_list(&saved, bridge, res, 0, 0); diff --git a/include/linux/ioport.h b/include/linux/ioport.h index dfdcd0bfe54e..b7456ae889dd 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -12,6 +12,7 @@ #ifndef __ASSEMBLY__ #include <linux/compiler.h> #include <linux/types.h> +#include <linux/list.h> /* * Resources are tree-like, allowing * nesting etc.. @@ -22,7 +23,8 @@ struct resource { const char *name; unsigned long flags; unsigned long desc; - struct resource *parent, *sibling, *child; + struct list_head child, sibling; + struct resource *parent; }; /* @@ -216,7 +218,6 @@ static inline bool resource_contains(struct resource *r1, struct resource *r2) return r1->start <= r2->start && r1->end >= r2->end; } - /* Convenience shorthand with allocation */ #define request_region(start,n,name) __request_region(&ioport_resource, (start), (n), (name), 0) #define request_muxed_region(start,n,name) __request_region(&ioport_resource, (start), (n), (name), IORESOURCE_MUXED) @@ -287,6 +288,18 @@ static inline bool resource_overlaps(struct resource *r1, struct resource *r2) return (r1->start <= r2->end && r1->end >= r2->start); } +static inline struct resource *resource_sibling(struct resource *res) +{ + if (res->parent && !list_is_last(&res->sibling, &res->parent->child)) + return list_next_entry(res, sibling); + return NULL; +} + +static inline struct resource *resource_first_child(struct list_head *head) +{ + return list_first_entry_or_null(head, struct resource, sibling); +} + #endif /* __ASSEMBLY__ */ #endif /* _LINUX_IOPORT_H */ diff --git a/kernel/resource.c b/kernel/resource.c index 81ccd19c1d9f..c96e58d3d2f8 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -31,6 +31,8 @@ struct resource ioport_resource = { .start = 0, .end = IO_SPACE_LIMIT, .flags = IORESOURCE_IO, + .sibling = LIST_HEAD_INIT(ioport_resource.sibling), + .child = LIST_HEAD_INIT(ioport_resource.child), }; EXPORT_SYMBOL(ioport_resource); @@ -39,6 +41,8 @@ struct resource iomem_resource = { .start = 0, .end = -1, .flags = IORESOURCE_MEM, + .sibling = LIST_HEAD_INIT(iomem_resource.sibling), + .child = LIST_HEAD_INIT(iomem_resource.child), }; EXPORT_SYMBOL(iomem_resource); @@ -57,20 +61,20 @@ static DEFINE_RWLOCK(resource_lock); * by boot mem after the system is up. So for reusing the resource entry * we need to remember the resource. */ -static struct resource *bootmem_resource_free; +static struct list_head bootmem_resource_free = LIST_HEAD_INIT(bootmem_resource_free); static DEFINE_SPINLOCK(bootmem_resource_lock); static struct resource *next_resource(struct resource *p, bool sibling_only) { /* Caller wants to traverse through siblings only */ if (sibling_only) - return p->sibling; + return resource_sibling(p); - if (p->child) - return p->child; - while (!p->sibling && p->parent) + if (!list_empty(&p->child)) + return resource_first_child(&p->child); + while (!resource_sibling(p) && p->parent) p = p->parent; - return p->sibling; + return resource_sibling(p); } static void *r_next(struct seq_file *m, void *v, loff_t *pos) @@ -90,7 +94,7 @@ static void *r_start(struct seq_file *m, loff_t *pos) struct resource *p = PDE_DATA(file_inode(m->file)); loff_t l = 0; read_lock(&resource_lock); - for (p = p->child; p && l < *pos; p = r_next(m, p, &l)) + for (p = resource_first_child(&p->child); p && l < *pos; p = r_next(m, p, &l)) ; return p; } @@ -153,8 +157,7 @@ static void free_resource(struct resource *res) if (!PageSlab(virt_to_head_page(res))) { spin_lock(&bootmem_resource_lock); - res->sibling = bootmem_resource_free; - bootmem_resource_free = res; + list_add(&res->sibling, &bootmem_resource_free); spin_unlock(&bootmem_resource_lock); } else { kfree(res); @@ -166,10 +169,9 @@ static struct resource *alloc_resource(gfp_t flags) struct resource *res = NULL; spin_lock(&bootmem_resource_lock); - if (bootmem_resource_free) { - res = bootmem_resource_free; - bootmem_resource_free = res->sibling; - } + res = resource_first_child(&bootmem_resource_free); + if (res) + list_del(&res->sibling); spin_unlock(&bootmem_resource_lock); if (res) @@ -177,6 +179,8 @@ static struct resource *alloc_resource(gfp_t flags) else res = kzalloc(sizeof(struct resource), flags); + INIT_LIST_HEAD(&res->child); + INIT_LIST_HEAD(&res->sibling); return res; } @@ -185,7 +189,7 @@ static struct resource * __request_resource(struct resource *root, struct resour { resource_size_t start = new->start; resource_size_t end = new->end; - struct resource *tmp, **p; + struct resource *tmp; if (end < start) return root; @@ -193,64 +197,62 @@ static struct resource * __request_resource(struct resource *root, struct resour return root; if (end > root->end) return root; - p = &root->child; - for (;;) { - tmp = *p; - if (!tmp || tmp->start > end) { - new->sibling = tmp; - *p = new; + + if (list_empty(&root->child)) { + list_add(&new->sibling, &root->child); + new->parent = root; + INIT_LIST_HEAD(&new->child); + return NULL; + } + + list_for_each_entry(tmp, &root->child, sibling) { + if (tmp->start > end) { + list_add(&new->sibling, tmp->sibling.prev); new->parent = root; + INIT_LIST_HEAD(&new->child); return NULL; } - p = &tmp->sibling; if (tmp->end < start) continue; return tmp; } + + list_add_tail(&new->sibling, &root->child); + new->parent = root; + INIT_LIST_HEAD(&new->child); + return NULL; } static int __release_resource(struct resource *old, bool release_child) { - struct resource *tmp, **p, *chd; + struct resource *tmp, *next, *chd; - p = &old->parent->child; - for (;;) { - tmp = *p; - if (!tmp) - break; + list_for_each_entry_safe(tmp, next, &old->parent->child, sibling) { if (tmp == old) { - if (release_child || !(tmp->child)) { - *p = tmp->sibling; + if (release_child || list_empty(&tmp->child)) { + list_del(&tmp->sibling); } else { - for (chd = tmp->child;; chd = chd->sibling) { + list_for_each_entry(chd, &tmp->child, sibling) chd->parent = tmp->parent; - if (!(chd->sibling)) - break; - } - *p = tmp->child; - chd->sibling = tmp->sibling; + list_splice(&tmp->child, tmp->sibling.prev); + list_del(&tmp->sibling); } + old->parent = NULL; return 0; } - p = &tmp->sibling; } return -EINVAL; } static void __release_child_resources(struct resource *r) { - struct resource *tmp, *p; + struct resource *tmp, *next; resource_size_t size; - p = r->child; - r->child = NULL; - while (p) { - tmp = p; - p = p->sibling; - + list_for_each_entry_safe(tmp, next, &r->child, sibling) { tmp->parent = NULL; - tmp->sibling = NULL; + list_del_init(&tmp->sibling); __release_child_resources(tmp); printk(KERN_DEBUG "release child resource %pR\n", tmp); @@ -259,6 +261,8 @@ static void __release_child_resources(struct resource *r) tmp->start = 0; tmp->end = size - 1; } + + INIT_LIST_HEAD(&tmp->child); } void release_child_resources(struct resource *r) @@ -343,7 +347,8 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc, read_lock(&resource_lock); - for (p = iomem_resource.child; p; p = next_resource(p, sibling_only)) { + for (p = resource_first_child(&iomem_resource.child); p; + p = next_resource(p, sibling_only)) { if ((p->flags & res->flags) != res->flags) continue; if ((desc != IORES_DESC_NONE) && (desc != p->desc)) @@ -532,7 +537,7 @@ int region_intersects(resource_size_t start, size_t size, unsigned long flags, struct resource *p; read_lock(&resource_lock); - for (p = iomem_resource.child; p ; p = p->sibling) { + list_for_each_entry(p, &iomem_resource.child, sibling) { bool is_type = (((p->flags & flags) == flags) && ((desc == IORES_DESC_NONE) || (desc == p->desc))); @@ -586,7 +591,7 @@ static int __find_resource(struct resource *root, struct resource *old, resource_size_t size, struct resource_constraint *constraint) { - struct resource *this = root->child; + struct resource *this = resource_first_child(&root->child); struct resource tmp = *new, avail, alloc; tmp.start = root->start; @@ -596,7 +601,7 @@ static int __find_resource(struct resource *root, struct resource *old, */ if (this && this->start == root->start) { tmp.start = (this == old) ? old->start : this->end + 1; - this = this->sibling; + this = resource_sibling(this); } for(;;) { if (this) @@ -632,7 +637,7 @@ next: if (!this || this->end == root->end) if (this != old) tmp.start = this->end + 1; - this = this->sibling; + this = resource_sibling(this); } return -EBUSY; } @@ -676,7 +681,7 @@ static int reallocate_resource(struct resource *root, struct resource *old, goto out; } - if (old->child) { + if (!list_empty(&old->child)) { err = -EBUSY; goto out; } @@ -757,7 +762,7 @@ struct resource *lookup_resource(struct resource *root, resource_size_t start) struct resource *res; read_lock(&resource_lock); - for (res = root->child; res; res = res->sibling) { + list_for_each_entry(res, &root->child, sibling) { if (res->start == start) break; } @@ -790,32 +795,27 @@ static struct resource * __insert_resource(struct resource *parent, struct resou break; } - for (next = first; ; next = next->sibling) { + for (next = first; ; next = resource_sibling(next)) { /* Partial overlap? Bad, and unfixable */ if (next->start < new->start || next->end > new->end) return next; - if (!next->sibling) + if (!resource_sibling(next)) break; - if (next->sibling->start > new->end) + if (resource_sibling(next)->start > new->end) break; } - new->parent = parent; - new->sibling = next->sibling; - new->child = first; + list_add(&new->sibling, &next->sibling); + INIT_LIST_HEAD(&new->child); - next->sibling = NULL; - for (next = first; next; next = next->sibling) + /* + * From first to next, they all fall into new's region, so change them + * as new's children. + */ + list_cut_position(&new->child, first->sibling.prev, &next->sibling); + list_for_each_entry(next, &new->child, sibling) next->parent = new; - if (parent->child == first) { - parent->child = new; - } else { - next = parent->child; - while (next->sibling != first) - next = next->sibling; - next->sibling = new; - } return NULL; } @@ -937,19 +937,17 @@ static int __adjust_resource(struct resource *res, resource_size_t start, if ((start < parent->start) || (end > parent->end)) goto out; - if (res->sibling && (res->sibling->start <= end)) + if (resource_sibling(res) && (resource_sibling(res)->start <= end)) goto out; - tmp = parent->child; - if (tmp != res) { - while (tmp->sibling != res) - tmp = tmp->sibling; + if (res->sibling.prev != &parent->child) { + tmp = list_prev_entry(res, sibling); if (start <= tmp->end) goto out; } skip: - for (tmp = res->child; tmp; tmp = tmp->sibling) + list_for_each_entry(tmp, &res->child, sibling) if ((tmp->start < start) || (tmp->end > end)) goto out; @@ -996,27 +994,30 @@ EXPORT_SYMBOL(adjust_resource); */ int reparent_resources(struct resource *parent, struct resource *res) { - struct resource *p, **pp; - struct resource **firstpp = NULL; + struct resource *p, *first = NULL; - for (pp = &parent->child; (p = *pp) != NULL; pp = &p->sibling) { + list_for_each_entry(p, &parent->child, sibling) { if (p->end < res->start) continue; if (res->end < p->start) break; if (p->start < res->start || p->end > res->end) return -ENOTSUPP; /* not completely contained */ - if (firstpp == NULL) - firstpp = pp; + if (first == NULL) + first = p; } - if (firstpp == NULL) + if (first == NULL) return -ECANCELED; /* didn't find any conflicting entries? */ res->parent = parent; - res->child = *firstpp; - res->sibling = *pp; - *firstpp = res; - *pp = NULL; - for (p = res->child; p != NULL; p = p->sibling) { + list_add(&res->sibling, p->sibling.prev); + INIT_LIST_HEAD(&res->child); + + /* + * From first to p's previous sibling, they all fall into + * res's region, change them as res's children. + */ + list_cut_position(&res->child, first->sibling.prev, res->sibling.prev); + list_for_each_entry(p, &res->child, sibling) { p->parent = res; pr_debug("PCI: Reparented %s %pR under %s\n", p->name, p, res->name); @@ -1216,34 +1217,32 @@ EXPORT_SYMBOL(__request_region); void __release_region(struct resource *parent, resource_size_t start, resource_size_t n) { - struct resource **p; + struct resource *res; resource_size_t end; - p = &parent->child; + res = resource_first_child(&parent->child); end = start + n - 1; write_lock(&resource_lock); for (;;) { - struct resource *res = *p; - if (!res) break; if (res->start <= start && res->end >= end) { if (!(res->flags & IORESOURCE_BUSY)) { - p = &res->child; + res = resource_first_child(&res->child); continue; } if (res->start != start || res->end != end) break; - *p = res->sibling; + list_del(&res->sibling); write_unlock(&resource_lock); if (res->flags & IORESOURCE_MUXED) wake_up(&muxed_resource_wait); free_resource(res); return; } - p = &res->sibling; + res = resource_sibling(res); } write_unlock(&resource_lock); @@ -1278,9 +1277,7 @@ EXPORT_SYMBOL(__release_region); int release_mem_region_adjustable(struct resource *parent, resource_size_t start, resource_size_t size) { - struct resource **p; - struct resource *res; - struct resource *new_res; + struct resource *res, *new_res; resource_size_t end; int ret = -EINVAL; @@ -1291,16 +1288,16 @@ int release_mem_region_adjustable(struct resource *parent, /* The alloc_resource() result gets checked later */ new_res = alloc_resource(GFP_KERNEL); - p = &parent->child; + res = resource_first_child(&parent->child); write_lock(&resource_lock); - while ((res = *p)) { + while ((res)) { if (res->start >= end) break; /* look for the next resource if it does not fit into */ if (res->start > start || res->end < end) { - p = &res->sibling; + res = resource_sibling(res); continue; } @@ -1308,14 +1305,14 @@ int release_mem_region_adjustable(struct resource *parent, break; if (!(res->flags & IORESOURCE_BUSY)) { - p = &res->child; + res = resource_first_child(&res->child); continue; } /* found the target resource; let's adjust accordingly */ if (res->start == start && res->end == end) { /* free the whole entry */ - *p = res->sibling; + list_del(&res->sibling); free_resource(res); ret = 0; } else if (res->start == start && res->end != end) { @@ -1338,14 +1335,13 @@ int release_mem_region_adjustable(struct resource *parent, new_res->flags = res->flags; new_res->desc = res->desc; new_res->parent = res->parent; - new_res->sibling = res->sibling; - new_res->child = NULL; + INIT_LIST_HEAD(&new_res->child); ret = __adjust_resource(res, res->start, start - res->start); if (ret) break; - res->sibling = new_res; + list_add(&new_res->sibling, &res->sibling); new_res = NULL; } @@ -1526,7 +1522,7 @@ static int __init reserve_setup(char *str) res->end = io_start + io_num - 1; res->flags |= IORESOURCE_BUSY; res->desc = IORES_DESC_NONE; - res->child = NULL; + INIT_LIST_HEAD(&res->child); if (request_resource(parent, res) == 0) reserved = x+1; } @@ -1546,7 +1542,7 @@ int iomem_map_sanity_check(resource_size_t addr, unsigned long size) loff_t l; read_lock(&resource_lock); - for (p = p->child; p ; p = r_next(NULL, p, &l)) { + for (p = resource_first_child(&p->child); p; p = r_next(NULL, p, &l)) { /* * We can probably skip the resources without * IORESOURCE_IO attribute? @@ -1602,7 +1598,7 @@ bool iomem_is_exclusive(u64 addr) addr = addr & PAGE_MASK; read_lock(&resource_lock); - for (p = p->child; p ; p = r_next(NULL, p, &l)) { + for (p = resource_first_child(&p->child); p; p = r_next(NULL, p, &l)) { /* * We can probably skip the resources without * IORESOURCE_IO attribute? -- 2.13.6 _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH v7 2/4] resource: Use list_head to link sibling resource @ 2018-07-18 2:49 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-18 2:49 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, dan.j.williams-ral2JQCrhuEAvxtiuMwx3w, nicolas.pitre-QSEj5FYQhm4dnm+yROfE0A, josh-iaAMLnmF4UmaiuxdJuQwMA, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w, bp-l3A5Bk7waGM, andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w Cc: linux-mips-6z/3iImG2C8G8FEW9MqTrA, brijesh.singh-5C7GfCeVMHo, devicetree-u79uwXL29TY76Z2rM5mHXA, airlied-cv59FeDIM0c, linux-pci-u79uwXL29TY76Z2rM5mHXA, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w, jcmvbkbc-Re5JQEeQqe8AvxtiuMwx3w, Paul Mackerras, baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/, kys-0li6OtcxBFHby3iVrkZq2A, frowand.list-Re5JQEeQqe8AvxtiuMwx3w, lorenzo.pieralisi-5wv7dgnIgG8, sthemmin-0li6OtcxBFHby3iVrkZq2A, Baoquan He, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, Michael Ellerman, patrik.r.jakobsson-Re5JQEeQqe8AvxtiuMwx3w, linux-input-u79uwXL29TY76Z2rM5mHXA, gustavo-THi1TnShQwVAfugRpC6u6w, dyoung-H+wXaHxf7aLQT0dZR+AlfA, thomas.lendacky-5C7GfCeVMHo, haiyangz-0li6OtcxBFHby3iVrkZq2A, maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA, jglisse-H+wXaHxf7aLQT0dZR+AlfA, seanpaul-F7+t8E8rja9g9hUCZPvPmw, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, tglx-hfZtesqFncYOwBW4kG4KsQ, yinghai-DgEjT+Ai2ygdnm+yROfE0A, jonathan.derrick-ral2JQCrhuEAvxtiuMwx3w, chris-YvXeqwSYzG2sTnJN9+BGXg, monstr-pSz03upnqPeHXe+LvDLADg, linux-parisc-u79uwXL29TY76Z2rM5mHXA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, Benjamin VGhlIHN0cnVjdCByZXNvdXJjZSB1c2VzIHNpbmdseSBsaW5rZWQgbGlzdCB0byBsaW5rIHNpYmxp bmdzLCBpbXBsZW1lbnRlZApieSBwb2ludGVyIG9wZXJhdGlvbi4gUmVwbGFjZSBpdCB3aXRoIGxp c3RfaGVhZCBmb3IgYmV0dGVyIGNvZGUgcmVhZGFiaWxpdHkuCgpCYXNlZCBvbiB0aGlzIGxpc3Rf aGVhZCByZXBsYWNlbWVudCwgaXQgd2lsbCBiZSB2ZXJ5IGVhc3kgdG8gZG8gcmV2ZXJzZQppdGVy YXRpb24gb24gaW9tZW1fcmVzb3VyY2UncyBzaWJsaW5nIGxpc3QgaW4gbGF0ZXIgcGF0Y2guCgpC ZXNpZGVzLCB0eXBlIG9mIG1lbWJlciB2YXJpYWJsZXMgb2Ygc3RydWN0IHJlc291cmNlLCBzaWJs aW5nIGFuZCBjaGlsZCwgYXJlCmNoYW5nZWQgZnJvbSAnc3RydWN0IHJlc291cmNlIConIHRvICdz dHJ1Y3QgbGlzdF9oZWFkJy4gVGhpcyBicmluZ3MgdHdvCnBvaW50ZXJzIG9mIHNpemUgaW5jcmVh c2UuCgpTdWdnZXN0ZWQtYnk6IEFuZHJldyBNb3J0b24gPGFrcG1AbGludXgtZm91bmRhdGlvbi5v cmc+ClNpZ25lZC1vZmYtYnk6IEJhb3F1YW4gSGUgPGJoZUByZWRoYXQuY29tPgpDYzogUGF0cmlr IEpha29ic3NvbiA8cGF0cmlrLnIuamFrb2Jzc29uQGdtYWlsLmNvbT4KQ2M6IERhdmlkIEFpcmxp ZSA8YWlybGllZEBsaW51eC5pZT4KQ2M6ICJLLiBZLiBTcmluaXZhc2FuIiA8a3lzQG1pY3Jvc29m dC5jb20+CkNjOiBIYWl5YW5nIFpoYW5nIDxoYWl5YW5nekBtaWNyb3NvZnQuY29tPgpDYzogU3Rl cGhlbiBIZW1taW5nZXIgPHN0aGVtbWluQG1pY3Jvc29mdC5jb20+CkNjOiBEbWl0cnkgVG9yb2to b3YgPGRtaXRyeS50b3Jva2hvdkBnbWFpbC5jb20+CkNjOiBEYW4gV2lsbGlhbXMgPGRhbi5qLndp bGxpYW1zQGludGVsLmNvbT4KQ2M6IFJvYiBIZXJyaW5nIDxyb2JoK2R0QGtlcm5lbC5vcmc+CkNj OiBGcmFuayBSb3dhbmQgPGZyb3dhbmQubGlzdEBnbWFpbC5jb20+CkNjOiBLZWl0aCBCdXNjaCA8 a2VpdGguYnVzY2hAaW50ZWwuY29tPgpDYzogSm9uYXRoYW4gRGVycmljayA8am9uYXRoYW4uZGVy cmlja0BpbnRlbC5jb20+CkNjOiBMb3JlbnpvIFBpZXJhbGlzaSA8bG9yZW56by5waWVyYWxpc2lA YXJtLmNvbT4KQ2M6IEJqb3JuIEhlbGdhYXMgPGJoZWxnYWFzQGdvb2dsZS5jb20+CkNjOiBUaG9t YXMgR2xlaXhuZXIgPHRnbHhAbGludXRyb25peC5kZT4KQ2M6IEJyaWplc2ggU2luZ2ggPGJyaWpl c2guc2luZ2hAYW1kLmNvbT4KQ2M6ICJKw6lyw7RtZSBHbGlzc2UiIDxqZ2xpc3NlQHJlZGhhdC5j b20+CkNjOiBCb3Jpc2xhdiBQZXRrb3YgPGJwQHN1c2UuZGU+CkNjOiBUb20gTGVuZGFja3kgPHRo b21hcy5sZW5kYWNreUBhbWQuY29tPgpDYzogR3JlZyBLcm9haC1IYXJ0bWFuIDxncmVna2hAbGlu dXhmb3VuZGF0aW9uLm9yZz4KQ2M6IFlhb3dlaSBCYWkgPGJhaXlhb3dlaUBjbXNzLmNoaW5hbW9i aWxlLmNvbT4KQ2M6IFdlaSBZYW5nIDxyaWNoYXJkLndlaXlhbmdAZ21haWwuY29tPgpDYzogZGV2 ZWxAbGludXhkcml2ZXJwcm9qZWN0Lm9yZwpDYzogbGludXgtaW5wdXRAdmdlci5rZXJuZWwub3Jn CkNjOiBsaW51eC1udmRpbW1AbGlzdHMuMDEub3JnCkNjOiBkZXZpY2V0cmVlQHZnZXIua2VybmVs Lm9yZwpDYzogbGludXgtcGNpQHZnZXIua2VybmVsLm9yZwpDYzogTWljaGFsIFNpbWVrIDxtb25z dHJAbW9uc3RyLmV1PgpDYzogQmVuamFtaW4gSGVycmVuc2NobWlkdCA8YmVuaEBrZXJuZWwuY3Jh c2hpbmcub3JnPiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIApDYzogUGF1bCBN YWNrZXJyYXMgPHBhdWx1c0BzYW1iYS5vcmc+CkNjOiBNaWNoYWVsIEVsbGVybWFuIDxtcGVAZWxs ZXJtYW4uaWQuYXU+CkNjOiBsaW51eC1taXBzQGxpbnV4LW1pcHMub3JnCi0tLQogYXJjaC9hcm0v cGxhdC1zYW1zdW5nL3BtLWNoZWNrLmMgICAgICAgICAgICB8ICAgNiArLQogYXJjaC9pYTY0L3Nu L2tlcm5lbC9pb19pbml0LmMgICAgICAgICAgICAgICB8ICAgMiArLQogYXJjaC9taWNyb2JsYXpl L3BjaS9wY2ktY29tbW9uLmMgICAgICAgICAgICB8ICAgNCArLQogYXJjaC9taXBzL3BjaS9wY2kt cmMzMjQzNC5jICAgICAgICAgICAgICAgICB8ICAxMiArLQogYXJjaC9wb3dlcnBjL2tlcm5lbC9w Y2ktY29tbW9uLmMgICAgICAgICAgICB8ICAgNCArLQogYXJjaC9zcGFyYy9rZXJuZWwvaW9wb3J0 LmMgICAgICAgICAgICAgICAgICB8ICAgMiArLQogYXJjaC94dGVuc2EvaW5jbHVkZS9hc20vcGNp LWJyaWRnZS5oICAgICAgICB8ICAgNCArLQogZHJpdmVycy9laXNhL2Vpc2EtYnVzLmMgICAgICAg ICAgICAgICAgICAgICB8ICAgMiArCiBkcml2ZXJzL2dwdS9kcm0vZHJtX21lbW9yeS5jICAgICAg ICAgICAgICAgIHwgICAzICstCiBkcml2ZXJzL2dwdS9kcm0vZ21hNTAwL2d0dC5jICAgICAgICAg ICAgICAgIHwgICA1ICstCiBkcml2ZXJzL2h2L3ZtYnVzX2Rydi5jICAgICAgICAgICAgICAgICAg ICAgIHwgIDUyICsrKy0tLS0KIGRyaXZlcnMvaW5wdXQvam95c3RpY2svaWZvcmNlL2lmb3JjZS1t YWluLmMgfCAgIDQgKy0KIGRyaXZlcnMvbnZkaW1tL25hbWVzcGFjZV9kZXZzLmMgICAgICAgICAg ICAgfCAgIDYgKy0KIGRyaXZlcnMvbnZkaW1tL25kLmggICAgICAgICAgICAgICAgICAgICAgICAg fCAgIDUgKy0KIGRyaXZlcnMvb2YvYWRkcmVzcy5jICAgICAgICAgICAgICAgICAgICAgICAgfCAg IDQgKy0KIGRyaXZlcnMvcGFyaXNjL2xiYV9wY2kuYyAgICAgICAgICAgICAgICAgICAgfCAgIDQg Ky0KIGRyaXZlcnMvcGNpL2NvbnRyb2xsZXIvdm1kLmMgICAgICAgICAgICAgICAgfCAgIDggKy0K IGRyaXZlcnMvcGNpL3Byb2JlLmMgICAgICAgICAgICAgICAgICAgICAgICAgfCAgIDIgKwogZHJp dmVycy9wY2kvc2V0dXAtYnVzLmMgICAgICAgICAgICAgICAgICAgICB8ICAgMiArLQogaW5jbHVk ZS9saW51eC9pb3BvcnQuaCAgICAgICAgICAgICAgICAgICAgICB8ICAxNyArKy0KIGtlcm5lbC9y ZXNvdXJjZS5jICAgICAgICAgICAgICAgICAgICAgICAgICAgfCAyMDYgKysrKysrKysrKysrKyst LS0tLS0tLS0tLS0tLQogMjEgZmlsZXMgY2hhbmdlZCwgMTgzIGluc2VydGlvbnMoKyksIDE3MSBk ZWxldGlvbnMoLSkKCmRpZmYgLS1naXQgYS9hcmNoL2FybS9wbGF0LXNhbXN1bmcvcG0tY2hlY2su YyBiL2FyY2gvYXJtL3BsYXQtc2Ftc3VuZy9wbS1jaGVjay5jCmluZGV4IGNkMmMwMmM2OGJjMy4u NTQ5NDM1NWIxYzQ5IDEwMDY0NAotLS0gYS9hcmNoL2FybS9wbGF0LXNhbXN1bmcvcG0tY2hlY2su YworKysgYi9hcmNoL2FybS9wbGF0LXNhbXN1bmcvcG0tY2hlY2suYwpAQCAtNDYsOCArNDYsOCBA QCB0eXBlZGVmIHUzMiAqKHJ1bl9mbl90KShzdHJ1Y3QgcmVzb3VyY2UgKnB0ciwgdTMyICphcmcp Owogc3RhdGljIHZvaWQgczNjX3BtX3J1bl9yZXMoc3RydWN0IHJlc291cmNlICpwdHIsIHJ1bl9m bl90IGZuLCB1MzIgKmFyZykKIHsKIAl3aGlsZSAocHRyICE9IE5VTEwpIHsKLQkJaWYgKHB0ci0+ Y2hpbGQgIT0gTlVMTCkKLQkJCXMzY19wbV9ydW5fcmVzKHB0ci0+Y2hpbGQsIGZuLCBhcmcpOwor CQlpZiAoIWxpc3RfZW1wdHkoJnB0ci0+Y2hpbGQpKQorCQkJczNjX3BtX3J1bl9yZXMocmVzb3Vy Y2VfZmlyc3RfY2hpbGQoJnB0ci0+Y2hpbGQpLCBmbiwgYXJnKTsKIAogCQlpZiAoKHB0ci0+Zmxh Z3MgJiBJT1JFU09VUkNFX1NZU1RFTV9SQU0pCiAJCQkJPT0gSU9SRVNPVVJDRV9TWVNURU1fUkFN KSB7CkBAIC01Nyw3ICs1Nyw3IEBAIHN0YXRpYyB2b2lkIHMzY19wbV9ydW5fcmVzKHN0cnVjdCBy ZXNvdXJjZSAqcHRyLCBydW5fZm5fdCBmbiwgdTMyICphcmcpCiAJCQlhcmcgPSAoZm4pKHB0ciwg YXJnKTsKIAkJfQogCi0JCXB0ciA9IHB0ci0+c2libGluZzsKKwkJcHRyID0gcmVzb3VyY2Vfc2li bGluZyhwdHIpOwogCX0KIH0KIApkaWZmIC0tZ2l0IGEvYXJjaC9pYTY0L3NuL2tlcm5lbC9pb19p bml0LmMgYi9hcmNoL2lhNjQvc24va2VybmVsL2lvX2luaXQuYwppbmRleCBkNjM4MDlhNmFkZmEu LjMzOGE3YjdmMTk0ZCAxMDA2NDQKLS0tIGEvYXJjaC9pYTY0L3NuL2tlcm5lbC9pb19pbml0LmMK KysrIGIvYXJjaC9pYTY0L3NuL2tlcm5lbC9pb19pbml0LmMKQEAgLTE5Miw3ICsxOTIsNyBAQCBz bl9pb19zbG90X2ZpeHVwKHN0cnVjdCBwY2lfZGV2ICpkZXYpCiAJCSAqIGlmIGl0J3MgYWxyZWFk eSBpbiB0aGUgZGV2aWNlIHN0cnVjdHVyZSwgcmVtb3ZlIGl0IGJlZm9yZQogCQkgKiBpbnNlcnRp bmcKIAkJICovCi0JCWlmIChyZXMtPnBhcmVudCAmJiByZXMtPnBhcmVudC0+Y2hpbGQpCisJCWlm IChyZXMtPnBhcmVudCAmJiAhbGlzdF9lbXB0eSgmcmVzLT5wYXJlbnQtPmNoaWxkKSkKIAkJCXJl bGVhc2VfcmVzb3VyY2UocmVzKTsKIAogCQlpZiAocmVzLT5mbGFncyAmIElPUkVTT1VSQ0VfSU8p CmRpZmYgLS1naXQgYS9hcmNoL21pY3JvYmxhemUvcGNpL3BjaS1jb21tb24uYyBiL2FyY2gvbWlj cm9ibGF6ZS9wY2kvcGNpLWNvbW1vbi5jCmluZGV4IDc4OTliYWZhYjA2NC4uMmJmNzNlMjdlMjMx IDEwMDY0NAotLS0gYS9hcmNoL21pY3JvYmxhemUvcGNpL3BjaS1jb21tb24uYworKysgYi9hcmNo L21pY3JvYmxhemUvcGNpL3BjaS1jb21tb24uYwpAQCAtNTMzLDcgKzUzMyw5IEBAIHZvaWQgcGNp X3Byb2Nlc3NfYnJpZGdlX09GX3JhbmdlcyhzdHJ1Y3QgcGNpX2NvbnRyb2xsZXIgKmhvc2UsCiAJ CQlyZXMtPmZsYWdzID0gcmFuZ2UuZmxhZ3M7CiAJCQlyZXMtPnN0YXJ0ID0gcmFuZ2UuY3B1X2Fk ZHI7CiAJCQlyZXMtPmVuZCA9IHJhbmdlLmNwdV9hZGRyICsgcmFuZ2Uuc2l6ZSAtIDE7Ci0JCQly ZXMtPnBhcmVudCA9IHJlcy0+Y2hpbGQgPSByZXMtPnNpYmxpbmcgPSBOVUxMOworCQkJcmVzLT5w YXJlbnQgPSBOVUxMOworCQkJSU5JVF9MSVNUX0hFQUQoJnJlcy0+Y2hpbGQpOworCQkJSU5JVF9M SVNUX0hFQUQoJnJlcy0+c2libGluZyk7CiAJCX0KIAl9CiAKZGlmZiAtLWdpdCBhL2FyY2gvbWlw cy9wY2kvcGNpLXJjMzI0MzQuYyBiL2FyY2gvbWlwcy9wY2kvcGNpLXJjMzI0MzQuYwppbmRleCA3 ZjZjZTZkNzM0YzAuLmU4MDI4M2RmNzkyNSAxMDA2NDQKLS0tIGEvYXJjaC9taXBzL3BjaS9wY2kt cmMzMjQzNC5jCisrKyBiL2FyY2gvbWlwcy9wY2kvcGNpLXJjMzI0MzQuYwpAQCAtNTMsOCArNTMs OCBAQCBzdGF0aWMgc3RydWN0IHJlc291cmNlIHJjMzI0MzRfcmVzX3BjaV9tZW0xID0gewogCS5z dGFydCA9IDB4NTAwMDAwMDAsCiAJLmVuZCA9IDB4NUZGRkZGRkYsCiAJLmZsYWdzID0gSU9SRVNP VVJDRV9NRU0sCi0JLnNpYmxpbmcgPSBOVUxMLAotCS5jaGlsZCA9ICZyYzMyNDM0X3Jlc19wY2lf bWVtMgorCS5zaWJsaW5nID0gTElTVF9IRUFEX0lOSVQocmMzMjQzNF9yZXNfcGNpX21lbTEuc2li bGluZyksCisJLmNoaWxkID0gTElTVF9IRUFEX0lOSVQocmMzMjQzNF9yZXNfcGNpX21lbTEuY2hp bGQpLAogfTsKIAogc3RhdGljIHN0cnVjdCByZXNvdXJjZSByYzMyNDM0X3Jlc19wY2lfbWVtMiA9 IHsKQEAgLTYzLDggKzYzLDggQEAgc3RhdGljIHN0cnVjdCByZXNvdXJjZSByYzMyNDM0X3Jlc19w Y2lfbWVtMiA9IHsKIAkuZW5kID0gMHg2RkZGRkZGRiwKIAkuZmxhZ3MgPSBJT1JFU09VUkNFX01F TSwKIAkucGFyZW50ID0gJnJjMzI0MzRfcmVzX3BjaV9tZW0xLAotCS5zaWJsaW5nID0gTlVMTCwK LQkuY2hpbGQgPSBOVUxMCisJLnNpYmxpbmcgPSBMSVNUX0hFQURfSU5JVChyYzMyNDM0X3Jlc19w Y2lfbWVtMi5zaWJsaW5nKSwKKwkuY2hpbGQgPSBMSVNUX0hFQURfSU5JVChyYzMyNDM0X3Jlc19w Y2lfbWVtMi5jaGlsZCksCiB9OwogCiBzdGF0aWMgc3RydWN0IHJlc291cmNlIHJjMzI0MzRfcmVz X3BjaV9pbzEgPSB7CkBAIC03Miw2ICs3Miw4IEBAIHN0YXRpYyBzdHJ1Y3QgcmVzb3VyY2UgcmMz MjQzNF9yZXNfcGNpX2lvMSA9IHsKIAkuc3RhcnQgPSAweDE4ODAwMDAwLAogCS5lbmQgPSAweDE4 OEZGRkZGLAogCS5mbGFncyA9IElPUkVTT1VSQ0VfSU8sCisJLnNpYmxpbmcgPSBMSVNUX0hFQURf SU5JVChyYzMyNDM0X3Jlc19wY2lfaW8xLnNpYmxpbmcpLAorCS5jaGlsZCA9IExJU1RfSEVBRF9J TklUKHJjMzI0MzRfcmVzX3BjaV9pbzEuY2hpbGQpLAogfTsKIAogZXh0ZXJuIHN0cnVjdCBwY2lf b3BzIHJjMzI0MzRfcGNpX29wczsKQEAgLTIwOCw2ICsyMTAsOCBAQCBzdGF0aWMgaW50IF9faW5p dCByYzMyNDM0X3BjaV9pbml0KHZvaWQpCiAKIAlwcl9pbmZvKCJQQ0k6IEluaXRpYWxpemluZyBQ Q0lcbiIpOwogCisJbGlzdF9hZGQoJnJjMzI0MzRfcmVzX3BjaV9tZW0yLnNpYmxpbmcsICZyYzMy NDM0X3Jlc19wY2lfbWVtMS5jaGlsZCk7CisKIAlpb3BvcnRfcmVzb3VyY2Uuc3RhcnQgPSByYzMy NDM0X3Jlc19wY2lfaW8xLnN0YXJ0OwogCWlvcG9ydF9yZXNvdXJjZS5lbmQgPSByYzMyNDM0X3Jl c19wY2lfaW8xLmVuZDsKIApkaWZmIC0tZ2l0IGEvYXJjaC9wb3dlcnBjL2tlcm5lbC9wY2ktY29t bW9uLmMgYi9hcmNoL3Bvd2VycGMva2VybmVsL3BjaS1jb21tb24uYwppbmRleCA5MjYwMzViYjM3 OGQuLjI4ZmJlODNjOWRhZiAxMDA2NDQKLS0tIGEvYXJjaC9wb3dlcnBjL2tlcm5lbC9wY2ktY29t bW9uLmMKKysrIGIvYXJjaC9wb3dlcnBjL2tlcm5lbC9wY2ktY29tbW9uLmMKQEAgLTc2MSw3ICs3 NjEsOSBAQCB2b2lkIHBjaV9wcm9jZXNzX2JyaWRnZV9PRl9yYW5nZXMoc3RydWN0IHBjaV9jb250 cm9sbGVyICpob3NlLAogCQkJcmVzLT5mbGFncyA9IHJhbmdlLmZsYWdzOwogCQkJcmVzLT5zdGFy dCA9IHJhbmdlLmNwdV9hZGRyOwogCQkJcmVzLT5lbmQgPSByYW5nZS5jcHVfYWRkciArIHJhbmdl LnNpemUgLSAxOwotCQkJcmVzLT5wYXJlbnQgPSByZXMtPmNoaWxkID0gcmVzLT5zaWJsaW5nID0g TlVMTDsKKwkJCXJlcy0+cGFyZW50ID0gTlVMTDsKKwkJCUlOSVRfTElTVF9IRUFEKCZyZXMtPmNo aWxkKTsKKwkJCUlOSVRfTElTVF9IRUFEKCZyZXMtPnNpYmxpbmcpOwogCQl9CiAJfQogfQpkaWZm IC0tZ2l0IGEvYXJjaC9zcGFyYy9rZXJuZWwvaW9wb3J0LmMgYi9hcmNoL3NwYXJjL2tlcm5lbC9p b3BvcnQuYwppbmRleCBjY2E5MTM0Y2ZhN2QuLjk5ZWZlNGU5OGIxNiAxMDA2NDQKLS0tIGEvYXJj aC9zcGFyYy9rZXJuZWwvaW9wb3J0LmMKKysrIGIvYXJjaC9zcGFyYy9rZXJuZWwvaW9wb3J0LmMK QEAgLTY2OSw3ICs2NjksNyBAQCBzdGF0aWMgaW50IHNwYXJjX2lvX3Byb2Nfc2hvdyhzdHJ1Y3Qg c2VxX2ZpbGUgKm0sIHZvaWQgKnYpCiAJc3RydWN0IHJlc291cmNlICpyb290ID0gbS0+cHJpdmF0 ZSwgKnI7CiAJY29uc3QgY2hhciAqbm07CiAKLQlmb3IgKHIgPSByb290LT5jaGlsZDsgciAhPSBO VUxMOyByID0gci0+c2libGluZykgeworCWxpc3RfZm9yX2VhY2hfZW50cnkociwgJnJvb3QtPmNo aWxkLCBzaWJsaW5nKSB7CiAJCWlmICgobm0gPSByLT5uYW1lKSA9PSBOVUxMKSBubSA9ICI/Pz8i OwogCQlzZXFfcHJpbnRmKG0sICIlMDE2bGx4LSUwMTZsbHg6ICVzXG4iLAogCQkJCSh1bnNpZ25l ZCBsb25nIGxvbmcpci0+c3RhcnQsCmRpZmYgLS1naXQgYS9hcmNoL3h0ZW5zYS9pbmNsdWRlL2Fz bS9wY2ktYnJpZGdlLmggYi9hcmNoL3h0ZW5zYS9pbmNsdWRlL2FzbS9wY2ktYnJpZGdlLmgKaW5k ZXggMGI2OGM3NmVjMWU2Li5mNDg3YjA2ODE3ZGYgMTAwNjQ0Ci0tLSBhL2FyY2gveHRlbnNhL2lu Y2x1ZGUvYXNtL3BjaS1icmlkZ2UuaAorKysgYi9hcmNoL3h0ZW5zYS9pbmNsdWRlL2FzbS9wY2kt YnJpZGdlLmgKQEAgLTcxLDggKzcxLDggQEAgc3RhdGljIGlubGluZSB2b2lkIHBjaWJpb3NfaW5p dF9yZXNvdXJjZShzdHJ1Y3QgcmVzb3VyY2UgKnJlcywKIAlyZXMtPmZsYWdzID0gZmxhZ3M7CiAJ cmVzLT5uYW1lID0gbmFtZTsKIAlyZXMtPnBhcmVudCA9IE5VTEw7Ci0JcmVzLT5zaWJsaW5nID0g TlVMTDsKLQlyZXMtPmNoaWxkID0gTlVMTDsKKwlJTklUX0xJU1RfSEVBRCgmcmVzLT5jaGlsZCk7 CisJSU5JVF9MSVNUX0hFQUQoJnJlcy0+c2libGluZyk7CiB9CiAKIApkaWZmIC0tZ2l0IGEvZHJp dmVycy9laXNhL2Vpc2EtYnVzLmMgYi9kcml2ZXJzL2Vpc2EvZWlzYS1idXMuYwppbmRleCAxZTgw NjJmNmRiZmMuLmRiYTc4Zjc1ZmQwNiAxMDA2NDQKLS0tIGEvZHJpdmVycy9laXNhL2Vpc2EtYnVz LmMKKysrIGIvZHJpdmVycy9laXNhL2Vpc2EtYnVzLmMKQEAgLTQwOCw2ICs0MDgsOCBAQCBzdGF0 aWMgc3RydWN0IHJlc291cmNlIGVpc2Ffcm9vdF9yZXMgPSB7CiAJLnN0YXJ0ID0gMCwKIAkuZW5k ICAgPSAweGZmZmZmZmZmLAogCS5mbGFncyA9IElPUkVTT1VSQ0VfSU8sCisJLnNpYmxpbmcgPSBM SVNUX0hFQURfSU5JVChlaXNhX3Jvb3RfcmVzLnNpYmxpbmcpLAorCS5jaGlsZCAgPSBMSVNUX0hF QURfSU5JVChlaXNhX3Jvb3RfcmVzLmNoaWxkKSwKIH07CiAKIHN0YXRpYyBpbnQgZWlzYV9idXNf Y291bnQ7CmRpZmYgLS1naXQgYS9kcml2ZXJzL2dwdS9kcm0vZHJtX21lbW9yeS5jIGIvZHJpdmVy cy9ncHUvZHJtL2RybV9tZW1vcnkuYwppbmRleCBkNjllNGZjMWVlNzcuLjMzYmFhN2ZhNWU0MSAx MDA2NDQKLS0tIGEvZHJpdmVycy9ncHUvZHJtL2RybV9tZW1vcnkuYworKysgYi9kcml2ZXJzL2dw dS9kcm0vZHJtX21lbW9yeS5jCkBAIC0xNTUsOSArMTU1LDggQEAgdTY0IGRybV9nZXRfbWF4X2lv bWVtKHZvaWQpCiAJc3RydWN0IHJlc291cmNlICp0bXA7CiAJcmVzb3VyY2Vfc2l6ZV90IG1heF9p b21lbSA9IDA7CiAKLQlmb3IgKHRtcCA9IGlvbWVtX3Jlc291cmNlLmNoaWxkOyB0bXA7IHRtcCA9 IHRtcC0+c2libGluZykgeworCWxpc3RfZm9yX2VhY2hfZW50cnkodG1wLCAmaW9tZW1fcmVzb3Vy Y2UuY2hpbGQsIHNpYmxpbmcpCiAJCW1heF9pb21lbSA9IG1heChtYXhfaW9tZW0sICB0bXAtPmVu ZCk7Ci0JfQogCiAJcmV0dXJuIG1heF9pb21lbTsKIH0KZGlmZiAtLWdpdCBhL2RyaXZlcnMvZ3B1 L2RybS9nbWE1MDAvZ3R0LmMgYi9kcml2ZXJzL2dwdS9kcm0vZ21hNTAwL2d0dC5jCmluZGV4IDM5 NDliMDk5MDkxNi4uYWRkZDNiYzAwOWFmIDEwMDY0NAotLS0gYS9kcml2ZXJzL2dwdS9kcm0vZ21h NTAwL2d0dC5jCisrKyBiL2RyaXZlcnMvZ3B1L2RybS9nbWE1MDAvZ3R0LmMKQEAgLTU2NSw3ICs1 NjUsNyBAQCBpbnQgcHNiX2d0dF9pbml0KHN0cnVjdCBkcm1fZGV2aWNlICpkZXYsIGludCByZXN1 bWUpCiBpbnQgcHNiX2d0dF9yZXN0b3JlKHN0cnVjdCBkcm1fZGV2aWNlICpkZXYpCiB7CiAJc3Ry dWN0IGRybV9wc2JfcHJpdmF0ZSAqZGV2X3ByaXYgPSBkZXYtPmRldl9wcml2YXRlOwotCXN0cnVj dCByZXNvdXJjZSAqciA9IGRldl9wcml2LT5ndHRfbWVtLT5jaGlsZDsKKwlzdHJ1Y3QgcmVzb3Vy Y2UgKnI7CiAJc3RydWN0IGd0dF9yYW5nZSAqcmFuZ2U7CiAJdW5zaWduZWQgaW50IHJlc3RvcmVk ID0gMCwgdG90YWwgPSAwLCBzaXplID0gMDsKIApAQCAtNTczLDE0ICs1NzMsMTMgQEAgaW50IHBz Yl9ndHRfcmVzdG9yZShzdHJ1Y3QgZHJtX2RldmljZSAqZGV2KQogCW11dGV4X2xvY2soJmRldl9w cml2LT5ndHRfbXV0ZXgpOwogCXBzYl9ndHRfaW5pdChkZXYsIDEpOwogCi0Jd2hpbGUgKHIgIT0g TlVMTCkgeworCWxpc3RfZm9yX2VhY2hfZW50cnkociwgJmRldl9wcml2LT5ndHRfbWVtLT5jaGls ZCwgc2libGluZykgewogCQlyYW5nZSA9IGNvbnRhaW5lcl9vZihyLCBzdHJ1Y3QgZ3R0X3Jhbmdl LCByZXNvdXJjZSk7CiAJCWlmIChyYW5nZS0+cGFnZXMpIHsKIAkJCXBzYl9ndHRfaW5zZXJ0KGRl diwgcmFuZ2UsIDEpOwogCQkJc2l6ZSArPSByYW5nZS0+cmVzb3VyY2UuZW5kIC0gcmFuZ2UtPnJl c291cmNlLnN0YXJ0OwogCQkJcmVzdG9yZWQrKzsKIAkJfQotCQlyID0gci0+c2libGluZzsKIAkJ dG90YWwrKzsKIAl9CiAJbXV0ZXhfdW5sb2NrKCZkZXZfcHJpdi0+Z3R0X211dGV4KTsKZGlmZiAt LWdpdCBhL2RyaXZlcnMvaHYvdm1idXNfZHJ2LmMgYi9kcml2ZXJzL2h2L3ZtYnVzX2Rydi5jCmlu ZGV4IGIxMGZlMjZjNDg5MS4uZDg3ZWM1YTFiYzRjIDEwMDY0NAotLS0gYS9kcml2ZXJzL2h2L3Zt YnVzX2Rydi5jCisrKyBiL2RyaXZlcnMvaHYvdm1idXNfZHJ2LmMKQEAgLTE0MTIsOSArMTQxMiw4 IEBAIHN0YXRpYyBhY3BpX3N0YXR1cyB2bWJ1c193YWxrX3Jlc291cmNlcyhzdHJ1Y3QgYWNwaV9y ZXNvdXJjZSAqcmVzLCB2b2lkICpjdHgpCiB7CiAJcmVzb3VyY2Vfc2l6ZV90IHN0YXJ0ID0gMDsK IAlyZXNvdXJjZV9zaXplX3QgZW5kID0gMDsKLQlzdHJ1Y3QgcmVzb3VyY2UgKm5ld19yZXM7CisJ c3RydWN0IHJlc291cmNlICpuZXdfcmVzLCAqdG1wOwogCXN0cnVjdCByZXNvdXJjZSAqKm9sZF9y ZXMgPSAmaHlwZXJ2X21taW87Ci0Jc3RydWN0IHJlc291cmNlICoqcHJldl9yZXMgPSBOVUxMOwog CiAJc3dpdGNoIChyZXMtPnR5cGUpIHsKIApAQCAtMTQ2MSw0NCArMTQ2MCwzNiBAQCBzdGF0aWMg YWNwaV9zdGF0dXMgdm1idXNfd2Fsa19yZXNvdXJjZXMoc3RydWN0IGFjcGlfcmVzb3VyY2UgKnJl cywgdm9pZCAqY3R4KQogCS8qCiAJICogSWYgdHdvIHJhbmdlcyBhcmUgYWRqYWNlbnQsIG1lcmdl IHRoZW0uCiAJICovCi0JZG8gewotCQlpZiAoISpvbGRfcmVzKSB7Ci0JCQkqb2xkX3JlcyA9IG5l d19yZXM7Ci0JCQlicmVhazsKLQkJfQotCi0JCWlmICgoKCpvbGRfcmVzKS0+ZW5kICsgMSkgPT0g bmV3X3Jlcy0+c3RhcnQpIHsKLQkJCSgqb2xkX3JlcyktPmVuZCA9IG5ld19yZXMtPmVuZDsKKwlp ZiAoISpvbGRfcmVzKSB7CisJCSpvbGRfcmVzID0gbmV3X3JlczsKKwkJcmV0dXJuIEFFX09LOwor CX0KKwl0bXAgPSAqb2xkX3JlczsKKwlsaXN0X2Zvcl9lYWNoX2VudHJ5X2Zyb20odG1wLCAmdG1w LT5wYXJlbnQtPmNoaWxkLCBzaWJsaW5nKSB7CisJCWlmICgodG1wLT5lbmQgKyAxKSA9PSBuZXdf cmVzLT5zdGFydCkgeworCQkJdG1wLT5lbmQgPSBuZXdfcmVzLT5lbmQ7CiAJCQlrZnJlZShuZXdf cmVzKTsKIAkJCWJyZWFrOwogCQl9CiAKLQkJaWYgKCgqb2xkX3JlcyktPnN0YXJ0ID09IG5ld19y ZXMtPmVuZCArIDEpIHsKLQkJCSgqb2xkX3JlcyktPnN0YXJ0ID0gbmV3X3Jlcy0+c3RhcnQ7CisJ CWlmICh0bXAtPnN0YXJ0ID09IG5ld19yZXMtPmVuZCArIDEpIHsKKwkJCXRtcC0+c3RhcnQgPSBu ZXdfcmVzLT5zdGFydDsKIAkJCWtmcmVlKG5ld19yZXMpOwogCQkJYnJlYWs7CiAJCX0KIAotCQlp ZiAoKCpvbGRfcmVzKS0+c3RhcnQgPiBuZXdfcmVzLT5lbmQpIHsKLQkJCW5ld19yZXMtPnNpYmxp bmcgPSAqb2xkX3JlczsKLQkJCWlmIChwcmV2X3JlcykKLQkJCQkoKnByZXZfcmVzKS0+c2libGlu ZyA9IG5ld19yZXM7Ci0JCQkqb2xkX3JlcyA9IG5ld19yZXM7CisJCWlmICh0bXAtPnN0YXJ0ID4g bmV3X3Jlcy0+ZW5kKSB7CisJCQlsaXN0X2FkZCgmbmV3X3Jlcy0+c2libGluZywgdG1wLT5zaWJs aW5nLnByZXYpOwogCQkJYnJlYWs7CiAJCX0KLQotCQlwcmV2X3JlcyA9IG9sZF9yZXM7Ci0JCW9s ZF9yZXMgPSAmKCpvbGRfcmVzKS0+c2libGluZzsKLQotCX0gd2hpbGUgKDEpOworCX0KIAogCXJl dHVybiBBRV9PSzsKIH0KIAogc3RhdGljIGludCB2bWJ1c19hY3BpX3JlbW92ZShzdHJ1Y3QgYWNw aV9kZXZpY2UgKmRldmljZSkKIHsKLQlzdHJ1Y3QgcmVzb3VyY2UgKmN1cl9yZXM7Ci0Jc3RydWN0 IHJlc291cmNlICpuZXh0X3JlczsKKwlzdHJ1Y3QgcmVzb3VyY2UgKnJlczsKIAogCWlmIChoeXBl cnZfbW1pbykgewogCQlpZiAoZmJfbW1pbykgewpAQCAtMTUwNywxMCArMTQ5OCw5IEBAIHN0YXRp YyBpbnQgdm1idXNfYWNwaV9yZW1vdmUoc3RydWN0IGFjcGlfZGV2aWNlICpkZXZpY2UpCiAJCQlm Yl9tbWlvID0gTlVMTDsKIAkJfQogCi0JCWZvciAoY3VyX3JlcyA9IGh5cGVydl9tbWlvOyBjdXJf cmVzOyBjdXJfcmVzID0gbmV4dF9yZXMpIHsKLQkJCW5leHRfcmVzID0gY3VyX3Jlcy0+c2libGlu ZzsKLQkJCWtmcmVlKGN1cl9yZXMpOwotCQl9CisJCXJlcyA9IGh5cGVydl9tbWlvOworCQlsaXN0 X2Zvcl9lYWNoX2VudHJ5X2Zyb20ocmVzLCAmcmVzLT5wYXJlbnQtPmNoaWxkLCBzaWJsaW5nKQor CQkJa2ZyZWUocmVzKTsKIAl9CiAKIAlyZXR1cm4gMDsKQEAgLTE1OTYsNyArMTU4Niw4IEBAIGlu dCB2bWJ1c19hbGxvY2F0ZV9tbWlvKHN0cnVjdCByZXNvdXJjZSAqKm5ldywgc3RydWN0IGh2X2Rl dmljZSAqZGV2aWNlX29iaiwKIAkJfQogCX0KIAotCWZvciAoaXRlciA9IGh5cGVydl9tbWlvOyBp dGVyOyBpdGVyID0gaXRlci0+c2libGluZykgeworCWl0ZXIgPSBoeXBlcnZfbW1pbzsKKwlsaXN0 X2Zvcl9lYWNoX2VudHJ5X2Zyb20oaXRlciwgJml0ZXItPnBhcmVudC0+Y2hpbGQsIHNpYmxpbmcp IHsKIAkJaWYgKChpdGVyLT5zdGFydCA+PSBtYXgpIHx8IChpdGVyLT5lbmQgPD0gbWluKSkKIAkJ CWNvbnRpbnVlOwogCkBAIC0xNjM5LDcgKzE2MzAsOCBAQCB2b2lkIHZtYnVzX2ZyZWVfbW1pbyhy ZXNvdXJjZV9zaXplX3Qgc3RhcnQsIHJlc291cmNlX3NpemVfdCBzaXplKQogCXN0cnVjdCByZXNv dXJjZSAqaXRlcjsKIAogCWRvd24oJmh5cGVydl9tbWlvX2xvY2spOwotCWZvciAoaXRlciA9IGh5 cGVydl9tbWlvOyBpdGVyOyBpdGVyID0gaXRlci0+c2libGluZykgeworCWl0ZXIgPSBoeXBlcnZf bW1pbzsKKwlsaXN0X2Zvcl9lYWNoX2VudHJ5X2Zyb20oaXRlciwgJml0ZXItPnBhcmVudC0+Y2hp bGQsIHNpYmxpbmcpIHsKIAkJaWYgKChpdGVyLT5zdGFydCA+PSBzdGFydCArIHNpemUpIHx8IChp dGVyLT5lbmQgPD0gc3RhcnQpKQogCQkJY29udGludWU7CiAKZGlmZiAtLWdpdCBhL2RyaXZlcnMv aW5wdXQvam95c3RpY2svaWZvcmNlL2lmb3JjZS1tYWluLmMgYi9kcml2ZXJzL2lucHV0L2pveXN0 aWNrL2lmb3JjZS9pZm9yY2UtbWFpbi5jCmluZGV4IGRhZWViNGM3ZTNiMC4uNWMwYmUyN2IzM2Zm IDEwMDY0NAotLS0gYS9kcml2ZXJzL2lucHV0L2pveXN0aWNrL2lmb3JjZS9pZm9yY2UtbWFpbi5j CisrKyBiL2RyaXZlcnMvaW5wdXQvam95c3RpY2svaWZvcmNlL2lmb3JjZS1tYWluLmMKQEAgLTMw NSw4ICszMDUsOCBAQCBpbnQgaWZvcmNlX2luaXRfZGV2aWNlKHN0cnVjdCBpZm9yY2UgKmlmb3Jj ZSkKIAlpZm9yY2UtPmRldmljZV9tZW1vcnkuZW5kID0gMjAwOwogCWlmb3JjZS0+ZGV2aWNlX21l bW9yeS5mbGFncyA9IElPUkVTT1VSQ0VfTUVNOwogCWlmb3JjZS0+ZGV2aWNlX21lbW9yeS5wYXJl bnQgPSBOVUxMOwotCWlmb3JjZS0+ZGV2aWNlX21lbW9yeS5jaGlsZCA9IE5VTEw7Ci0JaWZvcmNl LT5kZXZpY2VfbWVtb3J5LnNpYmxpbmcgPSBOVUxMOworCUlOSVRfTElTVF9IRUFEKCZpZm9yY2Ut PmRldmljZV9tZW1vcnkuY2hpbGQpOworCUlOSVRfTElTVF9IRUFEKCZpZm9yY2UtPmRldmljZV9t ZW1vcnkuc2libGluZyk7CiAKIC8qCiAgKiBXYWl0IHVudGlsIGRldmljZSByZWFkeSAtIHVudGls IGl0IHNlbmRzIGl0cyBmaXJzdCByZXNwb25zZS4KZGlmZiAtLWdpdCBhL2RyaXZlcnMvbnZkaW1t L25hbWVzcGFjZV9kZXZzLmMgYi9kcml2ZXJzL252ZGltbS9uYW1lc3BhY2VfZGV2cy5jCmluZGV4 IDI4YWZkZDY2ODkwNS4uZjUzZDQxMGQ5OTgxIDEwMDY0NAotLS0gYS9kcml2ZXJzL252ZGltbS9u YW1lc3BhY2VfZGV2cy5jCisrKyBiL2RyaXZlcnMvbnZkaW1tL25hbWVzcGFjZV9kZXZzLmMKQEAg LTYzNyw3ICs2MzcsNyBAQCBzdGF0aWMgcmVzb3VyY2Vfc2l6ZV90IHNjYW5fYWxsb2NhdGUoc3Ry dWN0IG5kX3JlZ2lvbiAqbmRfcmVnaW9uLAogIHJldHJ5OgogCWZpcnN0ID0gMDsKIAlmb3JfZWFj aF9kcGFfcmVzb3VyY2UobmRkLCByZXMpIHsKLQkJc3RydWN0IHJlc291cmNlICpuZXh0ID0gcmVz LT5zaWJsaW5nLCAqbmV3X3JlcyA9IE5VTEw7CisJCXN0cnVjdCByZXNvdXJjZSAqbmV4dCA9IHJl c291cmNlX3NpYmxpbmcocmVzKSwgKm5ld19yZXMgPSBOVUxMOwogCQlyZXNvdXJjZV9zaXplX3Qg YWxsb2NhdGUsIGF2YWlsYWJsZSA9IDA7CiAJCWVudW0gYWxsb2NfbG9jIGxvYyA9IEFMTE9DX0VS UjsKIAkJY29uc3QgY2hhciAqYWN0aW9uOwpAQCAtNzYzLDcgKzc2Myw3IEBAIHN0YXRpYyByZXNv dXJjZV9zaXplX3Qgc2Nhbl9hbGxvY2F0ZShzdHJ1Y3QgbmRfcmVnaW9uICpuZF9yZWdpb24sCiAJ ICogYW4gaW5pdGlhbCAicG1lbS1yZXNlcnZlIHBhc3MiLiAgT25seSBkbyBhbiBpbml0aWFsIEJM SyBhbGxvY2F0aW9uCiAJICogd2hlbiBub25lIG9mIHRoZSBEUEEgc3BhY2UgaXMgcmVzZXJ2ZWQu CiAJICovCi0JaWYgKChpc19wbWVtIHx8ICFuZGQtPmRwYS5jaGlsZCkgJiYgbiA9PSB0b19hbGxv Y2F0ZSkKKwlpZiAoKGlzX3BtZW0gfHwgbGlzdF9lbXB0eSgmbmRkLT5kcGEuY2hpbGQpKSAmJiBu ID09IHRvX2FsbG9jYXRlKQogCQlyZXR1cm4gaW5pdF9kcGFfYWxsb2NhdGlvbihsYWJlbF9pZCwg bmRfcmVnaW9uLCBuZF9tYXBwaW5nLCBuKTsKIAlyZXR1cm4gbjsKIH0KQEAgLTc3OSw3ICs3Nzks NyBAQCBzdGF0aWMgaW50IG1lcmdlX2RwYShzdHJ1Y3QgbmRfcmVnaW9uICpuZF9yZWdpb24sCiAg cmV0cnk6CiAJZm9yX2VhY2hfZHBhX3Jlc291cmNlKG5kZCwgcmVzKSB7CiAJCWludCByYzsKLQkJ c3RydWN0IHJlc291cmNlICpuZXh0ID0gcmVzLT5zaWJsaW5nOworCQlzdHJ1Y3QgcmVzb3VyY2Ug Km5leHQgPSByZXNvdXJjZV9zaWJsaW5nKHJlcyk7CiAJCXJlc291cmNlX3NpemVfdCBlbmQgPSBy ZXMtPnN0YXJ0ICsgcmVzb3VyY2Vfc2l6ZShyZXMpOwogCiAJCWlmICghbmV4dCB8fCBzdHJjbXAo cmVzLT5uYW1lLCBsYWJlbF9pZC0+aWQpICE9IDAKZGlmZiAtLWdpdCBhL2RyaXZlcnMvbnZkaW1t L25kLmggYi9kcml2ZXJzL252ZGltbS9uZC5oCmluZGV4IDMyZTAzNjRiNDhiOS4uZGE3ZGExNWUw M2U3IDEwMDY0NAotLS0gYS9kcml2ZXJzL252ZGltbS9uZC5oCisrKyBiL2RyaXZlcnMvbnZkaW1t L25kLmgKQEAgLTEwMiwxMSArMTAyLDEwIEBAIHVuc2lnbmVkIHNpemVvZl9uYW1lc3BhY2VfbGFi ZWwoc3RydWN0IG52ZGltbV9kcnZkYXRhICpuZGQpOwogCQkodW5zaWduZWQgbG9uZyBsb25nKSAo cmVzID8gcmVzLT5zdGFydCA6IDApLCAjI2FyZykKIAogI2RlZmluZSBmb3JfZWFjaF9kcGFfcmVz b3VyY2UobmRkLCByZXMpIFwKLQlmb3IgKHJlcyA9IChuZGQpLT5kcGEuY2hpbGQ7IHJlczsgcmVz ID0gcmVzLT5zaWJsaW5nKQorCWxpc3RfZm9yX2VhY2hfZW50cnkocmVzLCAmKG5kZCktPmRwYS5j aGlsZCwgc2libGluZykKIAogI2RlZmluZSBmb3JfZWFjaF9kcGFfcmVzb3VyY2Vfc2FmZShuZGQs IHJlcywgbmV4dCkgXAotCWZvciAocmVzID0gKG5kZCktPmRwYS5jaGlsZCwgbmV4dCA9IHJlcyA/ IHJlcy0+c2libGluZyA6IE5VTEw7IFwKLQkJCXJlczsgcmVzID0gbmV4dCwgbmV4dCA9IG5leHQg PyBuZXh0LT5zaWJsaW5nIDogTlVMTCkKKwlsaXN0X2Zvcl9lYWNoX2VudHJ5X3NhZmUocmVzLCBu ZXh0LCAmKG5kZCktPmRwYS5jaGlsZCwgc2libGluZykKIAogc3RydWN0IG5kX3BlcmNwdV9sYW5l IHsKIAlpbnQgY291bnQ7CmRpZmYgLS1naXQgYS9kcml2ZXJzL29mL2FkZHJlc3MuYyBiL2RyaXZl cnMvb2YvYWRkcmVzcy5jCmluZGV4IDUzMzQ5OTEyYWM3NS4uZTJlMjU3MTlhYjUyIDEwMDY0NAot LS0gYS9kcml2ZXJzL29mL2FkZHJlc3MuYworKysgYi9kcml2ZXJzL29mL2FkZHJlc3MuYwpAQCAt MzMwLDcgKzMzMCw5IEBAIGludCBvZl9wY2lfcmFuZ2VfdG9fcmVzb3VyY2Uoc3RydWN0IG9mX3Bj aV9yYW5nZSAqcmFuZ2UsCiB7CiAJaW50IGVycjsKIAlyZXMtPmZsYWdzID0gcmFuZ2UtPmZsYWdz OwotCXJlcy0+cGFyZW50ID0gcmVzLT5jaGlsZCA9IHJlcy0+c2libGluZyA9IE5VTEw7CisJcmVz LT5wYXJlbnQgPSBOVUxMOworCUlOSVRfTElTVF9IRUFEKCZyZXMtPmNoaWxkKTsKKwlJTklUX0xJ U1RfSEVBRCgmcmVzLT5zaWJsaW5nKTsKIAlyZXMtPm5hbWUgPSBucC0+ZnVsbF9uYW1lOwogCiAJ aWYgKHJlcy0+ZmxhZ3MgJiBJT1JFU09VUkNFX0lPKSB7CmRpZmYgLS1naXQgYS9kcml2ZXJzL3Bh cmlzYy9sYmFfcGNpLmMgYi9kcml2ZXJzL3BhcmlzYy9sYmFfcGNpLmMKaW5kZXggNjliZDk4NDIx ZWIxLi43NDgyYmRmZDE5NTkgMTAwNjQ0Ci0tLSBhL2RyaXZlcnMvcGFyaXNjL2xiYV9wY2kuYwor KysgYi9kcml2ZXJzL3BhcmlzYy9sYmFfcGNpLmMKQEAgLTE3MCw4ICsxNzAsOCBAQCBsYmFfZHVt cF9yZXMoc3RydWN0IHJlc291cmNlICpyLCBpbnQgZCkKIAlmb3IgKGkgPSBkOyBpIDsgLS1pKSBw cmludGsoIiAiKTsKIAlwcmludGsoS0VSTl9ERUJVRyAiJXAgWyVseCwlbHhdLyVseFxuIiwgciwK IAkJKGxvbmcpci0+c3RhcnQsIChsb25nKXItPmVuZCwgci0+ZmxhZ3MpOwotCWxiYV9kdW1wX3Jl cyhyLT5jaGlsZCwgZCsyKTsKLQlsYmFfZHVtcF9yZXMoci0+c2libGluZywgZCk7CisJbGJhX2R1 bXBfcmVzKHJlc291cmNlX2ZpcnN0X2NoaWxkKCZyLT5jaGlsZCksIGQrMik7CisJbGJhX2R1bXBf cmVzKHJlc291cmNlX3NpYmxpbmcociksIGQpOwogfQogCiAKZGlmZiAtLWdpdCBhL2RyaXZlcnMv cGNpL2NvbnRyb2xsZXIvdm1kLmMgYi9kcml2ZXJzL3BjaS9jb250cm9sbGVyL3ZtZC5jCmluZGV4 IDk0MmI2NGZjN2YxZi4uZTNhY2UyMDM0NWM3IDEwMDY0NAotLS0gYS9kcml2ZXJzL3BjaS9jb250 cm9sbGVyL3ZtZC5jCisrKyBiL2RyaXZlcnMvcGNpL2NvbnRyb2xsZXIvdm1kLmMKQEAgLTU0Miwx NCArNTQyLDE0IEBAIHN0YXRpYyBzdHJ1Y3QgcGNpX29wcyB2bWRfb3BzID0gewogCiBzdGF0aWMg dm9pZCB2bWRfYXR0YWNoX3Jlc291cmNlcyhzdHJ1Y3Qgdm1kX2RldiAqdm1kKQogewotCXZtZC0+ ZGV2LT5yZXNvdXJjZVtWTURfTUVNQkFSMV0uY2hpbGQgPSAmdm1kLT5yZXNvdXJjZXNbMV07Ci0J dm1kLT5kZXYtPnJlc291cmNlW1ZNRF9NRU1CQVIyXS5jaGlsZCA9ICZ2bWQtPnJlc291cmNlc1sy XTsKKwlsaXN0X2FkZCgmdm1kLT5yZXNvdXJjZXNbMV0uc2libGluZywgJnZtZC0+ZGV2LT5yZXNv dXJjZVtWTURfTUVNQkFSMV0uY2hpbGQpOworCWxpc3RfYWRkKCZ2bWQtPnJlc291cmNlc1syXS5z aWJsaW5nLCAmdm1kLT5kZXYtPnJlc291cmNlW1ZNRF9NRU1CQVIyXS5jaGlsZCk7CiB9CiAKIHN0 YXRpYyB2b2lkIHZtZF9kZXRhY2hfcmVzb3VyY2VzKHN0cnVjdCB2bWRfZGV2ICp2bWQpCiB7Ci0J dm1kLT5kZXYtPnJlc291cmNlW1ZNRF9NRU1CQVIxXS5jaGlsZCA9IE5VTEw7Ci0Jdm1kLT5kZXYt PnJlc291cmNlW1ZNRF9NRU1CQVIyXS5jaGlsZCA9IE5VTEw7CisJSU5JVF9MSVNUX0hFQUQoJnZt ZC0+ZGV2LT5yZXNvdXJjZVtWTURfTUVNQkFSMV0uY2hpbGQpOworCUlOSVRfTElTVF9IRUFEKCZ2 bWQtPmRldi0+cmVzb3VyY2VbVk1EX01FTUJBUjJdLmNoaWxkKTsKIH0KIAogLyoKZGlmZiAtLWdp dCBhL2RyaXZlcnMvcGNpL3Byb2JlLmMgYi9kcml2ZXJzL3BjaS9wcm9iZS5jCmluZGV4IGFjODc2 ZTMyZGU0Yi4uOTYyNGRkMWRmZDQ5IDEwMDY0NAotLS0gYS9kcml2ZXJzL3BjaS9wcm9iZS5jCisr KyBiL2RyaXZlcnMvcGNpL3Byb2JlLmMKQEAgLTU5LDYgKzU5LDggQEAgc3RhdGljIHN0cnVjdCBy ZXNvdXJjZSAqZ2V0X3BjaV9kb21haW5fYnVzbl9yZXMoaW50IGRvbWFpbl9ucikKIAlyLT5yZXMu c3RhcnQgPSAwOwogCXItPnJlcy5lbmQgPSAweGZmOwogCXItPnJlcy5mbGFncyA9IElPUkVTT1VS Q0VfQlVTIHwgSU9SRVNPVVJDRV9QQ0lfRklYRUQ7CisJSU5JVF9MSVNUX0hFQUQoJnItPnJlcy5j aGlsZCk7CisJSU5JVF9MSVNUX0hFQUQoJnItPnJlcy5zaWJsaW5nKTsKIAogCWxpc3RfYWRkX3Rh aWwoJnItPmxpc3QsICZwY2lfZG9tYWluX2J1c25fcmVzX2xpc3QpOwogCmRpZmYgLS1naXQgYS9k cml2ZXJzL3BjaS9zZXR1cC1idXMuYyBiL2RyaXZlcnMvcGNpL3NldHVwLWJ1cy5jCmluZGV4IDc5 YjE4MjRlODNiNC4uOGU2ODVhZjg5MzhkIDEwMDY0NAotLS0gYS9kcml2ZXJzL3BjaS9zZXR1cC1i dXMuYworKysgYi9kcml2ZXJzL3BjaS9zZXR1cC1idXMuYwpAQCAtMjEwNyw3ICsyMTA3LDcgQEAg aW50IHBjaV9yZWFzc2lnbl9icmlkZ2VfcmVzb3VyY2VzKHN0cnVjdCBwY2lfZGV2ICpicmlkZ2Us IHVuc2lnbmVkIGxvbmcgdHlwZSkKIAkJCQljb250aW51ZTsKIAogCQkJLyogSWdub3JlIEJBUnMg d2hpY2ggYXJlIHN0aWxsIGluIHVzZSAqLwotCQkJaWYgKHJlcy0+Y2hpbGQpCisJCQlpZiAoIWxp c3RfZW1wdHkoJnJlcy0+Y2hpbGQpKQogCQkJCWNvbnRpbnVlOwogCiAJCQlyZXQgPSBhZGRfdG9f bGlzdCgmc2F2ZWQsIGJyaWRnZSwgcmVzLCAwLCAwKTsKZGlmZiAtLWdpdCBhL2luY2x1ZGUvbGlu dXgvaW9wb3J0LmggYi9pbmNsdWRlL2xpbnV4L2lvcG9ydC5oCmluZGV4IGRmZGNkMGJmZTU0ZS4u Yjc0NTZhZTg4OWRkIDEwMDY0NAotLS0gYS9pbmNsdWRlL2xpbnV4L2lvcG9ydC5oCisrKyBiL2lu Y2x1ZGUvbGludXgvaW9wb3J0LmgKQEAgLTEyLDYgKzEyLDcgQEAKICNpZm5kZWYgX19BU1NFTUJM WV9fCiAjaW5jbHVkZSA8bGludXgvY29tcGlsZXIuaD4KICNpbmNsdWRlIDxsaW51eC90eXBlcy5o PgorI2luY2x1ZGUgPGxpbnV4L2xpc3QuaD4KIC8qCiAgKiBSZXNvdXJjZXMgYXJlIHRyZWUtbGlr ZSwgYWxsb3dpbmcKICAqIG5lc3RpbmcgZXRjLi4KQEAgLTIyLDcgKzIzLDggQEAgc3RydWN0IHJl c291cmNlIHsKIAljb25zdCBjaGFyICpuYW1lOwogCXVuc2lnbmVkIGxvbmcgZmxhZ3M7CiAJdW5z aWduZWQgbG9uZyBkZXNjOwotCXN0cnVjdCByZXNvdXJjZSAqcGFyZW50LCAqc2libGluZywgKmNo aWxkOworCXN0cnVjdCBsaXN0X2hlYWQgY2hpbGQsIHNpYmxpbmc7CisJc3RydWN0IHJlc291cmNl ICpwYXJlbnQ7CiB9OwogCiAvKgpAQCAtMjE2LDcgKzIxOCw2IEBAIHN0YXRpYyBpbmxpbmUgYm9v bCByZXNvdXJjZV9jb250YWlucyhzdHJ1Y3QgcmVzb3VyY2UgKnIxLCBzdHJ1Y3QgcmVzb3VyY2Ug KnIyKQogCXJldHVybiByMS0+c3RhcnQgPD0gcjItPnN0YXJ0ICYmIHIxLT5lbmQgPj0gcjItPmVu ZDsKIH0KIAotCiAvKiBDb252ZW5pZW5jZSBzaG9ydGhhbmQgd2l0aCBhbGxvY2F0aW9uICovCiAj ZGVmaW5lIHJlcXVlc3RfcmVnaW9uKHN0YXJ0LG4sbmFtZSkJCV9fcmVxdWVzdF9yZWdpb24oJmlv cG9ydF9yZXNvdXJjZSwgKHN0YXJ0KSwgKG4pLCAobmFtZSksIDApCiAjZGVmaW5lIHJlcXVlc3Rf bXV4ZWRfcmVnaW9uKHN0YXJ0LG4sbmFtZSkJX19yZXF1ZXN0X3JlZ2lvbigmaW9wb3J0X3Jlc291 cmNlLCAoc3RhcnQpLCAobiksIChuYW1lKSwgSU9SRVNPVVJDRV9NVVhFRCkKQEAgLTI4Nyw2ICsy ODgsMTggQEAgc3RhdGljIGlubGluZSBib29sIHJlc291cmNlX292ZXJsYXBzKHN0cnVjdCByZXNv dXJjZSAqcjEsIHN0cnVjdCByZXNvdXJjZSAqcjIpCiAgICAgICAgcmV0dXJuIChyMS0+c3RhcnQg PD0gcjItPmVuZCAmJiByMS0+ZW5kID49IHIyLT5zdGFydCk7CiB9CiAKK3N0YXRpYyBpbmxpbmUg c3RydWN0IHJlc291cmNlICpyZXNvdXJjZV9zaWJsaW5nKHN0cnVjdCByZXNvdXJjZSAqcmVzKQor eworCWlmIChyZXMtPnBhcmVudCAmJiAhbGlzdF9pc19sYXN0KCZyZXMtPnNpYmxpbmcsICZyZXMt PnBhcmVudC0+Y2hpbGQpKQorCQlyZXR1cm4gbGlzdF9uZXh0X2VudHJ5KHJlcywgc2libGluZyk7 CisJcmV0dXJuIE5VTEw7Cit9CisKK3N0YXRpYyBpbmxpbmUgc3RydWN0IHJlc291cmNlICpyZXNv dXJjZV9maXJzdF9jaGlsZChzdHJ1Y3QgbGlzdF9oZWFkICpoZWFkKQoreworCXJldHVybiBsaXN0 X2ZpcnN0X2VudHJ5X29yX251bGwoaGVhZCwgc3RydWN0IHJlc291cmNlLCBzaWJsaW5nKTsKK30K KwogCiAjZW5kaWYgLyogX19BU1NFTUJMWV9fICovCiAjZW5kaWYJLyogX0xJTlVYX0lPUE9SVF9I ICovCmRpZmYgLS1naXQgYS9rZXJuZWwvcmVzb3VyY2UuYyBiL2tlcm5lbC9yZXNvdXJjZS5jCmlu ZGV4IDgxY2NkMTljMWQ5Zi4uYzk2ZTU4ZDNkMmY4IDEwMDY0NAotLS0gYS9rZXJuZWwvcmVzb3Vy Y2UuYworKysgYi9rZXJuZWwvcmVzb3VyY2UuYwpAQCAtMzEsNiArMzEsOCBAQCBzdHJ1Y3QgcmVz b3VyY2UgaW9wb3J0X3Jlc291cmNlID0gewogCS5zdGFydAk9IDAsCiAJLmVuZAk9IElPX1NQQUNF X0xJTUlULAogCS5mbGFncwk9IElPUkVTT1VSQ0VfSU8sCisJLnNpYmxpbmcgPSBMSVNUX0hFQURf SU5JVChpb3BvcnRfcmVzb3VyY2Uuc2libGluZyksCisJLmNoaWxkICA9IExJU1RfSEVBRF9JTklU KGlvcG9ydF9yZXNvdXJjZS5jaGlsZCksCiB9OwogRVhQT1JUX1NZTUJPTChpb3BvcnRfcmVzb3Vy Y2UpOwogCkBAIC0zOSw2ICs0MSw4IEBAIHN0cnVjdCByZXNvdXJjZSBpb21lbV9yZXNvdXJjZSA9 IHsKIAkuc3RhcnQJPSAwLAogCS5lbmQJPSAtMSwKIAkuZmxhZ3MJPSBJT1JFU09VUkNFX01FTSwK Kwkuc2libGluZyA9IExJU1RfSEVBRF9JTklUKGlvbWVtX3Jlc291cmNlLnNpYmxpbmcpLAorCS5j aGlsZCAgPSBMSVNUX0hFQURfSU5JVChpb21lbV9yZXNvdXJjZS5jaGlsZCksCiB9OwogRVhQT1JU X1NZTUJPTChpb21lbV9yZXNvdXJjZSk7CiAKQEAgLTU3LDIwICs2MSwyMCBAQCBzdGF0aWMgREVG SU5FX1JXTE9DSyhyZXNvdXJjZV9sb2NrKTsKICAqIGJ5IGJvb3QgbWVtIGFmdGVyIHRoZSBzeXN0 ZW0gaXMgdXAuIFNvIGZvciByZXVzaW5nIHRoZSByZXNvdXJjZSBlbnRyeQogICogd2UgbmVlZCB0 byByZW1lbWJlciB0aGUgcmVzb3VyY2UuCiAgKi8KLXN0YXRpYyBzdHJ1Y3QgcmVzb3VyY2UgKmJv b3RtZW1fcmVzb3VyY2VfZnJlZTsKK3N0YXRpYyBzdHJ1Y3QgbGlzdF9oZWFkIGJvb3RtZW1fcmVz b3VyY2VfZnJlZSA9IExJU1RfSEVBRF9JTklUKGJvb3RtZW1fcmVzb3VyY2VfZnJlZSk7CiBzdGF0 aWMgREVGSU5FX1NQSU5MT0NLKGJvb3RtZW1fcmVzb3VyY2VfbG9jayk7CiAKIHN0YXRpYyBzdHJ1 Y3QgcmVzb3VyY2UgKm5leHRfcmVzb3VyY2Uoc3RydWN0IHJlc291cmNlICpwLCBib29sIHNpYmxp bmdfb25seSkKIHsKIAkvKiBDYWxsZXIgd2FudHMgdG8gdHJhdmVyc2UgdGhyb3VnaCBzaWJsaW5n cyBvbmx5ICovCiAJaWYgKHNpYmxpbmdfb25seSkKLQkJcmV0dXJuIHAtPnNpYmxpbmc7CisJCXJl dHVybiByZXNvdXJjZV9zaWJsaW5nKHApOwogCi0JaWYgKHAtPmNoaWxkKQotCQlyZXR1cm4gcC0+ Y2hpbGQ7Ci0Jd2hpbGUgKCFwLT5zaWJsaW5nICYmIHAtPnBhcmVudCkKKwlpZiAoIWxpc3RfZW1w dHkoJnAtPmNoaWxkKSkKKwkJcmV0dXJuIHJlc291cmNlX2ZpcnN0X2NoaWxkKCZwLT5jaGlsZCk7 CisJd2hpbGUgKCFyZXNvdXJjZV9zaWJsaW5nKHApICYmIHAtPnBhcmVudCkKIAkJcCA9IHAtPnBh cmVudDsKLQlyZXR1cm4gcC0+c2libGluZzsKKwlyZXR1cm4gcmVzb3VyY2Vfc2libGluZyhwKTsK IH0KIAogc3RhdGljIHZvaWQgKnJfbmV4dChzdHJ1Y3Qgc2VxX2ZpbGUgKm0sIHZvaWQgKnYsIGxv ZmZfdCAqcG9zKQpAQCAtOTAsNyArOTQsNyBAQCBzdGF0aWMgdm9pZCAqcl9zdGFydChzdHJ1Y3Qg c2VxX2ZpbGUgKm0sIGxvZmZfdCAqcG9zKQogCXN0cnVjdCByZXNvdXJjZSAqcCA9IFBERV9EQVRB KGZpbGVfaW5vZGUobS0+ZmlsZSkpOwogCWxvZmZfdCBsID0gMDsKIAlyZWFkX2xvY2soJnJlc291 cmNlX2xvY2spOwotCWZvciAocCA9IHAtPmNoaWxkOyBwICYmIGwgPCAqcG9zOyBwID0gcl9uZXh0 KG0sIHAsICZsKSkKKwlmb3IgKHAgPSByZXNvdXJjZV9maXJzdF9jaGlsZCgmcC0+Y2hpbGQpOyBw ICYmIGwgPCAqcG9zOyBwID0gcl9uZXh0KG0sIHAsICZsKSkKIAkJOwogCXJldHVybiBwOwogfQpA QCAtMTUzLDggKzE1Nyw3IEBAIHN0YXRpYyB2b2lkIGZyZWVfcmVzb3VyY2Uoc3RydWN0IHJlc291 cmNlICpyZXMpCiAKIAlpZiAoIVBhZ2VTbGFiKHZpcnRfdG9faGVhZF9wYWdlKHJlcykpKSB7CiAJ CXNwaW5fbG9jaygmYm9vdG1lbV9yZXNvdXJjZV9sb2NrKTsKLQkJcmVzLT5zaWJsaW5nID0gYm9v dG1lbV9yZXNvdXJjZV9mcmVlOwotCQlib290bWVtX3Jlc291cmNlX2ZyZWUgPSByZXM7CisJCWxp c3RfYWRkKCZyZXMtPnNpYmxpbmcsICZib290bWVtX3Jlc291cmNlX2ZyZWUpOwogCQlzcGluX3Vu bG9jaygmYm9vdG1lbV9yZXNvdXJjZV9sb2NrKTsKIAl9IGVsc2UgewogCQlrZnJlZShyZXMpOwpA QCAtMTY2LDEwICsxNjksOSBAQCBzdGF0aWMgc3RydWN0IHJlc291cmNlICphbGxvY19yZXNvdXJj ZShnZnBfdCBmbGFncykKIAlzdHJ1Y3QgcmVzb3VyY2UgKnJlcyA9IE5VTEw7CiAKIAlzcGluX2xv Y2soJmJvb3RtZW1fcmVzb3VyY2VfbG9jayk7Ci0JaWYgKGJvb3RtZW1fcmVzb3VyY2VfZnJlZSkg ewotCQlyZXMgPSBib290bWVtX3Jlc291cmNlX2ZyZWU7Ci0JCWJvb3RtZW1fcmVzb3VyY2VfZnJl ZSA9IHJlcy0+c2libGluZzsKLQl9CisJcmVzID0gcmVzb3VyY2VfZmlyc3RfY2hpbGQoJmJvb3Rt ZW1fcmVzb3VyY2VfZnJlZSk7CisJaWYgKHJlcykKKwkJbGlzdF9kZWwoJnJlcy0+c2libGluZyk7 CiAJc3Bpbl91bmxvY2soJmJvb3RtZW1fcmVzb3VyY2VfbG9jayk7CiAKIAlpZiAocmVzKQpAQCAt MTc3LDYgKzE3OSw4IEBAIHN0YXRpYyBzdHJ1Y3QgcmVzb3VyY2UgKmFsbG9jX3Jlc291cmNlKGdm cF90IGZsYWdzKQogCWVsc2UKIAkJcmVzID0ga3phbGxvYyhzaXplb2Yoc3RydWN0IHJlc291cmNl KSwgZmxhZ3MpOwogCisJSU5JVF9MSVNUX0hFQUQoJnJlcy0+Y2hpbGQpOworCUlOSVRfTElTVF9I RUFEKCZyZXMtPnNpYmxpbmcpOwogCXJldHVybiByZXM7CiB9CiAKQEAgLTE4NSw3ICsxODksNyBA QCBzdGF0aWMgc3RydWN0IHJlc291cmNlICogX19yZXF1ZXN0X3Jlc291cmNlKHN0cnVjdCByZXNv dXJjZSAqcm9vdCwgc3RydWN0IHJlc291cgogewogCXJlc291cmNlX3NpemVfdCBzdGFydCA9IG5l dy0+c3RhcnQ7CiAJcmVzb3VyY2Vfc2l6ZV90IGVuZCA9IG5ldy0+ZW5kOwotCXN0cnVjdCByZXNv dXJjZSAqdG1wLCAqKnA7CisJc3RydWN0IHJlc291cmNlICp0bXA7CiAKIAlpZiAoZW5kIDwgc3Rh cnQpCiAJCXJldHVybiByb290OwpAQCAtMTkzLDY0ICsxOTcsNjIgQEAgc3RhdGljIHN0cnVjdCBy ZXNvdXJjZSAqIF9fcmVxdWVzdF9yZXNvdXJjZShzdHJ1Y3QgcmVzb3VyY2UgKnJvb3QsIHN0cnVj dCByZXNvdXIKIAkJcmV0dXJuIHJvb3Q7CiAJaWYgKGVuZCA+IHJvb3QtPmVuZCkKIAkJcmV0dXJu IHJvb3Q7Ci0JcCA9ICZyb290LT5jaGlsZDsKLQlmb3IgKDs7KSB7Ci0JCXRtcCA9ICpwOwotCQlp ZiAoIXRtcCB8fCB0bXAtPnN0YXJ0ID4gZW5kKSB7Ci0JCQluZXctPnNpYmxpbmcgPSB0bXA7Ci0J CQkqcCA9IG5ldzsKKworCWlmIChsaXN0X2VtcHR5KCZyb290LT5jaGlsZCkpIHsKKwkJbGlzdF9h ZGQoJm5ldy0+c2libGluZywgJnJvb3QtPmNoaWxkKTsKKwkJbmV3LT5wYXJlbnQgPSByb290Owor CQlJTklUX0xJU1RfSEVBRCgmbmV3LT5jaGlsZCk7CisJCXJldHVybiBOVUxMOworCX0KKworCWxp c3RfZm9yX2VhY2hfZW50cnkodG1wLCAmcm9vdC0+Y2hpbGQsIHNpYmxpbmcpIHsKKwkJaWYgKHRt cC0+c3RhcnQgPiBlbmQpIHsKKwkJCWxpc3RfYWRkKCZuZXctPnNpYmxpbmcsIHRtcC0+c2libGlu Zy5wcmV2KTsKIAkJCW5ldy0+cGFyZW50ID0gcm9vdDsKKwkJCUlOSVRfTElTVF9IRUFEKCZuZXct PmNoaWxkKTsKIAkJCXJldHVybiBOVUxMOwogCQl9Ci0JCXAgPSAmdG1wLT5zaWJsaW5nOwogCQlp ZiAodG1wLT5lbmQgPCBzdGFydCkKIAkJCWNvbnRpbnVlOwogCQlyZXR1cm4gdG1wOwogCX0KKwor CWxpc3RfYWRkX3RhaWwoJm5ldy0+c2libGluZywgJnJvb3QtPmNoaWxkKTsKKwluZXctPnBhcmVu dCA9IHJvb3Q7CisJSU5JVF9MSVNUX0hFQUQoJm5ldy0+Y2hpbGQpOworCXJldHVybiBOVUxMOwog fQogCiBzdGF0aWMgaW50IF9fcmVsZWFzZV9yZXNvdXJjZShzdHJ1Y3QgcmVzb3VyY2UgKm9sZCwg Ym9vbCByZWxlYXNlX2NoaWxkKQogewotCXN0cnVjdCByZXNvdXJjZSAqdG1wLCAqKnAsICpjaGQ7 CisJc3RydWN0IHJlc291cmNlICp0bXAsICpuZXh0LCAqY2hkOwogCi0JcCA9ICZvbGQtPnBhcmVu dC0+Y2hpbGQ7Ci0JZm9yICg7OykgewotCQl0bXAgPSAqcDsKLQkJaWYgKCF0bXApCi0JCQlicmVh azsKKwlsaXN0X2Zvcl9lYWNoX2VudHJ5X3NhZmUodG1wLCBuZXh0LCAmb2xkLT5wYXJlbnQtPmNo aWxkLCBzaWJsaW5nKSB7CiAJCWlmICh0bXAgPT0gb2xkKSB7Ci0JCQlpZiAocmVsZWFzZV9jaGls ZCB8fCAhKHRtcC0+Y2hpbGQpKSB7Ci0JCQkJKnAgPSB0bXAtPnNpYmxpbmc7CisJCQlpZiAocmVs ZWFzZV9jaGlsZCB8fCBsaXN0X2VtcHR5KCZ0bXAtPmNoaWxkKSkgeworCQkJCWxpc3RfZGVsKCZ0 bXAtPnNpYmxpbmcpOwogCQkJfSBlbHNlIHsKLQkJCQlmb3IgKGNoZCA9IHRtcC0+Y2hpbGQ7OyBj aGQgPSBjaGQtPnNpYmxpbmcpIHsKKwkJCQlsaXN0X2Zvcl9lYWNoX2VudHJ5KGNoZCwgJnRtcC0+ Y2hpbGQsIHNpYmxpbmcpCiAJCQkJCWNoZC0+cGFyZW50ID0gdG1wLT5wYXJlbnQ7Ci0JCQkJCWlm ICghKGNoZC0+c2libGluZykpCi0JCQkJCQlicmVhazsKLQkJCQl9Ci0JCQkJKnAgPSB0bXAtPmNo aWxkOwotCQkJCWNoZC0+c2libGluZyA9IHRtcC0+c2libGluZzsKKwkJCQlsaXN0X3NwbGljZSgm dG1wLT5jaGlsZCwgdG1wLT5zaWJsaW5nLnByZXYpOworCQkJCWxpc3RfZGVsKCZ0bXAtPnNpYmxp bmcpOwogCQkJfQorCiAJCQlvbGQtPnBhcmVudCA9IE5VTEw7CiAJCQlyZXR1cm4gMDsKIAkJfQot CQlwID0gJnRtcC0+c2libGluZzsKIAl9CiAJcmV0dXJuIC1FSU5WQUw7CiB9CiAKIHN0YXRpYyB2 b2lkIF9fcmVsZWFzZV9jaGlsZF9yZXNvdXJjZXMoc3RydWN0IHJlc291cmNlICpyKQogewotCXN0 cnVjdCByZXNvdXJjZSAqdG1wLCAqcDsKKwlzdHJ1Y3QgcmVzb3VyY2UgKnRtcCwgKm5leHQ7CiAJ cmVzb3VyY2Vfc2l6ZV90IHNpemU7CiAKLQlwID0gci0+Y2hpbGQ7Ci0Jci0+Y2hpbGQgPSBOVUxM OwotCXdoaWxlIChwKSB7Ci0JCXRtcCA9IHA7Ci0JCXAgPSBwLT5zaWJsaW5nOwotCisJbGlzdF9m b3JfZWFjaF9lbnRyeV9zYWZlKHRtcCwgbmV4dCwgJnItPmNoaWxkLCBzaWJsaW5nKSB7CiAJCXRt cC0+cGFyZW50ID0gTlVMTDsKLQkJdG1wLT5zaWJsaW5nID0gTlVMTDsKKwkJbGlzdF9kZWxfaW5p dCgmdG1wLT5zaWJsaW5nKTsKIAkJX19yZWxlYXNlX2NoaWxkX3Jlc291cmNlcyh0bXApOwogCiAJ CXByaW50ayhLRVJOX0RFQlVHICJyZWxlYXNlIGNoaWxkIHJlc291cmNlICVwUlxuIiwgdG1wKTsK QEAgLTI1OSw2ICsyNjEsOCBAQCBzdGF0aWMgdm9pZCBfX3JlbGVhc2VfY2hpbGRfcmVzb3VyY2Vz KHN0cnVjdCByZXNvdXJjZSAqcikKIAkJdG1wLT5zdGFydCA9IDA7CiAJCXRtcC0+ZW5kID0gc2l6 ZSAtIDE7CiAJfQorCisJSU5JVF9MSVNUX0hFQUQoJnRtcC0+Y2hpbGQpOwogfQogCiB2b2lkIHJl bGVhc2VfY2hpbGRfcmVzb3VyY2VzKHN0cnVjdCByZXNvdXJjZSAqcikKQEAgLTM0Myw3ICszNDcs OCBAQCBzdGF0aWMgaW50IGZpbmRfbmV4dF9pb21lbV9yZXMoc3RydWN0IHJlc291cmNlICpyZXMs IHVuc2lnbmVkIGxvbmcgZGVzYywKIAogCXJlYWRfbG9jaygmcmVzb3VyY2VfbG9jayk7CiAKLQlm b3IgKHAgPSBpb21lbV9yZXNvdXJjZS5jaGlsZDsgcDsgcCA9IG5leHRfcmVzb3VyY2UocCwgc2li bGluZ19vbmx5KSkgeworCWZvciAocCA9IHJlc291cmNlX2ZpcnN0X2NoaWxkKCZpb21lbV9yZXNv dXJjZS5jaGlsZCk7IHA7CisJCQlwID0gbmV4dF9yZXNvdXJjZShwLCBzaWJsaW5nX29ubHkpKSB7 CiAJCWlmICgocC0+ZmxhZ3MgJiByZXMtPmZsYWdzKSAhPSByZXMtPmZsYWdzKQogCQkJY29udGlu dWU7CiAJCWlmICgoZGVzYyAhPSBJT1JFU19ERVNDX05PTkUpICYmIChkZXNjICE9IHAtPmRlc2Mp KQpAQCAtNTMyLDcgKzUzNyw3IEBAIGludCByZWdpb25faW50ZXJzZWN0cyhyZXNvdXJjZV9zaXpl X3Qgc3RhcnQsIHNpemVfdCBzaXplLCB1bnNpZ25lZCBsb25nIGZsYWdzLAogCXN0cnVjdCByZXNv dXJjZSAqcDsKIAogCXJlYWRfbG9jaygmcmVzb3VyY2VfbG9jayk7Ci0JZm9yIChwID0gaW9tZW1f cmVzb3VyY2UuY2hpbGQ7IHAgOyBwID0gcC0+c2libGluZykgeworCWxpc3RfZm9yX2VhY2hfZW50 cnkocCwgJmlvbWVtX3Jlc291cmNlLmNoaWxkLCBzaWJsaW5nKSB7CiAJCWJvb2wgaXNfdHlwZSA9 ICgoKHAtPmZsYWdzICYgZmxhZ3MpID09IGZsYWdzKSAmJgogCQkJCSgoZGVzYyA9PSBJT1JFU19E RVNDX05PTkUpIHx8CiAJCQkJIChkZXNjID09IHAtPmRlc2MpKSk7CkBAIC01ODYsNyArNTkxLDcg QEAgc3RhdGljIGludCBfX2ZpbmRfcmVzb3VyY2Uoc3RydWN0IHJlc291cmNlICpyb290LCBzdHJ1 Y3QgcmVzb3VyY2UgKm9sZCwKIAkJCSByZXNvdXJjZV9zaXplX3QgIHNpemUsCiAJCQkgc3RydWN0 IHJlc291cmNlX2NvbnN0cmFpbnQgKmNvbnN0cmFpbnQpCiB7Ci0Jc3RydWN0IHJlc291cmNlICp0 aGlzID0gcm9vdC0+Y2hpbGQ7CisJc3RydWN0IHJlc291cmNlICp0aGlzID0gcmVzb3VyY2VfZmly c3RfY2hpbGQoJnJvb3QtPmNoaWxkKTsKIAlzdHJ1Y3QgcmVzb3VyY2UgdG1wID0gKm5ldywgYXZh aWwsIGFsbG9jOwogCiAJdG1wLnN0YXJ0ID0gcm9vdC0+c3RhcnQ7CkBAIC01OTYsNyArNjAxLDcg QEAgc3RhdGljIGludCBfX2ZpbmRfcmVzb3VyY2Uoc3RydWN0IHJlc291cmNlICpyb290LCBzdHJ1 Y3QgcmVzb3VyY2UgKm9sZCwKIAkgKi8KIAlpZiAodGhpcyAmJiB0aGlzLT5zdGFydCA9PSByb290 LT5zdGFydCkgewogCQl0bXAuc3RhcnQgPSAodGhpcyA9PSBvbGQpID8gb2xkLT5zdGFydCA6IHRo aXMtPmVuZCArIDE7Ci0JCXRoaXMgPSB0aGlzLT5zaWJsaW5nOworCQl0aGlzID0gcmVzb3VyY2Vf c2libGluZyh0aGlzKTsKIAl9CiAJZm9yKDs7KSB7CiAJCWlmICh0aGlzKQpAQCAtNjMyLDcgKzYz Nyw3IEBAIG5leHQ6CQlpZiAoIXRoaXMgfHwgdGhpcy0+ZW5kID09IHJvb3QtPmVuZCkKIAogCQlp ZiAodGhpcyAhPSBvbGQpCiAJCQl0bXAuc3RhcnQgPSB0aGlzLT5lbmQgKyAxOwotCQl0aGlzID0g dGhpcy0+c2libGluZzsKKwkJdGhpcyA9IHJlc291cmNlX3NpYmxpbmcodGhpcyk7CiAJfQogCXJl dHVybiAtRUJVU1k7CiB9CkBAIC02NzYsNyArNjgxLDcgQEAgc3RhdGljIGludCByZWFsbG9jYXRl X3Jlc291cmNlKHN0cnVjdCByZXNvdXJjZSAqcm9vdCwgc3RydWN0IHJlc291cmNlICpvbGQsCiAJ CWdvdG8gb3V0OwogCX0KIAotCWlmIChvbGQtPmNoaWxkKSB7CisJaWYgKCFsaXN0X2VtcHR5KCZv bGQtPmNoaWxkKSkgewogCQllcnIgPSAtRUJVU1k7CiAJCWdvdG8gb3V0OwogCX0KQEAgLTc1Nyw3 ICs3NjIsNyBAQCBzdHJ1Y3QgcmVzb3VyY2UgKmxvb2t1cF9yZXNvdXJjZShzdHJ1Y3QgcmVzb3Vy Y2UgKnJvb3QsIHJlc291cmNlX3NpemVfdCBzdGFydCkKIAlzdHJ1Y3QgcmVzb3VyY2UgKnJlczsK IAogCXJlYWRfbG9jaygmcmVzb3VyY2VfbG9jayk7Ci0JZm9yIChyZXMgPSByb290LT5jaGlsZDsg cmVzOyByZXMgPSByZXMtPnNpYmxpbmcpIHsKKwlsaXN0X2Zvcl9lYWNoX2VudHJ5KHJlcywgJnJv b3QtPmNoaWxkLCBzaWJsaW5nKSB7CiAJCWlmIChyZXMtPnN0YXJ0ID09IHN0YXJ0KQogCQkJYnJl YWs7CiAJfQpAQCAtNzkwLDMyICs3OTUsMjcgQEAgc3RhdGljIHN0cnVjdCByZXNvdXJjZSAqIF9f aW5zZXJ0X3Jlc291cmNlKHN0cnVjdCByZXNvdXJjZSAqcGFyZW50LCBzdHJ1Y3QgcmVzb3UKIAkJ CWJyZWFrOwogCX0KIAotCWZvciAobmV4dCA9IGZpcnN0OyA7IG5leHQgPSBuZXh0LT5zaWJsaW5n KSB7CisJZm9yIChuZXh0ID0gZmlyc3Q7IDsgbmV4dCA9IHJlc291cmNlX3NpYmxpbmcobmV4dCkp IHsKIAkJLyogUGFydGlhbCBvdmVybGFwPyBCYWQsIGFuZCB1bmZpeGFibGUgKi8KIAkJaWYgKG5l eHQtPnN0YXJ0IDwgbmV3LT5zdGFydCB8fCBuZXh0LT5lbmQgPiBuZXctPmVuZCkKIAkJCXJldHVy biBuZXh0OwotCQlpZiAoIW5leHQtPnNpYmxpbmcpCisJCWlmICghcmVzb3VyY2Vfc2libGluZyhu ZXh0KSkKIAkJCWJyZWFrOwotCQlpZiAobmV4dC0+c2libGluZy0+c3RhcnQgPiBuZXctPmVuZCkK KwkJaWYgKHJlc291cmNlX3NpYmxpbmcobmV4dCktPnN0YXJ0ID4gbmV3LT5lbmQpCiAJCQlicmVh azsKIAl9Ci0KIAluZXctPnBhcmVudCA9IHBhcmVudDsKLQluZXctPnNpYmxpbmcgPSBuZXh0LT5z aWJsaW5nOwotCW5ldy0+Y2hpbGQgPSBmaXJzdDsKKwlsaXN0X2FkZCgmbmV3LT5zaWJsaW5nLCAm bmV4dC0+c2libGluZyk7CisJSU5JVF9MSVNUX0hFQUQoJm5ldy0+Y2hpbGQpOwogCi0JbmV4dC0+ c2libGluZyA9IE5VTEw7Ci0JZm9yIChuZXh0ID0gZmlyc3Q7IG5leHQ7IG5leHQgPSBuZXh0LT5z aWJsaW5nKQorCS8qCisJICogRnJvbSBmaXJzdCB0byBuZXh0LCB0aGV5IGFsbCBmYWxsIGludG8g bmV3J3MgcmVnaW9uLCBzbyBjaGFuZ2UgdGhlbQorCSAqIGFzIG5ldydzIGNoaWxkcmVuLgorCSAq LworCWxpc3RfY3V0X3Bvc2l0aW9uKCZuZXctPmNoaWxkLCBmaXJzdC0+c2libGluZy5wcmV2LCAm bmV4dC0+c2libGluZyk7CisJbGlzdF9mb3JfZWFjaF9lbnRyeShuZXh0LCAmbmV3LT5jaGlsZCwg c2libGluZykKIAkJbmV4dC0+cGFyZW50ID0gbmV3OwogCi0JaWYgKHBhcmVudC0+Y2hpbGQgPT0g Zmlyc3QpIHsKLQkJcGFyZW50LT5jaGlsZCA9IG5ldzsKLQl9IGVsc2UgewotCQluZXh0ID0gcGFy ZW50LT5jaGlsZDsKLQkJd2hpbGUgKG5leHQtPnNpYmxpbmcgIT0gZmlyc3QpCi0JCQluZXh0ID0g bmV4dC0+c2libGluZzsKLQkJbmV4dC0+c2libGluZyA9IG5ldzsKLQl9CiAJcmV0dXJuIE5VTEw7 CiB9CiAKQEAgLTkzNywxOSArOTM3LDE3IEBAIHN0YXRpYyBpbnQgX19hZGp1c3RfcmVzb3VyY2Uo c3RydWN0IHJlc291cmNlICpyZXMsIHJlc291cmNlX3NpemVfdCBzdGFydCwKIAlpZiAoKHN0YXJ0 IDwgcGFyZW50LT5zdGFydCkgfHwgKGVuZCA+IHBhcmVudC0+ZW5kKSkKIAkJZ290byBvdXQ7CiAK LQlpZiAocmVzLT5zaWJsaW5nICYmIChyZXMtPnNpYmxpbmctPnN0YXJ0IDw9IGVuZCkpCisJaWYg KHJlc291cmNlX3NpYmxpbmcocmVzKSAmJiAocmVzb3VyY2Vfc2libGluZyhyZXMpLT5zdGFydCA8 PSBlbmQpKQogCQlnb3RvIG91dDsKIAotCXRtcCA9IHBhcmVudC0+Y2hpbGQ7Ci0JaWYgKHRtcCAh PSByZXMpIHsKLQkJd2hpbGUgKHRtcC0+c2libGluZyAhPSByZXMpCi0JCQl0bXAgPSB0bXAtPnNp Ymxpbmc7CisJaWYgKHJlcy0+c2libGluZy5wcmV2ICE9ICZwYXJlbnQtPmNoaWxkKSB7CisJCXRt cCA9IGxpc3RfcHJldl9lbnRyeShyZXMsIHNpYmxpbmcpOwogCQlpZiAoc3RhcnQgPD0gdG1wLT5l bmQpCiAJCQlnb3RvIG91dDsKIAl9CiAKIHNraXA6Ci0JZm9yICh0bXAgPSByZXMtPmNoaWxkOyB0 bXA7IHRtcCA9IHRtcC0+c2libGluZykKKwlsaXN0X2Zvcl9lYWNoX2VudHJ5KHRtcCwgJnJlcy0+ Y2hpbGQsIHNpYmxpbmcpCiAJCWlmICgodG1wLT5zdGFydCA8IHN0YXJ0KSB8fCAodG1wLT5lbmQg PiBlbmQpKQogCQkJZ290byBvdXQ7CiAKQEAgLTk5NiwyNyArOTk0LDMwIEBAIEVYUE9SVF9TWU1C T0woYWRqdXN0X3Jlc291cmNlKTsKICAqLwogaW50IHJlcGFyZW50X3Jlc291cmNlcyhzdHJ1Y3Qg cmVzb3VyY2UgKnBhcmVudCwgc3RydWN0IHJlc291cmNlICpyZXMpCiB7Ci0Jc3RydWN0IHJlc291 cmNlICpwLCAqKnBwOwotCXN0cnVjdCByZXNvdXJjZSAqKmZpcnN0cHAgPSBOVUxMOworCXN0cnVj dCByZXNvdXJjZSAqcCwgKmZpcnN0ID0gTlVMTDsKIAotCWZvciAocHAgPSAmcGFyZW50LT5jaGls ZDsgKHAgPSAqcHApICE9IE5VTEw7IHBwID0gJnAtPnNpYmxpbmcpIHsKKwlsaXN0X2Zvcl9lYWNo X2VudHJ5KHAsICZwYXJlbnQtPmNoaWxkLCBzaWJsaW5nKSB7CiAJCWlmIChwLT5lbmQgPCByZXMt PnN0YXJ0KQogCQkJY29udGludWU7CiAJCWlmIChyZXMtPmVuZCA8IHAtPnN0YXJ0KQogCQkJYnJl YWs7CiAJCWlmIChwLT5zdGFydCA8IHJlcy0+c3RhcnQgfHwgcC0+ZW5kID4gcmVzLT5lbmQpCiAJ CQlyZXR1cm4gLUVOT1RTVVBQOwkvKiBub3QgY29tcGxldGVseSBjb250YWluZWQgKi8KLQkJaWYg KGZpcnN0cHAgPT0gTlVMTCkKLQkJCWZpcnN0cHAgPSBwcDsKKwkJaWYgKGZpcnN0ID09IE5VTEwp CisJCQlmaXJzdCA9IHA7CiAJfQotCWlmIChmaXJzdHBwID09IE5VTEwpCisJaWYgKGZpcnN0ID09 IE5VTEwpCiAJCXJldHVybiAtRUNBTkNFTEVEOyAvKiBkaWRuJ3QgZmluZCBhbnkgY29uZmxpY3Rp bmcgZW50cmllcz8gKi8KIAlyZXMtPnBhcmVudCA9IHBhcmVudDsKLQlyZXMtPmNoaWxkID0gKmZp cnN0cHA7Ci0JcmVzLT5zaWJsaW5nID0gKnBwOwotCSpmaXJzdHBwID0gcmVzOwotCSpwcCA9IE5V TEw7Ci0JZm9yIChwID0gcmVzLT5jaGlsZDsgcCAhPSBOVUxMOyBwID0gcC0+c2libGluZykgewor CWxpc3RfYWRkKCZyZXMtPnNpYmxpbmcsIHAtPnNpYmxpbmcucHJldik7CisJSU5JVF9MSVNUX0hF QUQoJnJlcy0+Y2hpbGQpOworCisJLyoKKwkgKiBGcm9tIGZpcnN0IHRvIHAncyBwcmV2aW91cyBz aWJsaW5nLCB0aGV5IGFsbCBmYWxsIGludG8KKwkgKiByZXMncyByZWdpb24sIGNoYW5nZSB0aGVt IGFzIHJlcydzIGNoaWxkcmVuLgorCSAqLworCWxpc3RfY3V0X3Bvc2l0aW9uKCZyZXMtPmNoaWxk LCBmaXJzdC0+c2libGluZy5wcmV2LCByZXMtPnNpYmxpbmcucHJldik7CisJbGlzdF9mb3JfZWFj aF9lbnRyeShwLCAmcmVzLT5jaGlsZCwgc2libGluZykgewogCQlwLT5wYXJlbnQgPSByZXM7CiAJ CXByX2RlYnVnKCJQQ0k6IFJlcGFyZW50ZWQgJXMgJXBSIHVuZGVyICVzXG4iLAogCQkJIHAtPm5h bWUsIHAsIHJlcy0+bmFtZSk7CkBAIC0xMjE2LDM0ICsxMjE3LDMyIEBAIEVYUE9SVF9TWU1CT0wo X19yZXF1ZXN0X3JlZ2lvbik7CiB2b2lkIF9fcmVsZWFzZV9yZWdpb24oc3RydWN0IHJlc291cmNl ICpwYXJlbnQsIHJlc291cmNlX3NpemVfdCBzdGFydCwKIAkJCXJlc291cmNlX3NpemVfdCBuKQog ewotCXN0cnVjdCByZXNvdXJjZSAqKnA7CisJc3RydWN0IHJlc291cmNlICpyZXM7CiAJcmVzb3Vy Y2Vfc2l6ZV90IGVuZDsKIAotCXAgPSAmcGFyZW50LT5jaGlsZDsKKwlyZXMgPSByZXNvdXJjZV9m aXJzdF9jaGlsZCgmcGFyZW50LT5jaGlsZCk7CiAJZW5kID0gc3RhcnQgKyBuIC0gMTsKIAogCXdy aXRlX2xvY2soJnJlc291cmNlX2xvY2spOwogCiAJZm9yICg7OykgewotCQlzdHJ1Y3QgcmVzb3Vy Y2UgKnJlcyA9ICpwOwotCiAJCWlmICghcmVzKQogCQkJYnJlYWs7CiAJCWlmIChyZXMtPnN0YXJ0 IDw9IHN0YXJ0ICYmIHJlcy0+ZW5kID49IGVuZCkgewogCQkJaWYgKCEocmVzLT5mbGFncyAmIElP UkVTT1VSQ0VfQlVTWSkpIHsKLQkJCQlwID0gJnJlcy0+Y2hpbGQ7CisJCQkJcmVzID0gcmVzb3Vy Y2VfZmlyc3RfY2hpbGQoJnJlcy0+Y2hpbGQpOwogCQkJCWNvbnRpbnVlOwogCQkJfQogCQkJaWYg KHJlcy0+c3RhcnQgIT0gc3RhcnQgfHwgcmVzLT5lbmQgIT0gZW5kKQogCQkJCWJyZWFrOwotCQkJ KnAgPSByZXMtPnNpYmxpbmc7CisJCQlsaXN0X2RlbCgmcmVzLT5zaWJsaW5nKTsKIAkJCXdyaXRl X3VubG9jaygmcmVzb3VyY2VfbG9jayk7CiAJCQlpZiAocmVzLT5mbGFncyAmIElPUkVTT1VSQ0Vf TVVYRUQpCiAJCQkJd2FrZV91cCgmbXV4ZWRfcmVzb3VyY2Vfd2FpdCk7CiAJCQlmcmVlX3Jlc291 cmNlKHJlcyk7CiAJCQlyZXR1cm47CiAJCX0KLQkJcCA9ICZyZXMtPnNpYmxpbmc7CisJCXJlcyA9 IHJlc291cmNlX3NpYmxpbmcocmVzKTsKIAl9CiAKIAl3cml0ZV91bmxvY2soJnJlc291cmNlX2xv Y2spOwpAQCAtMTI3OCw5ICsxMjc3LDcgQEAgRVhQT1JUX1NZTUJPTChfX3JlbGVhc2VfcmVnaW9u KTsKIGludCByZWxlYXNlX21lbV9yZWdpb25fYWRqdXN0YWJsZShzdHJ1Y3QgcmVzb3VyY2UgKnBh cmVudCwKIAkJCXJlc291cmNlX3NpemVfdCBzdGFydCwgcmVzb3VyY2Vfc2l6ZV90IHNpemUpCiB7 Ci0Jc3RydWN0IHJlc291cmNlICoqcDsKLQlzdHJ1Y3QgcmVzb3VyY2UgKnJlczsKLQlzdHJ1Y3Qg cmVzb3VyY2UgKm5ld19yZXM7CisJc3RydWN0IHJlc291cmNlICpyZXMsICpuZXdfcmVzOwogCXJl c291cmNlX3NpemVfdCBlbmQ7CiAJaW50IHJldCA9IC1FSU5WQUw7CiAKQEAgLTEyOTEsMTYgKzEy ODgsMTYgQEAgaW50IHJlbGVhc2VfbWVtX3JlZ2lvbl9hZGp1c3RhYmxlKHN0cnVjdCByZXNvdXJj ZSAqcGFyZW50LAogCS8qIFRoZSBhbGxvY19yZXNvdXJjZSgpIHJlc3VsdCBnZXRzIGNoZWNrZWQg bGF0ZXIgKi8KIAluZXdfcmVzID0gYWxsb2NfcmVzb3VyY2UoR0ZQX0tFUk5FTCk7CiAKLQlwID0g JnBhcmVudC0+Y2hpbGQ7CisJcmVzID0gcmVzb3VyY2VfZmlyc3RfY2hpbGQoJnBhcmVudC0+Y2hp bGQpOwogCXdyaXRlX2xvY2soJnJlc291cmNlX2xvY2spOwogCi0Jd2hpbGUgKChyZXMgPSAqcCkp IHsKKwl3aGlsZSAoKHJlcykpIHsKIAkJaWYgKHJlcy0+c3RhcnQgPj0gZW5kKQogCQkJYnJlYWs7 CiAKIAkJLyogbG9vayBmb3IgdGhlIG5leHQgcmVzb3VyY2UgaWYgaXQgZG9lcyBub3QgZml0IGlu dG8gKi8KIAkJaWYgKHJlcy0+c3RhcnQgPiBzdGFydCB8fCByZXMtPmVuZCA8IGVuZCkgewotCQkJ cCA9ICZyZXMtPnNpYmxpbmc7CisJCQlyZXMgPSByZXNvdXJjZV9zaWJsaW5nKHJlcyk7CiAJCQlj b250aW51ZTsKIAkJfQogCkBAIC0xMzA4LDE0ICsxMzA1LDE0IEBAIGludCByZWxlYXNlX21lbV9y ZWdpb25fYWRqdXN0YWJsZShzdHJ1Y3QgcmVzb3VyY2UgKnBhcmVudCwKIAkJCWJyZWFrOwogCiAJ CWlmICghKHJlcy0+ZmxhZ3MgJiBJT1JFU09VUkNFX0JVU1kpKSB7Ci0JCQlwID0gJnJlcy0+Y2hp bGQ7CisJCQlyZXMgPSByZXNvdXJjZV9maXJzdF9jaGlsZCgmcmVzLT5jaGlsZCk7CiAJCQljb250 aW51ZTsKIAkJfQogCiAJCS8qIGZvdW5kIHRoZSB0YXJnZXQgcmVzb3VyY2U7IGxldCdzIGFkanVz dCBhY2NvcmRpbmdseSAqLwogCQlpZiAocmVzLT5zdGFydCA9PSBzdGFydCAmJiByZXMtPmVuZCA9 PSBlbmQpIHsKIAkJCS8qIGZyZWUgdGhlIHdob2xlIGVudHJ5ICovCi0JCQkqcCA9IHJlcy0+c2li bGluZzsKKwkJCWxpc3RfZGVsKCZyZXMtPnNpYmxpbmcpOwogCQkJZnJlZV9yZXNvdXJjZShyZXMp OwogCQkJcmV0ID0gMDsKIAkJfSBlbHNlIGlmIChyZXMtPnN0YXJ0ID09IHN0YXJ0ICYmIHJlcy0+ ZW5kICE9IGVuZCkgewpAQCAtMTMzOCwxNCArMTMzNSwxMyBAQCBpbnQgcmVsZWFzZV9tZW1fcmVn aW9uX2FkanVzdGFibGUoc3RydWN0IHJlc291cmNlICpwYXJlbnQsCiAJCQluZXdfcmVzLT5mbGFn cyA9IHJlcy0+ZmxhZ3M7CiAJCQluZXdfcmVzLT5kZXNjID0gcmVzLT5kZXNjOwogCQkJbmV3X3Jl cy0+cGFyZW50ID0gcmVzLT5wYXJlbnQ7Ci0JCQluZXdfcmVzLT5zaWJsaW5nID0gcmVzLT5zaWJs aW5nOwotCQkJbmV3X3Jlcy0+Y2hpbGQgPSBOVUxMOworCQkJSU5JVF9MSVNUX0hFQUQoJm5ld19y ZXMtPmNoaWxkKTsKIAogCQkJcmV0ID0gX19hZGp1c3RfcmVzb3VyY2UocmVzLCByZXMtPnN0YXJ0 LAogCQkJCQkJc3RhcnQgLSByZXMtPnN0YXJ0KTsKIAkJCWlmIChyZXQpCiAJCQkJYnJlYWs7Ci0J CQlyZXMtPnNpYmxpbmcgPSBuZXdfcmVzOworCQkJbGlzdF9hZGQoJm5ld19yZXMtPnNpYmxpbmcs ICZyZXMtPnNpYmxpbmcpOwogCQkJbmV3X3JlcyA9IE5VTEw7CiAJCX0KIApAQCAtMTUyNiw3ICsx NTIyLDcgQEAgc3RhdGljIGludCBfX2luaXQgcmVzZXJ2ZV9zZXR1cChjaGFyICpzdHIpCiAJCQly ZXMtPmVuZCA9IGlvX3N0YXJ0ICsgaW9fbnVtIC0gMTsKIAkJCXJlcy0+ZmxhZ3MgfD0gSU9SRVNP VVJDRV9CVVNZOwogCQkJcmVzLT5kZXNjID0gSU9SRVNfREVTQ19OT05FOwotCQkJcmVzLT5jaGls ZCA9IE5VTEw7CisJCQlJTklUX0xJU1RfSEVBRCgmcmVzLT5jaGlsZCk7CiAJCQlpZiAocmVxdWVz dF9yZXNvdXJjZShwYXJlbnQsIHJlcykgPT0gMCkKIAkJCQlyZXNlcnZlZCA9IHgrMTsKIAkJfQpA QCAtMTU0Niw3ICsxNTQyLDcgQEAgaW50IGlvbWVtX21hcF9zYW5pdHlfY2hlY2socmVzb3VyY2Vf c2l6ZV90IGFkZHIsIHVuc2lnbmVkIGxvbmcgc2l6ZSkKIAlsb2ZmX3QgbDsKIAogCXJlYWRfbG9j aygmcmVzb3VyY2VfbG9jayk7Ci0JZm9yIChwID0gcC0+Y2hpbGQ7IHAgOyBwID0gcl9uZXh0KE5V TEwsIHAsICZsKSkgeworCWZvciAocCA9IHJlc291cmNlX2ZpcnN0X2NoaWxkKCZwLT5jaGlsZCk7 IHA7IHAgPSByX25leHQoTlVMTCwgcCwgJmwpKSB7CiAJCS8qCiAJCSAqIFdlIGNhbiBwcm9iYWJs eSBza2lwIHRoZSByZXNvdXJjZXMgd2l0aG91dAogCQkgKiBJT1JFU09VUkNFX0lPIGF0dHJpYnV0 ZT8KQEAgLTE2MDIsNyArMTU5OCw3IEBAIGJvb2wgaW9tZW1faXNfZXhjbHVzaXZlKHU2NCBhZGRy KQogCWFkZHIgPSBhZGRyICYgUEFHRV9NQVNLOwogCiAJcmVhZF9sb2NrKCZyZXNvdXJjZV9sb2Nr KTsKLQlmb3IgKHAgPSBwLT5jaGlsZDsgcCA7IHAgPSByX25leHQoTlVMTCwgcCwgJmwpKSB7CisJ Zm9yIChwID0gcmVzb3VyY2VfZmlyc3RfY2hpbGQoJnAtPmNoaWxkKTsgcDsgcCA9IHJfbmV4dChO VUxMLCBwLCAmbCkpIHsKIAkJLyoKIAkJICogV2UgY2FuIHByb2JhYmx5IHNraXAgdGhlIHJlc291 cmNlcyB3aXRob3V0CiAJCSAqIElPUkVTT1VSQ0VfSU8gYXR0cmlidXRlPwotLSAKMi4xMy42Cgpf X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwpMaW51eC1udmRp bW0gbWFpbGluZyBsaXN0CkxpbnV4LW52ZGltbUBsaXN0cy4wMS5vcmcKaHR0cHM6Ly9saXN0cy4w MS5vcmcvbWFpbG1hbi9saXN0aW5mby9saW51eC1udmRpbW0K ^ permalink raw reply [flat|nested] 83+ messages in thread
* [PATCH v7 2/4] resource: Use list_head to link sibling resource @ 2018-07-18 2:49 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-18 2:49 UTC (permalink / raw) To: linux-kernel, akpm, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko Cc: linux-mips, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, jcmvbkbc, Paul Mackerras, baiyaowei, kys, frowand.list, lorenzo.pieralisi, sthemmin, Baoquan He, linux-nvdimm, Michael Ellerman, patrik.r.jakobsson, linux-input, gustavo, dyoung, thomas.lendacky, haiyangz, maarten.lankhorst, jglisse, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, Benjamin Herrenschmidt, ebiederm, devel, linuxppc-dev, davem The struct resource uses singly linked list to link siblings, implemented by pointer operation. Replace it with list_head for better code readability. Based on this list_head replacement, it will be very easy to do reverse iteration on iomem_resource's sibling list in later patch. Besides, type of member variables of struct resource, sibling and child, are changed from 'struct resource *' to 'struct list_head'. This brings two pointers of size increase. Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Cc: David Airlie <airlied@linux.ie> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Jonathan Derrick <jonathan.derrick@intel.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: devel@linuxdriverproject.org Cc: linux-input@vger.kernel.org Cc: linux-nvdimm@lists.01.org Cc: devicetree@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: Michal Simek <monstr@monstr.eu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linux-mips@linux-mips.org --- arch/arm/plat-samsung/pm-check.c | 6 +- arch/ia64/sn/kernel/io_init.c | 2 +- arch/microblaze/pci/pci-common.c | 4 +- arch/mips/pci/pci-rc32434.c | 12 +- arch/powerpc/kernel/pci-common.c | 4 +- arch/sparc/kernel/ioport.c | 2 +- arch/xtensa/include/asm/pci-bridge.h | 4 +- drivers/eisa/eisa-bus.c | 2 + drivers/gpu/drm/drm_memory.c | 3 +- drivers/gpu/drm/gma500/gtt.c | 5 +- drivers/hv/vmbus_drv.c | 52 +++---- drivers/input/joystick/iforce/iforce-main.c | 4 +- drivers/nvdimm/namespace_devs.c | 6 +- drivers/nvdimm/nd.h | 5 +- drivers/of/address.c | 4 +- drivers/parisc/lba_pci.c | 4 +- drivers/pci/controller/vmd.c | 8 +- drivers/pci/probe.c | 2 + drivers/pci/setup-bus.c | 2 +- include/linux/ioport.h | 17 ++- kernel/resource.c | 206 ++++++++++++++-------------- 21 files changed, 183 insertions(+), 171 deletions(-) diff --git a/arch/arm/plat-samsung/pm-check.c b/arch/arm/plat-samsung/pm-check.c index cd2c02c68bc3..5494355b1c49 100644 --- a/arch/arm/plat-samsung/pm-check.c +++ b/arch/arm/plat-samsung/pm-check.c @@ -46,8 +46,8 @@ typedef u32 *(run_fn_t)(struct resource *ptr, u32 *arg); static void s3c_pm_run_res(struct resource *ptr, run_fn_t fn, u32 *arg) { while (ptr != NULL) { - if (ptr->child != NULL) - s3c_pm_run_res(ptr->child, fn, arg); + if (!list_empty(&ptr->child)) + s3c_pm_run_res(resource_first_child(&ptr->child), fn, arg); if ((ptr->flags & IORESOURCE_SYSTEM_RAM) == IORESOURCE_SYSTEM_RAM) { @@ -57,7 +57,7 @@ static void s3c_pm_run_res(struct resource *ptr, run_fn_t fn, u32 *arg) arg = (fn)(ptr, arg); } - ptr = ptr->sibling; + ptr = resource_sibling(ptr); } } diff --git a/arch/ia64/sn/kernel/io_init.c b/arch/ia64/sn/kernel/io_init.c index d63809a6adfa..338a7b7f194d 100644 --- a/arch/ia64/sn/kernel/io_init.c +++ b/arch/ia64/sn/kernel/io_init.c @@ -192,7 +192,7 @@ sn_io_slot_fixup(struct pci_dev *dev) * if it's already in the device structure, remove it before * inserting */ - if (res->parent && res->parent->child) + if (res->parent && !list_empty(&res->parent->child)) release_resource(res); if (res->flags & IORESOURCE_IO) diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c index 7899bafab064..2bf73e27e231 100644 --- a/arch/microblaze/pci/pci-common.c +++ b/arch/microblaze/pci/pci-common.c @@ -533,7 +533,9 @@ void pci_process_bridge_OF_ranges(struct pci_controller *hose, res->flags = range.flags; res->start = range.cpu_addr; res->end = range.cpu_addr + range.size - 1; - res->parent = res->child = res->sibling = NULL; + res->parent = NULL; + INIT_LIST_HEAD(&res->child); + INIT_LIST_HEAD(&res->sibling); } } diff --git a/arch/mips/pci/pci-rc32434.c b/arch/mips/pci/pci-rc32434.c index 7f6ce6d734c0..e80283df7925 100644 --- a/arch/mips/pci/pci-rc32434.c +++ b/arch/mips/pci/pci-rc32434.c @@ -53,8 +53,8 @@ static struct resource rc32434_res_pci_mem1 = { .start = 0x50000000, .end = 0x5FFFFFFF, .flags = IORESOURCE_MEM, - .sibling = NULL, - .child = &rc32434_res_pci_mem2 + .sibling = LIST_HEAD_INIT(rc32434_res_pci_mem1.sibling), + .child = LIST_HEAD_INIT(rc32434_res_pci_mem1.child), }; static struct resource rc32434_res_pci_mem2 = { @@ -63,8 +63,8 @@ static struct resource rc32434_res_pci_mem2 = { .end = 0x6FFFFFFF, .flags = IORESOURCE_MEM, .parent = &rc32434_res_pci_mem1, - .sibling = NULL, - .child = NULL + .sibling = LIST_HEAD_INIT(rc32434_res_pci_mem2.sibling), + .child = LIST_HEAD_INIT(rc32434_res_pci_mem2.child), }; static struct resource rc32434_res_pci_io1 = { @@ -72,6 +72,8 @@ static struct resource rc32434_res_pci_io1 = { .start = 0x18800000, .end = 0x188FFFFF, .flags = IORESOURCE_IO, + .sibling = LIST_HEAD_INIT(rc32434_res_pci_io1.sibling), + .child = LIST_HEAD_INIT(rc32434_res_pci_io1.child), }; extern struct pci_ops rc32434_pci_ops; @@ -208,6 +210,8 @@ static int __init rc32434_pci_init(void) pr_info("PCI: Initializing PCI\n"); + list_add(&rc32434_res_pci_mem2.sibling, &rc32434_res_pci_mem1.child); + ioport_resource.start = rc32434_res_pci_io1.start; ioport_resource.end = rc32434_res_pci_io1.end; diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index 926035bb378d..28fbe83c9daf 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -761,7 +761,9 @@ void pci_process_bridge_OF_ranges(struct pci_controller *hose, res->flags = range.flags; res->start = range.cpu_addr; res->end = range.cpu_addr + range.size - 1; - res->parent = res->child = res->sibling = NULL; + res->parent = NULL; + INIT_LIST_HEAD(&res->child); + INIT_LIST_HEAD(&res->sibling); } } } diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c index cca9134cfa7d..99efe4e98b16 100644 --- a/arch/sparc/kernel/ioport.c +++ b/arch/sparc/kernel/ioport.c @@ -669,7 +669,7 @@ static int sparc_io_proc_show(struct seq_file *m, void *v) struct resource *root = m->private, *r; const char *nm; - for (r = root->child; r != NULL; r = r->sibling) { + list_for_each_entry(r, &root->child, sibling) { if ((nm = r->name) == NULL) nm = "???"; seq_printf(m, "%016llx-%016llx: %s\n", (unsigned long long)r->start, diff --git a/arch/xtensa/include/asm/pci-bridge.h b/arch/xtensa/include/asm/pci-bridge.h index 0b68c76ec1e6..f487b06817df 100644 --- a/arch/xtensa/include/asm/pci-bridge.h +++ b/arch/xtensa/include/asm/pci-bridge.h @@ -71,8 +71,8 @@ static inline void pcibios_init_resource(struct resource *res, res->flags = flags; res->name = name; res->parent = NULL; - res->sibling = NULL; - res->child = NULL; + INIT_LIST_HEAD(&res->child); + INIT_LIST_HEAD(&res->sibling); } diff --git a/drivers/eisa/eisa-bus.c b/drivers/eisa/eisa-bus.c index 1e8062f6dbfc..dba78f75fd06 100644 --- a/drivers/eisa/eisa-bus.c +++ b/drivers/eisa/eisa-bus.c @@ -408,6 +408,8 @@ static struct resource eisa_root_res = { .start = 0, .end = 0xffffffff, .flags = IORESOURCE_IO, + .sibling = LIST_HEAD_INIT(eisa_root_res.sibling), + .child = LIST_HEAD_INIT(eisa_root_res.child), }; static int eisa_bus_count; diff --git a/drivers/gpu/drm/drm_memory.c b/drivers/gpu/drm/drm_memory.c index d69e4fc1ee77..33baa7fa5e41 100644 --- a/drivers/gpu/drm/drm_memory.c +++ b/drivers/gpu/drm/drm_memory.c @@ -155,9 +155,8 @@ u64 drm_get_max_iomem(void) struct resource *tmp; resource_size_t max_iomem = 0; - for (tmp = iomem_resource.child; tmp; tmp = tmp->sibling) { + list_for_each_entry(tmp, &iomem_resource.child, sibling) max_iomem = max(max_iomem, tmp->end); - } return max_iomem; } diff --git a/drivers/gpu/drm/gma500/gtt.c b/drivers/gpu/drm/gma500/gtt.c index 3949b0990916..addd3bc009af 100644 --- a/drivers/gpu/drm/gma500/gtt.c +++ b/drivers/gpu/drm/gma500/gtt.c @@ -565,7 +565,7 @@ int psb_gtt_init(struct drm_device *dev, int resume) int psb_gtt_restore(struct drm_device *dev) { struct drm_psb_private *dev_priv = dev->dev_private; - struct resource *r = dev_priv->gtt_mem->child; + struct resource *r; struct gtt_range *range; unsigned int restored = 0, total = 0, size = 0; @@ -573,14 +573,13 @@ int psb_gtt_restore(struct drm_device *dev) mutex_lock(&dev_priv->gtt_mutex); psb_gtt_init(dev, 1); - while (r != NULL) { + list_for_each_entry(r, &dev_priv->gtt_mem->child, sibling) { range = container_of(r, struct gtt_range, resource); if (range->pages) { psb_gtt_insert(dev, range, 1); size += range->resource.end - range->resource.start; restored++; } - r = r->sibling; total++; } mutex_unlock(&dev_priv->gtt_mutex); diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index b10fe26c4891..d87ec5a1bc4c 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1412,9 +1412,8 @@ static acpi_status vmbus_walk_resources(struct acpi_resource *res, void *ctx) { resource_size_t start = 0; resource_size_t end = 0; - struct resource *new_res; + struct resource *new_res, *tmp; struct resource **old_res = &hyperv_mmio; - struct resource **prev_res = NULL; switch (res->type) { @@ -1461,44 +1460,36 @@ static acpi_status vmbus_walk_resources(struct acpi_resource *res, void *ctx) /* * If two ranges are adjacent, merge them. */ - do { - if (!*old_res) { - *old_res = new_res; - break; - } - - if (((*old_res)->end + 1) == new_res->start) { - (*old_res)->end = new_res->end; + if (!*old_res) { + *old_res = new_res; + return AE_OK; + } + tmp = *old_res; + list_for_each_entry_from(tmp, &tmp->parent->child, sibling) { + if ((tmp->end + 1) == new_res->start) { + tmp->end = new_res->end; kfree(new_res); break; } - if ((*old_res)->start == new_res->end + 1) { - (*old_res)->start = new_res->start; + if (tmp->start == new_res->end + 1) { + tmp->start = new_res->start; kfree(new_res); break; } - if ((*old_res)->start > new_res->end) { - new_res->sibling = *old_res; - if (prev_res) - (*prev_res)->sibling = new_res; - *old_res = new_res; + if (tmp->start > new_res->end) { + list_add(&new_res->sibling, tmp->sibling.prev); break; } - - prev_res = old_res; - old_res = &(*old_res)->sibling; - - } while (1); + } return AE_OK; } static int vmbus_acpi_remove(struct acpi_device *device) { - struct resource *cur_res; - struct resource *next_res; + struct resource *res; if (hyperv_mmio) { if (fb_mmio) { @@ -1507,10 +1498,9 @@ static int vmbus_acpi_remove(struct acpi_device *device) fb_mmio = NULL; } - for (cur_res = hyperv_mmio; cur_res; cur_res = next_res) { - next_res = cur_res->sibling; - kfree(cur_res); - } + res = hyperv_mmio; + list_for_each_entry_from(res, &res->parent->child, sibling) + kfree(res); } return 0; @@ -1596,7 +1586,8 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, } } - for (iter = hyperv_mmio; iter; iter = iter->sibling) { + iter = hyperv_mmio; + list_for_each_entry_from(iter, &iter->parent->child, sibling) { if ((iter->start >= max) || (iter->end <= min)) continue; @@ -1639,7 +1630,8 @@ void vmbus_free_mmio(resource_size_t start, resource_size_t size) struct resource *iter; down(&hyperv_mmio_lock); - for (iter = hyperv_mmio; iter; iter = iter->sibling) { + iter = hyperv_mmio; + list_for_each_entry_from(iter, &iter->parent->child, sibling) { if ((iter->start >= start + size) || (iter->end <= start)) continue; diff --git a/drivers/input/joystick/iforce/iforce-main.c b/drivers/input/joystick/iforce/iforce-main.c index daeeb4c7e3b0..5c0be27b33ff 100644 --- a/drivers/input/joystick/iforce/iforce-main.c +++ b/drivers/input/joystick/iforce/iforce-main.c @@ -305,8 +305,8 @@ int iforce_init_device(struct iforce *iforce) iforce->device_memory.end = 200; iforce->device_memory.flags = IORESOURCE_MEM; iforce->device_memory.parent = NULL; - iforce->device_memory.child = NULL; - iforce->device_memory.sibling = NULL; + INIT_LIST_HEAD(&iforce->device_memory.child); + INIT_LIST_HEAD(&iforce->device_memory.sibling); /* * Wait until device ready - until it sends its first response. diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c index 28afdd668905..f53d410d9981 100644 --- a/drivers/nvdimm/namespace_devs.c +++ b/drivers/nvdimm/namespace_devs.c @@ -637,7 +637,7 @@ static resource_size_t scan_allocate(struct nd_region *nd_region, retry: first = 0; for_each_dpa_resource(ndd, res) { - struct resource *next = res->sibling, *new_res = NULL; + struct resource *next = resource_sibling(res), *new_res = NULL; resource_size_t allocate, available = 0; enum alloc_loc loc = ALLOC_ERR; const char *action; @@ -763,7 +763,7 @@ static resource_size_t scan_allocate(struct nd_region *nd_region, * an initial "pmem-reserve pass". Only do an initial BLK allocation * when none of the DPA space is reserved. */ - if ((is_pmem || !ndd->dpa.child) && n == to_allocate) + if ((is_pmem || list_empty(&ndd->dpa.child)) && n == to_allocate) return init_dpa_allocation(label_id, nd_region, nd_mapping, n); return n; } @@ -779,7 +779,7 @@ static int merge_dpa(struct nd_region *nd_region, retry: for_each_dpa_resource(ndd, res) { int rc; - struct resource *next = res->sibling; + struct resource *next = resource_sibling(res); resource_size_t end = res->start + resource_size(res); if (!next || strcmp(res->name, label_id->id) != 0 diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h index 32e0364b48b9..da7da15e03e7 100644 --- a/drivers/nvdimm/nd.h +++ b/drivers/nvdimm/nd.h @@ -102,11 +102,10 @@ unsigned sizeof_namespace_label(struct nvdimm_drvdata *ndd); (unsigned long long) (res ? res->start : 0), ##arg) #define for_each_dpa_resource(ndd, res) \ - for (res = (ndd)->dpa.child; res; res = res->sibling) + list_for_each_entry(res, &(ndd)->dpa.child, sibling) #define for_each_dpa_resource_safe(ndd, res, next) \ - for (res = (ndd)->dpa.child, next = res ? res->sibling : NULL; \ - res; res = next, next = next ? next->sibling : NULL) + list_for_each_entry_safe(res, next, &(ndd)->dpa.child, sibling) struct nd_percpu_lane { int count; diff --git a/drivers/of/address.c b/drivers/of/address.c index 53349912ac75..e2e25719ab52 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -330,7 +330,9 @@ int of_pci_range_to_resource(struct of_pci_range *range, { int err; res->flags = range->flags; - res->parent = res->child = res->sibling = NULL; + res->parent = NULL; + INIT_LIST_HEAD(&res->child); + INIT_LIST_HEAD(&res->sibling); res->name = np->full_name; if (res->flags & IORESOURCE_IO) { diff --git a/drivers/parisc/lba_pci.c b/drivers/parisc/lba_pci.c index 69bd98421eb1..7482bdfd1959 100644 --- a/drivers/parisc/lba_pci.c +++ b/drivers/parisc/lba_pci.c @@ -170,8 +170,8 @@ lba_dump_res(struct resource *r, int d) for (i = d; i ; --i) printk(" "); printk(KERN_DEBUG "%p [%lx,%lx]/%lx\n", r, (long)r->start, (long)r->end, r->flags); - lba_dump_res(r->child, d+2); - lba_dump_res(r->sibling, d); + lba_dump_res(resource_first_child(&r->child), d+2); + lba_dump_res(resource_sibling(r), d); } diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c index 942b64fc7f1f..e3ace20345c7 100644 --- a/drivers/pci/controller/vmd.c +++ b/drivers/pci/controller/vmd.c @@ -542,14 +542,14 @@ static struct pci_ops vmd_ops = { static void vmd_attach_resources(struct vmd_dev *vmd) { - vmd->dev->resource[VMD_MEMBAR1].child = &vmd->resources[1]; - vmd->dev->resource[VMD_MEMBAR2].child = &vmd->resources[2]; + list_add(&vmd->resources[1].sibling, &vmd->dev->resource[VMD_MEMBAR1].child); + list_add(&vmd->resources[2].sibling, &vmd->dev->resource[VMD_MEMBAR2].child); } static void vmd_detach_resources(struct vmd_dev *vmd) { - vmd->dev->resource[VMD_MEMBAR1].child = NULL; - vmd->dev->resource[VMD_MEMBAR2].child = NULL; + INIT_LIST_HEAD(&vmd->dev->resource[VMD_MEMBAR1].child); + INIT_LIST_HEAD(&vmd->dev->resource[VMD_MEMBAR2].child); } /* diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index ac876e32de4b..9624dd1dfd49 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -59,6 +59,8 @@ static struct resource *get_pci_domain_busn_res(int domain_nr) r->res.start = 0; r->res.end = 0xff; r->res.flags = IORESOURCE_BUS | IORESOURCE_PCI_FIXED; + INIT_LIST_HEAD(&r->res.child); + INIT_LIST_HEAD(&r->res.sibling); list_add_tail(&r->list, &pci_domain_busn_res_list); diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c index 79b1824e83b4..8e685af8938d 100644 --- a/drivers/pci/setup-bus.c +++ b/drivers/pci/setup-bus.c @@ -2107,7 +2107,7 @@ int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type) continue; /* Ignore BARs which are still in use */ - if (res->child) + if (!list_empty(&res->child)) continue; ret = add_to_list(&saved, bridge, res, 0, 0); diff --git a/include/linux/ioport.h b/include/linux/ioport.h index dfdcd0bfe54e..b7456ae889dd 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -12,6 +12,7 @@ #ifndef __ASSEMBLY__ #include <linux/compiler.h> #include <linux/types.h> +#include <linux/list.h> /* * Resources are tree-like, allowing * nesting etc.. @@ -22,7 +23,8 @@ struct resource { const char *name; unsigned long flags; unsigned long desc; - struct resource *parent, *sibling, *child; + struct list_head child, sibling; + struct resource *parent; }; /* @@ -216,7 +218,6 @@ static inline bool resource_contains(struct resource *r1, struct resource *r2) return r1->start <= r2->start && r1->end >= r2->end; } - /* Convenience shorthand with allocation */ #define request_region(start,n,name) __request_region(&ioport_resource, (start), (n), (name), 0) #define request_muxed_region(start,n,name) __request_region(&ioport_resource, (start), (n), (name), IORESOURCE_MUXED) @@ -287,6 +288,18 @@ static inline bool resource_overlaps(struct resource *r1, struct resource *r2) return (r1->start <= r2->end && r1->end >= r2->start); } +static inline struct resource *resource_sibling(struct resource *res) +{ + if (res->parent && !list_is_last(&res->sibling, &res->parent->child)) + return list_next_entry(res, sibling); + return NULL; +} + +static inline struct resource *resource_first_child(struct list_head *head) +{ + return list_first_entry_or_null(head, struct resource, sibling); +} + #endif /* __ASSEMBLY__ */ #endif /* _LINUX_IOPORT_H */ diff --git a/kernel/resource.c b/kernel/resource.c index 81ccd19c1d9f..c96e58d3d2f8 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -31,6 +31,8 @@ struct resource ioport_resource = { .start = 0, .end = IO_SPACE_LIMIT, .flags = IORESOURCE_IO, + .sibling = LIST_HEAD_INIT(ioport_resource.sibling), + .child = LIST_HEAD_INIT(ioport_resource.child), }; EXPORT_SYMBOL(ioport_resource); @@ -39,6 +41,8 @@ struct resource iomem_resource = { .start = 0, .end = -1, .flags = IORESOURCE_MEM, + .sibling = LIST_HEAD_INIT(iomem_resource.sibling), + .child = LIST_HEAD_INIT(iomem_resource.child), }; EXPORT_SYMBOL(iomem_resource); @@ -57,20 +61,20 @@ static DEFINE_RWLOCK(resource_lock); * by boot mem after the system is up. So for reusing the resource entry * we need to remember the resource. */ -static struct resource *bootmem_resource_free; +static struct list_head bootmem_resource_free = LIST_HEAD_INIT(bootmem_resource_free); static DEFINE_SPINLOCK(bootmem_resource_lock); static struct resource *next_resource(struct resource *p, bool sibling_only) { /* Caller wants to traverse through siblings only */ if (sibling_only) - return p->sibling; + return resource_sibling(p); - if (p->child) - return p->child; - while (!p->sibling && p->parent) + if (!list_empty(&p->child)) + return resource_first_child(&p->child); + while (!resource_sibling(p) && p->parent) p = p->parent; - return p->sibling; + return resource_sibling(p); } static void *r_next(struct seq_file *m, void *v, loff_t *pos) @@ -90,7 +94,7 @@ static void *r_start(struct seq_file *m, loff_t *pos) struct resource *p = PDE_DATA(file_inode(m->file)); loff_t l = 0; read_lock(&resource_lock); - for (p = p->child; p && l < *pos; p = r_next(m, p, &l)) + for (p = resource_first_child(&p->child); p && l < *pos; p = r_next(m, p, &l)) ; return p; } @@ -153,8 +157,7 @@ static void free_resource(struct resource *res) if (!PageSlab(virt_to_head_page(res))) { spin_lock(&bootmem_resource_lock); - res->sibling = bootmem_resource_free; - bootmem_resource_free = res; + list_add(&res->sibling, &bootmem_resource_free); spin_unlock(&bootmem_resource_lock); } else { kfree(res); @@ -166,10 +169,9 @@ static struct resource *alloc_resource(gfp_t flags) struct resource *res = NULL; spin_lock(&bootmem_resource_lock); - if (bootmem_resource_free) { - res = bootmem_resource_free; - bootmem_resource_free = res->sibling; - } + res = resource_first_child(&bootmem_resource_free); + if (res) + list_del(&res->sibling); spin_unlock(&bootmem_resource_lock); if (res) @@ -177,6 +179,8 @@ static struct resource *alloc_resource(gfp_t flags) else res = kzalloc(sizeof(struct resource), flags); + INIT_LIST_HEAD(&res->child); + INIT_LIST_HEAD(&res->sibling); return res; } @@ -185,7 +189,7 @@ static struct resource * __request_resource(struct resource *root, struct resour { resource_size_t start = new->start; resource_size_t end = new->end; - struct resource *tmp, **p; + struct resource *tmp; if (end < start) return root; @@ -193,64 +197,62 @@ static struct resource * __request_resource(struct resource *root, struct resour return root; if (end > root->end) return root; - p = &root->child; - for (;;) { - tmp = *p; - if (!tmp || tmp->start > end) { - new->sibling = tmp; - *p = new; + + if (list_empty(&root->child)) { + list_add(&new->sibling, &root->child); + new->parent = root; + INIT_LIST_HEAD(&new->child); + return NULL; + } + + list_for_each_entry(tmp, &root->child, sibling) { + if (tmp->start > end) { + list_add(&new->sibling, tmp->sibling.prev); new->parent = root; + INIT_LIST_HEAD(&new->child); return NULL; } - p = &tmp->sibling; if (tmp->end < start) continue; return tmp; } + + list_add_tail(&new->sibling, &root->child); + new->parent = root; + INIT_LIST_HEAD(&new->child); + return NULL; } static int __release_resource(struct resource *old, bool release_child) { - struct resource *tmp, **p, *chd; + struct resource *tmp, *next, *chd; - p = &old->parent->child; - for (;;) { - tmp = *p; - if (!tmp) - break; + list_for_each_entry_safe(tmp, next, &old->parent->child, sibling) { if (tmp == old) { - if (release_child || !(tmp->child)) { - *p = tmp->sibling; + if (release_child || list_empty(&tmp->child)) { + list_del(&tmp->sibling); } else { - for (chd = tmp->child;; chd = chd->sibling) { + list_for_each_entry(chd, &tmp->child, sibling) chd->parent = tmp->parent; - if (!(chd->sibling)) - break; - } - *p = tmp->child; - chd->sibling = tmp->sibling; + list_splice(&tmp->child, tmp->sibling.prev); + list_del(&tmp->sibling); } + old->parent = NULL; return 0; } - p = &tmp->sibling; } return -EINVAL; } static void __release_child_resources(struct resource *r) { - struct resource *tmp, *p; + struct resource *tmp, *next; resource_size_t size; - p = r->child; - r->child = NULL; - while (p) { - tmp = p; - p = p->sibling; - + list_for_each_entry_safe(tmp, next, &r->child, sibling) { tmp->parent = NULL; - tmp->sibling = NULL; + list_del_init(&tmp->sibling); __release_child_resources(tmp); printk(KERN_DEBUG "release child resource %pR\n", tmp); @@ -259,6 +261,8 @@ static void __release_child_resources(struct resource *r) tmp->start = 0; tmp->end = size - 1; } + + INIT_LIST_HEAD(&tmp->child); } void release_child_resources(struct resource *r) @@ -343,7 +347,8 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc, read_lock(&resource_lock); - for (p = iomem_resource.child; p; p = next_resource(p, sibling_only)) { + for (p = resource_first_child(&iomem_resource.child); p; + p = next_resource(p, sibling_only)) { if ((p->flags & res->flags) != res->flags) continue; if ((desc != IORES_DESC_NONE) && (desc != p->desc)) @@ -532,7 +537,7 @@ int region_intersects(resource_size_t start, size_t size, unsigned long flags, struct resource *p; read_lock(&resource_lock); - for (p = iomem_resource.child; p ; p = p->sibling) { + list_for_each_entry(p, &iomem_resource.child, sibling) { bool is_type = (((p->flags & flags) == flags) && ((desc == IORES_DESC_NONE) || (desc == p->desc))); @@ -586,7 +591,7 @@ static int __find_resource(struct resource *root, struct resource *old, resource_size_t size, struct resource_constraint *constraint) { - struct resource *this = root->child; + struct resource *this = resource_first_child(&root->child); struct resource tmp = *new, avail, alloc; tmp.start = root->start; @@ -596,7 +601,7 @@ static int __find_resource(struct resource *root, struct resource *old, */ if (this && this->start == root->start) { tmp.start = (this == old) ? old->start : this->end + 1; - this = this->sibling; + this = resource_sibling(this); } for(;;) { if (this) @@ -632,7 +637,7 @@ next: if (!this || this->end == root->end) if (this != old) tmp.start = this->end + 1; - this = this->sibling; + this = resource_sibling(this); } return -EBUSY; } @@ -676,7 +681,7 @@ static int reallocate_resource(struct resource *root, struct resource *old, goto out; } - if (old->child) { + if (!list_empty(&old->child)) { err = -EBUSY; goto out; } @@ -757,7 +762,7 @@ struct resource *lookup_resource(struct resource *root, resource_size_t start) struct resource *res; read_lock(&resource_lock); - for (res = root->child; res; res = res->sibling) { + list_for_each_entry(res, &root->child, sibling) { if (res->start == start) break; } @@ -790,32 +795,27 @@ static struct resource * __insert_resource(struct resource *parent, struct resou break; } - for (next = first; ; next = next->sibling) { + for (next = first; ; next = resource_sibling(next)) { /* Partial overlap? Bad, and unfixable */ if (next->start < new->start || next->end > new->end) return next; - if (!next->sibling) + if (!resource_sibling(next)) break; - if (next->sibling->start > new->end) + if (resource_sibling(next)->start > new->end) break; } - new->parent = parent; - new->sibling = next->sibling; - new->child = first; + list_add(&new->sibling, &next->sibling); + INIT_LIST_HEAD(&new->child); - next->sibling = NULL; - for (next = first; next; next = next->sibling) + /* + * From first to next, they all fall into new's region, so change them + * as new's children. + */ + list_cut_position(&new->child, first->sibling.prev, &next->sibling); + list_for_each_entry(next, &new->child, sibling) next->parent = new; - if (parent->child == first) { - parent->child = new; - } else { - next = parent->child; - while (next->sibling != first) - next = next->sibling; - next->sibling = new; - } return NULL; } @@ -937,19 +937,17 @@ static int __adjust_resource(struct resource *res, resource_size_t start, if ((start < parent->start) || (end > parent->end)) goto out; - if (res->sibling && (res->sibling->start <= end)) + if (resource_sibling(res) && (resource_sibling(res)->start <= end)) goto out; - tmp = parent->child; - if (tmp != res) { - while (tmp->sibling != res) - tmp = tmp->sibling; + if (res->sibling.prev != &parent->child) { + tmp = list_prev_entry(res, sibling); if (start <= tmp->end) goto out; } skip: - for (tmp = res->child; tmp; tmp = tmp->sibling) + list_for_each_entry(tmp, &res->child, sibling) if ((tmp->start < start) || (tmp->end > end)) goto out; @@ -996,27 +994,30 @@ EXPORT_SYMBOL(adjust_resource); */ int reparent_resources(struct resource *parent, struct resource *res) { - struct resource *p, **pp; - struct resource **firstpp = NULL; + struct resource *p, *first = NULL; - for (pp = &parent->child; (p = *pp) != NULL; pp = &p->sibling) { + list_for_each_entry(p, &parent->child, sibling) { if (p->end < res->start) continue; if (res->end < p->start) break; if (p->start < res->start || p->end > res->end) return -ENOTSUPP; /* not completely contained */ - if (firstpp == NULL) - firstpp = pp; + if (first == NULL) + first = p; } - if (firstpp == NULL) + if (first == NULL) return -ECANCELED; /* didn't find any conflicting entries? */ res->parent = parent; - res->child = *firstpp; - res->sibling = *pp; - *firstpp = res; - *pp = NULL; - for (p = res->child; p != NULL; p = p->sibling) { + list_add(&res->sibling, p->sibling.prev); + INIT_LIST_HEAD(&res->child); + + /* + * From first to p's previous sibling, they all fall into + * res's region, change them as res's children. + */ + list_cut_position(&res->child, first->sibling.prev, res->sibling.prev); + list_for_each_entry(p, &res->child, sibling) { p->parent = res; pr_debug("PCI: Reparented %s %pR under %s\n", p->name, p, res->name); @@ -1216,34 +1217,32 @@ EXPORT_SYMBOL(__request_region); void __release_region(struct resource *parent, resource_size_t start, resource_size_t n) { - struct resource **p; + struct resource *res; resource_size_t end; - p = &parent->child; + res = resource_first_child(&parent->child); end = start + n - 1; write_lock(&resource_lock); for (;;) { - struct resource *res = *p; - if (!res) break; if (res->start <= start && res->end >= end) { if (!(res->flags & IORESOURCE_BUSY)) { - p = &res->child; + res = resource_first_child(&res->child); continue; } if (res->start != start || res->end != end) break; - *p = res->sibling; + list_del(&res->sibling); write_unlock(&resource_lock); if (res->flags & IORESOURCE_MUXED) wake_up(&muxed_resource_wait); free_resource(res); return; } - p = &res->sibling; + res = resource_sibling(res); } write_unlock(&resource_lock); @@ -1278,9 +1277,7 @@ EXPORT_SYMBOL(__release_region); int release_mem_region_adjustable(struct resource *parent, resource_size_t start, resource_size_t size) { - struct resource **p; - struct resource *res; - struct resource *new_res; + struct resource *res, *new_res; resource_size_t end; int ret = -EINVAL; @@ -1291,16 +1288,16 @@ int release_mem_region_adjustable(struct resource *parent, /* The alloc_resource() result gets checked later */ new_res = alloc_resource(GFP_KERNEL); - p = &parent->child; + res = resource_first_child(&parent->child); write_lock(&resource_lock); - while ((res = *p)) { + while ((res)) { if (res->start >= end) break; /* look for the next resource if it does not fit into */ if (res->start > start || res->end < end) { - p = &res->sibling; + res = resource_sibling(res); continue; } @@ -1308,14 +1305,14 @@ int release_mem_region_adjustable(struct resource *parent, break; if (!(res->flags & IORESOURCE_BUSY)) { - p = &res->child; + res = resource_first_child(&res->child); continue; } /* found the target resource; let's adjust accordingly */ if (res->start == start && res->end == end) { /* free the whole entry */ - *p = res->sibling; + list_del(&res->sibling); free_resource(res); ret = 0; } else if (res->start == start && res->end != end) { @@ -1338,14 +1335,13 @@ int release_mem_region_adjustable(struct resource *parent, new_res->flags = res->flags; new_res->desc = res->desc; new_res->parent = res->parent; - new_res->sibling = res->sibling; - new_res->child = NULL; + INIT_LIST_HEAD(&new_res->child); ret = __adjust_resource(res, res->start, start - res->start); if (ret) break; - res->sibling = new_res; + list_add(&new_res->sibling, &res->sibling); new_res = NULL; } @@ -1526,7 +1522,7 @@ static int __init reserve_setup(char *str) res->end = io_start + io_num - 1; res->flags |= IORESOURCE_BUSY; res->desc = IORES_DESC_NONE; - res->child = NULL; + INIT_LIST_HEAD(&res->child); if (request_resource(parent, res) == 0) reserved = x+1; } @@ -1546,7 +1542,7 @@ int iomem_map_sanity_check(resource_size_t addr, unsigned long size) loff_t l; read_lock(&resource_lock); - for (p = p->child; p ; p = r_next(NULL, p, &l)) { + for (p = resource_first_child(&p->child); p; p = r_next(NULL, p, &l)) { /* * We can probably skip the resources without * IORESOURCE_IO attribute? @@ -1602,7 +1598,7 @@ bool iomem_is_exclusive(u64 addr) addr = addr & PAGE_MASK; read_lock(&resource_lock); - for (p = p->child; p ; p = r_next(NULL, p, &l)) { + for (p = resource_first_child(&p->child); p; p = r_next(NULL, p, &l)) { /* * We can probably skip the resources without * IORESOURCE_IO attribute? -- 2.13.6 _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH v7 2/4] resource: Use list_head to link sibling resource @ 2018-07-18 2:49 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-18 2:49 UTC (permalink / raw) To: linux-kernel, akpm, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko Cc: patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, ebiederm, vgoyal, dyoung, yinghai, monstr, davem, chris, jcmvbkbc, gustavo, maarten.lankhorst, seanpaul, linux-parisc, linuxppc-dev, Baoquan He, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, linux-mips The struct resource uses singly linked list to link siblings, implemented by pointer operation. Replace it with list_head for better code readability. Based on this list_head replacement, it will be very easy to do reverse iteration on iomem_resource's sibling list in later patch. Besides, type of member variables of struct resource, sibling and child, are changed from 'struct resource *' to 'struct list_head'. This brings two pointers of size increase. Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Cc: David Airlie <airlied@linux.ie> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Jonathan Derrick <jonathan.derrick@intel.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: devel@linuxdriverproject.org Cc: linux-input@vger.kernel.org Cc: linux-nvdimm@lists.01.org Cc: devicetree@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: Michal Simek <monstr@monstr.eu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linux-mips@linux-mips.org --- arch/arm/plat-samsung/pm-check.c | 6 +- arch/ia64/sn/kernel/io_init.c | 2 +- arch/microblaze/pci/pci-common.c | 4 +- arch/mips/pci/pci-rc32434.c | 12 +- arch/powerpc/kernel/pci-common.c | 4 +- arch/sparc/kernel/ioport.c | 2 +- arch/xtensa/include/asm/pci-bridge.h | 4 +- drivers/eisa/eisa-bus.c | 2 + drivers/gpu/drm/drm_memory.c | 3 +- drivers/gpu/drm/gma500/gtt.c | 5 +- drivers/hv/vmbus_drv.c | 52 +++---- drivers/input/joystick/iforce/iforce-main.c | 4 +- drivers/nvdimm/namespace_devs.c | 6 +- drivers/nvdimm/nd.h | 5 +- drivers/of/address.c | 4 +- drivers/parisc/lba_pci.c | 4 +- drivers/pci/controller/vmd.c | 8 +- drivers/pci/probe.c | 2 + drivers/pci/setup-bus.c | 2 +- include/linux/ioport.h | 17 ++- kernel/resource.c | 206 ++++++++++++++-------------- 21 files changed, 183 insertions(+), 171 deletions(-) diff --git a/arch/arm/plat-samsung/pm-check.c b/arch/arm/plat-samsung/pm-check.c index cd2c02c68bc3..5494355b1c49 100644 --- a/arch/arm/plat-samsung/pm-check.c +++ b/arch/arm/plat-samsung/pm-check.c @@ -46,8 +46,8 @@ typedef u32 *(run_fn_t)(struct resource *ptr, u32 *arg); static void s3c_pm_run_res(struct resource *ptr, run_fn_t fn, u32 *arg) { while (ptr != NULL) { - if (ptr->child != NULL) - s3c_pm_run_res(ptr->child, fn, arg); + if (!list_empty(&ptr->child)) + s3c_pm_run_res(resource_first_child(&ptr->child), fn, arg); if ((ptr->flags & IORESOURCE_SYSTEM_RAM) == IORESOURCE_SYSTEM_RAM) { @@ -57,7 +57,7 @@ static void s3c_pm_run_res(struct resource *ptr, run_fn_t fn, u32 *arg) arg = (fn)(ptr, arg); } - ptr = ptr->sibling; + ptr = resource_sibling(ptr); } } diff --git a/arch/ia64/sn/kernel/io_init.c b/arch/ia64/sn/kernel/io_init.c index d63809a6adfa..338a7b7f194d 100644 --- a/arch/ia64/sn/kernel/io_init.c +++ b/arch/ia64/sn/kernel/io_init.c @@ -192,7 +192,7 @@ sn_io_slot_fixup(struct pci_dev *dev) * if it's already in the device structure, remove it before * inserting */ - if (res->parent && res->parent->child) + if (res->parent && !list_empty(&res->parent->child)) release_resource(res); if (res->flags & IORESOURCE_IO) diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c index 7899bafab064..2bf73e27e231 100644 --- a/arch/microblaze/pci/pci-common.c +++ b/arch/microblaze/pci/pci-common.c @@ -533,7 +533,9 @@ void pci_process_bridge_OF_ranges(struct pci_controller *hose, res->flags = range.flags; res->start = range.cpu_addr; res->end = range.cpu_addr + range.size - 1; - res->parent = res->child = res->sibling = NULL; + res->parent = NULL; + INIT_LIST_HEAD(&res->child); + INIT_LIST_HEAD(&res->sibling); } } diff --git a/arch/mips/pci/pci-rc32434.c b/arch/mips/pci/pci-rc32434.c index 7f6ce6d734c0..e80283df7925 100644 --- a/arch/mips/pci/pci-rc32434.c +++ b/arch/mips/pci/pci-rc32434.c @@ -53,8 +53,8 @@ static struct resource rc32434_res_pci_mem1 = { .start = 0x50000000, .end = 0x5FFFFFFF, .flags = IORESOURCE_MEM, - .sibling = NULL, - .child = &rc32434_res_pci_mem2 + .sibling = LIST_HEAD_INIT(rc32434_res_pci_mem1.sibling), + .child = LIST_HEAD_INIT(rc32434_res_pci_mem1.child), }; static struct resource rc32434_res_pci_mem2 = { @@ -63,8 +63,8 @@ static struct resource rc32434_res_pci_mem2 = { .end = 0x6FFFFFFF, .flags = IORESOURCE_MEM, .parent = &rc32434_res_pci_mem1, - .sibling = NULL, - .child = NULL + .sibling = LIST_HEAD_INIT(rc32434_res_pci_mem2.sibling), + .child = LIST_HEAD_INIT(rc32434_res_pci_mem2.child), }; static struct resource rc32434_res_pci_io1 = { @@ -72,6 +72,8 @@ static struct resource rc32434_res_pci_io1 = { .start = 0x18800000, .end = 0x188FFFFF, .flags = IORESOURCE_IO, + .sibling = LIST_HEAD_INIT(rc32434_res_pci_io1.sibling), + .child = LIST_HEAD_INIT(rc32434_res_pci_io1.child), }; extern struct pci_ops rc32434_pci_ops; @@ -208,6 +210,8 @@ static int __init rc32434_pci_init(void) pr_info("PCI: Initializing PCI\n"); + list_add(&rc32434_res_pci_mem2.sibling, &rc32434_res_pci_mem1.child); + ioport_resource.start = rc32434_res_pci_io1.start; ioport_resource.end = rc32434_res_pci_io1.end; diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index 926035bb378d..28fbe83c9daf 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -761,7 +761,9 @@ void pci_process_bridge_OF_ranges(struct pci_controller *hose, res->flags = range.flags; res->start = range.cpu_addr; res->end = range.cpu_addr + range.size - 1; - res->parent = res->child = res->sibling = NULL; + res->parent = NULL; + INIT_LIST_HEAD(&res->child); + INIT_LIST_HEAD(&res->sibling); } } } diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c index cca9134cfa7d..99efe4e98b16 100644 --- a/arch/sparc/kernel/ioport.c +++ b/arch/sparc/kernel/ioport.c @@ -669,7 +669,7 @@ static int sparc_io_proc_show(struct seq_file *m, void *v) struct resource *root = m->private, *r; const char *nm; - for (r = root->child; r != NULL; r = r->sibling) { + list_for_each_entry(r, &root->child, sibling) { if ((nm = r->name) == NULL) nm = "???"; seq_printf(m, "%016llx-%016llx: %s\n", (unsigned long long)r->start, diff --git a/arch/xtensa/include/asm/pci-bridge.h b/arch/xtensa/include/asm/pci-bridge.h index 0b68c76ec1e6..f487b06817df 100644 --- a/arch/xtensa/include/asm/pci-bridge.h +++ b/arch/xtensa/include/asm/pci-bridge.h @@ -71,8 +71,8 @@ static inline void pcibios_init_resource(struct resource *res, res->flags = flags; res->name = name; res->parent = NULL; - res->sibling = NULL; - res->child = NULL; + INIT_LIST_HEAD(&res->child); + INIT_LIST_HEAD(&res->sibling); } diff --git a/drivers/eisa/eisa-bus.c b/drivers/eisa/eisa-bus.c index 1e8062f6dbfc..dba78f75fd06 100644 --- a/drivers/eisa/eisa-bus.c +++ b/drivers/eisa/eisa-bus.c @@ -408,6 +408,8 @@ static struct resource eisa_root_res = { .start = 0, .end = 0xffffffff, .flags = IORESOURCE_IO, + .sibling = LIST_HEAD_INIT(eisa_root_res.sibling), + .child = LIST_HEAD_INIT(eisa_root_res.child), }; static int eisa_bus_count; diff --git a/drivers/gpu/drm/drm_memory.c b/drivers/gpu/drm/drm_memory.c index d69e4fc1ee77..33baa7fa5e41 100644 --- a/drivers/gpu/drm/drm_memory.c +++ b/drivers/gpu/drm/drm_memory.c @@ -155,9 +155,8 @@ u64 drm_get_max_iomem(void) struct resource *tmp; resource_size_t max_iomem = 0; - for (tmp = iomem_resource.child; tmp; tmp = tmp->sibling) { + list_for_each_entry(tmp, &iomem_resource.child, sibling) max_iomem = max(max_iomem, tmp->end); - } return max_iomem; } diff --git a/drivers/gpu/drm/gma500/gtt.c b/drivers/gpu/drm/gma500/gtt.c index 3949b0990916..addd3bc009af 100644 --- a/drivers/gpu/drm/gma500/gtt.c +++ b/drivers/gpu/drm/gma500/gtt.c @@ -565,7 +565,7 @@ int psb_gtt_init(struct drm_device *dev, int resume) int psb_gtt_restore(struct drm_device *dev) { struct drm_psb_private *dev_priv = dev->dev_private; - struct resource *r = dev_priv->gtt_mem->child; + struct resource *r; struct gtt_range *range; unsigned int restored = 0, total = 0, size = 0; @@ -573,14 +573,13 @@ int psb_gtt_restore(struct drm_device *dev) mutex_lock(&dev_priv->gtt_mutex); psb_gtt_init(dev, 1); - while (r != NULL) { + list_for_each_entry(r, &dev_priv->gtt_mem->child, sibling) { range = container_of(r, struct gtt_range, resource); if (range->pages) { psb_gtt_insert(dev, range, 1); size += range->resource.end - range->resource.start; restored++; } - r = r->sibling; total++; } mutex_unlock(&dev_priv->gtt_mutex); diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index b10fe26c4891..d87ec5a1bc4c 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1412,9 +1412,8 @@ static acpi_status vmbus_walk_resources(struct acpi_resource *res, void *ctx) { resource_size_t start = 0; resource_size_t end = 0; - struct resource *new_res; + struct resource *new_res, *tmp; struct resource **old_res = &hyperv_mmio; - struct resource **prev_res = NULL; switch (res->type) { @@ -1461,44 +1460,36 @@ static acpi_status vmbus_walk_resources(struct acpi_resource *res, void *ctx) /* * If two ranges are adjacent, merge them. */ - do { - if (!*old_res) { - *old_res = new_res; - break; - } - - if (((*old_res)->end + 1) == new_res->start) { - (*old_res)->end = new_res->end; + if (!*old_res) { + *old_res = new_res; + return AE_OK; + } + tmp = *old_res; + list_for_each_entry_from(tmp, &tmp->parent->child, sibling) { + if ((tmp->end + 1) == new_res->start) { + tmp->end = new_res->end; kfree(new_res); break; } - if ((*old_res)->start == new_res->end + 1) { - (*old_res)->start = new_res->start; + if (tmp->start == new_res->end + 1) { + tmp->start = new_res->start; kfree(new_res); break; } - if ((*old_res)->start > new_res->end) { - new_res->sibling = *old_res; - if (prev_res) - (*prev_res)->sibling = new_res; - *old_res = new_res; + if (tmp->start > new_res->end) { + list_add(&new_res->sibling, tmp->sibling.prev); break; } - - prev_res = old_res; - old_res = &(*old_res)->sibling; - - } while (1); + } return AE_OK; } static int vmbus_acpi_remove(struct acpi_device *device) { - struct resource *cur_res; - struct resource *next_res; + struct resource *res; if (hyperv_mmio) { if (fb_mmio) { @@ -1507,10 +1498,9 @@ static int vmbus_acpi_remove(struct acpi_device *device) fb_mmio = NULL; } - for (cur_res = hyperv_mmio; cur_res; cur_res = next_res) { - next_res = cur_res->sibling; - kfree(cur_res); - } + res = hyperv_mmio; + list_for_each_entry_from(res, &res->parent->child, sibling) + kfree(res); } return 0; @@ -1596,7 +1586,8 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, } } - for (iter = hyperv_mmio; iter; iter = iter->sibling) { + iter = hyperv_mmio; + list_for_each_entry_from(iter, &iter->parent->child, sibling) { if ((iter->start >= max) || (iter->end <= min)) continue; @@ -1639,7 +1630,8 @@ void vmbus_free_mmio(resource_size_t start, resource_size_t size) struct resource *iter; down(&hyperv_mmio_lock); - for (iter = hyperv_mmio; iter; iter = iter->sibling) { + iter = hyperv_mmio; + list_for_each_entry_from(iter, &iter->parent->child, sibling) { if ((iter->start >= start + size) || (iter->end <= start)) continue; diff --git a/drivers/input/joystick/iforce/iforce-main.c b/drivers/input/joystick/iforce/iforce-main.c index daeeb4c7e3b0..5c0be27b33ff 100644 --- a/drivers/input/joystick/iforce/iforce-main.c +++ b/drivers/input/joystick/iforce/iforce-main.c @@ -305,8 +305,8 @@ int iforce_init_device(struct iforce *iforce) iforce->device_memory.end = 200; iforce->device_memory.flags = IORESOURCE_MEM; iforce->device_memory.parent = NULL; - iforce->device_memory.child = NULL; - iforce->device_memory.sibling = NULL; + INIT_LIST_HEAD(&iforce->device_memory.child); + INIT_LIST_HEAD(&iforce->device_memory.sibling); /* * Wait until device ready - until it sends its first response. diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c index 28afdd668905..f53d410d9981 100644 --- a/drivers/nvdimm/namespace_devs.c +++ b/drivers/nvdimm/namespace_devs.c @@ -637,7 +637,7 @@ static resource_size_t scan_allocate(struct nd_region *nd_region, retry: first = 0; for_each_dpa_resource(ndd, res) { - struct resource *next = res->sibling, *new_res = NULL; + struct resource *next = resource_sibling(res), *new_res = NULL; resource_size_t allocate, available = 0; enum alloc_loc loc = ALLOC_ERR; const char *action; @@ -763,7 +763,7 @@ static resource_size_t scan_allocate(struct nd_region *nd_region, * an initial "pmem-reserve pass". Only do an initial BLK allocation * when none of the DPA space is reserved. */ - if ((is_pmem || !ndd->dpa.child) && n == to_allocate) + if ((is_pmem || list_empty(&ndd->dpa.child)) && n == to_allocate) return init_dpa_allocation(label_id, nd_region, nd_mapping, n); return n; } @@ -779,7 +779,7 @@ static int merge_dpa(struct nd_region *nd_region, retry: for_each_dpa_resource(ndd, res) { int rc; - struct resource *next = res->sibling; + struct resource *next = resource_sibling(res); resource_size_t end = res->start + resource_size(res); if (!next || strcmp(res->name, label_id->id) != 0 diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h index 32e0364b48b9..da7da15e03e7 100644 --- a/drivers/nvdimm/nd.h +++ b/drivers/nvdimm/nd.h @@ -102,11 +102,10 @@ unsigned sizeof_namespace_label(struct nvdimm_drvdata *ndd); (unsigned long long) (res ? res->start : 0), ##arg) #define for_each_dpa_resource(ndd, res) \ - for (res = (ndd)->dpa.child; res; res = res->sibling) + list_for_each_entry(res, &(ndd)->dpa.child, sibling) #define for_each_dpa_resource_safe(ndd, res, next) \ - for (res = (ndd)->dpa.child, next = res ? res->sibling : NULL; \ - res; res = next, next = next ? next->sibling : NULL) + list_for_each_entry_safe(res, next, &(ndd)->dpa.child, sibling) struct nd_percpu_lane { int count; diff --git a/drivers/of/address.c b/drivers/of/address.c index 53349912ac75..e2e25719ab52 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -330,7 +330,9 @@ int of_pci_range_to_resource(struct of_pci_range *range, { int err; res->flags = range->flags; - res->parent = res->child = res->sibling = NULL; + res->parent = NULL; + INIT_LIST_HEAD(&res->child); + INIT_LIST_HEAD(&res->sibling); res->name = np->full_name; if (res->flags & IORESOURCE_IO) { diff --git a/drivers/parisc/lba_pci.c b/drivers/parisc/lba_pci.c index 69bd98421eb1..7482bdfd1959 100644 --- a/drivers/parisc/lba_pci.c +++ b/drivers/parisc/lba_pci.c @@ -170,8 +170,8 @@ lba_dump_res(struct resource *r, int d) for (i = d; i ; --i) printk(" "); printk(KERN_DEBUG "%p [%lx,%lx]/%lx\n", r, (long)r->start, (long)r->end, r->flags); - lba_dump_res(r->child, d+2); - lba_dump_res(r->sibling, d); + lba_dump_res(resource_first_child(&r->child), d+2); + lba_dump_res(resource_sibling(r), d); } diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c index 942b64fc7f1f..e3ace20345c7 100644 --- a/drivers/pci/controller/vmd.c +++ b/drivers/pci/controller/vmd.c @@ -542,14 +542,14 @@ static struct pci_ops vmd_ops = { static void vmd_attach_resources(struct vmd_dev *vmd) { - vmd->dev->resource[VMD_MEMBAR1].child = &vmd->resources[1]; - vmd->dev->resource[VMD_MEMBAR2].child = &vmd->resources[2]; + list_add(&vmd->resources[1].sibling, &vmd->dev->resource[VMD_MEMBAR1].child); + list_add(&vmd->resources[2].sibling, &vmd->dev->resource[VMD_MEMBAR2].child); } static void vmd_detach_resources(struct vmd_dev *vmd) { - vmd->dev->resource[VMD_MEMBAR1].child = NULL; - vmd->dev->resource[VMD_MEMBAR2].child = NULL; + INIT_LIST_HEAD(&vmd->dev->resource[VMD_MEMBAR1].child); + INIT_LIST_HEAD(&vmd->dev->resource[VMD_MEMBAR2].child); } /* diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index ac876e32de4b..9624dd1dfd49 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -59,6 +59,8 @@ static struct resource *get_pci_domain_busn_res(int domain_nr) r->res.start = 0; r->res.end = 0xff; r->res.flags = IORESOURCE_BUS | IORESOURCE_PCI_FIXED; + INIT_LIST_HEAD(&r->res.child); + INIT_LIST_HEAD(&r->res.sibling); list_add_tail(&r->list, &pci_domain_busn_res_list); diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c index 79b1824e83b4..8e685af8938d 100644 --- a/drivers/pci/setup-bus.c +++ b/drivers/pci/setup-bus.c @@ -2107,7 +2107,7 @@ int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type) continue; /* Ignore BARs which are still in use */ - if (res->child) + if (!list_empty(&res->child)) continue; ret = add_to_list(&saved, bridge, res, 0, 0); diff --git a/include/linux/ioport.h b/include/linux/ioport.h index dfdcd0bfe54e..b7456ae889dd 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -12,6 +12,7 @@ #ifndef __ASSEMBLY__ #include <linux/compiler.h> #include <linux/types.h> +#include <linux/list.h> /* * Resources are tree-like, allowing * nesting etc.. @@ -22,7 +23,8 @@ struct resource { const char *name; unsigned long flags; unsigned long desc; - struct resource *parent, *sibling, *child; + struct list_head child, sibling; + struct resource *parent; }; /* @@ -216,7 +218,6 @@ static inline bool resource_contains(struct resource *r1, struct resource *r2) return r1->start <= r2->start && r1->end >= r2->end; } - /* Convenience shorthand with allocation */ #define request_region(start,n,name) __request_region(&ioport_resource, (start), (n), (name), 0) #define request_muxed_region(start,n,name) __request_region(&ioport_resource, (start), (n), (name), IORESOURCE_MUXED) @@ -287,6 +288,18 @@ static inline bool resource_overlaps(struct resource *r1, struct resource *r2) return (r1->start <= r2->end && r1->end >= r2->start); } +static inline struct resource *resource_sibling(struct resource *res) +{ + if (res->parent && !list_is_last(&res->sibling, &res->parent->child)) + return list_next_entry(res, sibling); + return NULL; +} + +static inline struct resource *resource_first_child(struct list_head *head) +{ + return list_first_entry_or_null(head, struct resource, sibling); +} + #endif /* __ASSEMBLY__ */ #endif /* _LINUX_IOPORT_H */ diff --git a/kernel/resource.c b/kernel/resource.c index 81ccd19c1d9f..c96e58d3d2f8 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -31,6 +31,8 @@ struct resource ioport_resource = { .start = 0, .end = IO_SPACE_LIMIT, .flags = IORESOURCE_IO, + .sibling = LIST_HEAD_INIT(ioport_resource.sibling), + .child = LIST_HEAD_INIT(ioport_resource.child), }; EXPORT_SYMBOL(ioport_resource); @@ -39,6 +41,8 @@ struct resource iomem_resource = { .start = 0, .end = -1, .flags = IORESOURCE_MEM, + .sibling = LIST_HEAD_INIT(iomem_resource.sibling), + .child = LIST_HEAD_INIT(iomem_resource.child), }; EXPORT_SYMBOL(iomem_resource); @@ -57,20 +61,20 @@ static DEFINE_RWLOCK(resource_lock); * by boot mem after the system is up. So for reusing the resource entry * we need to remember the resource. */ -static struct resource *bootmem_resource_free; +static struct list_head bootmem_resource_free = LIST_HEAD_INIT(bootmem_resource_free); static DEFINE_SPINLOCK(bootmem_resource_lock); static struct resource *next_resource(struct resource *p, bool sibling_only) { /* Caller wants to traverse through siblings only */ if (sibling_only) - return p->sibling; + return resource_sibling(p); - if (p->child) - return p->child; - while (!p->sibling && p->parent) + if (!list_empty(&p->child)) + return resource_first_child(&p->child); + while (!resource_sibling(p) && p->parent) p = p->parent; - return p->sibling; + return resource_sibling(p); } static void *r_next(struct seq_file *m, void *v, loff_t *pos) @@ -90,7 +94,7 @@ static void *r_start(struct seq_file *m, loff_t *pos) struct resource *p = PDE_DATA(file_inode(m->file)); loff_t l = 0; read_lock(&resource_lock); - for (p = p->child; p && l < *pos; p = r_next(m, p, &l)) + for (p = resource_first_child(&p->child); p && l < *pos; p = r_next(m, p, &l)) ; return p; } @@ -153,8 +157,7 @@ static void free_resource(struct resource *res) if (!PageSlab(virt_to_head_page(res))) { spin_lock(&bootmem_resource_lock); - res->sibling = bootmem_resource_free; - bootmem_resource_free = res; + list_add(&res->sibling, &bootmem_resource_free); spin_unlock(&bootmem_resource_lock); } else { kfree(res); @@ -166,10 +169,9 @@ static struct resource *alloc_resource(gfp_t flags) struct resource *res = NULL; spin_lock(&bootmem_resource_lock); - if (bootmem_resource_free) { - res = bootmem_resource_free; - bootmem_resource_free = res->sibling; - } + res = resource_first_child(&bootmem_resource_free); + if (res) + list_del(&res->sibling); spin_unlock(&bootmem_resource_lock); if (res) @@ -177,6 +179,8 @@ static struct resource *alloc_resource(gfp_t flags) else res = kzalloc(sizeof(struct resource), flags); + INIT_LIST_HEAD(&res->child); + INIT_LIST_HEAD(&res->sibling); return res; } @@ -185,7 +189,7 @@ static struct resource * __request_resource(struct resource *root, struct resour { resource_size_t start = new->start; resource_size_t end = new->end; - struct resource *tmp, **p; + struct resource *tmp; if (end < start) return root; @@ -193,64 +197,62 @@ static struct resource * __request_resource(struct resource *root, struct resour return root; if (end > root->end) return root; - p = &root->child; - for (;;) { - tmp = *p; - if (!tmp || tmp->start > end) { - new->sibling = tmp; - *p = new; + + if (list_empty(&root->child)) { + list_add(&new->sibling, &root->child); + new->parent = root; + INIT_LIST_HEAD(&new->child); + return NULL; + } + + list_for_each_entry(tmp, &root->child, sibling) { + if (tmp->start > end) { + list_add(&new->sibling, tmp->sibling.prev); new->parent = root; + INIT_LIST_HEAD(&new->child); return NULL; } - p = &tmp->sibling; if (tmp->end < start) continue; return tmp; } + + list_add_tail(&new->sibling, &root->child); + new->parent = root; + INIT_LIST_HEAD(&new->child); + return NULL; } static int __release_resource(struct resource *old, bool release_child) { - struct resource *tmp, **p, *chd; + struct resource *tmp, *next, *chd; - p = &old->parent->child; - for (;;) { - tmp = *p; - if (!tmp) - break; + list_for_each_entry_safe(tmp, next, &old->parent->child, sibling) { if (tmp == old) { - if (release_child || !(tmp->child)) { - *p = tmp->sibling; + if (release_child || list_empty(&tmp->child)) { + list_del(&tmp->sibling); } else { - for (chd = tmp->child;; chd = chd->sibling) { + list_for_each_entry(chd, &tmp->child, sibling) chd->parent = tmp->parent; - if (!(chd->sibling)) - break; - } - *p = tmp->child; - chd->sibling = tmp->sibling; + list_splice(&tmp->child, tmp->sibling.prev); + list_del(&tmp->sibling); } + old->parent = NULL; return 0; } - p = &tmp->sibling; } return -EINVAL; } static void __release_child_resources(struct resource *r) { - struct resource *tmp, *p; + struct resource *tmp, *next; resource_size_t size; - p = r->child; - r->child = NULL; - while (p) { - tmp = p; - p = p->sibling; - + list_for_each_entry_safe(tmp, next, &r->child, sibling) { tmp->parent = NULL; - tmp->sibling = NULL; + list_del_init(&tmp->sibling); __release_child_resources(tmp); printk(KERN_DEBUG "release child resource %pR\n", tmp); @@ -259,6 +261,8 @@ static void __release_child_resources(struct resource *r) tmp->start = 0; tmp->end = size - 1; } + + INIT_LIST_HEAD(&tmp->child); } void release_child_resources(struct resource *r) @@ -343,7 +347,8 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc, read_lock(&resource_lock); - for (p = iomem_resource.child; p; p = next_resource(p, sibling_only)) { + for (p = resource_first_child(&iomem_resource.child); p; + p = next_resource(p, sibling_only)) { if ((p->flags & res->flags) != res->flags) continue; if ((desc != IORES_DESC_NONE) && (desc != p->desc)) @@ -532,7 +537,7 @@ int region_intersects(resource_size_t start, size_t size, unsigned long flags, struct resource *p; read_lock(&resource_lock); - for (p = iomem_resource.child; p ; p = p->sibling) { + list_for_each_entry(p, &iomem_resource.child, sibling) { bool is_type = (((p->flags & flags) == flags) && ((desc == IORES_DESC_NONE) || (desc == p->desc))); @@ -586,7 +591,7 @@ static int __find_resource(struct resource *root, struct resource *old, resource_size_t size, struct resource_constraint *constraint) { - struct resource *this = root->child; + struct resource *this = resource_first_child(&root->child); struct resource tmp = *new, avail, alloc; tmp.start = root->start; @@ -596,7 +601,7 @@ static int __find_resource(struct resource *root, struct resource *old, */ if (this && this->start == root->start) { tmp.start = (this == old) ? old->start : this->end + 1; - this = this->sibling; + this = resource_sibling(this); } for(;;) { if (this) @@ -632,7 +637,7 @@ next: if (!this || this->end == root->end) if (this != old) tmp.start = this->end + 1; - this = this->sibling; + this = resource_sibling(this); } return -EBUSY; } @@ -676,7 +681,7 @@ static int reallocate_resource(struct resource *root, struct resource *old, goto out; } - if (old->child) { + if (!list_empty(&old->child)) { err = -EBUSY; goto out; } @@ -757,7 +762,7 @@ struct resource *lookup_resource(struct resource *root, resource_size_t start) struct resource *res; read_lock(&resource_lock); - for (res = root->child; res; res = res->sibling) { + list_for_each_entry(res, &root->child, sibling) { if (res->start == start) break; } @@ -790,32 +795,27 @@ static struct resource * __insert_resource(struct resource *parent, struct resou break; } - for (next = first; ; next = next->sibling) { + for (next = first; ; next = resource_sibling(next)) { /* Partial overlap? Bad, and unfixable */ if (next->start < new->start || next->end > new->end) return next; - if (!next->sibling) + if (!resource_sibling(next)) break; - if (next->sibling->start > new->end) + if (resource_sibling(next)->start > new->end) break; } - new->parent = parent; - new->sibling = next->sibling; - new->child = first; + list_add(&new->sibling, &next->sibling); + INIT_LIST_HEAD(&new->child); - next->sibling = NULL; - for (next = first; next; next = next->sibling) + /* + * From first to next, they all fall into new's region, so change them + * as new's children. + */ + list_cut_position(&new->child, first->sibling.prev, &next->sibling); + list_for_each_entry(next, &new->child, sibling) next->parent = new; - if (parent->child == first) { - parent->child = new; - } else { - next = parent->child; - while (next->sibling != first) - next = next->sibling; - next->sibling = new; - } return NULL; } @@ -937,19 +937,17 @@ static int __adjust_resource(struct resource *res, resource_size_t start, if ((start < parent->start) || (end > parent->end)) goto out; - if (res->sibling && (res->sibling->start <= end)) + if (resource_sibling(res) && (resource_sibling(res)->start <= end)) goto out; - tmp = parent->child; - if (tmp != res) { - while (tmp->sibling != res) - tmp = tmp->sibling; + if (res->sibling.prev != &parent->child) { + tmp = list_prev_entry(res, sibling); if (start <= tmp->end) goto out; } skip: - for (tmp = res->child; tmp; tmp = tmp->sibling) + list_for_each_entry(tmp, &res->child, sibling) if ((tmp->start < start) || (tmp->end > end)) goto out; @@ -996,27 +994,30 @@ EXPORT_SYMBOL(adjust_resource); */ int reparent_resources(struct resource *parent, struct resource *res) { - struct resource *p, **pp; - struct resource **firstpp = NULL; + struct resource *p, *first = NULL; - for (pp = &parent->child; (p = *pp) != NULL; pp = &p->sibling) { + list_for_each_entry(p, &parent->child, sibling) { if (p->end < res->start) continue; if (res->end < p->start) break; if (p->start < res->start || p->end > res->end) return -ENOTSUPP; /* not completely contained */ - if (firstpp == NULL) - firstpp = pp; + if (first == NULL) + first = p; } - if (firstpp == NULL) + if (first == NULL) return -ECANCELED; /* didn't find any conflicting entries? */ res->parent = parent; - res->child = *firstpp; - res->sibling = *pp; - *firstpp = res; - *pp = NULL; - for (p = res->child; p != NULL; p = p->sibling) { + list_add(&res->sibling, p->sibling.prev); + INIT_LIST_HEAD(&res->child); + + /* + * From first to p's previous sibling, they all fall into + * res's region, change them as res's children. + */ + list_cut_position(&res->child, first->sibling.prev, res->sibling.prev); + list_for_each_entry(p, &res->child, sibling) { p->parent = res; pr_debug("PCI: Reparented %s %pR under %s\n", p->name, p, res->name); @@ -1216,34 +1217,32 @@ EXPORT_SYMBOL(__request_region); void __release_region(struct resource *parent, resource_size_t start, resource_size_t n) { - struct resource **p; + struct resource *res; resource_size_t end; - p = &parent->child; + res = resource_first_child(&parent->child); end = start + n - 1; write_lock(&resource_lock); for (;;) { - struct resource *res = *p; - if (!res) break; if (res->start <= start && res->end >= end) { if (!(res->flags & IORESOURCE_BUSY)) { - p = &res->child; + res = resource_first_child(&res->child); continue; } if (res->start != start || res->end != end) break; - *p = res->sibling; + list_del(&res->sibling); write_unlock(&resource_lock); if (res->flags & IORESOURCE_MUXED) wake_up(&muxed_resource_wait); free_resource(res); return; } - p = &res->sibling; + res = resource_sibling(res); } write_unlock(&resource_lock); @@ -1278,9 +1277,7 @@ EXPORT_SYMBOL(__release_region); int release_mem_region_adjustable(struct resource *parent, resource_size_t start, resource_size_t size) { - struct resource **p; - struct resource *res; - struct resource *new_res; + struct resource *res, *new_res; resource_size_t end; int ret = -EINVAL; @@ -1291,16 +1288,16 @@ int release_mem_region_adjustable(struct resource *parent, /* The alloc_resource() result gets checked later */ new_res = alloc_resource(GFP_KERNEL); - p = &parent->child; + res = resource_first_child(&parent->child); write_lock(&resource_lock); - while ((res = *p)) { + while ((res)) { if (res->start >= end) break; /* look for the next resource if it does not fit into */ if (res->start > start || res->end < end) { - p = &res->sibling; + res = resource_sibling(res); continue; } @@ -1308,14 +1305,14 @@ int release_mem_region_adjustable(struct resource *parent, break; if (!(res->flags & IORESOURCE_BUSY)) { - p = &res->child; + res = resource_first_child(&res->child); continue; } /* found the target resource; let's adjust accordingly */ if (res->start == start && res->end == end) { /* free the whole entry */ - *p = res->sibling; + list_del(&res->sibling); free_resource(res); ret = 0; } else if (res->start == start && res->end != end) { @@ -1338,14 +1335,13 @@ int release_mem_region_adjustable(struct resource *parent, new_res->flags = res->flags; new_res->desc = res->desc; new_res->parent = res->parent; - new_res->sibling = res->sibling; - new_res->child = NULL; + INIT_LIST_HEAD(&new_res->child); ret = __adjust_resource(res, res->start, start - res->start); if (ret) break; - res->sibling = new_res; + list_add(&new_res->sibling, &res->sibling); new_res = NULL; } @@ -1526,7 +1522,7 @@ static int __init reserve_setup(char *str) res->end = io_start + io_num - 1; res->flags |= IORESOURCE_BUSY; res->desc = IORES_DESC_NONE; - res->child = NULL; + INIT_LIST_HEAD(&res->child); if (request_resource(parent, res) == 0) reserved = x+1; } @@ -1546,7 +1542,7 @@ int iomem_map_sanity_check(resource_size_t addr, unsigned long size) loff_t l; read_lock(&resource_lock); - for (p = p->child; p ; p = r_next(NULL, p, &l)) { + for (p = resource_first_child(&p->child); p; p = r_next(NULL, p, &l)) { /* * We can probably skip the resources without * IORESOURCE_IO attribute? @@ -1602,7 +1598,7 @@ bool iomem_is_exclusive(u64 addr) addr = addr & PAGE_MASK; read_lock(&resource_lock); - for (p = p->child; p ; p = r_next(NULL, p, &l)) { + for (p = resource_first_child(&p->child); p; p = r_next(NULL, p, &l)) { /* * We can probably skip the resources without * IORESOURCE_IO attribute? -- 2.13.6 ^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH v7 3/4] resource: add walk_system_ram_res_rev() 2018-07-18 2:49 ` Baoquan He (?) (?) @ 2018-07-18 2:49 ` Baoquan He -1 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-18 2:49 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, dan.j.williams-ral2JQCrhuEAvxtiuMwx3w, nicolas.pitre-QSEj5FYQhm4dnm+yROfE0A, josh-iaAMLnmF4UmaiuxdJuQwMA, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w, bp-l3A5Bk7waGM, andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w Cc: brijesh.singh-5C7GfCeVMHo, devicetree-u79uwXL29TY76Z2rM5mHXA, airlied-cv59FeDIM0c, linux-pci-u79uwXL29TY76Z2rM5mHXA, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w, jcmvbkbc-Re5JQEeQqe8AvxtiuMwx3w, baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/, kys-0li6OtcxBFHby3iVrkZq2A, frowand.list-Re5JQEeQqe8AvxtiuMwx3w, lorenzo.pieralisi-5wv7dgnIgG8, sthemmin-0li6OtcxBFHby3iVrkZq2A, Baoquan He, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, patrik.r.jakobsson-Re5JQEeQqe8AvxtiuMwx3w, linux-input-u79uwXL29TY76Z2rM5mHXA, gustavo-THi1TnShQwVAfugRpC6u6w, dyoung-H+wXaHxf7aLQT0dZR+AlfA, thomas.lendacky-5C7GfCeVMHo, haiyangz-0li6OtcxBFHby3iVrkZq2A, maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA, jglisse-H+wXaHxf7aLQT0dZR+AlfA, seanpaul-F7+t8E8rja9g9hUCZPvPmw, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, tglx-hfZtesqFncYOwBW4kG4KsQ, yinghai-DgEjT+Ai2ygdnm+yROfE0A, jonathan.derrick-ral2JQCrhuEAvxtiuMwx3w, chris-YvXeqwSYzG2sTnJN9+BGXg, monstr-pSz03upnqPeHXe+LvDLADg, linux-parisc-u79uwXL29TY76Z2rM5mHXA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, ebiederm-aS9lmoZGLiVWk0Htik3J/w, devel-tBiZLqfeLfOHmIFyCCdPziST3g8Odh+X, linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ, davem-fT/PcQaiUtIeIZ0/mPfg9Q This function, being a variant of walk_system_ram_res() introduced in commit 8c86e70acead ("resource: provide new functions to walk through resources"), walks through a list of all the resources of System RAM in reversed order, i.e., from higher to lower. It will be used in kexec_file code. Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Wei Yang <richard.weiyang@gmail.com> --- include/linux/ioport.h | 3 +++ kernel/resource.c | 40 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 43 insertions(+) diff --git a/include/linux/ioport.h b/include/linux/ioport.h index b7456ae889dd..066cc263e2cc 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -279,6 +279,9 @@ extern int walk_system_ram_res(u64 start, u64 end, void *arg, int (*func)(struct resource *, void *)); extern int +walk_system_ram_res_rev(u64 start, u64 end, void *arg, + int (*func)(struct resource *, void *)); +extern int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end, void *arg, int (*func)(struct resource *, void *)); diff --git a/kernel/resource.c b/kernel/resource.c index c96e58d3d2f8..3e18f24b90c4 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -23,6 +23,8 @@ #include <linux/pfn.h> #include <linux/mm.h> #include <linux/resource_ext.h> +#include <linux/string.h> +#include <linux/vmalloc.h> #include <asm/io.h> @@ -443,6 +445,44 @@ int walk_system_ram_res(u64 start, u64 end, void *arg, } /* + * This function, being a variant of walk_system_ram_res(), calls the @func + * callback against all memory ranges of type System RAM which are marked as + * IORESOURCE_SYSTEM_RAM and IORESOUCE_BUSY in reversed order, i.e., from + * higher to lower. + */ +int walk_system_ram_res_rev(u64 start, u64 end, void *arg, + int (*func)(struct resource *, void *)) +{ + unsigned long flags; + struct resource *res; + int ret = -1; + + flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + + read_lock(&resource_lock); + list_for_each_entry_reverse(res, &iomem_resource.child, sibling) { + if (start >= end) + break; + if ((res->flags & flags) != flags) + continue; + if (res->desc != IORES_DESC_NONE) + continue; + if (res->end < start) + break; + + if ((res->end >= start) && (res->start < end)) { + ret = (*func)(res, arg); + if (ret) + break; + } + end = res->start - 1; + + } + read_unlock(&resource_lock); + return ret; +} + +/* * This function calls the @func callback against all memory ranges, which * are ranges marked as IORESOURCE_MEM and IORESOUCE_BUSY. */ -- 2.13.6 _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH v7 3/4] resource: add walk_system_ram_res_rev() @ 2018-07-18 2:49 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-18 2:49 UTC (permalink / raw) To: linux-kernel, akpm, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko Cc: patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, ebiederm, vgoyal, dyoung, yinghai, monstr, davem, chris, jcmvbkbc, gustavo, maarten.lankhorst, seanpaul, linux-parisc, linuxppc-dev, Baoquan He This function, being a variant of walk_system_ram_res() introduced in commit 8c86e70acead ("resource: provide new functions to walk through resources"), walks through a list of all the resources of System RAM in reversed order, i.e., from higher to lower. It will be used in kexec_file code. Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Wei Yang <richard.weiyang@gmail.com> --- include/linux/ioport.h | 3 +++ kernel/resource.c | 40 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 43 insertions(+) diff --git a/include/linux/ioport.h b/include/linux/ioport.h index b7456ae889dd..066cc263e2cc 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -279,6 +279,9 @@ extern int walk_system_ram_res(u64 start, u64 end, void *arg, int (*func)(struct resource *, void *)); extern int +walk_system_ram_res_rev(u64 start, u64 end, void *arg, + int (*func)(struct resource *, void *)); +extern int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end, void *arg, int (*func)(struct resource *, void *)); diff --git a/kernel/resource.c b/kernel/resource.c index c96e58d3d2f8..3e18f24b90c4 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -23,6 +23,8 @@ #include <linux/pfn.h> #include <linux/mm.h> #include <linux/resource_ext.h> +#include <linux/string.h> +#include <linux/vmalloc.h> #include <asm/io.h> @@ -443,6 +445,44 @@ int walk_system_ram_res(u64 start, u64 end, void *arg, } /* + * This function, being a variant of walk_system_ram_res(), calls the @func + * callback against all memory ranges of type System RAM which are marked as + * IORESOURCE_SYSTEM_RAM and IORESOUCE_BUSY in reversed order, i.e., from + * higher to lower. + */ +int walk_system_ram_res_rev(u64 start, u64 end, void *arg, + int (*func)(struct resource *, void *)) +{ + unsigned long flags; + struct resource *res; + int ret = -1; + + flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + + read_lock(&resource_lock); + list_for_each_entry_reverse(res, &iomem_resource.child, sibling) { + if (start >= end) + break; + if ((res->flags & flags) != flags) + continue; + if (res->desc != IORES_DESC_NONE) + continue; + if (res->end < start) + break; + + if ((res->end >= start) && (res->start < end)) { + ret = (*func)(res, arg); + if (ret) + break; + } + end = res->start - 1; + + } + read_unlock(&resource_lock); + return ret; +} + +/* * This function calls the @func callback against all memory ranges, which * are ranges marked as IORESOURCE_MEM and IORESOUCE_BUSY. */ -- 2.13.6 ^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH v7 3/4] resource: add walk_system_ram_res_rev() @ 2018-07-18 2:49 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-18 2:49 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, dan.j.williams-ral2JQCrhuEAvxtiuMwx3w, nicolas.pitre-QSEj5FYQhm4dnm+yROfE0A, josh-iaAMLnmF4UmaiuxdJuQwMA, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w, bp-l3A5Bk7waGM, andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w Cc: brijesh.singh-5C7GfCeVMHo, devicetree-u79uwXL29TY76Z2rM5mHXA, airlied-cv59FeDIM0c, linux-pci-u79uwXL29TY76Z2rM5mHXA, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w, jcmvbkbc-Re5JQEeQqe8AvxtiuMwx3w, baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/, kys-0li6OtcxBFHby3iVrkZq2A, frowand.list-Re5JQEeQqe8AvxtiuMwx3w, lorenzo.pieralisi-5wv7dgnIgG8, sthemmin-0li6OtcxBFHby3iVrkZq2A, Baoquan He, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, patrik.r.jakobsson-Re5JQEeQqe8AvxtiuMwx3w, linux-input-u79uwXL29TY76Z2rM5mHXA, gustavo-THi1TnShQwVAfugRpC6u6w, dyoung-H+wXaHxf7aLQT0dZR+AlfA, thomas.lendacky-5C7GfCeVMHo, haiyangz-0li6OtcxBFHby3iVrkZq2A, maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA, jglisse-H+wXaHxf7aLQT0dZR+AlfA, seanpaul-F7+t8E8rja9g9hUCZPvPmw, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, tglx-hfZtesqFncYOwBW4kG4KsQ, yinghai-DgEjT+Ai2ygdnm+yROfE0A, jonathan.derrick-ral2JQCrhuEAvxtiuMwx3w, chris-YvXeqwSYzG2sTnJN9+BGXg, monstr-pSz03upnqPeHXe+LvDLADg, linux-parisc-u79uwXL29TY76Z2rM5mHXA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, ebiederm-aS9lmoZGLiVWk0Htik3J/w, devel-tBiZLqfeLfOHmIFyCCdPziST3g8Odh+X, linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ, davem-fT/PcQaiUtIeIZ0/mPfg9Q VGhpcyBmdW5jdGlvbiwgYmVpbmcgYSB2YXJpYW50IG9mIHdhbGtfc3lzdGVtX3JhbV9yZXMoKSBp bnRyb2R1Y2VkIGluCmNvbW1pdCA4Yzg2ZTcwYWNlYWQgKCJyZXNvdXJjZTogcHJvdmlkZSBuZXcg ZnVuY3Rpb25zIHRvIHdhbGsgdGhyb3VnaApyZXNvdXJjZXMiKSwgd2Fsa3MgdGhyb3VnaCBhIGxp c3Qgb2YgYWxsIHRoZSByZXNvdXJjZXMgb2YgU3lzdGVtIFJBTQppbiByZXZlcnNlZCBvcmRlciwg aS5lLiwgZnJvbSBoaWdoZXIgdG8gbG93ZXIuCgpJdCB3aWxsIGJlIHVzZWQgaW4ga2V4ZWNfZmls ZSBjb2RlLgoKU2lnbmVkLW9mZi1ieTogQmFvcXVhbiBIZSA8YmhlQHJlZGhhdC5jb20+CkNjOiBB bmRyZXcgTW9ydG9uIDxha3BtQGxpbnV4LWZvdW5kYXRpb24ub3JnPgpDYzogVGhvbWFzIEdsZWl4 bmVyIDx0Z2x4QGxpbnV0cm9uaXguZGU+CkNjOiBCcmlqZXNoIFNpbmdoIDxicmlqZXNoLnNpbmdo QGFtZC5jb20+CkNjOiAiSsOpcsO0bWUgR2xpc3NlIiA8amdsaXNzZUByZWRoYXQuY29tPgpDYzog Qm9yaXNsYXYgUGV0a292IDxicEBzdXNlLmRlPgpDYzogVG9tIExlbmRhY2t5IDx0aG9tYXMubGVu ZGFja3lAYW1kLmNvbT4KQ2M6IFdlaSBZYW5nIDxyaWNoYXJkLndlaXlhbmdAZ21haWwuY29tPgot LS0KIGluY2x1ZGUvbGludXgvaW9wb3J0LmggfCAgMyArKysKIGtlcm5lbC9yZXNvdXJjZS5jICAg ICAgfCA0MCArKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrCiAyIGZpbGVz IGNoYW5nZWQsIDQzIGluc2VydGlvbnMoKykKCmRpZmYgLS1naXQgYS9pbmNsdWRlL2xpbnV4L2lv cG9ydC5oIGIvaW5jbHVkZS9saW51eC9pb3BvcnQuaAppbmRleCBiNzQ1NmFlODg5ZGQuLjA2NmNj MjYzZTJjYyAxMDA2NDQKLS0tIGEvaW5jbHVkZS9saW51eC9pb3BvcnQuaAorKysgYi9pbmNsdWRl L2xpbnV4L2lvcG9ydC5oCkBAIC0yNzksNiArMjc5LDkgQEAgZXh0ZXJuIGludAogd2Fsa19zeXN0 ZW1fcmFtX3Jlcyh1NjQgc3RhcnQsIHU2NCBlbmQsIHZvaWQgKmFyZywKIAkJICAgIGludCAoKmZ1 bmMpKHN0cnVjdCByZXNvdXJjZSAqLCB2b2lkICopKTsKIGV4dGVybiBpbnQKK3dhbGtfc3lzdGVt X3JhbV9yZXNfcmV2KHU2NCBzdGFydCwgdTY0IGVuZCwgdm9pZCAqYXJnLAorCQkJaW50ICgqZnVu Yykoc3RydWN0IHJlc291cmNlICosIHZvaWQgKikpOworZXh0ZXJuIGludAogd2Fsa19pb21lbV9y ZXNfZGVzYyh1bnNpZ25lZCBsb25nIGRlc2MsIHVuc2lnbmVkIGxvbmcgZmxhZ3MsIHU2NCBzdGFy dCwgdTY0IGVuZCwKIAkJICAgIHZvaWQgKmFyZywgaW50ICgqZnVuYykoc3RydWN0IHJlc291cmNl ICosIHZvaWQgKikpOwogCmRpZmYgLS1naXQgYS9rZXJuZWwvcmVzb3VyY2UuYyBiL2tlcm5lbC9y ZXNvdXJjZS5jCmluZGV4IGM5NmU1OGQzZDJmOC4uM2UxOGYyNGI5MGM0IDEwMDY0NAotLS0gYS9r ZXJuZWwvcmVzb3VyY2UuYworKysgYi9rZXJuZWwvcmVzb3VyY2UuYwpAQCAtMjMsNiArMjMsOCBA QAogI2luY2x1ZGUgPGxpbnV4L3Bmbi5oPgogI2luY2x1ZGUgPGxpbnV4L21tLmg+CiAjaW5jbHVk ZSA8bGludXgvcmVzb3VyY2VfZXh0Lmg+CisjaW5jbHVkZSA8bGludXgvc3RyaW5nLmg+CisjaW5j bHVkZSA8bGludXgvdm1hbGxvYy5oPgogI2luY2x1ZGUgPGFzbS9pby5oPgogCiAKQEAgLTQ0Myw2 ICs0NDUsNDQgQEAgaW50IHdhbGtfc3lzdGVtX3JhbV9yZXModTY0IHN0YXJ0LCB1NjQgZW5kLCB2 b2lkICphcmcsCiB9CiAKIC8qCisgKiBUaGlzIGZ1bmN0aW9uLCBiZWluZyBhIHZhcmlhbnQgb2Yg d2Fsa19zeXN0ZW1fcmFtX3JlcygpLCBjYWxscyB0aGUgQGZ1bmMKKyAqIGNhbGxiYWNrIGFnYWlu c3QgYWxsIG1lbW9yeSByYW5nZXMgb2YgdHlwZSBTeXN0ZW0gUkFNIHdoaWNoIGFyZSBtYXJrZWQg YXMKKyAqIElPUkVTT1VSQ0VfU1lTVEVNX1JBTSBhbmQgSU9SRVNPVUNFX0JVU1kgaW4gcmV2ZXJz ZWQgb3JkZXIsIGkuZS4sIGZyb20KKyAqIGhpZ2hlciB0byBsb3dlci4KKyAqLworaW50IHdhbGtf c3lzdGVtX3JhbV9yZXNfcmV2KHU2NCBzdGFydCwgdTY0IGVuZCwgdm9pZCAqYXJnLAorCQkJCWlu dCAoKmZ1bmMpKHN0cnVjdCByZXNvdXJjZSAqLCB2b2lkICopKQoreworCXVuc2lnbmVkIGxvbmcg ZmxhZ3M7CisJc3RydWN0IHJlc291cmNlICpyZXM7CisJaW50IHJldCA9IC0xOworCisJZmxhZ3Mg PSBJT1JFU09VUkNFX1NZU1RFTV9SQU0gfCBJT1JFU09VUkNFX0JVU1k7CisKKwlyZWFkX2xvY2so JnJlc291cmNlX2xvY2spOworCWxpc3RfZm9yX2VhY2hfZW50cnlfcmV2ZXJzZShyZXMsICZpb21l bV9yZXNvdXJjZS5jaGlsZCwgc2libGluZykgeworCQlpZiAoc3RhcnQgPj0gZW5kKQorCQkJYnJl YWs7CisJCWlmICgocmVzLT5mbGFncyAmIGZsYWdzKSAhPSBmbGFncykKKwkJCWNvbnRpbnVlOwor CQlpZiAocmVzLT5kZXNjICE9IElPUkVTX0RFU0NfTk9ORSkKKwkJCWNvbnRpbnVlOworCQlpZiAo cmVzLT5lbmQgPCBzdGFydCkKKwkJCWJyZWFrOworCisJCWlmICgocmVzLT5lbmQgPj0gc3RhcnQp ICYmIChyZXMtPnN0YXJ0IDwgZW5kKSkgeworCQkJcmV0ID0gKCpmdW5jKShyZXMsIGFyZyk7CisJ CQlpZiAocmV0KQorCQkJCWJyZWFrOworCQl9CisJCWVuZCA9IHJlcy0+c3RhcnQgLSAxOworCisJ fQorCXJlYWRfdW5sb2NrKCZyZXNvdXJjZV9sb2NrKTsKKwlyZXR1cm4gcmV0OworfQorCisvKgog ICogVGhpcyBmdW5jdGlvbiBjYWxscyB0aGUgQGZ1bmMgY2FsbGJhY2sgYWdhaW5zdCBhbGwgbWVt b3J5IHJhbmdlcywgd2hpY2gKICAqIGFyZSByYW5nZXMgbWFya2VkIGFzIElPUkVTT1VSQ0VfTUVN IGFuZCBJT1JFU09VQ0VfQlVTWS4KICAqLwotLSAKMi4xMy42CgpfX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fXwpMaW51eC1udmRpbW0gbWFpbGluZyBsaXN0Ckxp bnV4LW52ZGltbUBsaXN0cy4wMS5vcmcKaHR0cHM6Ly9saXN0cy4wMS5vcmcvbWFpbG1hbi9saXN0 aW5mby9saW51eC1udmRpbW0K ^ permalink raw reply [flat|nested] 83+ messages in thread
* [PATCH v7 3/4] resource: add walk_system_ram_res_rev() @ 2018-07-18 2:49 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-18 2:49 UTC (permalink / raw) To: linux-kernel, akpm, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko Cc: brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, jcmvbkbc, baiyaowei, kys, frowand.list, lorenzo.pieralisi, sthemmin, Baoquan He, linux-nvdimm, patrik.r.jakobsson, linux-input, gustavo, dyoung, thomas.lendacky, haiyangz, maarten.lankhorst, jglisse, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, ebiederm, devel, linuxppc-dev, davem This function, being a variant of walk_system_ram_res() introduced in commit 8c86e70acead ("resource: provide new functions to walk through resources"), walks through a list of all the resources of System RAM in reversed order, i.e., from higher to lower. It will be used in kexec_file code. Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Wei Yang <richard.weiyang@gmail.com> --- include/linux/ioport.h | 3 +++ kernel/resource.c | 40 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 43 insertions(+) diff --git a/include/linux/ioport.h b/include/linux/ioport.h index b7456ae889dd..066cc263e2cc 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -279,6 +279,9 @@ extern int walk_system_ram_res(u64 start, u64 end, void *arg, int (*func)(struct resource *, void *)); extern int +walk_system_ram_res_rev(u64 start, u64 end, void *arg, + int (*func)(struct resource *, void *)); +extern int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end, void *arg, int (*func)(struct resource *, void *)); diff --git a/kernel/resource.c b/kernel/resource.c index c96e58d3d2f8..3e18f24b90c4 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -23,6 +23,8 @@ #include <linux/pfn.h> #include <linux/mm.h> #include <linux/resource_ext.h> +#include <linux/string.h> +#include <linux/vmalloc.h> #include <asm/io.h> @@ -443,6 +445,44 @@ int walk_system_ram_res(u64 start, u64 end, void *arg, } /* + * This function, being a variant of walk_system_ram_res(), calls the @func + * callback against all memory ranges of type System RAM which are marked as + * IORESOURCE_SYSTEM_RAM and IORESOUCE_BUSY in reversed order, i.e., from + * higher to lower. + */ +int walk_system_ram_res_rev(u64 start, u64 end, void *arg, + int (*func)(struct resource *, void *)) +{ + unsigned long flags; + struct resource *res; + int ret = -1; + + flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + + read_lock(&resource_lock); + list_for_each_entry_reverse(res, &iomem_resource.child, sibling) { + if (start >= end) + break; + if ((res->flags & flags) != flags) + continue; + if (res->desc != IORES_DESC_NONE) + continue; + if (res->end < start) + break; + + if ((res->end >= start) && (res->start < end)) { + ret = (*func)(res, arg); + if (ret) + break; + } + end = res->start - 1; + + } + read_unlock(&resource_lock); + return ret; +} + +/* * This function calls the @func callback against all memory ranges, which * are ranges marked as IORESOURCE_MEM and IORESOUCE_BUSY. */ -- 2.13.6 _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required [not found] ` <20180718024944.577-1-bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2018-07-18 2:49 ` Baoquan He 2018-07-18 2:49 ` Baoquan He @ 2018-07-18 2:49 ` Baoquan He 2 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-18 2:49 UTC (permalink / raw) To: linux-kernel, akpm, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko Cc: brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, keith.busch, jcmvbkbc, baiyaowei, kys, frowand.list, lorenzo.pieralisi, sthemmin, Baoquan He, linux-nvdimm, patrik.r.jakobsson, linux-input, gustavo, dyoung, vgoyal, thomas.lendacky, haiyangz, maarten.lankhorst, jglisse, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, ebiederm, devel, linuxppc-dev, davem For kexec_file loading, if kexec_buf.top_down is 'true', the memory which is used to load kernel/initrd/purgatory is supposed to be allocated from top to down. This is what we have been doing all along in the old kexec loading interface and the kexec loading is still default setting in some distributions. However, the current kexec_file loading interface doesn't do like this. The function arch_kexec_walk_mem() it calls ignores checking kexec_buf.top_down, but calls walk_system_ram_res() directly to go through all resources of System RAM from bottom to up, to try to find memory region which can contain the specific kexec buffer, then call locate_mem_hole_callback() to allocate memory in that found memory region from top to down. This brings confusion especially when KASLR is widely supported , users have to make clear why kexec/kdump kernel loading position is different between these two interfaces in order to exclude unnecessary noises. Hence these two interfaces need be unified on behaviour. Here add checking if kexec_buf.top_down is 'true' in arch_kexec_walk_mem(), if yes, call the newly added walk_system_ram_res_rev() to find memory region from top to down to load kernel. Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: kexec@lists.infradead.org --- kernel/kexec_file.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index c6a3b6851372..75226c1d08ce 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -518,6 +518,8 @@ int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf, IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY, crashk_res.start, crashk_res.end, kbuf, func); + else if (kbuf->top_down) + return walk_system_ram_res_rev(0, ULONG_MAX, kbuf, func); else return walk_system_ram_res(0, ULONG_MAX, kbuf, func); } -- 2.13.6 _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-18 2:49 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-18 2:49 UTC (permalink / raw) To: linux-kernel, akpm, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko Cc: patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, ebiederm, vgoyal, dyoung, yinghai, monstr, davem, chris, jcmvbkbc, gustavo, maarten.lankhorst, seanpaul, linux-parisc, linuxppc-dev, Baoquan He, kexec For kexec_file loading, if kexec_buf.top_down is 'true', the memory which is used to load kernel/initrd/purgatory is supposed to be allocated from top to down. This is what we have been doing all along in the old kexec loading interface and the kexec loading is still default setting in some distributions. However, the current kexec_file loading interface doesn't do like this. The function arch_kexec_walk_mem() it calls ignores checking kexec_buf.top_down, but calls walk_system_ram_res() directly to go through all resources of System RAM from bottom to up, to try to find memory region which can contain the specific kexec buffer, then call locate_mem_hole_callback() to allocate memory in that found memory region from top to down. This brings confusion especially when KASLR is widely supported , users have to make clear why kexec/kdump kernel loading position is different between these two interfaces in order to exclude unnecessary noises. Hence these two interfaces need be unified on behaviour. Here add checking if kexec_buf.top_down is 'true' in arch_kexec_walk_mem(), if yes, call the newly added walk_system_ram_res_rev() to find memory region from top to down to load kernel. Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: kexec@lists.infradead.org --- kernel/kexec_file.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index c6a3b6851372..75226c1d08ce 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -518,6 +518,8 @@ int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf, IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY, crashk_res.start, crashk_res.end, kbuf, func); + else if (kbuf->top_down) + return walk_system_ram_res_rev(0, ULONG_MAX, kbuf, func); else return walk_system_ram_res(0, ULONG_MAX, kbuf, func); } -- 2.13.6 ^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-18 2:49 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-18 2:49 UTC (permalink / raw) To: linux-kernel, akpm, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko Cc: brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, jcmvbkbc, baiyaowei, kys, frowand.list, lorenzo.pieralisi, sthemmin, Baoquan He, linux-nvdimm, patrik.r.jakobsson, linux-input, gustavo, dyoung, thomas.lendacky, haiyangz, maarten.lankhorst, jglisse, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, ebiederm, devel, linuxppc-dev, davem For kexec_file loading, if kexec_buf.top_down is 'true', the memory which is used to load kernel/initrd/purgatory is supposed to be allocated from top to down. This is what we have been doing all along in the old kexec loading interface and the kexec loading is still default setting in some distributions. However, the current kexec_file loading interface doesn't do like this. The function arch_kexec_walk_mem() it calls ignores checking kexec_buf.top_down, but calls walk_system_ram_res() directly to go through all resources of System RAM from bottom to up, to try to find memory region which can contain the specific kexec buffer, then call locate_mem_hole_callback() to allocate memory in that found memory region from top to down. This brings confusion especially when KASLR is widely supported , users have to make clear why kexec/kdump kernel loading position is different between these two interfaces in order to exclude unnecessary noises. Hence these two interfaces need be unified on behaviour. Here add checking if kexec_buf.top_down is 'true' in arch_kexec_walk_mem(), if yes, call the newly added walk_system_ram_res_rev() to find memory region from top to down to load kernel. Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: kexec@lists.infradead.org --- kernel/kexec_file.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index c6a3b6851372..75226c1d08ce 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -518,6 +518,8 @@ int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf, IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY, crashk_res.start, crashk_res.end, kbuf, func); + else if (kbuf->top_down) + return walk_system_ram_res_rev(0, ULONG_MAX, kbuf, func); else return walk_system_ram_res(0, ULONG_MAX, kbuf, func); } -- 2.13.6 _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-18 2:49 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-18 2:49 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, dan.j.williams-ral2JQCrhuEAvxtiuMwx3w, nicolas.pitre-QSEj5FYQhm4dnm+yROfE0A, josh-iaAMLnmF4UmaiuxdJuQwMA, fengguang.wu-ral2JQCrhuEAvxtiuMwx3w, bp-l3A5Bk7waGM, andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w Cc: brijesh.singh-5C7GfCeVMHo, devicetree-u79uwXL29TY76Z2rM5mHXA, airlied-cv59FeDIM0c, linux-pci-u79uwXL29TY76Z2rM5mHXA, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w, jcmvbkbc-Re5JQEeQqe8AvxtiuMwx3w, baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/, kys-0li6OtcxBFHby3iVrkZq2A, frowand.list-Re5JQEeQqe8AvxtiuMwx3w, lorenzo.pieralisi-5wv7dgnIgG8, sthemmin-0li6OtcxBFHby3iVrkZq2A, Baoquan He, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, patrik.r.jakobsson-Re5JQEeQqe8AvxtiuMwx3w, linux-input-u79uwXL29TY76Z2rM5mHXA, gustavo-THi1TnShQwVAfugRpC6u6w, dyoung-H+wXaHxf7aLQT0dZR+AlfA, thomas.lendacky-5C7GfCeVMHo, haiyangz-0li6OtcxBFHby3iVrkZq2A, maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA, jglisse-H+wXaHxf7aLQT0dZR+AlfA, seanpaul-F7+t8E8rja9g9hUCZPvPmw, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, tglx-hfZtesqFncYOwBW4kG4KsQ, yinghai-DgEjT+Ai2ygdnm+yROfE0A, jonathan.derrick-ral2JQCrhuEAvxtiuMwx3w, chris-YvXeqwSYzG2sTnJN9+BGXg, monstr-pSz03upnqPeHXe+LvDLADg, linux-parisc-u79uwXL29TY76Z2rM5mHXA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, ebiederm-aS9lmoZGLiVWk0Htik3J/w, devel-tBiZLqfeLfOHmIFyCCdPziST3g8Odh+X, linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ, davem For kexec_file loading, if kexec_buf.top_down is 'true', the memory which is used to load kernel/initrd/purgatory is supposed to be allocated from top to down. This is what we have been doing all along in the old kexec loading interface and the kexec loading is still default setting in some distributions. However, the current kexec_file loading interface doesn't do like this. The function arch_kexec_walk_mem() it calls ignores checking kexec_buf.top_down, but calls walk_system_ram_res() directly to go through all resources of System RAM from bottom to up, to try to find memory region which can contain the specific kexec buffer, then call locate_mem_hole_callback() to allocate memory in that found memory region from top to down. This brings confusion especially when KASLR is widely supported , users have to make clear why kexec/kdump kernel loading position is different between these two interfaces in order to exclude unnecessary noises. Hence these two interfaces need be unified on behaviour. Here add checking if kexec_buf.top_down is 'true' in arch_kexec_walk_mem(), if yes, call the newly added walk_system_ram_res_rev() to find memory region from top to down to load kernel. Signed-off-by: Baoquan He <bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Cc: Eric Biederman <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Cc: Dave Young <dyoung-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Cc: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> Cc: Yinghai Lu <yinghai-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Cc: kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org --- kernel/kexec_file.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index c6a3b6851372..75226c1d08ce 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -518,6 +518,8 @@ int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf, IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY, crashk_res.start, crashk_res.end, kbuf, func); + else if (kbuf->top_down) + return walk_system_ram_res_rev(0, ULONG_MAX, kbuf, func); else return walk_system_ram_res(0, ULONG_MAX, kbuf, func); } -- 2.13.6 ^ permalink raw reply related [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required 2018-07-18 2:49 ` Baoquan He (?) (?) @ 2018-07-18 22:33 ` Andrew Morton -1 siblings, 0 replies; 83+ messages in thread From: Andrew Morton @ 2018-07-18 22:33 UTC (permalink / raw) To: Baoquan He Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, keith.busch, jcmvbkbc, baiyaowei, kys, frowand.list, tglx, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, vgoyal, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, dan.j.williams, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, fengguang.wu, linuxppc-dev, davem On Wed, 18 Jul 2018 10:49:44 +0800 Baoquan He <bhe@redhat.com> wrote: > For kexec_file loading, if kexec_buf.top_down is 'true', the memory which > is used to load kernel/initrd/purgatory is supposed to be allocated from > top to down. This is what we have been doing all along in the old kexec > loading interface and the kexec loading is still default setting in some > distributions. However, the current kexec_file loading interface doesn't > do like this. The function arch_kexec_walk_mem() it calls ignores checking > kexec_buf.top_down, but calls walk_system_ram_res() directly to go through > all resources of System RAM from bottom to up, to try to find memory region > which can contain the specific kexec buffer, then call locate_mem_hole_callback() > to allocate memory in that found memory region from top to down. This brings > confusion especially when KASLR is widely supported , users have to make clear > why kexec/kdump kernel loading position is different between these two > interfaces in order to exclude unnecessary noises. Hence these two interfaces > need be unified on behaviour. As far as I can tell, the above is the whole reason for the patchset, yes? To avoid confusing users. Is that sufficient? Can we instead simplify their lives by providing better documentation or informative printks or better Kconfig text, etc? And who *are* the people who are performing this configuration? Random system administrators? Linux distro engineers? If the latter then they presumably aren't easily confused! In other words, I'm trying to understand how much benefit this patchset will provide to our users as a whole. _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-18 22:33 ` Andrew Morton 0 siblings, 0 replies; 83+ messages in thread From: Andrew Morton @ 2018-07-18 22:33 UTC (permalink / raw) To: Baoquan He Cc: linux-kernel, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko, patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, ebiederm, vgoyal, dyoung, yinghai, monstr, davem, chris, jcmvbkbc, gustavo, maarten.lankhorst, seanpaul, linux-parisc, linuxppc-dev, kexec On Wed, 18 Jul 2018 10:49:44 +0800 Baoquan He <bhe@redhat.com> wrote: > For kexec_file loading, if kexec_buf.top_down is 'true', the memory which > is used to load kernel/initrd/purgatory is supposed to be allocated from > top to down. This is what we have been doing all along in the old kexec > loading interface and the kexec loading is still default setting in some > distributions. However, the current kexec_file loading interface doesn't > do like this. The function arch_kexec_walk_mem() it calls ignores checking > kexec_buf.top_down, but calls walk_system_ram_res() directly to go through > all resources of System RAM from bottom to up, to try to find memory region > which can contain the specific kexec buffer, then call locate_mem_hole_callback() > to allocate memory in that found memory region from top to down. This brings > confusion especially when KASLR is widely supported , users have to make clear > why kexec/kdump kernel loading position is different between these two > interfaces in order to exclude unnecessary noises. Hence these two interfaces > need be unified on behaviour. As far as I can tell, the above is the whole reason for the patchset, yes? To avoid confusing users. Is that sufficient? Can we instead simplify their lives by providing better documentation or informative printks or better Kconfig text, etc? And who *are* the people who are performing this configuration? Random system administrators? Linux distro engineers? If the latter then they presumably aren't easily confused! In other words, I'm trying to understand how much benefit this patchset will provide to our users as a whole. ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-18 22:33 ` Andrew Morton 0 siblings, 0 replies; 83+ messages in thread From: Andrew Morton @ 2018-07-18 22:33 UTC (permalink / raw) To: Baoquan He Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, jcmvbkbc, baiyaowei, kys, frowand.list, tglx, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, fengguang.wu, linuxppc-dev, davem On Wed, 18 Jul 2018 10:49:44 +0800 Baoquan He <bhe@redhat.com> wrote: > For kexec_file loading, if kexec_buf.top_down is 'true', the memory which > is used to load kernel/initrd/purgatory is supposed to be allocated from > top to down. This is what we have been doing all along in the old kexec > loading interface and the kexec loading is still default setting in some > distributions. However, the current kexec_file loading interface doesn't > do like this. The function arch_kexec_walk_mem() it calls ignores checking > kexec_buf.top_down, but calls walk_system_ram_res() directly to go through > all resources of System RAM from bottom to up, to try to find memory region > which can contain the specific kexec buffer, then call locate_mem_hole_callback() > to allocate memory in that found memory region from top to down. This brings > confusion especially when KASLR is widely supported , users have to make clear > why kexec/kdump kernel loading position is different between these two > interfaces in order to exclude unnecessary noises. Hence these two interfaces > need be unified on behaviour. As far as I can tell, the above is the whole reason for the patchset, yes? To avoid confusing users. Is that sufficient? Can we instead simplify their lives by providing better documentation or informative printks or better Kconfig text, etc? And who *are* the people who are performing this configuration? Random system administrators? Linux distro engineers? If the latter then they presumably aren't easily confused! In other words, I'm trying to understand how much benefit this patchset will provide to our users as a whole. _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-18 22:33 ` Andrew Morton 0 siblings, 0 replies; 83+ messages in thread From: Andrew Morton @ 2018-07-18 22:33 UTC (permalink / raw) To: Baoquan He Cc: linux-kernel, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko, patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, ebiederm On Wed, 18 Jul 2018 10:49:44 +0800 Baoquan He <bhe@redhat.com> wrote: > For kexec_file loading, if kexec_buf.top_down is 'true', the memory which > is used to load kernel/initrd/purgatory is supposed to be allocated from > top to down. This is what we have been doing all along in the old kexec > loading interface and the kexec loading is still default setting in some > distributions. However, the current kexec_file loading interface doesn't > do like this. The function arch_kexec_walk_mem() it calls ignores checking > kexec_buf.top_down, but calls walk_system_ram_res() directly to go through > all resources of System RAM from bottom to up, to try to find memory region > which can contain the specific kexec buffer, then call locate_mem_hole_callback() > to allocate memory in that found memory region from top to down. This brings > confusion especially when KASLR is widely supported , users have to make clear > why kexec/kdump kernel loading position is different between these two > interfaces in order to exclude unnecessary noises. Hence these two interfaces > need be unified on behaviour. As far as I can tell, the above is the whole reason for the patchset, yes? To avoid confusing users. Is that sufficient? Can we instead simplify their lives by providing better documentation or informative printks or better Kconfig text, etc? And who *are* the people who are performing this configuration? Random system administrators? Linux distro engineers? If the latter then they presumably aren't easily confused! In other words, I'm trying to understand how much benefit this patchset will provide to our users as a whole. ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-19 15:17 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-19 15:17 UTC (permalink / raw) To: Andrew Morton Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, keith.busch, jcmvbkbc, baiyaowei, kys, frowand.list, tglx, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, vgoyal, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, dan.j.williams, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, fengguang.wu, linuxppc-dev, davem Hi Andrew, On 07/18/18 at 03:33pm, Andrew Morton wrote: > On Wed, 18 Jul 2018 10:49:44 +0800 Baoquan He <bhe@redhat.com> wrote: > > > For kexec_file loading, if kexec_buf.top_down is 'true', the memory which > > is used to load kernel/initrd/purgatory is supposed to be allocated from > > top to down. This is what we have been doing all along in the old kexec > > loading interface and the kexec loading is still default setting in some > > distributions. However, the current kexec_file loading interface doesn't > > do like this. The function arch_kexec_walk_mem() it calls ignores checking > > kexec_buf.top_down, but calls walk_system_ram_res() directly to go through > > all resources of System RAM from bottom to up, to try to find memory region > > which can contain the specific kexec buffer, then call locate_mem_hole_callback() > > to allocate memory in that found memory region from top to down. This brings > > confusion especially when KASLR is widely supported , users have to make clear > > why kexec/kdump kernel loading position is different between these two > > interfaces in order to exclude unnecessary noises. Hence these two interfaces > > need be unified on behaviour. > > As far as I can tell, the above is the whole reason for the patchset, > yes? To avoid confusing users. In fact, it's not just trying to avoid confusing users. Kexec loading and kexec_file loading are just do the same thing in essence. Just we need do kernel image verification on uefi system, have to port kexec loading code to kernel. Kexec has been a formal feature in our distro, and customers owning those kind of very large machine can make use of this feature to speed up the reboot process. On uefi machine, the kexec_file loading will search place to put kernel under 4G from top to down. As we know, the 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume it. It may have possibility to not be able to find a usable space for kernel/initrd. From the top down of the whole memory space, we don't have this worry. And at the first post, I just posted below with AKASHI's walk_system_ram_res_rev() version. Later you suggested to use list_head to link child sibling of resource, see what the code change looks like. http://lkml.kernel.org/r/20180322033722.9279-1-bhe@redhat.com Then I posted v2 http://lkml.kernel.org/r/20180408024724.16812-1-bhe@redhat.com Rob Herring mentioned that other components which has this tree struct have planned to do the same thing, replacing the singly linked list with list_head to link resource child sibling. Just quote Rob's words as below. I think this could be another reason. ~~~~~ From Rob The DT struct device_node also has the same tree structure with parent, child, sibling pointers and converting to list_head had been on the todo list for a while. ACPI also has some tree walking functions (drivers/acpi/acpica/pstree.c). Perhaps there should be a common tree struct and helpers defined either on top of list_head or a ~~~~~ new struct if that saves some size. > > Is that sufficient? Can we instead simplify their lives by providing > better documentation or informative printks or better Kconfig text, > etc? > > And who *are* the people who are performing this configuration? Random > system administrators? Linux distro engineers? If the latter then > they presumably aren't easily confused! Kexec was invented for kernel developer to speed up their kernel rebooting. Now high end sever admin, kernel developer and QE are also keen to use it to reboot large box for faster feature testing, bug debugging. Kernel dev could know this well, about kernel loading position, admin or QE might not be aware of it very well. > > In other words, I'm trying to understand how much benefit this patchset > will provide to our users as a whole. Understood. The list_head replacing patch truly involes too many code changes, it's risky. I am willing to try any idea from reviewers, won't persuit they have to be accepted finally. If don't have a try, we don't know what it looks like, and what impact it may have. I am fine to take AKASHI's simple version of walk_system_ram_res_rev() to lower risk, even though it could be a little bit low efficient. Thanks Baoquan _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-19 15:17 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-19 15:17 UTC (permalink / raw) To: Andrew Morton Cc: linux-kernel, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko, patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, ebiederm, vgoyal, dyoung, yinghai, monstr, davem, chris, jcmvbkbc, gustavo, maarten.lankhorst, seanpaul, linux-parisc, linuxppc-dev, kexec Hi Andrew, On 07/18/18 at 03:33pm, Andrew Morton wrote: > On Wed, 18 Jul 2018 10:49:44 +0800 Baoquan He <bhe@redhat.com> wrote: > > > For kexec_file loading, if kexec_buf.top_down is 'true', the memory which > > is used to load kernel/initrd/purgatory is supposed to be allocated from > > top to down. This is what we have been doing all along in the old kexec > > loading interface and the kexec loading is still default setting in some > > distributions. However, the current kexec_file loading interface doesn't > > do like this. The function arch_kexec_walk_mem() it calls ignores checking > > kexec_buf.top_down, but calls walk_system_ram_res() directly to go through > > all resources of System RAM from bottom to up, to try to find memory region > > which can contain the specific kexec buffer, then call locate_mem_hole_callback() > > to allocate memory in that found memory region from top to down. This brings > > confusion especially when KASLR is widely supported , users have to make clear > > why kexec/kdump kernel loading position is different between these two > > interfaces in order to exclude unnecessary noises. Hence these two interfaces > > need be unified on behaviour. > > As far as I can tell, the above is the whole reason for the patchset, > yes? To avoid confusing users. In fact, it's not just trying to avoid confusing users. Kexec loading and kexec_file loading are just do the same thing in essence. Just we need do kernel image verification on uefi system, have to port kexec loading code to kernel. Kexec has been a formal feature in our distro, and customers owning those kind of very large machine can make use of this feature to speed up the reboot process. On uefi machine, the kexec_file loading will search place to put kernel under 4G from top to down. As we know, the 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume it. It may have possibility to not be able to find a usable space for kernel/initrd. From the top down of the whole memory space, we don't have this worry. And at the first post, I just posted below with AKASHI's walk_system_ram_res_rev() version. Later you suggested to use list_head to link child sibling of resource, see what the code change looks like. http://lkml.kernel.org/r/20180322033722.9279-1-bhe@redhat.com Then I posted v2 http://lkml.kernel.org/r/20180408024724.16812-1-bhe@redhat.com Rob Herring mentioned that other components which has this tree struct have planned to do the same thing, replacing the singly linked list with list_head to link resource child sibling. Just quote Rob's words as below. I think this could be another reason. ~~~~~ From Rob The DT struct device_node also has the same tree structure with parent, child, sibling pointers and converting to list_head had been on the todo list for a while. ACPI also has some tree walking functions (drivers/acpi/acpica/pstree.c). Perhaps there should be a common tree struct and helpers defined either on top of list_head or a ~~~~~ new struct if that saves some size. > > Is that sufficient? Can we instead simplify their lives by providing > better documentation or informative printks or better Kconfig text, > etc? > > And who *are* the people who are performing this configuration? Random > system administrators? Linux distro engineers? If the latter then > they presumably aren't easily confused! Kexec was invented for kernel developer to speed up their kernel rebooting. Now high end sever admin, kernel developer and QE are also keen to use it to reboot large box for faster feature testing, bug debugging. Kernel dev could know this well, about kernel loading position, admin or QE might not be aware of it very well. > > In other words, I'm trying to understand how much benefit this patchset > will provide to our users as a whole. Understood. The list_head replacing patch truly involes too many code changes, it's risky. I am willing to try any idea from reviewers, won't persuit they have to be accepted finally. If don't have a try, we don't know what it looks like, and what impact it may have. I am fine to take AKASHI's simple version of walk_system_ram_res_rev() to lower risk, even though it could be a little bit low efficient. Thanks Baoquan ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-19 15:17 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-19 15:17 UTC (permalink / raw) To: Andrew Morton Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, jcmvbkbc, baiyaowei, kys, frowand.list, tglx, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, fengguang.wu, linuxppc-dev, davem Hi Andrew, On 07/18/18 at 03:33pm, Andrew Morton wrote: > On Wed, 18 Jul 2018 10:49:44 +0800 Baoquan He <bhe@redhat.com> wrote: > > > For kexec_file loading, if kexec_buf.top_down is 'true', the memory which > > is used to load kernel/initrd/purgatory is supposed to be allocated from > > top to down. This is what we have been doing all along in the old kexec > > loading interface and the kexec loading is still default setting in some > > distributions. However, the current kexec_file loading interface doesn't > > do like this. The function arch_kexec_walk_mem() it calls ignores checking > > kexec_buf.top_down, but calls walk_system_ram_res() directly to go through > > all resources of System RAM from bottom to up, to try to find memory region > > which can contain the specific kexec buffer, then call locate_mem_hole_callback() > > to allocate memory in that found memory region from top to down. This brings > > confusion especially when KASLR is widely supported , users have to make clear > > why kexec/kdump kernel loading position is different between these two > > interfaces in order to exclude unnecessary noises. Hence these two interfaces > > need be unified on behaviour. > > As far as I can tell, the above is the whole reason for the patchset, > yes? To avoid confusing users. In fact, it's not just trying to avoid confusing users. Kexec loading and kexec_file loading are just do the same thing in essence. Just we need do kernel image verification on uefi system, have to port kexec loading code to kernel. Kexec has been a formal feature in our distro, and customers owning those kind of very large machine can make use of this feature to speed up the reboot process. On uefi machine, the kexec_file loading will search place to put kernel under 4G from top to down. As we know, the 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume it. It may have possibility to not be able to find a usable space for kernel/initrd. From the top down of the whole memory space, we don't have this worry. And at the first post, I just posted below with AKASHI's walk_system_ram_res_rev() version. Later you suggested to use list_head to link child sibling of resource, see what the code change looks like. http://lkml.kernel.org/r/20180322033722.9279-1-bhe@redhat.com Then I posted v2 http://lkml.kernel.org/r/20180408024724.16812-1-bhe@redhat.com Rob Herring mentioned that other components which has this tree struct have planned to do the same thing, replacing the singly linked list with list_head to link resource child sibling. Just quote Rob's words as below. I think this could be another reason. ~~~~~ From Rob The DT struct device_node also has the same tree structure with parent, child, sibling pointers and converting to list_head had been on the todo list for a while. ACPI also has some tree walking functions (drivers/acpi/acpica/pstree.c). Perhaps there should be a common tree struct and helpers defined either on top of list_head or a ~~~~~ new struct if that saves some size. > > Is that sufficient? Can we instead simplify their lives by providing > better documentation or informative printks or better Kconfig text, > etc? > > And who *are* the people who are performing this configuration? Random > system administrators? Linux distro engineers? If the latter then > they presumably aren't easily confused! Kexec was invented for kernel developer to speed up their kernel rebooting. Now high end sever admin, kernel developer and QE are also keen to use it to reboot large box for faster feature testing, bug debugging. Kernel dev could know this well, about kernel loading position, admin or QE might not be aware of it very well. > > In other words, I'm trying to understand how much benefit this patchset > will provide to our users as a whole. Understood. The list_head replacing patch truly involes too many code changes, it's risky. I am willing to try any idea from reviewers, won't persuit they have to be accepted finally. If don't have a try, we don't know what it looks like, and what impact it may have. I am fine to take AKASHI's simple version of walk_system_ram_res_rev() to lower risk, even though it could be a little bit low efficient. Thanks Baoquan _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-19 15:17 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-19 15:17 UTC (permalink / raw) To: Andrew Morton Cc: nicolas.pitre-QSEj5FYQhm4dnm+yROfE0A, brijesh.singh-5C7GfCeVMHo, devicetree-u79uwXL29TY76Z2rM5mHXA, airlied-cv59FeDIM0c, linux-pci-u79uwXL29TY76Z2rM5mHXA, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w, jcmvbkbc-Re5JQEeQqe8AvxtiuMwx3w, baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/, kys-0li6OtcxBFHby3iVrkZq2A, frowand.list-Re5JQEeQqe8AvxtiuMwx3w, tglx-hfZtesqFncYOwBW4kG4KsQ, lorenzo.pieralisi-5wv7dgnIgG8, sthemmin-0li6OtcxBFHby3iVrkZq2A, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, patrik.r.jakobsson-Re5JQEeQqe8AvxtiuMwx3w, andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w, linux-input-u79uwXL29TY76Z2rM5mHXA, gustavo-THi1TnShQwVAfugRpC6u6w, bp-l3A5Bk7waGM, dyoung-H+wXaHxf7aLQT0dZR+AlfA, thomas.lendacky-5C7GfCeVMHo, haiyangz-0li6OtcxBFHby3iVrkZq2A, maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA, josh-iaAMLnmF4UmaiuxdJuQwMA, jglisse-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, seanpaul-F7+t8E8rja9g9hUCZPvPmw, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, yinghai-DgEjT+Ai2ygdnm+yROfE0A, jonathan.derrick-ral2JQCrhuEAvxtiuMwx3w, chris-YvXeqwSYzG2sTnJN9+BGXg, monstr-pSz03upnqPeHXe+LvDLADg, linux-parisc-u79uwXL29TY76Z2rM5mHXA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-kernel Hi Andrew, On 07/18/18 at 03:33pm, Andrew Morton wrote: > On Wed, 18 Jul 2018 10:49:44 +0800 Baoquan He <bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > > For kexec_file loading, if kexec_buf.top_down is 'true', the memory which > > is used to load kernel/initrd/purgatory is supposed to be allocated from > > top to down. This is what we have been doing all along in the old kexec > > loading interface and the kexec loading is still default setting in some > > distributions. However, the current kexec_file loading interface doesn't > > do like this. The function arch_kexec_walk_mem() it calls ignores checking > > kexec_buf.top_down, but calls walk_system_ram_res() directly to go through > > all resources of System RAM from bottom to up, to try to find memory region > > which can contain the specific kexec buffer, then call locate_mem_hole_callback() > > to allocate memory in that found memory region from top to down. This brings > > confusion especially when KASLR is widely supported , users have to make clear > > why kexec/kdump kernel loading position is different between these two > > interfaces in order to exclude unnecessary noises. Hence these two interfaces > > need be unified on behaviour. > > As far as I can tell, the above is the whole reason for the patchset, > yes? To avoid confusing users. In fact, it's not just trying to avoid confusing users. Kexec loading and kexec_file loading are just do the same thing in essence. Just we need do kernel image verification on uefi system, have to port kexec loading code to kernel. Kexec has been a formal feature in our distro, and customers owning those kind of very large machine can make use of this feature to speed up the reboot process. On uefi machine, the kexec_file loading will search place to put kernel under 4G from top to down. As we know, the 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume it. It may have possibility to not be able to find a usable space for kernel/initrd. From the top down of the whole memory space, we don't have this worry. And at the first post, I just posted below with AKASHI's walk_system_ram_res_rev() version. Later you suggested to use list_head to link child sibling of resource, see what the code change looks like. http://lkml.kernel.org/r/20180322033722.9279-1-bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org Then I posted v2 http://lkml.kernel.org/r/20180408024724.16812-1-bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org Rob Herring mentioned that other components which has this tree struct have planned to do the same thing, replacing the singly linked list with list_head to link resource child sibling. Just quote Rob's words as below. I think this could be another reason. ~~~~~ From Rob The DT struct device_node also has the same tree structure with parent, child, sibling pointers and converting to list_head had been on the todo list for a while. ACPI also has some tree walking functions (drivers/acpi/acpica/pstree.c). Perhaps there should be a common tree struct and helpers defined either on top of list_head or a ~~~~~ new struct if that saves some size. > > Is that sufficient? Can we instead simplify their lives by providing > better documentation or informative printks or better Kconfig text, > etc? > > And who *are* the people who are performing this configuration? Random > system administrators? Linux distro engineers? If the latter then > they presumably aren't easily confused! Kexec was invented for kernel developer to speed up their kernel rebooting. Now high end sever admin, kernel developer and QE are also keen to use it to reboot large box for faster feature testing, bug debugging. Kernel dev could know this well, about kernel loading position, admin or QE might not be aware of it very well. > > In other words, I'm trying to understand how much benefit this patchset > will provide to our users as a whole. Understood. The list_head replacing patch truly involes too many code changes, it's risky. I am willing to try any idea from reviewers, won't persuit they have to be accepted finally. If don't have a try, we don't know what it looks like, and what impact it may have. I am fine to take AKASHI's simple version of walk_system_ram_res_rev() to lower risk, even though it could be a little bit low efficient. Thanks Baoquan ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-19 19:44 ` Andrew Morton 0 siblings, 0 replies; 83+ messages in thread From: Andrew Morton @ 2018-07-19 19:44 UTC (permalink / raw) To: Baoquan He Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, keith.busch, jcmvbkbc, baiyaowei, kys, frowand.list, tglx, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, vgoyal, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, dan.j.williams, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, fengguang.wu, linuxppc-dev, davem On Thu, 19 Jul 2018 23:17:53 +0800 Baoquan He <bhe@redhat.com> wrote: > Hi Andrew, > > On 07/18/18 at 03:33pm, Andrew Morton wrote: > > On Wed, 18 Jul 2018 10:49:44 +0800 Baoquan He <bhe@redhat.com> wrote: > > > > > For kexec_file loading, if kexec_buf.top_down is 'true', the memory which > > > is used to load kernel/initrd/purgatory is supposed to be allocated from > > > top to down. This is what we have been doing all along in the old kexec > > > loading interface and the kexec loading is still default setting in some > > > distributions. However, the current kexec_file loading interface doesn't > > > do like this. The function arch_kexec_walk_mem() it calls ignores checking > > > kexec_buf.top_down, but calls walk_system_ram_res() directly to go through > > > all resources of System RAM from bottom to up, to try to find memory region > > > which can contain the specific kexec buffer, then call locate_mem_hole_callback() > > > to allocate memory in that found memory region from top to down. This brings > > > confusion especially when KASLR is widely supported , users have to make clear > > > why kexec/kdump kernel loading position is different between these two > > > interfaces in order to exclude unnecessary noises. Hence these two interfaces > > > need be unified on behaviour. > > > > As far as I can tell, the above is the whole reason for the patchset, > > yes? To avoid confusing users. > > > In fact, it's not just trying to avoid confusing users. Kexec loading > and kexec_file loading are just do the same thing in essence. Just we > need do kernel image verification on uefi system, have to port kexec > loading code to kernel. > > Kexec has been a formal feature in our distro, and customers owning > those kind of very large machine can make use of this feature to speed > up the reboot process. On uefi machine, the kexec_file loading will > search place to put kernel under 4G from top to down. As we know, the > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > it. It may have possibility to not be able to find a usable space for > kernel/initrd. From the top down of the whole memory space, we don't > have this worry. > > And at the first post, I just posted below with AKASHI's > walk_system_ram_res_rev() version. Later you suggested to use > list_head to link child sibling of resource, see what the code change > looks like. > http://lkml.kernel.org/r/20180322033722.9279-1-bhe@redhat.com > > Then I posted v2 > http://lkml.kernel.org/r/20180408024724.16812-1-bhe@redhat.com > Rob Herring mentioned that other components which has this tree struct > have planned to do the same thing, replacing the singly linked list with > list_head to link resource child sibling. Just quote Rob's words as > below. I think this could be another reason. > > ~~~~~ From Rob > The DT struct device_node also has the same tree structure with > parent, child, sibling pointers and converting to list_head had been > on the todo list for a while. ACPI also has some tree walking > functions (drivers/acpi/acpica/pstree.c). Perhaps there should be a > common tree struct and helpers defined either on top of list_head or a > ~~~~~ > new struct if that saves some size. Please let's get all this into the changelogs? > > > > Is that sufficient? Can we instead simplify their lives by providing > > better documentation or informative printks or better Kconfig text, > > etc? > > > > And who *are* the people who are performing this configuration? Random > > system administrators? Linux distro engineers? If the latter then > > they presumably aren't easily confused! > > Kexec was invented for kernel developer to speed up their kernel > rebooting. Now high end sever admin, kernel developer and QE are also > keen to use it to reboot large box for faster feature testing, bug > debugging. Kernel dev could know this well, about kernel loading > position, admin or QE might not be aware of it very well. > > > > > In other words, I'm trying to understand how much benefit this patchset > > will provide to our users as a whole. > > Understood. The list_head replacing patch truly involes too many code > changes, it's risky. I am willing to try any idea from reviewers, won't > persuit they have to be accepted finally. If don't have a try, we don't > know what it looks like, and what impact it may have. I am fine to take > AKASHI's simple version of walk_system_ram_res_rev() to lower risk, even > though it could be a little bit low efficient. The larger patch produces a better result. We can handle it ;) _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-19 19:44 ` Andrew Morton 0 siblings, 0 replies; 83+ messages in thread From: Andrew Morton @ 2018-07-19 19:44 UTC (permalink / raw) To: Baoquan He Cc: linux-kernel, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko, patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, ebiederm, vgoyal, dyoung, yinghai, monstr, davem, chris, jcmvbkbc, gustavo, maarten.lankhorst, seanpaul, linux-parisc, linuxppc-dev, kexec On Thu, 19 Jul 2018 23:17:53 +0800 Baoquan He <bhe@redhat.com> wrote: > Hi Andrew, > > On 07/18/18 at 03:33pm, Andrew Morton wrote: > > On Wed, 18 Jul 2018 10:49:44 +0800 Baoquan He <bhe@redhat.com> wrote: > > > > > For kexec_file loading, if kexec_buf.top_down is 'true', the memory which > > > is used to load kernel/initrd/purgatory is supposed to be allocated from > > > top to down. This is what we have been doing all along in the old kexec > > > loading interface and the kexec loading is still default setting in some > > > distributions. However, the current kexec_file loading interface doesn't > > > do like this. The function arch_kexec_walk_mem() it calls ignores checking > > > kexec_buf.top_down, but calls walk_system_ram_res() directly to go through > > > all resources of System RAM from bottom to up, to try to find memory region > > > which can contain the specific kexec buffer, then call locate_mem_hole_callback() > > > to allocate memory in that found memory region from top to down. This brings > > > confusion especially when KASLR is widely supported , users have to make clear > > > why kexec/kdump kernel loading position is different between these two > > > interfaces in order to exclude unnecessary noises. Hence these two interfaces > > > need be unified on behaviour. > > > > As far as I can tell, the above is the whole reason for the patchset, > > yes? To avoid confusing users. > > > In fact, it's not just trying to avoid confusing users. Kexec loading > and kexec_file loading are just do the same thing in essence. Just we > need do kernel image verification on uefi system, have to port kexec > loading code to kernel. > > Kexec has been a formal feature in our distro, and customers owning > those kind of very large machine can make use of this feature to speed > up the reboot process. On uefi machine, the kexec_file loading will > search place to put kernel under 4G from top to down. As we know, the > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > it. It may have possibility to not be able to find a usable space for > kernel/initrd. From the top down of the whole memory space, we don't > have this worry. > > And at the first post, I just posted below with AKASHI's > walk_system_ram_res_rev() version. Later you suggested to use > list_head to link child sibling of resource, see what the code change > looks like. > http://lkml.kernel.org/r/20180322033722.9279-1-bhe@redhat.com > > Then I posted v2 > http://lkml.kernel.org/r/20180408024724.16812-1-bhe@redhat.com > Rob Herring mentioned that other components which has this tree struct > have planned to do the same thing, replacing the singly linked list with > list_head to link resource child sibling. Just quote Rob's words as > below. I think this could be another reason. > > ~~~~~ From Rob > The DT struct device_node also has the same tree structure with > parent, child, sibling pointers and converting to list_head had been > on the todo list for a while. ACPI also has some tree walking > functions (drivers/acpi/acpica/pstree.c). Perhaps there should be a > common tree struct and helpers defined either on top of list_head or a > ~~~~~ > new struct if that saves some size. Please let's get all this into the changelogs? > > > > Is that sufficient? Can we instead simplify their lives by providing > > better documentation or informative printks or better Kconfig text, > > etc? > > > > And who *are* the people who are performing this configuration? Random > > system administrators? Linux distro engineers? If the latter then > > they presumably aren't easily confused! > > Kexec was invented for kernel developer to speed up their kernel > rebooting. Now high end sever admin, kernel developer and QE are also > keen to use it to reboot large box for faster feature testing, bug > debugging. Kernel dev could know this well, about kernel loading > position, admin or QE might not be aware of it very well. > > > > > In other words, I'm trying to understand how much benefit this patchset > > will provide to our users as a whole. > > Understood. The list_head replacing patch truly involes too many code > changes, it's risky. I am willing to try any idea from reviewers, won't > persuit they have to be accepted finally. If don't have a try, we don't > know what it looks like, and what impact it may have. I am fine to take > AKASHI's simple version of walk_system_ram_res_rev() to lower risk, even > though it could be a little bit low efficient. The larger patch produces a better result. We can handle it ;) ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-19 19:44 ` Andrew Morton 0 siblings, 0 replies; 83+ messages in thread From: Andrew Morton @ 2018-07-19 19:44 UTC (permalink / raw) To: Baoquan He Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, jcmvbkbc, baiyaowei, kys, frowand.list, tglx, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, fengguang.wu, linuxppc-dev, davem On Thu, 19 Jul 2018 23:17:53 +0800 Baoquan He <bhe@redhat.com> wrote: > Hi Andrew, > > On 07/18/18 at 03:33pm, Andrew Morton wrote: > > On Wed, 18 Jul 2018 10:49:44 +0800 Baoquan He <bhe@redhat.com> wrote: > > > > > For kexec_file loading, if kexec_buf.top_down is 'true', the memory which > > > is used to load kernel/initrd/purgatory is supposed to be allocated from > > > top to down. This is what we have been doing all along in the old kexec > > > loading interface and the kexec loading is still default setting in some > > > distributions. However, the current kexec_file loading interface doesn't > > > do like this. The function arch_kexec_walk_mem() it calls ignores checking > > > kexec_buf.top_down, but calls walk_system_ram_res() directly to go through > > > all resources of System RAM from bottom to up, to try to find memory region > > > which can contain the specific kexec buffer, then call locate_mem_hole_callback() > > > to allocate memory in that found memory region from top to down. This brings > > > confusion especially when KASLR is widely supported , users have to make clear > > > why kexec/kdump kernel loading position is different between these two > > > interfaces in order to exclude unnecessary noises. Hence these two interfaces > > > need be unified on behaviour. > > > > As far as I can tell, the above is the whole reason for the patchset, > > yes? To avoid confusing users. > > > In fact, it's not just trying to avoid confusing users. Kexec loading > and kexec_file loading are just do the same thing in essence. Just we > need do kernel image verification on uefi system, have to port kexec > loading code to kernel. > > Kexec has been a formal feature in our distro, and customers owning > those kind of very large machine can make use of this feature to speed > up the reboot process. On uefi machine, the kexec_file loading will > search place to put kernel under 4G from top to down. As we know, the > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > it. It may have possibility to not be able to find a usable space for > kernel/initrd. From the top down of the whole memory space, we don't > have this worry. > > And at the first post, I just posted below with AKASHI's > walk_system_ram_res_rev() version. Later you suggested to use > list_head to link child sibling of resource, see what the code change > looks like. > http://lkml.kernel.org/r/20180322033722.9279-1-bhe@redhat.com > > Then I posted v2 > http://lkml.kernel.org/r/20180408024724.16812-1-bhe@redhat.com > Rob Herring mentioned that other components which has this tree struct > have planned to do the same thing, replacing the singly linked list with > list_head to link resource child sibling. Just quote Rob's words as > below. I think this could be another reason. > > ~~~~~ From Rob > The DT struct device_node also has the same tree structure with > parent, child, sibling pointers and converting to list_head had been > on the todo list for a while. ACPI also has some tree walking > functions (drivers/acpi/acpica/pstree.c). Perhaps there should be a > common tree struct and helpers defined either on top of list_head or a > ~~~~~ > new struct if that saves some size. Please let's get all this into the changelogs? > > > > Is that sufficient? Can we instead simplify their lives by providing > > better documentation or informative printks or better Kconfig text, > > etc? > > > > And who *are* the people who are performing this configuration? Random > > system administrators? Linux distro engineers? If the latter then > > they presumably aren't easily confused! > > Kexec was invented for kernel developer to speed up their kernel > rebooting. Now high end sever admin, kernel developer and QE are also > keen to use it to reboot large box for faster feature testing, bug > debugging. Kernel dev could know this well, about kernel loading > position, admin or QE might not be aware of it very well. > > > > > In other words, I'm trying to understand how much benefit this patchset > > will provide to our users as a whole. > > Understood. The list_head replacing patch truly involes too many code > changes, it's risky. I am willing to try any idea from reviewers, won't > persuit they have to be accepted finally. If don't have a try, we don't > know what it looks like, and what impact it may have. I am fine to take > AKASHI's simple version of walk_system_ram_res_rev() to lower risk, even > though it could be a little bit low efficient. The larger patch produces a better result. We can handle it ;) _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-19 19:44 ` Andrew Morton 0 siblings, 0 replies; 83+ messages in thread From: Andrew Morton @ 2018-07-19 19:44 UTC (permalink / raw) To: Baoquan He Cc: nicolas.pitre-QSEj5FYQhm4dnm+yROfE0A, brijesh.singh-5C7GfCeVMHo, devicetree-u79uwXL29TY76Z2rM5mHXA, airlied-cv59FeDIM0c, linux-pci-u79uwXL29TY76Z2rM5mHXA, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w, jcmvbkbc-Re5JQEeQqe8AvxtiuMwx3w, baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/, kys-0li6OtcxBFHby3iVrkZq2A, frowand.list-Re5JQEeQqe8AvxtiuMwx3w, tglx-hfZtesqFncYOwBW4kG4KsQ, lorenzo.pieralisi-5wv7dgnIgG8, sthemmin-0li6OtcxBFHby3iVrkZq2A, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, patrik.r.jakobsson-Re5JQEeQqe8AvxtiuMwx3w, andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w, linux-input-u79uwXL29TY76Z2rM5mHXA, gustavo-THi1TnShQwVAfugRpC6u6w, bp-l3A5Bk7waGM, dyoung-H+wXaHxf7aLQT0dZR+AlfA, thomas.lendacky-5C7GfCeVMHo, haiyangz-0li6OtcxBFHby3iVrkZq2A, maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA, josh-iaAMLnmF4UmaiuxdJuQwMA, jglisse-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, seanpaul-F7+t8E8rja9g9hUCZPvPmw, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, yinghai-DgEjT+Ai2ygdnm+yROfE0A, jonathan.derrick-ral2JQCrhuEAvxtiuMwx3w, chris-YvXeqwSYzG2sTnJN9+BGXg, monstr-pSz03upnqPeHXe+LvDLADg, linux-parisc-u79uwXL29TY76Z2rM5mHXA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-kernel On Thu, 19 Jul 2018 23:17:53 +0800 Baoquan He <bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > Hi Andrew, > > On 07/18/18 at 03:33pm, Andrew Morton wrote: > > On Wed, 18 Jul 2018 10:49:44 +0800 Baoquan He <bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > > > > For kexec_file loading, if kexec_buf.top_down is 'true', the memory which > > > is used to load kernel/initrd/purgatory is supposed to be allocated from > > > top to down. This is what we have been doing all along in the old kexec > > > loading interface and the kexec loading is still default setting in some > > > distributions. However, the current kexec_file loading interface doesn't > > > do like this. The function arch_kexec_walk_mem() it calls ignores checking > > > kexec_buf.top_down, but calls walk_system_ram_res() directly to go through > > > all resources of System RAM from bottom to up, to try to find memory region > > > which can contain the specific kexec buffer, then call locate_mem_hole_callback() > > > to allocate memory in that found memory region from top to down. This brings > > > confusion especially when KASLR is widely supported , users have to make clear > > > why kexec/kdump kernel loading position is different between these two > > > interfaces in order to exclude unnecessary noises. Hence these two interfaces > > > need be unified on behaviour. > > > > As far as I can tell, the above is the whole reason for the patchset, > > yes? To avoid confusing users. > > > In fact, it's not just trying to avoid confusing users. Kexec loading > and kexec_file loading are just do the same thing in essence. Just we > need do kernel image verification on uefi system, have to port kexec > loading code to kernel. > > Kexec has been a formal feature in our distro, and customers owning > those kind of very large machine can make use of this feature to speed > up the reboot process. On uefi machine, the kexec_file loading will > search place to put kernel under 4G from top to down. As we know, the > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > it. It may have possibility to not be able to find a usable space for > kernel/initrd. From the top down of the whole memory space, we don't > have this worry. > > And at the first post, I just posted below with AKASHI's > walk_system_ram_res_rev() version. Later you suggested to use > list_head to link child sibling of resource, see what the code change > looks like. > http://lkml.kernel.org/r/20180322033722.9279-1-bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org > > Then I posted v2 > http://lkml.kernel.org/r/20180408024724.16812-1-bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org > Rob Herring mentioned that other components which has this tree struct > have planned to do the same thing, replacing the singly linked list with > list_head to link resource child sibling. Just quote Rob's words as > below. I think this could be another reason. > > ~~~~~ From Rob > The DT struct device_node also has the same tree structure with > parent, child, sibling pointers and converting to list_head had been > on the todo list for a while. ACPI also has some tree walking > functions (drivers/acpi/acpica/pstree.c). Perhaps there should be a > common tree struct and helpers defined either on top of list_head or a > ~~~~~ > new struct if that saves some size. Please let's get all this into the changelogs? > > > > Is that sufficient? Can we instead simplify their lives by providing > > better documentation or informative printks or better Kconfig text, > > etc? > > > > And who *are* the people who are performing this configuration? Random > > system administrators? Linux distro engineers? If the latter then > > they presumably aren't easily confused! > > Kexec was invented for kernel developer to speed up their kernel > rebooting. Now high end sever admin, kernel developer and QE are also > keen to use it to reboot large box for faster feature testing, bug > debugging. Kernel dev could know this well, about kernel loading > position, admin or QE might not be aware of it very well. > > > > > In other words, I'm trying to understand how much benefit this patchset > > will provide to our users as a whole. > > Understood. The list_head replacing patch truly involes too many code > changes, it's risky. I am willing to try any idea from reviewers, won't > persuit they have to be accepted finally. If don't have a try, we don't > know what it looks like, and what impact it may have. I am fine to take > AKASHI's simple version of walk_system_ram_res_rev() to lower risk, even > though it could be a little bit low efficient. The larger patch produces a better result. We can handle it ;) ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-25 2:21 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-25 2:21 UTC (permalink / raw) To: Andrew Morton Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, keith.busch, jcmvbkbc, baiyaowei, kys, frowand.list, tglx, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, vgoyal, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, dan.j.williams, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, fengguang.wu, linuxppc-dev, davem Hi Andrew, On 07/19/18 at 12:44pm, Andrew Morton wrote: > On Thu, 19 Jul 2018 23:17:53 +0800 Baoquan He <bhe@redhat.com> wrote: > > > As far as I can tell, the above is the whole reason for the patchset, > > > yes? To avoid confusing users. > > > > > > In fact, it's not just trying to avoid confusing users. Kexec loading > > and kexec_file loading are just do the same thing in essence. Just we > > need do kernel image verification on uefi system, have to port kexec > > loading code to kernel. > > > > Kexec has been a formal feature in our distro, and customers owning > > those kind of very large machine can make use of this feature to speed > > up the reboot process. On uefi machine, the kexec_file loading will > > search place to put kernel under 4G from top to down. As we know, the > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > it. It may have possibility to not be able to find a usable space for > > kernel/initrd. From the top down of the whole memory space, we don't > > have this worry. > > > > And at the first post, I just posted below with AKASHI's > > walk_system_ram_res_rev() version. Later you suggested to use > > list_head to link child sibling of resource, see what the code change > > looks like. > > http://lkml.kernel.org/r/20180322033722.9279-1-bhe@redhat.com > > > > Then I posted v2 > > http://lkml.kernel.org/r/20180408024724.16812-1-bhe@redhat.com > > Rob Herring mentioned that other components which has this tree struct > > have planned to do the same thing, replacing the singly linked list with > > list_head to link resource child sibling. Just quote Rob's words as > > below. I think this could be another reason. > > > > ~~~~~ From Rob > > The DT struct device_node also has the same tree structure with > > parent, child, sibling pointers and converting to list_head had been > > on the todo list for a while. ACPI also has some tree walking > > functions (drivers/acpi/acpica/pstree.c). Perhaps there should be a > > common tree struct and helpers defined either on top of list_head or a > > ~~~~~ > > new struct if that saves some size. > > Please let's get all this into the changelogs? Sorry for late reply because of some urgent customer hotplug issues. I am rewriting all change logs, and cover letter. Then found I was wrong about the 2nd reason. The current kexec_file_load calls kexec_locate_mem_hole() to go through all system RAM region, if one region is larger than the size of kernel or initrd, it will search a position in that region from top to down. Since kexec will jump to 2nd kernel and don't need to care the 1st kernel's data, we can always find a usable space to load kexec kernel/initrd under 4G. So the only reason for this patch is keeping consistent with kexec_load and avoid confusion. And since x86 5-level paging mode has been added, we have another issue for top-down searching in the whole system RAM. That is we support dynamic 4-level to 5-level changing. Namely a kernel compiled with 5-level support, we can add 'no5lvl' to force 4-level. Then jumping from a 5-level kernel to 4-level kernel, e.g we load kernel at the top of system RAM in 5-level paging mode which might be bigger than 64TB, then try to jump to 4-level kernel with the upper limit of 64TB. For this case, we need add limit for kexec kernel loading if in 5-level kernel. All this mess makes me hesitate to choose a deligate method. Maybe I should drop this patchset. > > > > > > > Is that sufficient? Can we instead simplify their lives by providing > > > better documentation or informative printks or better Kconfig text, > > > etc? > > > > > > And who *are* the people who are performing this configuration? Random > > > system administrators? Linux distro engineers? If the latter then > > > they presumably aren't easily confused! > > > > Kexec was invented for kernel developer to speed up their kernel > > rebooting. Now high end sever admin, kernel developer and QE are also > > keen to use it to reboot large box for faster feature testing, bug > > debugging. Kernel dev could know this well, about kernel loading > > position, admin or QE might not be aware of it very well. > > > > > > > > In other words, I'm trying to understand how much benefit this patchset > > > will provide to our users as a whole. > > > > Understood. The list_head replacing patch truly involes too many code > > changes, it's risky. I am willing to try any idea from reviewers, won't > > persuit they have to be accepted finally. If don't have a try, we don't > > know what it looks like, and what impact it may have. I am fine to take > > AKASHI's simple version of walk_system_ram_res_rev() to lower risk, even > > though it could be a little bit low efficient. > > The larger patch produces a better result. We can handle it ;) For this issue, if we stop changing the kexec top down searching code, I am not sure if we should post this replacing with list_head patches separately. Thanks Baoquan _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-25 2:21 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-25 2:21 UTC (permalink / raw) To: Andrew Morton Cc: linux-kernel, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko, patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, ebiederm, vgoyal, dyoung, yinghai, monstr, davem, chris, jcmvbkbc, gustavo, maarten.lankhorst, seanpaul, linux-parisc, linuxppc-dev, kexec Hi Andrew, On 07/19/18 at 12:44pm, Andrew Morton wrote: > On Thu, 19 Jul 2018 23:17:53 +0800 Baoquan He <bhe@redhat.com> wrote: > > > As far as I can tell, the above is the whole reason for the patchset, > > > yes? To avoid confusing users. > > > > > > In fact, it's not just trying to avoid confusing users. Kexec loading > > and kexec_file loading are just do the same thing in essence. Just we > > need do kernel image verification on uefi system, have to port kexec > > loading code to kernel. > > > > Kexec has been a formal feature in our distro, and customers owning > > those kind of very large machine can make use of this feature to speed > > up the reboot process. On uefi machine, the kexec_file loading will > > search place to put kernel under 4G from top to down. As we know, the > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > it. It may have possibility to not be able to find a usable space for > > kernel/initrd. From the top down of the whole memory space, we don't > > have this worry. > > > > And at the first post, I just posted below with AKASHI's > > walk_system_ram_res_rev() version. Later you suggested to use > > list_head to link child sibling of resource, see what the code change > > looks like. > > http://lkml.kernel.org/r/20180322033722.9279-1-bhe@redhat.com > > > > Then I posted v2 > > http://lkml.kernel.org/r/20180408024724.16812-1-bhe@redhat.com > > Rob Herring mentioned that other components which has this tree struct > > have planned to do the same thing, replacing the singly linked list with > > list_head to link resource child sibling. Just quote Rob's words as > > below. I think this could be another reason. > > > > ~~~~~ From Rob > > The DT struct device_node also has the same tree structure with > > parent, child, sibling pointers and converting to list_head had been > > on the todo list for a while. ACPI also has some tree walking > > functions (drivers/acpi/acpica/pstree.c). Perhaps there should be a > > common tree struct and helpers defined either on top of list_head or a > > ~~~~~ > > new struct if that saves some size. > > Please let's get all this into the changelogs? Sorry for late reply because of some urgent customer hotplug issues. I am rewriting all change logs, and cover letter. Then found I was wrong about the 2nd reason. The current kexec_file_load calls kexec_locate_mem_hole() to go through all system RAM region, if one region is larger than the size of kernel or initrd, it will search a position in that region from top to down. Since kexec will jump to 2nd kernel and don't need to care the 1st kernel's data, we can always find a usable space to load kexec kernel/initrd under 4G. So the only reason for this patch is keeping consistent with kexec_load and avoid confusion. And since x86 5-level paging mode has been added, we have another issue for top-down searching in the whole system RAM. That is we support dynamic 4-level to 5-level changing. Namely a kernel compiled with 5-level support, we can add 'no5lvl' to force 4-level. Then jumping from a 5-level kernel to 4-level kernel, e.g we load kernel at the top of system RAM in 5-level paging mode which might be bigger than 64TB, then try to jump to 4-level kernel with the upper limit of 64TB. For this case, we need add limit for kexec kernel loading if in 5-level kernel. All this mess makes me hesitate to choose a deligate method. Maybe I should drop this patchset. > > > > > > > Is that sufficient? Can we instead simplify their lives by providing > > > better documentation or informative printks or better Kconfig text, > > > etc? > > > > > > And who *are* the people who are performing this configuration? Random > > > system administrators? Linux distro engineers? If the latter then > > > they presumably aren't easily confused! > > > > Kexec was invented for kernel developer to speed up their kernel > > rebooting. Now high end sever admin, kernel developer and QE are also > > keen to use it to reboot large box for faster feature testing, bug > > debugging. Kernel dev could know this well, about kernel loading > > position, admin or QE might not be aware of it very well. > > > > > > > > In other words, I'm trying to understand how much benefit this patchset > > > will provide to our users as a whole. > > > > Understood. The list_head replacing patch truly involes too many code > > changes, it's risky. I am willing to try any idea from reviewers, won't > > persuit they have to be accepted finally. If don't have a try, we don't > > know what it looks like, and what impact it may have. I am fine to take > > AKASHI's simple version of walk_system_ram_res_rev() to lower risk, even > > though it could be a little bit low efficient. > > The larger patch produces a better result. We can handle it ;) For this issue, if we stop changing the kexec top down searching code, I am not sure if we should post this replacing with list_head patches separately. Thanks Baoquan ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-25 2:21 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-25 2:21 UTC (permalink / raw) To: Andrew Morton Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, jcmvbkbc, baiyaowei, kys, frowand.list, tglx, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, fengguang.wu, linuxppc-dev, davem Hi Andrew, On 07/19/18 at 12:44pm, Andrew Morton wrote: > On Thu, 19 Jul 2018 23:17:53 +0800 Baoquan He <bhe@redhat.com> wrote: > > > As far as I can tell, the above is the whole reason for the patchset, > > > yes? To avoid confusing users. > > > > > > In fact, it's not just trying to avoid confusing users. Kexec loading > > and kexec_file loading are just do the same thing in essence. Just we > > need do kernel image verification on uefi system, have to port kexec > > loading code to kernel. > > > > Kexec has been a formal feature in our distro, and customers owning > > those kind of very large machine can make use of this feature to speed > > up the reboot process. On uefi machine, the kexec_file loading will > > search place to put kernel under 4G from top to down. As we know, the > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > it. It may have possibility to not be able to find a usable space for > > kernel/initrd. From the top down of the whole memory space, we don't > > have this worry. > > > > And at the first post, I just posted below with AKASHI's > > walk_system_ram_res_rev() version. Later you suggested to use > > list_head to link child sibling of resource, see what the code change > > looks like. > > http://lkml.kernel.org/r/20180322033722.9279-1-bhe@redhat.com > > > > Then I posted v2 > > http://lkml.kernel.org/r/20180408024724.16812-1-bhe@redhat.com > > Rob Herring mentioned that other components which has this tree struct > > have planned to do the same thing, replacing the singly linked list with > > list_head to link resource child sibling. Just quote Rob's words as > > below. I think this could be another reason. > > > > ~~~~~ From Rob > > The DT struct device_node also has the same tree structure with > > parent, child, sibling pointers and converting to list_head had been > > on the todo list for a while. ACPI also has some tree walking > > functions (drivers/acpi/acpica/pstree.c). Perhaps there should be a > > common tree struct and helpers defined either on top of list_head or a > > ~~~~~ > > new struct if that saves some size. > > Please let's get all this into the changelogs? Sorry for late reply because of some urgent customer hotplug issues. I am rewriting all change logs, and cover letter. Then found I was wrong about the 2nd reason. The current kexec_file_load calls kexec_locate_mem_hole() to go through all system RAM region, if one region is larger than the size of kernel or initrd, it will search a position in that region from top to down. Since kexec will jump to 2nd kernel and don't need to care the 1st kernel's data, we can always find a usable space to load kexec kernel/initrd under 4G. So the only reason for this patch is keeping consistent with kexec_load and avoid confusion. And since x86 5-level paging mode has been added, we have another issue for top-down searching in the whole system RAM. That is we support dynamic 4-level to 5-level changing. Namely a kernel compiled with 5-level support, we can add 'no5lvl' to force 4-level. Then jumping from a 5-level kernel to 4-level kernel, e.g we load kernel at the top of system RAM in 5-level paging mode which might be bigger than 64TB, then try to jump to 4-level kernel with the upper limit of 64TB. For this case, we need add limit for kexec kernel loading if in 5-level kernel. All this mess makes me hesitate to choose a deligate method. Maybe I should drop this patchset. > > > > > > > Is that sufficient? Can we instead simplify their lives by providing > > > better documentation or informative printks or better Kconfig text, > > > etc? > > > > > > And who *are* the people who are performing this configuration? Random > > > system administrators? Linux distro engineers? If the latter then > > > they presumably aren't easily confused! > > > > Kexec was invented for kernel developer to speed up their kernel > > rebooting. Now high end sever admin, kernel developer and QE are also > > keen to use it to reboot large box for faster feature testing, bug > > debugging. Kernel dev could know this well, about kernel loading > > position, admin or QE might not be aware of it very well. > > > > > > > > In other words, I'm trying to understand how much benefit this patchset > > > will provide to our users as a whole. > > > > Understood. The list_head replacing patch truly involes too many code > > changes, it's risky. I am willing to try any idea from reviewers, won't > > persuit they have to be accepted finally. If don't have a try, we don't > > know what it looks like, and what impact it may have. I am fine to take > > AKASHI's simple version of walk_system_ram_res_rev() to lower risk, even > > though it could be a little bit low efficient. > > The larger patch produces a better result. We can handle it ;) For this issue, if we stop changing the kexec top down searching code, I am not sure if we should post this replacing with list_head patches separately. Thanks Baoquan _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-25 2:21 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-25 2:21 UTC (permalink / raw) To: Andrew Morton Cc: nicolas.pitre-QSEj5FYQhm4dnm+yROfE0A, brijesh.singh-5C7GfCeVMHo, devicetree-u79uwXL29TY76Z2rM5mHXA, airlied-cv59FeDIM0c, linux-pci-u79uwXL29TY76Z2rM5mHXA, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w, jcmvbkbc-Re5JQEeQqe8AvxtiuMwx3w, baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/, kys-0li6OtcxBFHby3iVrkZq2A, frowand.list-Re5JQEeQqe8AvxtiuMwx3w, tglx-hfZtesqFncYOwBW4kG4KsQ, lorenzo.pieralisi-5wv7dgnIgG8, sthemmin-0li6OtcxBFHby3iVrkZq2A, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, patrik.r.jakobsson-Re5JQEeQqe8AvxtiuMwx3w, andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w, linux-input-u79uwXL29TY76Z2rM5mHXA, gustavo-THi1TnShQwVAfugRpC6u6w, bp-l3A5Bk7waGM, dyoung-H+wXaHxf7aLQT0dZR+AlfA, thomas.lendacky-5C7GfCeVMHo, haiyangz-0li6OtcxBFHby3iVrkZq2A, maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA, josh-iaAMLnmF4UmaiuxdJuQwMA, jglisse-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, seanpaul-F7+t8E8rja9g9hUCZPvPmw, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, yinghai-DgEjT+Ai2ygdnm+yROfE0A, jonathan.derrick-ral2JQCrhuEAvxtiuMwx3w, chris-YvXeqwSYzG2sTnJN9+BGXg, monstr-pSz03upnqPeHXe+LvDLADg, linux-parisc-u79uwXL29TY76Z2rM5mHXA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-kernel Hi Andrew, On 07/19/18 at 12:44pm, Andrew Morton wrote: > On Thu, 19 Jul 2018 23:17:53 +0800 Baoquan He <bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > > As far as I can tell, the above is the whole reason for the patchset, > > > yes? To avoid confusing users. > > > > > > In fact, it's not just trying to avoid confusing users. Kexec loading > > and kexec_file loading are just do the same thing in essence. Just we > > need do kernel image verification on uefi system, have to port kexec > > loading code to kernel. > > > > Kexec has been a formal feature in our distro, and customers owning > > those kind of very large machine can make use of this feature to speed > > up the reboot process. On uefi machine, the kexec_file loading will > > search place to put kernel under 4G from top to down. As we know, the > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > it. It may have possibility to not be able to find a usable space for > > kernel/initrd. From the top down of the whole memory space, we don't > > have this worry. > > > > And at the first post, I just posted below with AKASHI's > > walk_system_ram_res_rev() version. Later you suggested to use > > list_head to link child sibling of resource, see what the code change > > looks like. > > http://lkml.kernel.org/r/20180322033722.9279-1-bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org > > > > Then I posted v2 > > http://lkml.kernel.org/r/20180408024724.16812-1-bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org > > Rob Herring mentioned that other components which has this tree struct > > have planned to do the same thing, replacing the singly linked list with > > list_head to link resource child sibling. Just quote Rob's words as > > below. I think this could be another reason. > > > > ~~~~~ From Rob > > The DT struct device_node also has the same tree structure with > > parent, child, sibling pointers and converting to list_head had been > > on the todo list for a while. ACPI also has some tree walking > > functions (drivers/acpi/acpica/pstree.c). Perhaps there should be a > > common tree struct and helpers defined either on top of list_head or a > > ~~~~~ > > new struct if that saves some size. > > Please let's get all this into the changelogs? Sorry for late reply because of some urgent customer hotplug issues. I am rewriting all change logs, and cover letter. Then found I was wrong about the 2nd reason. The current kexec_file_load calls kexec_locate_mem_hole() to go through all system RAM region, if one region is larger than the size of kernel or initrd, it will search a position in that region from top to down. Since kexec will jump to 2nd kernel and don't need to care the 1st kernel's data, we can always find a usable space to load kexec kernel/initrd under 4G. So the only reason for this patch is keeping consistent with kexec_load and avoid confusion. And since x86 5-level paging mode has been added, we have another issue for top-down searching in the whole system RAM. That is we support dynamic 4-level to 5-level changing. Namely a kernel compiled with 5-level support, we can add 'no5lvl' to force 4-level. Then jumping from a 5-level kernel to 4-level kernel, e.g we load kernel at the top of system RAM in 5-level paging mode which might be bigger than 64TB, then try to jump to 4-level kernel with the upper limit of 64TB. For this case, we need add limit for kexec kernel loading if in 5-level kernel. All this mess makes me hesitate to choose a deligate method. Maybe I should drop this patchset. > > > > > > > Is that sufficient? Can we instead simplify their lives by providing > > > better documentation or informative printks or better Kconfig text, > > > etc? > > > > > > And who *are* the people who are performing this configuration? Random > > > system administrators? Linux distro engineers? If the latter then > > > they presumably aren't easily confused! > > > > Kexec was invented for kernel developer to speed up their kernel > > rebooting. Now high end sever admin, kernel developer and QE are also > > keen to use it to reboot large box for faster feature testing, bug > > debugging. Kernel dev could know this well, about kernel loading > > position, admin or QE might not be aware of it very well. > > > > > > > > In other words, I'm trying to understand how much benefit this patchset > > > will provide to our users as a whole. > > > > Understood. The list_head replacing patch truly involes too many code > > changes, it's risky. I am willing to try any idea from reviewers, won't > > persuit they have to be accepted finally. If don't have a try, we don't > > know what it looks like, and what impact it may have. I am fine to take > > AKASHI's simple version of walk_system_ram_res_rev() to lower risk, even > > though it could be a little bit low efficient. > > The larger patch produces a better result. We can handle it ;) For this issue, if we stop changing the kexec top down searching code, I am not sure if we should post this replacing with list_head patches separately. Thanks Baoquan ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required 2018-07-19 15:17 ` Baoquan He ` (2 preceding siblings ...) (?) @ 2018-07-23 14:34 ` Michal Hocko -1 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-23 14:34 UTC (permalink / raw) To: Baoquan He Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, keith.busch, jcmvbkbc, baiyaowei, kys, frowand.list, dan.j.williams, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, vgoyal, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, Andrew Morton, fengguang.wu, linuxppc-dev, davem On Thu 19-07-18 23:17:53, Baoquan He wrote: > Kexec has been a formal feature in our distro, and customers owning > those kind of very large machine can make use of this feature to speed > up the reboot process. On uefi machine, the kexec_file loading will > search place to put kernel under 4G from top to down. As we know, the > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > it. It may have possibility to not be able to find a usable space for > kernel/initrd. From the top down of the whole memory space, we don't > have this worry. I do not have the full context here but let me note that you should be careful when doing top-down reservation because you can easily get into hotplugable memory and break the hotremove usecase. We even warn when this is done. See memblock_find_in_range_node -- Michal Hocko SUSE Labs _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-23 14:34 ` Michal Hocko 0 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-23 14:34 UTC (permalink / raw) To: Baoquan He Cc: Andrew Morton, linux-kernel, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko, patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, ebiederm, vgoyal, dyoung, yinghai, monstr, davem, chris, jcmvbkbc, gustavo, maarten.lankhorst, seanpaul, linux-parisc, linuxppc-dev, kexec On Thu 19-07-18 23:17:53, Baoquan He wrote: > Kexec has been a formal feature in our distro, and customers owning > those kind of very large machine can make use of this feature to speed > up the reboot process. On uefi machine, the kexec_file loading will > search place to put kernel under 4G from top to down. As we know, the > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > it. It may have possibility to not be able to find a usable space for > kernel/initrd. From the top down of the whole memory space, we don't > have this worry. I do not have the full context here but let me note that you should be careful when doing top-down reservation because you can easily get into hotplugable memory and break the hotremove usecase. We even warn when this is done. See memblock_find_in_range_node -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-23 14:34 ` Michal Hocko 0 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-23 14:34 UTC (permalink / raw) To: Baoquan He Cc: Andrew Morton, linux-kernel, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko, patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree On Thu 19-07-18 23:17:53, Baoquan He wrote: > Kexec has been a formal feature in our distro, and customers owning > those kind of very large machine can make use of this feature to speed > up the reboot process. On uefi machine, the kexec_file loading will > search place to put kernel under 4G from top to down. As we know, the > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > it. It may have possibility to not be able to find a usable space for > kernel/initrd. From the top down of the whole memory space, we don't > have this worry. I do not have the full context here but let me note that you should be careful when doing top-down reservation because you can easily get into hotplugable memory and break the hotremove usecase. We even warn when this is done. See memblock_find_in_range_node -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-23 14:34 ` Michal Hocko 0 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-23 14:34 UTC (permalink / raw) To: Baoquan He Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, jcmvbkbc, baiyaowei, kys, frowand.list, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, Andrew Morton, fengguang.wu, linuxppc-dev, davem On Thu 19-07-18 23:17:53, Baoquan He wrote: > Kexec has been a formal feature in our distro, and customers owning > those kind of very large machine can make use of this feature to speed > up the reboot process. On uefi machine, the kexec_file loading will > search place to put kernel under 4G from top to down. As we know, the > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > it. It may have possibility to not be able to find a usable space for > kernel/initrd. From the top down of the whole memory space, we don't > have this worry. I do not have the full context here but let me note that you should be careful when doing top-down reservation because you can easily get into hotplugable memory and break the hotremove usecase. We even warn when this is done. See memblock_find_in_range_node -- Michal Hocko SUSE Labs _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-23 14:34 ` Michal Hocko 0 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-23 14:34 UTC (permalink / raw) To: Baoquan He Cc: Andrew Morton, linux-kernel, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko, patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree On Thu 19-07-18 23:17:53, Baoquan He wrote: > Kexec has been a formal feature in our distro, and customers owning > those kind of very large machine can make use of this feature to speed > up the reboot process. On uefi machine, the kexec_file loading will > search place to put kernel under 4G from top to down. As we know, the > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > it. It may have possibility to not be able to find a usable space for > kernel/initrd. From the top down of the whole memory space, we don't > have this worry. I do not have the full context here but let me note that you should be careful when doing top-down reservation because you can easily get into hotplugable memory and break the hotremove usecase. We even warn when this is done. See memblock_find_in_range_node -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-25 6:48 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-25 6:48 UTC (permalink / raw) To: Michal Hocko Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, keith.busch, jcmvbkbc, baiyaowei, kys, frowand.list, dan.j.williams, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, vgoyal, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, Andrew Morton, fengguang.wu, linuxppc-dev, davem On 07/23/18 at 04:34pm, Michal Hocko wrote: > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > Kexec has been a formal feature in our distro, and customers owning > > those kind of very large machine can make use of this feature to speed > > up the reboot process. On uefi machine, the kexec_file loading will > > search place to put kernel under 4G from top to down. As we know, the > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > it. It may have possibility to not be able to find a usable space for > > kernel/initrd. From the top down of the whole memory space, we don't > > have this worry. > > I do not have the full context here but let me note that you should be > careful when doing top-down reservation because you can easily get into > hotplugable memory and break the hotremove usecase. We even warn when > this is done. See memblock_find_in_range_node Kexec read kernel/initrd file into buffer, just search usable positions for them to do the later copying. You can see below struct kexec_segment, for the old kexec_load, kernel/initrd are read into user space buffer, the @buf stores the user space buffer address, @mem stores the position where kernel/initrd will be put. In kernel, it calls kimage_load_normal_segment() to copy user space buffer to intermediate pages which are allocated with flag GFP_KERNEL. These intermediate pages are recorded as entries, later when user execute "kexec -e" to trigger kexec jumping, it will do the final copying from the intermediate pages to the real destination pages which @mem pointed. Because we can't touch the existed data in 1st kernel when do kexec kernel loading. With my understanding, GFP_KERNEL will make those intermediate pages be allocated inside immovable area, it won't impact hotplugging. But the @mem we searched in the whole system RAM might be lost along with hotplug. Hence we need do kexec kernel again when hotplug event is detected. #define KEXEC_CONTROL_MEMORY_GFP (GFP_KERNEL | __GFP_NORETRY) struct kexec_segment { /* * This pointer can point to user memory if kexec_load() system * call is used or will point to kernel memory if * kexec_file_load() system call is used. * * Use ->buf when expecting to deal with user memory and use ->kbuf * when expecting to deal with kernel memory. */ union { void __user *buf; void *kbuf; }; size_t bufsz; unsigned long mem; size_t memsz; }; Thanks Baoquan _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-25 6:48 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-25 6:48 UTC (permalink / raw) To: Michal Hocko Cc: Andrew Morton, linux-kernel, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko, patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, ebiederm, vgoyal, dyoung, yinghai, monstr, davem, chris, jcmvbkbc, gustavo, maarten.lankhorst, seanpaul, linux-parisc, linuxppc-dev, kexec On 07/23/18 at 04:34pm, Michal Hocko wrote: > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > Kexec has been a formal feature in our distro, and customers owning > > those kind of very large machine can make use of this feature to speed > > up the reboot process. On uefi machine, the kexec_file loading will > > search place to put kernel under 4G from top to down. As we know, the > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > it. It may have possibility to not be able to find a usable space for > > kernel/initrd. From the top down of the whole memory space, we don't > > have this worry. > > I do not have the full context here but let me note that you should be > careful when doing top-down reservation because you can easily get into > hotplugable memory and break the hotremove usecase. We even warn when > this is done. See memblock_find_in_range_node Kexec read kernel/initrd file into buffer, just search usable positions for them to do the later copying. You can see below struct kexec_segment, for the old kexec_load, kernel/initrd are read into user space buffer, the @buf stores the user space buffer address, @mem stores the position where kernel/initrd will be put. In kernel, it calls kimage_load_normal_segment() to copy user space buffer to intermediate pages which are allocated with flag GFP_KERNEL. These intermediate pages are recorded as entries, later when user execute "kexec -e" to trigger kexec jumping, it will do the final copying from the intermediate pages to the real destination pages which @mem pointed. Because we can't touch the existed data in 1st kernel when do kexec kernel loading. With my understanding, GFP_KERNEL will make those intermediate pages be allocated inside immovable area, it won't impact hotplugging. But the @mem we searched in the whole system RAM might be lost along with hotplug. Hence we need do kexec kernel again when hotplug event is detected. #define KEXEC_CONTROL_MEMORY_GFP (GFP_KERNEL | __GFP_NORETRY) struct kexec_segment { /* * This pointer can point to user memory if kexec_load() system * call is used or will point to kernel memory if * kexec_file_load() system call is used. * * Use ->buf when expecting to deal with user memory and use ->kbuf * when expecting to deal with kernel memory. */ union { void __user *buf; void *kbuf; }; size_t bufsz; unsigned long mem; size_t memsz; }; Thanks Baoquan ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-25 6:48 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-25 6:48 UTC (permalink / raw) To: Michal Hocko Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, jcmvbkbc, baiyaowei, kys, frowand.list, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, Andrew Morton, fengguang.wu, linuxppc-dev, davem On 07/23/18 at 04:34pm, Michal Hocko wrote: > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > Kexec has been a formal feature in our distro, and customers owning > > those kind of very large machine can make use of this feature to speed > > up the reboot process. On uefi machine, the kexec_file loading will > > search place to put kernel under 4G from top to down. As we know, the > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > it. It may have possibility to not be able to find a usable space for > > kernel/initrd. From the top down of the whole memory space, we don't > > have this worry. > > I do not have the full context here but let me note that you should be > careful when doing top-down reservation because you can easily get into > hotplugable memory and break the hotremove usecase. We even warn when > this is done. See memblock_find_in_range_node Kexec read kernel/initrd file into buffer, just search usable positions for them to do the later copying. You can see below struct kexec_segment, for the old kexec_load, kernel/initrd are read into user space buffer, the @buf stores the user space buffer address, @mem stores the position where kernel/initrd will be put. In kernel, it calls kimage_load_normal_segment() to copy user space buffer to intermediate pages which are allocated with flag GFP_KERNEL. These intermediate pages are recorded as entries, later when user execute "kexec -e" to trigger kexec jumping, it will do the final copying from the intermediate pages to the real destination pages which @mem pointed. Because we can't touch the existed data in 1st kernel when do kexec kernel loading. With my understanding, GFP_KERNEL will make those intermediate pages be allocated inside immovable area, it won't impact hotplugging. But the @mem we searched in the whole system RAM might be lost along with hotplug. Hence we need do kexec kernel again when hotplug event is detected. #define KEXEC_CONTROL_MEMORY_GFP (GFP_KERNEL | __GFP_NORETRY) struct kexec_segment { /* * This pointer can point to user memory if kexec_load() system * call is used or will point to kernel memory if * kexec_file_load() system call is used. * * Use ->buf when expecting to deal with user memory and use ->kbuf * when expecting to deal with kernel memory. */ union { void __user *buf; void *kbuf; }; size_t bufsz; unsigned long mem; size_t memsz; }; Thanks Baoquan _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-25 6:48 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-25 6:48 UTC (permalink / raw) To: Michal Hocko Cc: nicolas.pitre-QSEj5FYQhm4dnm+yROfE0A, brijesh.singh-5C7GfCeVMHo, devicetree-u79uwXL29TY76Z2rM5mHXA, airlied-cv59FeDIM0c, linux-pci-u79uwXL29TY76Z2rM5mHXA, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w, jcmvbkbc-Re5JQEeQqe8AvxtiuMwx3w, baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/, kys-0li6OtcxBFHby3iVrkZq2A, frowand.list-Re5JQEeQqe8AvxtiuMwx3w, lorenzo.pieralisi-5wv7dgnIgG8, sthemmin-0li6OtcxBFHby3iVrkZq2A, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, patrik.r.jakobsson-Re5JQEeQqe8AvxtiuMwx3w, andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w, linux-input-u79uwXL29TY76Z2rM5mHXA, gustavo-THi1TnShQwVAfugRpC6u6w, bp-l3A5Bk7waGM, dyoung-H+wXaHxf7aLQT0dZR+AlfA, thomas.lendacky-5C7GfCeVMHo, haiyangz-0li6OtcxBFHby3iVrkZq2A, maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA, josh-iaAMLnmF4UmaiuxdJuQwMA, jglisse-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, seanpaul-F7+t8E8rja9g9hUCZPvPmw, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, tglx-hfZtesqFncYOwBW4kG4KsQ, yinghai-DgEjT+Ai2ygdnm+yROfE0A, jonathan.derrick-ral2JQCrhuEAvxtiuMwx3w, chris-YvXeqwSYzG2sTnJN9+BGXg, monstr-pSz03upnqPeHXe+LvDLADg, linux-parisc-u79uwXL29TY76Z2rM5mHXA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-kernel On 07/23/18 at 04:34pm, Michal Hocko wrote: > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > Kexec has been a formal feature in our distro, and customers owning > > those kind of very large machine can make use of this feature to speed > > up the reboot process. On uefi machine, the kexec_file loading will > > search place to put kernel under 4G from top to down. As we know, the > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > it. It may have possibility to not be able to find a usable space for > > kernel/initrd. From the top down of the whole memory space, we don't > > have this worry. > > I do not have the full context here but let me note that you should be > careful when doing top-down reservation because you can easily get into > hotplugable memory and break the hotremove usecase. We even warn when > this is done. See memblock_find_in_range_node Kexec read kernel/initrd file into buffer, just search usable positions for them to do the later copying. You can see below struct kexec_segment, for the old kexec_load, kernel/initrd are read into user space buffer, the @buf stores the user space buffer address, @mem stores the position where kernel/initrd will be put. In kernel, it calls kimage_load_normal_segment() to copy user space buffer to intermediate pages which are allocated with flag GFP_KERNEL. These intermediate pages are recorded as entries, later when user execute "kexec -e" to trigger kexec jumping, it will do the final copying from the intermediate pages to the real destination pages which @mem pointed. Because we can't touch the existed data in 1st kernel when do kexec kernel loading. With my understanding, GFP_KERNEL will make those intermediate pages be allocated inside immovable area, it won't impact hotplugging. But the @mem we searched in the whole system RAM might be lost along with hotplug. Hence we need do kexec kernel again when hotplug event is detected. #define KEXEC_CONTROL_MEMORY_GFP (GFP_KERNEL | __GFP_NORETRY) struct kexec_segment { /* * This pointer can point to user memory if kexec_load() system * call is used or will point to kernel memory if * kexec_file_load() system call is used. * * Use ->buf when expecting to deal with user memory and use ->kbuf * when expecting to deal with kernel memory. */ union { void __user *buf; void *kbuf; }; size_t bufsz; unsigned long mem; size_t memsz; }; Thanks Baoquan ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required 2018-07-25 6:48 ` Baoquan He (?) (?) @ 2018-07-26 12:59 ` Michal Hocko -1 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-26 12:59 UTC (permalink / raw) To: Baoquan He Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, keith.busch, jcmvbkbc, baiyaowei, kys, frowand.list, dan.j.williams, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, vgoyal, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, Andrew Morton, fengguang.wu, linuxppc-dev, davem On Wed 25-07-18 14:48:13, Baoquan He wrote: > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > Kexec has been a formal feature in our distro, and customers owning > > > those kind of very large machine can make use of this feature to speed > > > up the reboot process. On uefi machine, the kexec_file loading will > > > search place to put kernel under 4G from top to down. As we know, the > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > it. It may have possibility to not be able to find a usable space for > > > kernel/initrd. From the top down of the whole memory space, we don't > > > have this worry. > > > > I do not have the full context here but let me note that you should be > > careful when doing top-down reservation because you can easily get into > > hotplugable memory and break the hotremove usecase. We even warn when > > this is done. See memblock_find_in_range_node > > Kexec read kernel/initrd file into buffer, just search usable positions > for them to do the later copying. You can see below struct kexec_segment, > for the old kexec_load, kernel/initrd are read into user space buffer, > the @buf stores the user space buffer address, @mem stores the position > where kernel/initrd will be put. In kernel, it calls > kimage_load_normal_segment() to copy user space buffer to intermediate > pages which are allocated with flag GFP_KERNEL. These intermediate pages > are recorded as entries, later when user execute "kexec -e" to trigger > kexec jumping, it will do the final copying from the intermediate pages > to the real destination pages which @mem pointed. Because we can't touch > the existed data in 1st kernel when do kexec kernel loading. With my > understanding, GFP_KERNEL will make those intermediate pages be > allocated inside immovable area, it won't impact hotplugging. But the > @mem we searched in the whole system RAM might be lost along with > hotplug. Hence we need do kexec kernel again when hotplug event is > detected. I am not sure I am following. If @mem is placed at movable node then the memory hotremove simply won't work, because we are seeing reserved pages and do not know what to do about them. They are not migrateable. Allocating intermediate pages from other nodes doesn't really help. The memblock code warns exactly for that reason. -- Michal Hocko SUSE Labs _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 12:59 ` Michal Hocko 0 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-26 12:59 UTC (permalink / raw) To: Baoquan He Cc: Andrew Morton, linux-kernel, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko, patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, ebiederm, vgoyal, dyoung, yinghai, monstr, davem, chris, jcmvbkbc, gustavo, maarten.lankhorst, seanpaul, linux-parisc, linuxppc-dev, kexec On Wed 25-07-18 14:48:13, Baoquan He wrote: > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > Kexec has been a formal feature in our distro, and customers owning > > > those kind of very large machine can make use of this feature to speed > > > up the reboot process. On uefi machine, the kexec_file loading will > > > search place to put kernel under 4G from top to down. As we know, the > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > it. It may have possibility to not be able to find a usable space for > > > kernel/initrd. From the top down of the whole memory space, we don't > > > have this worry. > > > > I do not have the full context here but let me note that you should be > > careful when doing top-down reservation because you can easily get into > > hotplugable memory and break the hotremove usecase. We even warn when > > this is done. See memblock_find_in_range_node > > Kexec read kernel/initrd file into buffer, just search usable positions > for them to do the later copying. You can see below struct kexec_segment, > for the old kexec_load, kernel/initrd are read into user space buffer, > the @buf stores the user space buffer address, @mem stores the position > where kernel/initrd will be put. In kernel, it calls > kimage_load_normal_segment() to copy user space buffer to intermediate > pages which are allocated with flag GFP_KERNEL. These intermediate pages > are recorded as entries, later when user execute "kexec -e" to trigger > kexec jumping, it will do the final copying from the intermediate pages > to the real destination pages which @mem pointed. Because we can't touch > the existed data in 1st kernel when do kexec kernel loading. With my > understanding, GFP_KERNEL will make those intermediate pages be > allocated inside immovable area, it won't impact hotplugging. But the > @mem we searched in the whole system RAM might be lost along with > hotplug. Hence we need do kexec kernel again when hotplug event is > detected. I am not sure I am following. If @mem is placed at movable node then the memory hotremove simply won't work, because we are seeing reserved pages and do not know what to do about them. They are not migrateable. Allocating intermediate pages from other nodes doesn't really help. The memblock code warns exactly for that reason. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 12:59 ` Michal Hocko 0 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-26 12:59 UTC (permalink / raw) To: Baoquan He Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, jcmvbkbc, baiyaowei, kys, frowand.list, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, Andrew Morton, fengguang.wu, linuxppc-dev, davem On Wed 25-07-18 14:48:13, Baoquan He wrote: > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > Kexec has been a formal feature in our distro, and customers owning > > > those kind of very large machine can make use of this feature to speed > > > up the reboot process. On uefi machine, the kexec_file loading will > > > search place to put kernel under 4G from top to down. As we know, the > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > it. It may have possibility to not be able to find a usable space for > > > kernel/initrd. From the top down of the whole memory space, we don't > > > have this worry. > > > > I do not have the full context here but let me note that you should be > > careful when doing top-down reservation because you can easily get into > > hotplugable memory and break the hotremove usecase. We even warn when > > this is done. See memblock_find_in_range_node > > Kexec read kernel/initrd file into buffer, just search usable positions > for them to do the later copying. You can see below struct kexec_segment, > for the old kexec_load, kernel/initrd are read into user space buffer, > the @buf stores the user space buffer address, @mem stores the position > where kernel/initrd will be put. In kernel, it calls > kimage_load_normal_segment() to copy user space buffer to intermediate > pages which are allocated with flag GFP_KERNEL. These intermediate pages > are recorded as entries, later when user execute "kexec -e" to trigger > kexec jumping, it will do the final copying from the intermediate pages > to the real destination pages which @mem pointed. Because we can't touch > the existed data in 1st kernel when do kexec kernel loading. With my > understanding, GFP_KERNEL will make those intermediate pages be > allocated inside immovable area, it won't impact hotplugging. But the > @mem we searched in the whole system RAM might be lost along with > hotplug. Hence we need do kexec kernel again when hotplug event is > detected. I am not sure I am following. If @mem is placed at movable node then the memory hotremove simply won't work, because we are seeing reserved pages and do not know what to do about them. They are not migrateable. Allocating intermediate pages from other nodes doesn't really help. The memblock code warns exactly for that reason. -- Michal Hocko SUSE Labs _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 12:59 ` Michal Hocko 0 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-26 12:59 UTC (permalink / raw) To: Baoquan He Cc: nicolas.pitre-QSEj5FYQhm4dnm+yROfE0A, brijesh.singh-5C7GfCeVMHo, devicetree-u79uwXL29TY76Z2rM5mHXA, airlied-cv59FeDIM0c, linux-pci-u79uwXL29TY76Z2rM5mHXA, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w, jcmvbkbc-Re5JQEeQqe8AvxtiuMwx3w, baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/, kys-0li6OtcxBFHby3iVrkZq2A, frowand.list-Re5JQEeQqe8AvxtiuMwx3w, lorenzo.pieralisi-5wv7dgnIgG8, sthemmin-0li6OtcxBFHby3iVrkZq2A, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, patrik.r.jakobsson-Re5JQEeQqe8AvxtiuMwx3w, andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w, linux-input-u79uwXL29TY76Z2rM5mHXA, gustavo-THi1TnShQwVAfugRpC6u6w, bp-l3A5Bk7waGM, dyoung-H+wXaHxf7aLQT0dZR+AlfA, thomas.lendacky-5C7GfCeVMHo, haiyangz-0li6OtcxBFHby3iVrkZq2A, maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA, josh-iaAMLnmF4UmaiuxdJuQwMA, jglisse-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, seanpaul-F7+t8E8rja9g9hUCZPvPmw, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, tglx-hfZtesqFncYOwBW4kG4KsQ, yinghai-DgEjT+Ai2ygdnm+yROfE0A, jonathan.derrick-ral2JQCrhuEAvxtiuMwx3w, chris-YvXeqwSYzG2sTnJN9+BGXg, monstr-pSz03upnqPeHXe+LvDLADg, linux-parisc-u79uwXL29TY76Z2rM5mHXA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-kernel On Wed 25-07-18 14:48:13, Baoquan He wrote: > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > Kexec has been a formal feature in our distro, and customers owning > > > those kind of very large machine can make use of this feature to speed > > > up the reboot process. On uefi machine, the kexec_file loading will > > > search place to put kernel under 4G from top to down. As we know, the > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > it. It may have possibility to not be able to find a usable space for > > > kernel/initrd. From the top down of the whole memory space, we don't > > > have this worry. > > > > I do not have the full context here but let me note that you should be > > careful when doing top-down reservation because you can easily get into > > hotplugable memory and break the hotremove usecase. We even warn when > > this is done. See memblock_find_in_range_node > > Kexec read kernel/initrd file into buffer, just search usable positions > for them to do the later copying. You can see below struct kexec_segment, > for the old kexec_load, kernel/initrd are read into user space buffer, > the @buf stores the user space buffer address, @mem stores the position > where kernel/initrd will be put. In kernel, it calls > kimage_load_normal_segment() to copy user space buffer to intermediate > pages which are allocated with flag GFP_KERNEL. These intermediate pages > are recorded as entries, later when user execute "kexec -e" to trigger > kexec jumping, it will do the final copying from the intermediate pages > to the real destination pages which @mem pointed. Because we can't touch > the existed data in 1st kernel when do kexec kernel loading. With my > understanding, GFP_KERNEL will make those intermediate pages be > allocated inside immovable area, it won't impact hotplugging. But the > @mem we searched in the whole system RAM might be lost along with > hotplug. Hence we need do kexec kernel again when hotplug event is > detected. I am not sure I am following. If @mem is placed at movable node then the memory hotremove simply won't work, because we are seeing reserved pages and do not know what to do about them. They are not migrateable. Allocating intermediate pages from other nodes doesn't really help. The memblock code warns exactly for that reason. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 13:09 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-26 13:09 UTC (permalink / raw) To: Michal Hocko Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, keith.busch, jcmvbkbc, baiyaowei, kys, frowand.list, dan.j.williams, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, vgoyal, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, Andrew Morton, fengguang.wu, linuxppc-dev, davem On 07/26/18 at 02:59pm, Michal Hocko wrote: > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > Kexec has been a formal feature in our distro, and customers owning > > > > those kind of very large machine can make use of this feature to speed > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > search place to put kernel under 4G from top to down. As we know, the > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > it. It may have possibility to not be able to find a usable space for > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > have this worry. > > > > > > I do not have the full context here but let me note that you should be > > > careful when doing top-down reservation because you can easily get into > > > hotplugable memory and break the hotremove usecase. We even warn when > > > this is done. See memblock_find_in_range_node > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > for them to do the later copying. You can see below struct kexec_segment, > > for the old kexec_load, kernel/initrd are read into user space buffer, > > the @buf stores the user space buffer address, @mem stores the position > > where kernel/initrd will be put. In kernel, it calls > > kimage_load_normal_segment() to copy user space buffer to intermediate > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > are recorded as entries, later when user execute "kexec -e" to trigger > > kexec jumping, it will do the final copying from the intermediate pages > > to the real destination pages which @mem pointed. Because we can't touch > > the existed data in 1st kernel when do kexec kernel loading. With my > > understanding, GFP_KERNEL will make those intermediate pages be > > allocated inside immovable area, it won't impact hotplugging. But the > > @mem we searched in the whole system RAM might be lost along with > > hotplug. Hence we need do kexec kernel again when hotplug event is > > detected. > > I am not sure I am following. If @mem is placed at movable node then the > memory hotremove simply won't work, because we are seeing reserved pages > and do not know what to do about them. They are not migrateable. > Allocating intermediate pages from other nodes doesn't really help. OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove in 1st kernel, it does impact the kernel which kexec jump into if kernel is at top of system RAM and the top RAM is in movable node. > > The memblock code warns exactly for that reason. > -- > Michal Hocko > SUSE Labs _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 13:09 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-26 13:09 UTC (permalink / raw) To: Michal Hocko Cc: Andrew Morton, linux-kernel, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko, patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, ebiederm, vgoyal, dyoung, yinghai, monstr, davem, chris, jcmvbkbc, gustavo, maarten.lankhorst, seanpaul, linux-parisc, linuxppc-dev, kexec On 07/26/18 at 02:59pm, Michal Hocko wrote: > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > Kexec has been a formal feature in our distro, and customers owning > > > > those kind of very large machine can make use of this feature to speed > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > search place to put kernel under 4G from top to down. As we know, the > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > it. It may have possibility to not be able to find a usable space for > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > have this worry. > > > > > > I do not have the full context here but let me note that you should be > > > careful when doing top-down reservation because you can easily get into > > > hotplugable memory and break the hotremove usecase. We even warn when > > > this is done. See memblock_find_in_range_node > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > for them to do the later copying. You can see below struct kexec_segment, > > for the old kexec_load, kernel/initrd are read into user space buffer, > > the @buf stores the user space buffer address, @mem stores the position > > where kernel/initrd will be put. In kernel, it calls > > kimage_load_normal_segment() to copy user space buffer to intermediate > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > are recorded as entries, later when user execute "kexec -e" to trigger > > kexec jumping, it will do the final copying from the intermediate pages > > to the real destination pages which @mem pointed. Because we can't touch > > the existed data in 1st kernel when do kexec kernel loading. With my > > understanding, GFP_KERNEL will make those intermediate pages be > > allocated inside immovable area, it won't impact hotplugging. But the > > @mem we searched in the whole system RAM might be lost along with > > hotplug. Hence we need do kexec kernel again when hotplug event is > > detected. > > I am not sure I am following. If @mem is placed at movable node then the > memory hotremove simply won't work, because we are seeing reserved pages > and do not know what to do about them. They are not migrateable. > Allocating intermediate pages from other nodes doesn't really help. OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove in 1st kernel, it does impact the kernel which kexec jump into if kernel is at top of system RAM and the top RAM is in movable node. > > The memblock code warns exactly for that reason. > -- > Michal Hocko > SUSE Labs ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 13:09 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-26 13:09 UTC (permalink / raw) To: Michal Hocko Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, jcmvbkbc, baiyaowei, kys, frowand.list, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, Andrew Morton, fengguang.wu, linuxppc-dev, davem On 07/26/18 at 02:59pm, Michal Hocko wrote: > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > Kexec has been a formal feature in our distro, and customers owning > > > > those kind of very large machine can make use of this feature to speed > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > search place to put kernel under 4G from top to down. As we know, the > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > it. It may have possibility to not be able to find a usable space for > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > have this worry. > > > > > > I do not have the full context here but let me note that you should be > > > careful when doing top-down reservation because you can easily get into > > > hotplugable memory and break the hotremove usecase. We even warn when > > > this is done. See memblock_find_in_range_node > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > for them to do the later copying. You can see below struct kexec_segment, > > for the old kexec_load, kernel/initrd are read into user space buffer, > > the @buf stores the user space buffer address, @mem stores the position > > where kernel/initrd will be put. In kernel, it calls > > kimage_load_normal_segment() to copy user space buffer to intermediate > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > are recorded as entries, later when user execute "kexec -e" to trigger > > kexec jumping, it will do the final copying from the intermediate pages > > to the real destination pages which @mem pointed. Because we can't touch > > the existed data in 1st kernel when do kexec kernel loading. With my > > understanding, GFP_KERNEL will make those intermediate pages be > > allocated inside immovable area, it won't impact hotplugging. But the > > @mem we searched in the whole system RAM might be lost along with > > hotplug. Hence we need do kexec kernel again when hotplug event is > > detected. > > I am not sure I am following. If @mem is placed at movable node then the > memory hotremove simply won't work, because we are seeing reserved pages > and do not know what to do about them. They are not migrateable. > Allocating intermediate pages from other nodes doesn't really help. OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove in 1st kernel, it does impact the kernel which kexec jump into if kernel is at top of system RAM and the top RAM is in movable node. > > The memblock code warns exactly for that reason. > -- > Michal Hocko > SUSE Labs _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 13:09 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-26 13:09 UTC (permalink / raw) To: Michal Hocko Cc: nicolas.pitre-QSEj5FYQhm4dnm+yROfE0A, brijesh.singh-5C7GfCeVMHo, devicetree-u79uwXL29TY76Z2rM5mHXA, airlied-cv59FeDIM0c, linux-pci-u79uwXL29TY76Z2rM5mHXA, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w, jcmvbkbc-Re5JQEeQqe8AvxtiuMwx3w, baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/, kys-0li6OtcxBFHby3iVrkZq2A, frowand.list-Re5JQEeQqe8AvxtiuMwx3w, lorenzo.pieralisi-5wv7dgnIgG8, sthemmin-0li6OtcxBFHby3iVrkZq2A, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, patrik.r.jakobsson-Re5JQEeQqe8AvxtiuMwx3w, andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w, linux-input-u79uwXL29TY76Z2rM5mHXA, gustavo-THi1TnShQwVAfugRpC6u6w, bp-l3A5Bk7waGM, dyoung-H+wXaHxf7aLQT0dZR+AlfA, thomas.lendacky-5C7GfCeVMHo, haiyangz-0li6OtcxBFHby3iVrkZq2A, maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA, josh-iaAMLnmF4UmaiuxdJuQwMA, jglisse-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, seanpaul-F7+t8E8rja9g9hUCZPvPmw, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, tglx-hfZtesqFncYOwBW4kG4KsQ, yinghai-DgEjT+Ai2ygdnm+yROfE0A, jonathan.derrick-ral2JQCrhuEAvxtiuMwx3w, chris-YvXeqwSYzG2sTnJN9+BGXg, monstr-pSz03upnqPeHXe+LvDLADg, linux-parisc-u79uwXL29TY76Z2rM5mHXA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-kernel On 07/26/18 at 02:59pm, Michal Hocko wrote: > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > Kexec has been a formal feature in our distro, and customers owning > > > > those kind of very large machine can make use of this feature to speed > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > search place to put kernel under 4G from top to down. As we know, the > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > it. It may have possibility to not be able to find a usable space for > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > have this worry. > > > > > > I do not have the full context here but let me note that you should be > > > careful when doing top-down reservation because you can easily get into > > > hotplugable memory and break the hotremove usecase. We even warn when > > > this is done. See memblock_find_in_range_node > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > for them to do the later copying. You can see below struct kexec_segment, > > for the old kexec_load, kernel/initrd are read into user space buffer, > > the @buf stores the user space buffer address, @mem stores the position > > where kernel/initrd will be put. In kernel, it calls > > kimage_load_normal_segment() to copy user space buffer to intermediate > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > are recorded as entries, later when user execute "kexec -e" to trigger > > kexec jumping, it will do the final copying from the intermediate pages > > to the real destination pages which @mem pointed. Because we can't touch > > the existed data in 1st kernel when do kexec kernel loading. With my > > understanding, GFP_KERNEL will make those intermediate pages be > > allocated inside immovable area, it won't impact hotplugging. But the > > @mem we searched in the whole system RAM might be lost along with > > hotplug. Hence we need do kexec kernel again when hotplug event is > > detected. > > I am not sure I am following. If @mem is placed at movable node then the > memory hotremove simply won't work, because we are seeing reserved pages > and do not know what to do about them. They are not migrateable. > Allocating intermediate pages from other nodes doesn't really help. OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove in 1st kernel, it does impact the kernel which kexec jump into if kernel is at top of system RAM and the top RAM is in movable node. > > The memblock code warns exactly for that reason. > -- > Michal Hocko > SUSE Labs ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required 2018-07-26 13:09 ` Baoquan He (?) (?) @ 2018-07-26 13:12 ` Michal Hocko -1 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-26 13:12 UTC (permalink / raw) To: Baoquan He Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, keith.busch, jcmvbkbc, baiyaowei, kys, frowand.list, dan.j.williams, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, vgoyal, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, Andrew Morton, fengguang.wu, linuxppc-dev, davem On Thu 26-07-18 21:09:04, Baoquan He wrote: > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > those kind of very large machine can make use of this feature to speed > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > it. It may have possibility to not be able to find a usable space for > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > have this worry. > > > > > > > > I do not have the full context here but let me note that you should be > > > > careful when doing top-down reservation because you can easily get into > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > this is done. See memblock_find_in_range_node > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > for them to do the later copying. You can see below struct kexec_segment, > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > the @buf stores the user space buffer address, @mem stores the position > > > where kernel/initrd will be put. In kernel, it calls > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > kexec jumping, it will do the final copying from the intermediate pages > > > to the real destination pages which @mem pointed. Because we can't touch > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > understanding, GFP_KERNEL will make those intermediate pages be > > > allocated inside immovable area, it won't impact hotplugging. But the > > > @mem we searched in the whole system RAM might be lost along with > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > detected. > > > > I am not sure I am following. If @mem is placed at movable node then the > > memory hotremove simply won't work, because we are seeing reserved pages > > and do not know what to do about them. They are not migrateable. > > Allocating intermediate pages from other nodes doesn't really help. > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > in 1st kernel, it does impact the kernel which kexec jump into if kernel > is at top of system RAM and the top RAM is in movable node. It will affect the 1st kernel (which does the memblock allocation top-down) as well. For reasons mentioned above. -- Michal Hocko SUSE Labs _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 13:12 ` Michal Hocko 0 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-26 13:12 UTC (permalink / raw) To: Baoquan He Cc: Andrew Morton, linux-kernel, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko, patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, ebiederm, vgoyal, dyoung, yinghai, monstr, davem, chris, jcmvbkbc, gustavo, maarten.lankhorst, seanpaul, linux-parisc, linuxppc-dev, kexec On Thu 26-07-18 21:09:04, Baoquan He wrote: > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > those kind of very large machine can make use of this feature to speed > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > it. It may have possibility to not be able to find a usable space for > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > have this worry. > > > > > > > > I do not have the full context here but let me note that you should be > > > > careful when doing top-down reservation because you can easily get into > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > this is done. See memblock_find_in_range_node > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > for them to do the later copying. You can see below struct kexec_segment, > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > the @buf stores the user space buffer address, @mem stores the position > > > where kernel/initrd will be put. In kernel, it calls > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > kexec jumping, it will do the final copying from the intermediate pages > > > to the real destination pages which @mem pointed. Because we can't touch > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > understanding, GFP_KERNEL will make those intermediate pages be > > > allocated inside immovable area, it won't impact hotplugging. But the > > > @mem we searched in the whole system RAM might be lost along with > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > detected. > > > > I am not sure I am following. If @mem is placed at movable node then the > > memory hotremove simply won't work, because we are seeing reserved pages > > and do not know what to do about them. They are not migrateable. > > Allocating intermediate pages from other nodes doesn't really help. > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > in 1st kernel, it does impact the kernel which kexec jump into if kernel > is at top of system RAM and the top RAM is in movable node. It will affect the 1st kernel (which does the memblock allocation top-down) as well. For reasons mentioned above. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 13:12 ` Michal Hocko 0 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-26 13:12 UTC (permalink / raw) To: Baoquan He Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, jcmvbkbc, baiyaowei, kys, frowand.list, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, Andrew Morton, fengguang.wu, linuxppc-dev, davem On Thu 26-07-18 21:09:04, Baoquan He wrote: > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > those kind of very large machine can make use of this feature to speed > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > it. It may have possibility to not be able to find a usable space for > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > have this worry. > > > > > > > > I do not have the full context here but let me note that you should be > > > > careful when doing top-down reservation because you can easily get into > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > this is done. See memblock_find_in_range_node > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > for them to do the later copying. You can see below struct kexec_segment, > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > the @buf stores the user space buffer address, @mem stores the position > > > where kernel/initrd will be put. In kernel, it calls > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > kexec jumping, it will do the final copying from the intermediate pages > > > to the real destination pages which @mem pointed. Because we can't touch > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > understanding, GFP_KERNEL will make those intermediate pages be > > > allocated inside immovable area, it won't impact hotplugging. But the > > > @mem we searched in the whole system RAM might be lost along with > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > detected. > > > > I am not sure I am following. If @mem is placed at movable node then the > > memory hotremove simply won't work, because we are seeing reserved pages > > and do not know what to do about them. They are not migrateable. > > Allocating intermediate pages from other nodes doesn't really help. > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > in 1st kernel, it does impact the kernel which kexec jump into if kernel > is at top of system RAM and the top RAM is in movable node. It will affect the 1st kernel (which does the memblock allocation top-down) as well. For reasons mentioned above. -- Michal Hocko SUSE Labs _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 13:12 ` Michal Hocko 0 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-26 13:12 UTC (permalink / raw) To: Baoquan He Cc: nicolas.pitre-QSEj5FYQhm4dnm+yROfE0A, brijesh.singh-5C7GfCeVMHo, devicetree-u79uwXL29TY76Z2rM5mHXA, airlied-cv59FeDIM0c, linux-pci-u79uwXL29TY76Z2rM5mHXA, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w, jcmvbkbc-Re5JQEeQqe8AvxtiuMwx3w, baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/, kys-0li6OtcxBFHby3iVrkZq2A, frowand.list-Re5JQEeQqe8AvxtiuMwx3w, lorenzo.pieralisi-5wv7dgnIgG8, sthemmin-0li6OtcxBFHby3iVrkZq2A, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, patrik.r.jakobsson-Re5JQEeQqe8AvxtiuMwx3w, andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w, linux-input-u79uwXL29TY76Z2rM5mHXA, gustavo-THi1TnShQwVAfugRpC6u6w, bp-l3A5Bk7waGM, dyoung-H+wXaHxf7aLQT0dZR+AlfA, thomas.lendacky-5C7GfCeVMHo, haiyangz-0li6OtcxBFHby3iVrkZq2A, maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA, josh-iaAMLnmF4UmaiuxdJuQwMA, jglisse-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, seanpaul-F7+t8E8rja9g9hUCZPvPmw, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, tglx-hfZtesqFncYOwBW4kG4KsQ, yinghai-DgEjT+Ai2ygdnm+yROfE0A, jonathan.derrick-ral2JQCrhuEAvxtiuMwx3w, chris-YvXeqwSYzG2sTnJN9+BGXg, monstr-pSz03upnqPeHXe+LvDLADg, linux-parisc-u79uwXL29TY76Z2rM5mHXA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-kernel On Thu 26-07-18 21:09:04, Baoquan He wrote: > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > those kind of very large machine can make use of this feature to speed > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > it. It may have possibility to not be able to find a usable space for > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > have this worry. > > > > > > > > I do not have the full context here but let me note that you should be > > > > careful when doing top-down reservation because you can easily get into > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > this is done. See memblock_find_in_range_node > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > for them to do the later copying. You can see below struct kexec_segment, > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > the @buf stores the user space buffer address, @mem stores the position > > > where kernel/initrd will be put. In kernel, it calls > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > kexec jumping, it will do the final copying from the intermediate pages > > > to the real destination pages which @mem pointed. Because we can't touch > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > understanding, GFP_KERNEL will make those intermediate pages be > > > allocated inside immovable area, it won't impact hotplugging. But the > > > @mem we searched in the whole system RAM might be lost along with > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > detected. > > > > I am not sure I am following. If @mem is placed at movable node then the > > memory hotremove simply won't work, because we are seeing reserved pages > > and do not know what to do about them. They are not migrateable. > > Allocating intermediate pages from other nodes doesn't really help. > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > in 1st kernel, it does impact the kernel which kexec jump into if kernel > is at top of system RAM and the top RAM is in movable node. It will affect the 1st kernel (which does the memblock allocation top-down) as well. For reasons mentioned above. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 13:14 ` Michal Hocko 0 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-26 13:14 UTC (permalink / raw) To: Baoquan He Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, keith.busch, jcmvbkbc, baiyaowei, kys, frowand.list, dan.j.williams, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, vgoyal, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, Andrew Morton, fengguang.wu, linuxppc-dev, davem On Thu 26-07-18 15:12:42, Michal Hocko wrote: > On Thu 26-07-18 21:09:04, Baoquan He wrote: > > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > > those kind of very large machine can make use of this feature to speed > > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > > it. It may have possibility to not be able to find a usable space for > > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > > have this worry. > > > > > > > > > > I do not have the full context here but let me note that you should be > > > > > careful when doing top-down reservation because you can easily get into > > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > > this is done. See memblock_find_in_range_node > > > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > > for them to do the later copying. You can see below struct kexec_segment, > > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > > the @buf stores the user space buffer address, @mem stores the position > > > > where kernel/initrd will be put. In kernel, it calls > > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > > kexec jumping, it will do the final copying from the intermediate pages > > > > to the real destination pages which @mem pointed. Because we can't touch > > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > > understanding, GFP_KERNEL will make those intermediate pages be > > > > allocated inside immovable area, it won't impact hotplugging. But the > > > > @mem we searched in the whole system RAM might be lost along with > > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > > detected. > > > > > > I am not sure I am following. If @mem is placed at movable node then the > > > memory hotremove simply won't work, because we are seeing reserved pages > > > and do not know what to do about them. They are not migrateable. > > > Allocating intermediate pages from other nodes doesn't really help. > > > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > > in 1st kernel, it does impact the kernel which kexec jump into if kernel > > is at top of system RAM and the top RAM is in movable node. > > It will affect the 1st kernel (which does the memblock allocation > top-down) as well. For reasons mentioned above. And btw. in the ideal world, we would restrict the memblock allocation top-down from the non-movable nodes. But I do not think we have that information ready at the time when the reservation is done. -- Michal Hocko SUSE Labs _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 13:14 ` Michal Hocko 0 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-26 13:14 UTC (permalink / raw) To: Baoquan He Cc: Andrew Morton, linux-kernel, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko, patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, ebiederm, vgoyal, dyoung, yinghai, monstr, davem, chris, jcmvbkbc, gustavo, maarten.lankhorst, seanpaul, linux-parisc, linuxppc-dev, kexec On Thu 26-07-18 15:12:42, Michal Hocko wrote: > On Thu 26-07-18 21:09:04, Baoquan He wrote: > > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > > those kind of very large machine can make use of this feature to speed > > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > > it. It may have possibility to not be able to find a usable space for > > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > > have this worry. > > > > > > > > > > I do not have the full context here but let me note that you should be > > > > > careful when doing top-down reservation because you can easily get into > > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > > this is done. See memblock_find_in_range_node > > > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > > for them to do the later copying. You can see below struct kexec_segment, > > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > > the @buf stores the user space buffer address, @mem stores the position > > > > where kernel/initrd will be put. In kernel, it calls > > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > > kexec jumping, it will do the final copying from the intermediate pages > > > > to the real destination pages which @mem pointed. Because we can't touch > > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > > understanding, GFP_KERNEL will make those intermediate pages be > > > > allocated inside immovable area, it won't impact hotplugging. But the > > > > @mem we searched in the whole system RAM might be lost along with > > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > > detected. > > > > > > I am not sure I am following. If @mem is placed at movable node then the > > > memory hotremove simply won't work, because we are seeing reserved pages > > > and do not know what to do about them. They are not migrateable. > > > Allocating intermediate pages from other nodes doesn't really help. > > > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > > in 1st kernel, it does impact the kernel which kexec jump into if kernel > > is at top of system RAM and the top RAM is in movable node. > > It will affect the 1st kernel (which does the memblock allocation > top-down) as well. For reasons mentioned above. And btw. in the ideal world, we would restrict the memblock allocation top-down from the non-movable nodes. But I do not think we have that information ready at the time when the reservation is done. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 13:14 ` Michal Hocko 0 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-26 13:14 UTC (permalink / raw) To: Baoquan He Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, jcmvbkbc, baiyaowei, kys, frowand.list, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, Andrew Morton, fengguang.wu, linuxppc-dev, davem On Thu 26-07-18 15:12:42, Michal Hocko wrote: > On Thu 26-07-18 21:09:04, Baoquan He wrote: > > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > > those kind of very large machine can make use of this feature to speed > > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > > it. It may have possibility to not be able to find a usable space for > > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > > have this worry. > > > > > > > > > > I do not have the full context here but let me note that you should be > > > > > careful when doing top-down reservation because you can easily get into > > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > > this is done. See memblock_find_in_range_node > > > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > > for them to do the later copying. You can see below struct kexec_segment, > > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > > the @buf stores the user space buffer address, @mem stores the position > > > > where kernel/initrd will be put. In kernel, it calls > > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > > kexec jumping, it will do the final copying from the intermediate pages > > > > to the real destination pages which @mem pointed. Because we can't touch > > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > > understanding, GFP_KERNEL will make those intermediate pages be > > > > allocated inside immovable area, it won't impact hotplugging. But the > > > > @mem we searched in the whole system RAM might be lost along with > > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > > detected. > > > > > > I am not sure I am following. If @mem is placed at movable node then the > > > memory hotremove simply won't work, because we are seeing reserved pages > > > and do not know what to do about them. They are not migrateable. > > > Allocating intermediate pages from other nodes doesn't really help. > > > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > > in 1st kernel, it does impact the kernel which kexec jump into if kernel > > is at top of system RAM and the top RAM is in movable node. > > It will affect the 1st kernel (which does the memblock allocation > top-down) as well. For reasons mentioned above. And btw. in the ideal world, we would restrict the memblock allocation top-down from the non-movable nodes. But I do not think we have that information ready at the time when the reservation is done. -- Michal Hocko SUSE Labs _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 13:14 ` Michal Hocko 0 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-26 13:14 UTC (permalink / raw) To: Baoquan He Cc: nicolas.pitre-QSEj5FYQhm4dnm+yROfE0A, brijesh.singh-5C7GfCeVMHo, devicetree-u79uwXL29TY76Z2rM5mHXA, airlied-cv59FeDIM0c, linux-pci-u79uwXL29TY76Z2rM5mHXA, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w, jcmvbkbc-Re5JQEeQqe8AvxtiuMwx3w, baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/, kys-0li6OtcxBFHby3iVrkZq2A, frowand.list-Re5JQEeQqe8AvxtiuMwx3w, lorenzo.pieralisi-5wv7dgnIgG8, sthemmin-0li6OtcxBFHby3iVrkZq2A, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, patrik.r.jakobsson-Re5JQEeQqe8AvxtiuMwx3w, andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w, linux-input-u79uwXL29TY76Z2rM5mHXA, gustavo-THi1TnShQwVAfugRpC6u6w, bp-l3A5Bk7waGM, dyoung-H+wXaHxf7aLQT0dZR+AlfA, thomas.lendacky-5C7GfCeVMHo, haiyangz-0li6OtcxBFHby3iVrkZq2A, maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA, josh-iaAMLnmF4UmaiuxdJuQwMA, jglisse-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, seanpaul-F7+t8E8rja9g9hUCZPvPmw, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, tglx-hfZtesqFncYOwBW4kG4KsQ, yinghai-DgEjT+Ai2ygdnm+yROfE0A, jonathan.derrick-ral2JQCrhuEAvxtiuMwx3w, chris-YvXeqwSYzG2sTnJN9+BGXg, monstr-pSz03upnqPeHXe+LvDLADg, linux-parisc-u79uwXL29TY76Z2rM5mHXA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-kernel On Thu 26-07-18 15:12:42, Michal Hocko wrote: > On Thu 26-07-18 21:09:04, Baoquan He wrote: > > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > > those kind of very large machine can make use of this feature to speed > > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > > it. It may have possibility to not be able to find a usable space for > > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > > have this worry. > > > > > > > > > > I do not have the full context here but let me note that you should be > > > > > careful when doing top-down reservation because you can easily get into > > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > > this is done. See memblock_find_in_range_node > > > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > > for them to do the later copying. You can see below struct kexec_segment, > > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > > the @buf stores the user space buffer address, @mem stores the position > > > > where kernel/initrd will be put. In kernel, it calls > > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > > kexec jumping, it will do the final copying from the intermediate pages > > > > to the real destination pages which @mem pointed. Because we can't touch > > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > > understanding, GFP_KERNEL will make those intermediate pages be > > > > allocated inside immovable area, it won't impact hotplugging. But the > > > > @mem we searched in the whole system RAM might be lost along with > > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > > detected. > > > > > > I am not sure I am following. If @mem is placed at movable node then the > > > memory hotremove simply won't work, because we are seeing reserved pages > > > and do not know what to do about them. They are not migrateable. > > > Allocating intermediate pages from other nodes doesn't really help. > > > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > > in 1st kernel, it does impact the kernel which kexec jump into if kernel > > is at top of system RAM and the top RAM is in movable node. > > It will affect the 1st kernel (which does the memblock allocation > top-down) as well. For reasons mentioned above. And btw. in the ideal world, we would restrict the memblock allocation top-down from the non-movable nodes. But I do not think we have that information ready at the time when the reservation is done. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 13:37 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-26 13:37 UTC (permalink / raw) To: Michal Hocko Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, keith.busch, jcmvbkbc, baiyaowei, kys, frowand.list, dan.j.williams, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, vgoyal, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, Andrew Morton, fengguang.wu, linuxppc-dev, davem On 07/26/18 at 03:14pm, Michal Hocko wrote: > On Thu 26-07-18 15:12:42, Michal Hocko wrote: > > On Thu 26-07-18 21:09:04, Baoquan He wrote: > > > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > > > those kind of very large machine can make use of this feature to speed > > > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > > > it. It may have possibility to not be able to find a usable space for > > > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > > > have this worry. > > > > > > > > > > > > I do not have the full context here but let me note that you should be > > > > > > careful when doing top-down reservation because you can easily get into > > > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > > > this is done. See memblock_find_in_range_node > > > > > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > > > for them to do the later copying. You can see below struct kexec_segment, > > > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > > > the @buf stores the user space buffer address, @mem stores the position > > > > > where kernel/initrd will be put. In kernel, it calls > > > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > > > kexec jumping, it will do the final copying from the intermediate pages > > > > > to the real destination pages which @mem pointed. Because we can't touch > > > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > > > understanding, GFP_KERNEL will make those intermediate pages be > > > > > allocated inside immovable area, it won't impact hotplugging. But the > > > > > @mem we searched in the whole system RAM might be lost along with > > > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > > > detected. > > > > > > > > I am not sure I am following. If @mem is placed at movable node then the > > > > memory hotremove simply won't work, because we are seeing reserved pages > > > > and do not know what to do about them. They are not migrateable. > > > > Allocating intermediate pages from other nodes doesn't really help. > > > > > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > > > in 1st kernel, it does impact the kernel which kexec jump into if kernel > > > is at top of system RAM and the top RAM is in movable node. > > > > It will affect the 1st kernel (which does the memblock allocation > > top-down) as well. For reasons mentioned above. > > And btw. in the ideal world, we would restrict the memblock allocation > top-down from the non-movable nodes. But I do not think we have that > information ready at the time when the reservation is done. Oh, you could mix kexec loading up with kdump kernel loading. For kdump kernel, we need reserve memory region during bootup with memblock allocator. For kexec loading, we just operate after system up, and do not need to reserve any memmory region. About memory used to load them, it's quite different way. Thanks Baoquan _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 13:37 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-26 13:37 UTC (permalink / raw) To: Michal Hocko Cc: Andrew Morton, linux-kernel, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko, patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, ebiederm, vgoyal, dyoung, yinghai, monstr, davem, chris, jcmvbkbc, gustavo, maarten.lankhorst, seanpaul, linux-parisc, linuxppc-dev, kexec On 07/26/18 at 03:14pm, Michal Hocko wrote: > On Thu 26-07-18 15:12:42, Michal Hocko wrote: > > On Thu 26-07-18 21:09:04, Baoquan He wrote: > > > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > > > those kind of very large machine can make use of this feature to speed > > > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > > > it. It may have possibility to not be able to find a usable space for > > > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > > > have this worry. > > > > > > > > > > > > I do not have the full context here but let me note that you should be > > > > > > careful when doing top-down reservation because you can easily get into > > > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > > > this is done. See memblock_find_in_range_node > > > > > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > > > for them to do the later copying. You can see below struct kexec_segment, > > > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > > > the @buf stores the user space buffer address, @mem stores the position > > > > > where kernel/initrd will be put. In kernel, it calls > > > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > > > kexec jumping, it will do the final copying from the intermediate pages > > > > > to the real destination pages which @mem pointed. Because we can't touch > > > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > > > understanding, GFP_KERNEL will make those intermediate pages be > > > > > allocated inside immovable area, it won't impact hotplugging. But the > > > > > @mem we searched in the whole system RAM might be lost along with > > > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > > > detected. > > > > > > > > I am not sure I am following. If @mem is placed at movable node then the > > > > memory hotremove simply won't work, because we are seeing reserved pages > > > > and do not know what to do about them. They are not migrateable. > > > > Allocating intermediate pages from other nodes doesn't really help. > > > > > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > > > in 1st kernel, it does impact the kernel which kexec jump into if kernel > > > is at top of system RAM and the top RAM is in movable node. > > > > It will affect the 1st kernel (which does the memblock allocation > > top-down) as well. For reasons mentioned above. > > And btw. in the ideal world, we would restrict the memblock allocation > top-down from the non-movable nodes. But I do not think we have that > information ready at the time when the reservation is done. Oh, you could mix kexec loading up with kdump kernel loading. For kdump kernel, we need reserve memory region during bootup with memblock allocator. For kexec loading, we just operate after system up, and do not need to reserve any memmory region. About memory used to load them, it's quite different way. Thanks Baoquan ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 13:37 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-26 13:37 UTC (permalink / raw) To: Michal Hocko Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, jcmvbkbc, baiyaowei, kys, frowand.list, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, Andrew Morton, fengguang.wu, linuxppc-dev, davem On 07/26/18 at 03:14pm, Michal Hocko wrote: > On Thu 26-07-18 15:12:42, Michal Hocko wrote: > > On Thu 26-07-18 21:09:04, Baoquan He wrote: > > > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > > > those kind of very large machine can make use of this feature to speed > > > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > > > it. It may have possibility to not be able to find a usable space for > > > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > > > have this worry. > > > > > > > > > > > > I do not have the full context here but let me note that you should be > > > > > > careful when doing top-down reservation because you can easily get into > > > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > > > this is done. See memblock_find_in_range_node > > > > > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > > > for them to do the later copying. You can see below struct kexec_segment, > > > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > > > the @buf stores the user space buffer address, @mem stores the position > > > > > where kernel/initrd will be put. In kernel, it calls > > > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > > > kexec jumping, it will do the final copying from the intermediate pages > > > > > to the real destination pages which @mem pointed. Because we can't touch > > > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > > > understanding, GFP_KERNEL will make those intermediate pages be > > > > > allocated inside immovable area, it won't impact hotplugging. But the > > > > > @mem we searched in the whole system RAM might be lost along with > > > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > > > detected. > > > > > > > > I am not sure I am following. If @mem is placed at movable node then the > > > > memory hotremove simply won't work, because we are seeing reserved pages > > > > and do not know what to do about them. They are not migrateable. > > > > Allocating intermediate pages from other nodes doesn't really help. > > > > > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > > > in 1st kernel, it does impact the kernel which kexec jump into if kernel > > > is at top of system RAM and the top RAM is in movable node. > > > > It will affect the 1st kernel (which does the memblock allocation > > top-down) as well. For reasons mentioned above. > > And btw. in the ideal world, we would restrict the memblock allocation > top-down from the non-movable nodes. But I do not think we have that > information ready at the time when the reservation is done. Oh, you could mix kexec loading up with kdump kernel loading. For kdump kernel, we need reserve memory region during bootup with memblock allocator. For kexec loading, we just operate after system up, and do not need to reserve any memmory region. About memory used to load them, it's quite different way. Thanks Baoquan _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 13:37 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-26 13:37 UTC (permalink / raw) To: Michal Hocko Cc: nicolas.pitre-QSEj5FYQhm4dnm+yROfE0A, brijesh.singh-5C7GfCeVMHo, devicetree-u79uwXL29TY76Z2rM5mHXA, airlied-cv59FeDIM0c, linux-pci-u79uwXL29TY76Z2rM5mHXA, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w, jcmvbkbc-Re5JQEeQqe8AvxtiuMwx3w, baiyaowei-0p4V/sDNsUmm0O/7XYngnFaTQe2KTcn/, kys-0li6OtcxBFHby3iVrkZq2A, frowand.list-Re5JQEeQqe8AvxtiuMwx3w, lorenzo.pieralisi-5wv7dgnIgG8, sthemmin-0li6OtcxBFHby3iVrkZq2A, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, patrik.r.jakobsson-Re5JQEeQqe8AvxtiuMwx3w, andy.shevchenko-Re5JQEeQqe8AvxtiuMwx3w, linux-input-u79uwXL29TY76Z2rM5mHXA, gustavo-THi1TnShQwVAfugRpC6u6w, bp-l3A5Bk7waGM, dyoung-H+wXaHxf7aLQT0dZR+AlfA, thomas.lendacky-5C7GfCeVMHo, haiyangz-0li6OtcxBFHby3iVrkZq2A, maarten.lankhorst-VuQAYsv1563Yd54FQh9/CA, josh-iaAMLnmF4UmaiuxdJuQwMA, jglisse-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, seanpaul-F7+t8E8rja9g9hUCZPvPmw, bhelgaas-hpIqsD4AKlfQT0dZR+AlfA, tglx-hfZtesqFncYOwBW4kG4KsQ, yinghai-DgEjT+Ai2ygdnm+yROfE0A, jonathan.derrick-ral2JQCrhuEAvxtiuMwx3w, chris-YvXeqwSYzG2sTnJN9+BGXg, monstr-pSz03upnqPeHXe+LvDLADg, linux-parisc-u79uwXL29TY76Z2rM5mHXA, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, dmitry.torokhov-Re5JQEeQqe8AvxtiuMwx3w, kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-kernel On 07/26/18 at 03:14pm, Michal Hocko wrote: > On Thu 26-07-18 15:12:42, Michal Hocko wrote: > > On Thu 26-07-18 21:09:04, Baoquan He wrote: > > > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > > > those kind of very large machine can make use of this feature to speed > > > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > > > it. It may have possibility to not be able to find a usable space for > > > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > > > have this worry. > > > > > > > > > > > > I do not have the full context here but let me note that you should be > > > > > > careful when doing top-down reservation because you can easily get into > > > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > > > this is done. See memblock_find_in_range_node > > > > > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > > > for them to do the later copying. You can see below struct kexec_segment, > > > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > > > the @buf stores the user space buffer address, @mem stores the position > > > > > where kernel/initrd will be put. In kernel, it calls > > > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > > > kexec jumping, it will do the final copying from the intermediate pages > > > > > to the real destination pages which @mem pointed. Because we can't touch > > > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > > > understanding, GFP_KERNEL will make those intermediate pages be > > > > > allocated inside immovable area, it won't impact hotplugging. But the > > > > > @mem we searched in the whole system RAM might be lost along with > > > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > > > detected. > > > > > > > > I am not sure I am following. If @mem is placed at movable node then the > > > > memory hotremove simply won't work, because we are seeing reserved pages > > > > and do not know what to do about them. They are not migrateable. > > > > Allocating intermediate pages from other nodes doesn't really help. > > > > > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > > > in 1st kernel, it does impact the kernel which kexec jump into if kernel > > > is at top of system RAM and the top RAM is in movable node. > > > > It will affect the 1st kernel (which does the memblock allocation > > top-down) as well. For reasons mentioned above. > > And btw. in the ideal world, we would restrict the memblock allocation > top-down from the non-movable nodes. But I do not think we have that > information ready at the time when the reservation is done. Oh, you could mix kexec loading up with kdump kernel loading. For kdump kernel, we need reserve memory region during bootup with memblock allocator. For kexec loading, we just operate after system up, and do not need to reserve any memmory region. About memory used to load them, it's quite different way. Thanks Baoquan ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required 2018-07-26 13:37 ` Baoquan He (?) (?) @ 2018-07-26 14:01 ` Michal Hocko -1 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-26 14:01 UTC (permalink / raw) To: Baoquan He Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, keith.busch, jcmvbkbc, baiyaowei, kys, frowand.list, dan.j.williams, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, vgoyal, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, Andrew Morton, fengguang.wu, linuxppc-dev, davem On Thu 26-07-18 21:37:05, Baoquan He wrote: > On 07/26/18 at 03:14pm, Michal Hocko wrote: > > On Thu 26-07-18 15:12:42, Michal Hocko wrote: > > > On Thu 26-07-18 21:09:04, Baoquan He wrote: > > > > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > > > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > > > > those kind of very large machine can make use of this feature to speed > > > > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > > > > it. It may have possibility to not be able to find a usable space for > > > > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > > > > have this worry. > > > > > > > > > > > > > > I do not have the full context here but let me note that you should be > > > > > > > careful when doing top-down reservation because you can easily get into > > > > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > > > > this is done. See memblock_find_in_range_node > > > > > > > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > > > > for them to do the later copying. You can see below struct kexec_segment, > > > > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > > > > the @buf stores the user space buffer address, @mem stores the position > > > > > > where kernel/initrd will be put. In kernel, it calls > > > > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > > > > kexec jumping, it will do the final copying from the intermediate pages > > > > > > to the real destination pages which @mem pointed. Because we can't touch > > > > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > > > > understanding, GFP_KERNEL will make those intermediate pages be > > > > > > allocated inside immovable area, it won't impact hotplugging. But the > > > > > > @mem we searched in the whole system RAM might be lost along with > > > > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > > > > detected. > > > > > > > > > > I am not sure I am following. If @mem is placed at movable node then the > > > > > memory hotremove simply won't work, because we are seeing reserved pages > > > > > and do not know what to do about them. They are not migrateable. > > > > > Allocating intermediate pages from other nodes doesn't really help. > > > > > > > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > > > > in 1st kernel, it does impact the kernel which kexec jump into if kernel > > > > is at top of system RAM and the top RAM is in movable node. > > > > > > It will affect the 1st kernel (which does the memblock allocation > > > top-down) as well. For reasons mentioned above. > > > > And btw. in the ideal world, we would restrict the memblock allocation > > top-down from the non-movable nodes. But I do not think we have that > > information ready at the time when the reservation is done. > > Oh, you could mix kexec loading up with kdump kernel loading. For kdump > kernel, we need reserve memory region during bootup with memblock > allocator. For kexec loading, we just operate after system up, and do > not need to reserve any memmory region. About memory used to load them, > it's quite different way. I didn't know about that. I thought both use the same underlying reservation mechanism. My bad and sorry for the noise. -- Michal Hocko SUSE Labs _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 14:01 ` Michal Hocko 0 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-26 14:01 UTC (permalink / raw) To: Baoquan He Cc: Andrew Morton, linux-kernel, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko, patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, ebiederm, vgoyal, dyoung, yinghai, monstr, davem, chris, jcmvbkbc, gustavo, maarten.lankhorst, seanpaul, linux-parisc, linuxppc-dev, kexec On Thu 26-07-18 21:37:05, Baoquan He wrote: > On 07/26/18 at 03:14pm, Michal Hocko wrote: > > On Thu 26-07-18 15:12:42, Michal Hocko wrote: > > > On Thu 26-07-18 21:09:04, Baoquan He wrote: > > > > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > > > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > > > > those kind of very large machine can make use of this feature to speed > > > > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > > > > it. It may have possibility to not be able to find a usable space for > > > > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > > > > have this worry. > > > > > > > > > > > > > > I do not have the full context here but let me note that you should be > > > > > > > careful when doing top-down reservation because you can easily get into > > > > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > > > > this is done. See memblock_find_in_range_node > > > > > > > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > > > > for them to do the later copying. You can see below struct kexec_segment, > > > > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > > > > the @buf stores the user space buffer address, @mem stores the position > > > > > > where kernel/initrd will be put. In kernel, it calls > > > > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > > > > kexec jumping, it will do the final copying from the intermediate pages > > > > > > to the real destination pages which @mem pointed. Because we can't touch > > > > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > > > > understanding, GFP_KERNEL will make those intermediate pages be > > > > > > allocated inside immovable area, it won't impact hotplugging. But the > > > > > > @mem we searched in the whole system RAM might be lost along with > > > > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > > > > detected. > > > > > > > > > > I am not sure I am following. If @mem is placed at movable node then the > > > > > memory hotremove simply won't work, because we are seeing reserved pages > > > > > and do not know what to do about them. They are not migrateable. > > > > > Allocating intermediate pages from other nodes doesn't really help. > > > > > > > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > > > > in 1st kernel, it does impact the kernel which kexec jump into if kernel > > > > is at top of system RAM and the top RAM is in movable node. > > > > > > It will affect the 1st kernel (which does the memblock allocation > > > top-down) as well. For reasons mentioned above. > > > > And btw. in the ideal world, we would restrict the memblock allocation > > top-down from the non-movable nodes. But I do not think we have that > > information ready at the time when the reservation is done. > > Oh, you could mix kexec loading up with kdump kernel loading. For kdump > kernel, we need reserve memory region during bootup with memblock > allocator. For kexec loading, we just operate after system up, and do > not need to reserve any memmory region. About memory used to load them, > it's quite different way. I didn't know about that. I thought both use the same underlying reservation mechanism. My bad and sorry for the noise. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 14:01 ` Michal Hocko 0 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-26 14:01 UTC (permalink / raw) To: Baoquan He Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, jcmvbkbc, baiyaowei, kys, frowand.list, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, Andrew Morton, fengguang.wu, linuxppc-dev, davem On Thu 26-07-18 21:37:05, Baoquan He wrote: > On 07/26/18 at 03:14pm, Michal Hocko wrote: > > On Thu 26-07-18 15:12:42, Michal Hocko wrote: > > > On Thu 26-07-18 21:09:04, Baoquan He wrote: > > > > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > > > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > > > > those kind of very large machine can make use of this feature to speed > > > > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > > > > it. It may have possibility to not be able to find a usable space for > > > > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > > > > have this worry. > > > > > > > > > > > > > > I do not have the full context here but let me note that you should be > > > > > > > careful when doing top-down reservation because you can easily get into > > > > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > > > > this is done. See memblock_find_in_range_node > > > > > > > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > > > > for them to do the later copying. You can see below struct kexec_segment, > > > > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > > > > the @buf stores the user space buffer address, @mem stores the position > > > > > > where kernel/initrd will be put. In kernel, it calls > > > > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > > > > kexec jumping, it will do the final copying from the intermediate pages > > > > > > to the real destination pages which @mem pointed. Because we can't touch > > > > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > > > > understanding, GFP_KERNEL will make those intermediate pages be > > > > > > allocated inside immovable area, it won't impact hotplugging. But the > > > > > > @mem we searched in the whole system RAM might be lost along with > > > > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > > > > detected. > > > > > > > > > > I am not sure I am following. If @mem is placed at movable node then the > > > > > memory hotremove simply won't work, because we are seeing reserved pages > > > > > and do not know what to do about them. They are not migrateable. > > > > > Allocating intermediate pages from other nodes doesn't really help. > > > > > > > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > > > > in 1st kernel, it does impact the kernel which kexec jump into if kernel > > > > is at top of system RAM and the top RAM is in movable node. > > > > > > It will affect the 1st kernel (which does the memblock allocation > > > top-down) as well. For reasons mentioned above. > > > > And btw. in the ideal world, we would restrict the memblock allocation > > top-down from the non-movable nodes. But I do not think we have that > > information ready at the time when the reservation is done. > > Oh, you could mix kexec loading up with kdump kernel loading. For kdump > kernel, we need reserve memory region during bootup with memblock > allocator. For kexec loading, we just operate after system up, and do > not need to reserve any memmory region. About memory used to load them, > it's quite different way. I didn't know about that. I thought both use the same underlying reservation mechanism. My bad and sorry for the noise. -- Michal Hocko SUSE Labs _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 14:01 ` Michal Hocko 0 siblings, 0 replies; 83+ messages in thread From: Michal Hocko @ 2018-07-26 14:01 UTC (permalink / raw) To: Baoquan He Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, keith.busch, jcmvbkbc, baiyaowei, frowand.list, dan.j.williams, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, vgoyal, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov On Thu 26-07-18 21:37:05, Baoquan He wrote: > On 07/26/18 at 03:14pm, Michal Hocko wrote: > > On Thu 26-07-18 15:12:42, Michal Hocko wrote: > > > On Thu 26-07-18 21:09:04, Baoquan He wrote: > > > > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > > > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > > > > those kind of very large machine can make use of this feature to speed > > > > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > > > > it. It may have possibility to not be able to find a usable space for > > > > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > > > > have this worry. > > > > > > > > > > > > > > I do not have the full context here but let me note that you should be > > > > > > > careful when doing top-down reservation because you can easily get into > > > > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > > > > this is done. See memblock_find_in_range_node > > > > > > > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > > > > for them to do the later copying. You can see below struct kexec_segment, > > > > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > > > > the @buf stores the user space buffer address, @mem stores the position > > > > > > where kernel/initrd will be put. In kernel, it calls > > > > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > > > > kexec jumping, it will do the final copying from the intermediate pages > > > > > > to the real destination pages which @mem pointed. Because we can't touch > > > > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > > > > understanding, GFP_KERNEL will make those intermediate pages be > > > > > > allocated inside immovable area, it won't impact hotplugging. But the > > > > > > @mem we searched in the whole system RAM might be lost along with > > > > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > > > > detected. > > > > > > > > > > I am not sure I am following. If @mem is placed at movable node then the > > > > > memory hotremove simply won't work, because we are seeing reserved pages > > > > > and do not know what to do about them. They are not migrateable. > > > > > Allocating intermediate pages from other nodes doesn't really help. > > > > > > > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > > > > in 1st kernel, it does impact the kernel which kexec jump into if kernel > > > > is at top of system RAM and the top RAM is in movable node. > > > > > > It will affect the 1st kernel (which does the memblock allocation > > > top-down) as well. For reasons mentioned above. > > > > And btw. in the ideal world, we would restrict the memblock allocation > > top-down from the non-movable nodes. But I do not think we have that > > information ready at the time when the reservation is done. > > Oh, you could mix kexec loading up with kdump kernel loading. For kdump > kernel, we need reserve memory region during bootup with memblock > allocator. For kexec loading, we just operate after system up, and do > not need to reserve any memmory region. About memory used to load them, > it's quite different way. I didn't know about that. I thought both use the same underlying reservation mechanism. My bad and sorry for the noise. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required 2018-07-26 14:01 ` Michal Hocko ` (2 preceding siblings ...) (?) @ 2018-07-26 15:10 ` Baoquan He -1 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-26 15:10 UTC (permalink / raw) To: Michal Hocko Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, keith.busch, jcmvbkbc, baiyaowei, kys, frowand.list, dan.j.williams, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, vgoyal, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, Andrew Morton, fengguang.wu, linuxppc-dev, davem On 07/26/18 at 04:01pm, Michal Hocko wrote: > On Thu 26-07-18 21:37:05, Baoquan He wrote: > > On 07/26/18 at 03:14pm, Michal Hocko wrote: > > > On Thu 26-07-18 15:12:42, Michal Hocko wrote: > > > > On Thu 26-07-18 21:09:04, Baoquan He wrote: > > > > > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > > > > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > > > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > > > > > those kind of very large machine can make use of this feature to speed > > > > > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > > > > > it. It may have possibility to not be able to find a usable space for > > > > > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > > > > > have this worry. > > > > > > > > > > > > > > > > I do not have the full context here but let me note that you should be > > > > > > > > careful when doing top-down reservation because you can easily get into > > > > > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > > > > > this is done. See memblock_find_in_range_node > > > > > > > > > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > > > > > for them to do the later copying. You can see below struct kexec_segment, > > > > > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > > > > > the @buf stores the user space buffer address, @mem stores the position > > > > > > > where kernel/initrd will be put. In kernel, it calls > > > > > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > > > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > > > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > > > > > kexec jumping, it will do the final copying from the intermediate pages > > > > > > > to the real destination pages which @mem pointed. Because we can't touch > > > > > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > > > > > understanding, GFP_KERNEL will make those intermediate pages be > > > > > > > allocated inside immovable area, it won't impact hotplugging. But the > > > > > > > @mem we searched in the whole system RAM might be lost along with > > > > > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > > > > > detected. > > > > > > > > > > > > I am not sure I am following. If @mem is placed at movable node then the > > > > > > memory hotremove simply won't work, because we are seeing reserved pages > > > > > > and do not know what to do about them. They are not migrateable. > > > > > > Allocating intermediate pages from other nodes doesn't really help. > > > > > > > > > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > > > > > in 1st kernel, it does impact the kernel which kexec jump into if kernel > > > > > is at top of system RAM and the top RAM is in movable node. > > > > > > > > It will affect the 1st kernel (which does the memblock allocation > > > > top-down) as well. For reasons mentioned above. > > > > > > And btw. in the ideal world, we would restrict the memblock allocation > > > top-down from the non-movable nodes. But I do not think we have that > > > information ready at the time when the reservation is done. > > > > Oh, you could mix kexec loading up with kdump kernel loading. For kdump > > kernel, we need reserve memory region during bootup with memblock > > allocator. For kexec loading, we just operate after system up, and do > > not need to reserve any memmory region. About memory used to load them, > > it's quite different way. > > I didn't know about that. I thought both use the same underlying > reservation mechanism. My bad and sorry for the noise. Not at all. It's truly confusing. I often need take time to recall those details. _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 15:10 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-26 15:10 UTC (permalink / raw) To: Michal Hocko Cc: Andrew Morton, linux-kernel, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko, patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree, linux-pci, ebiederm, vgoyal, dyoung, yinghai, monstr, davem, chris, jcmvbkbc, gustavo, maarten.lankhorst, seanpaul, linux-parisc, linuxppc-dev, kexec On 07/26/18 at 04:01pm, Michal Hocko wrote: > On Thu 26-07-18 21:37:05, Baoquan He wrote: > > On 07/26/18 at 03:14pm, Michal Hocko wrote: > > > On Thu 26-07-18 15:12:42, Michal Hocko wrote: > > > > On Thu 26-07-18 21:09:04, Baoquan He wrote: > > > > > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > > > > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > > > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > > > > > those kind of very large machine can make use of this feature to speed > > > > > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > > > > > it. It may have possibility to not be able to find a usable space for > > > > > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > > > > > have this worry. > > > > > > > > > > > > > > > > I do not have the full context here but let me note that you should be > > > > > > > > careful when doing top-down reservation because you can easily get into > > > > > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > > > > > this is done. See memblock_find_in_range_node > > > > > > > > > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > > > > > for them to do the later copying. You can see below struct kexec_segment, > > > > > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > > > > > the @buf stores the user space buffer address, @mem stores the position > > > > > > > where kernel/initrd will be put. In kernel, it calls > > > > > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > > > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > > > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > > > > > kexec jumping, it will do the final copying from the intermediate pages > > > > > > > to the real destination pages which @mem pointed. Because we can't touch > > > > > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > > > > > understanding, GFP_KERNEL will make those intermediate pages be > > > > > > > allocated inside immovable area, it won't impact hotplugging. But the > > > > > > > @mem we searched in the whole system RAM might be lost along with > > > > > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > > > > > detected. > > > > > > > > > > > > I am not sure I am following. If @mem is placed at movable node then the > > > > > > memory hotremove simply won't work, because we are seeing reserved pages > > > > > > and do not know what to do about them. They are not migrateable. > > > > > > Allocating intermediate pages from other nodes doesn't really help. > > > > > > > > > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > > > > > in 1st kernel, it does impact the kernel which kexec jump into if kernel > > > > > is at top of system RAM and the top RAM is in movable node. > > > > > > > > It will affect the 1st kernel (which does the memblock allocation > > > > top-down) as well. For reasons mentioned above. > > > > > > And btw. in the ideal world, we would restrict the memblock allocation > > > top-down from the non-movable nodes. But I do not think we have that > > > information ready at the time when the reservation is done. > > > > Oh, you could mix kexec loading up with kdump kernel loading. For kdump > > kernel, we need reserve memory region during bootup with memblock > > allocator. For kexec loading, we just operate after system up, and do > > not need to reserve any memmory region. About memory used to load them, > > it's quite different way. > > I didn't know about that. I thought both use the same underlying > reservation mechanism. My bad and sorry for the noise. Not at all. It's truly confusing. I often need take time to recall those details. ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 15:10 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-26 15:10 UTC (permalink / raw) To: Michal Hocko Cc: Andrew Morton, linux-kernel, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko, patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree On 07/26/18 at 04:01pm, Michal Hocko wrote: > On Thu 26-07-18 21:37:05, Baoquan He wrote: > > On 07/26/18 at 03:14pm, Michal Hocko wrote: > > > On Thu 26-07-18 15:12:42, Michal Hocko wrote: > > > > On Thu 26-07-18 21:09:04, Baoquan He wrote: > > > > > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > > > > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > > > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > > > > > those kind of very large machine can make use of this feature to speed > > > > > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > > > > > it. It may have possibility to not be able to find a usable space for > > > > > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > > > > > have this worry. > > > > > > > > > > > > > > > > I do not have the full context here but let me note that you should be > > > > > > > > careful when doing top-down reservation because you can easily get into > > > > > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > > > > > this is done. See memblock_find_in_range_node > > > > > > > > > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > > > > > for them to do the later copying. You can see below struct kexec_segment, > > > > > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > > > > > the @buf stores the user space buffer address, @mem stores the position > > > > > > > where kernel/initrd will be put. In kernel, it calls > > > > > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > > > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > > > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > > > > > kexec jumping, it will do the final copying from the intermediate pages > > > > > > > to the real destination pages which @mem pointed. Because we can't touch > > > > > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > > > > > understanding, GFP_KERNEL will make those intermediate pages be > > > > > > > allocated inside immovable area, it won't impact hotplugging. But the > > > > > > > @mem we searched in the whole system RAM might be lost along with > > > > > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > > > > > detected. > > > > > > > > > > > > I am not sure I am following. If @mem is placed at movable node then the > > > > > > memory hotremove simply won't work, because we are seeing reserved pages > > > > > > and do not know what to do about them. They are not migrateable. > > > > > > Allocating intermediate pages from other nodes doesn't really help. > > > > > > > > > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > > > > > in 1st kernel, it does impact the kernel which kexec jump into if kernel > > > > > is at top of system RAM and the top RAM is in movable node. > > > > > > > > It will affect the 1st kernel (which does the memblock allocation > > > > top-down) as well. For reasons mentioned above. > > > > > > And btw. in the ideal world, we would restrict the memblock allocation > > > top-down from the non-movable nodes. But I do not think we have that > > > information ready at the time when the reservation is done. > > > > Oh, you could mix kexec loading up with kdump kernel loading. For kdump > > kernel, we need reserve memory region during bootup with memblock > > allocator. For kexec loading, we just operate after system up, and do > > not need to reserve any memmory region. About memory used to load them, > > it's quite different way. > > I didn't know about that. I thought both use the same underlying > reservation mechanism. My bad and sorry for the noise. Not at all. It's truly confusing. I often need take time to recall those details. ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 15:10 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-26 15:10 UTC (permalink / raw) To: Michal Hocko Cc: nicolas.pitre, brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang, jcmvbkbc, baiyaowei, kys, frowand.list, lorenzo.pieralisi, sthemmin, linux-nvdimm, patrik.r.jakobsson, andy.shevchenko, linux-input, gustavo, bp, dyoung, thomas.lendacky, haiyangz, maarten.lankhorst, josh, jglisse, robh+dt, seanpaul, bhelgaas, tglx, yinghai, jonathan.derrick, chris, monstr, linux-parisc, gregkh, dmitry.torokhov, kexec, linux-kernel, ebiederm, devel, Andrew Morton, fengguang.wu, linuxppc-dev, davem On 07/26/18 at 04:01pm, Michal Hocko wrote: > On Thu 26-07-18 21:37:05, Baoquan He wrote: > > On 07/26/18 at 03:14pm, Michal Hocko wrote: > > > On Thu 26-07-18 15:12:42, Michal Hocko wrote: > > > > On Thu 26-07-18 21:09:04, Baoquan He wrote: > > > > > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > > > > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > > > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > > > > > those kind of very large machine can make use of this feature to speed > > > > > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > > > > > it. It may have possibility to not be able to find a usable space for > > > > > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > > > > > have this worry. > > > > > > > > > > > > > > > > I do not have the full context here but let me note that you should be > > > > > > > > careful when doing top-down reservation because you can easily get into > > > > > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > > > > > this is done. See memblock_find_in_range_node > > > > > > > > > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > > > > > for them to do the later copying. You can see below struct kexec_segment, > > > > > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > > > > > the @buf stores the user space buffer address, @mem stores the position > > > > > > > where kernel/initrd will be put. In kernel, it calls > > > > > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > > > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > > > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > > > > > kexec jumping, it will do the final copying from the intermediate pages > > > > > > > to the real destination pages which @mem pointed. Because we can't touch > > > > > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > > > > > understanding, GFP_KERNEL will make those intermediate pages be > > > > > > > allocated inside immovable area, it won't impact hotplugging. But the > > > > > > > @mem we searched in the whole system RAM might be lost along with > > > > > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > > > > > detected. > > > > > > > > > > > > I am not sure I am following. If @mem is placed at movable node then the > > > > > > memory hotremove simply won't work, because we are seeing reserved pages > > > > > > and do not know what to do about them. They are not migrateable. > > > > > > Allocating intermediate pages from other nodes doesn't really help. > > > > > > > > > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > > > > > in 1st kernel, it does impact the kernel which kexec jump into if kernel > > > > > is at top of system RAM and the top RAM is in movable node. > > > > > > > > It will affect the 1st kernel (which does the memblock allocation > > > > top-down) as well. For reasons mentioned above. > > > > > > And btw. in the ideal world, we would restrict the memblock allocation > > > top-down from the non-movable nodes. But I do not think we have that > > > information ready at the time when the reservation is done. > > > > Oh, you could mix kexec loading up with kdump kernel loading. For kdump > > kernel, we need reserve memory region during bootup with memblock > > allocator. For kexec loading, we just operate after system up, and do > > not need to reserve any memmory region. About memory used to load them, > > it's quite different way. > > I didn't know about that. I thought both use the same underlying > reservation mechanism. My bad and sorry for the noise. Not at all. It's truly confusing. I often need take time to recall those details. _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm ^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required @ 2018-07-26 15:10 ` Baoquan He 0 siblings, 0 replies; 83+ messages in thread From: Baoquan He @ 2018-07-26 15:10 UTC (permalink / raw) To: Michal Hocko Cc: Andrew Morton, linux-kernel, robh+dt, dan.j.williams, nicolas.pitre, josh, fengguang.wu, bp, andy.shevchenko, patrik.r.jakobsson, airlied, kys, haiyangz, sthemmin, dmitry.torokhov, frowand.list, keith.busch, jonathan.derrick, lorenzo.pieralisi, bhelgaas, tglx, brijesh.singh, jglisse, thomas.lendacky, gregkh, baiyaowei, richard.weiyang, devel, linux-input, linux-nvdimm, devicetree On 07/26/18 at 04:01pm, Michal Hocko wrote: > On Thu 26-07-18 21:37:05, Baoquan He wrote: > > On 07/26/18 at 03:14pm, Michal Hocko wrote: > > > On Thu 26-07-18 15:12:42, Michal Hocko wrote: > > > > On Thu 26-07-18 21:09:04, Baoquan He wrote: > > > > > On 07/26/18 at 02:59pm, Michal Hocko wrote: > > > > > > On Wed 25-07-18 14:48:13, Baoquan He wrote: > > > > > > > On 07/23/18 at 04:34pm, Michal Hocko wrote: > > > > > > > > On Thu 19-07-18 23:17:53, Baoquan He wrote: > > > > > > > > > Kexec has been a formal feature in our distro, and customers owning > > > > > > > > > those kind of very large machine can make use of this feature to speed > > > > > > > > > up the reboot process. On uefi machine, the kexec_file loading will > > > > > > > > > search place to put kernel under 4G from top to down. As we know, the > > > > > > > > > 1st 4G space is DMA32 ZONE, dma, pci mmcfg, bios etc all try to consume > > > > > > > > > it. It may have possibility to not be able to find a usable space for > > > > > > > > > kernel/initrd. From the top down of the whole memory space, we don't > > > > > > > > > have this worry. > > > > > > > > > > > > > > > > I do not have the full context here but let me note that you should be > > > > > > > > careful when doing top-down reservation because you can easily get into > > > > > > > > hotplugable memory and break the hotremove usecase. We even warn when > > > > > > > > this is done. See memblock_find_in_range_node > > > > > > > > > > > > > > Kexec read kernel/initrd file into buffer, just search usable positions > > > > > > > for them to do the later copying. You can see below struct kexec_segment, > > > > > > > for the old kexec_load, kernel/initrd are read into user space buffer, > > > > > > > the @buf stores the user space buffer address, @mem stores the position > > > > > > > where kernel/initrd will be put. In kernel, it calls > > > > > > > kimage_load_normal_segment() to copy user space buffer to intermediate > > > > > > > pages which are allocated with flag GFP_KERNEL. These intermediate pages > > > > > > > are recorded as entries, later when user execute "kexec -e" to trigger > > > > > > > kexec jumping, it will do the final copying from the intermediate pages > > > > > > > to the real destination pages which @mem pointed. Because we can't touch > > > > > > > the existed data in 1st kernel when do kexec kernel loading. With my > > > > > > > understanding, GFP_KERNEL will make those intermediate pages be > > > > > > > allocated inside immovable area, it won't impact hotplugging. But the > > > > > > > @mem we searched in the whole system RAM might be lost along with > > > > > > > hotplug. Hence we need do kexec kernel again when hotplug event is > > > > > > > detected. > > > > > > > > > > > > I am not sure I am following. If @mem is placed at movable node then the > > > > > > memory hotremove simply won't work, because we are seeing reserved pages > > > > > > and do not know what to do about them. They are not migrateable. > > > > > > Allocating intermediate pages from other nodes doesn't really help. > > > > > > > > > > OK, I forgot the 2nd kernel which kexec jump into. It won't impact hotremove > > > > > in 1st kernel, it does impact the kernel which kexec jump into if kernel > > > > > is at top of system RAM and the top RAM is in movable node. > > > > > > > > It will affect the 1st kernel (which does the memblock allocation > > > > top-down) as well. For reasons mentioned above. > > > > > > And btw. in the ideal world, we would restrict the memblock allocation > > > top-down from the non-movable nodes. But I do not think we have that > > > information ready at the time when the reservation is done. > > > > Oh, you could mix kexec loading up with kdump kernel loading. For kdump > > kernel, we need reserve memory region during bootup with memblock > > allocator. For kexec loading, we just operate after system up, and do > > not need to reserve any memmory region. About memory used to load them, > > it's quite different way. > > I didn't know about that. I thought both use the same underlying > reservation mechanism. My bad and sorry for the noise. Not at all. It's truly confusing. I often need take time to recall those details. ^ permalink raw reply [flat|nested] 83+ messages in thread
end of thread, other threads:[~2018-07-26 16:27 UTC | newest]
Thread overview: 83+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-18 2:49 [PATCH v7 0/4] resource: Use list_head to link sibling resource Baoquan He
2018-07-18 2:49 ` Baoquan He
2018-07-18 2:49 ` Baoquan He
[not found] ` <20180718024944.577-1-bhe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-07-18 2:49 ` [PATCH v7 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public Baoquan He
2018-07-18 2:49 ` Baoquan He
2018-07-18 2:49 ` Baoquan He
2018-07-18 16:36 ` Andy Shevchenko
2018-07-18 16:36 ` Andy Shevchenko
2018-07-18 16:36 ` Andy Shevchenko
[not found] ` <CAHp75VdO88ydJQ9GHdaDUmAmzL6QHR=US6JiXZ1R_EEA-xWR1Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-07-18 16:37 ` Andy Shevchenko
2018-07-18 16:37 ` Andy Shevchenko
2018-07-18 16:37 ` Andy Shevchenko
2018-07-18 16:37 ` Andy Shevchenko
[not found] ` <CAHp75Vf2yEwHhEhhQH2XN+pOQ=-skiAHZ=FgLnfVV8vcm59qeQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-07-19 15:18 ` Baoquan He
2018-07-19 15:18 ` Baoquan He
2018-07-19 15:18 ` Baoquan He
2018-07-19 15:18 ` Baoquan He
2018-07-18 2:49 ` [PATCH v7 2/4] resource: Use list_head to link sibling resource Baoquan He
2018-07-18 2:49 ` Baoquan He
2018-07-18 2:49 ` Baoquan He
2018-07-18 2:49 ` Baoquan He
2018-07-18 2:49 ` [PATCH v7 3/4] resource: add walk_system_ram_res_rev() Baoquan He
2018-07-18 2:49 ` Baoquan He
2018-07-18 2:49 ` Baoquan He
2018-07-18 2:49 ` Baoquan He
2018-07-18 2:49 ` [PATCH v7 4/4] kexec_file: Load kernel at top of system RAM if required Baoquan He
2018-07-18 2:49 ` Baoquan He
2018-07-18 2:49 ` Baoquan He
2018-07-18 2:49 ` Baoquan He
2018-07-18 22:33 ` Andrew Morton
2018-07-18 22:33 ` Andrew Morton
2018-07-18 22:33 ` Andrew Morton
2018-07-18 22:33 ` Andrew Morton
2018-07-19 15:17 ` Baoquan He
2018-07-19 15:17 ` Baoquan He
2018-07-19 15:17 ` Baoquan He
2018-07-19 15:17 ` Baoquan He
2018-07-19 19:44 ` Andrew Morton
2018-07-19 19:44 ` Andrew Morton
2018-07-19 19:44 ` Andrew Morton
2018-07-19 19:44 ` Andrew Morton
2018-07-25 2:21 ` Baoquan He
2018-07-25 2:21 ` Baoquan He
2018-07-25 2:21 ` Baoquan He
2018-07-25 2:21 ` Baoquan He
2018-07-23 14:34 ` Michal Hocko
2018-07-23 14:34 ` Michal Hocko
2018-07-23 14:34 ` Michal Hocko
2018-07-23 14:34 ` Michal Hocko
2018-07-23 14:34 ` Michal Hocko
2018-07-25 6:48 ` Baoquan He
2018-07-25 6:48 ` Baoquan He
2018-07-25 6:48 ` Baoquan He
2018-07-25 6:48 ` Baoquan He
2018-07-26 12:59 ` Michal Hocko
2018-07-26 12:59 ` Michal Hocko
2018-07-26 12:59 ` Michal Hocko
2018-07-26 12:59 ` Michal Hocko
2018-07-26 13:09 ` Baoquan He
2018-07-26 13:09 ` Baoquan He
2018-07-26 13:09 ` Baoquan He
2018-07-26 13:09 ` Baoquan He
2018-07-26 13:12 ` Michal Hocko
2018-07-26 13:12 ` Michal Hocko
2018-07-26 13:12 ` Michal Hocko
2018-07-26 13:12 ` Michal Hocko
2018-07-26 13:14 ` Michal Hocko
2018-07-26 13:14 ` Michal Hocko
2018-07-26 13:14 ` Michal Hocko
2018-07-26 13:14 ` Michal Hocko
2018-07-26 13:37 ` Baoquan He
2018-07-26 13:37 ` Baoquan He
2018-07-26 13:37 ` Baoquan He
2018-07-26 13:37 ` Baoquan He
2018-07-26 14:01 ` Michal Hocko
2018-07-26 14:01 ` Michal Hocko
2018-07-26 14:01 ` Michal Hocko
2018-07-26 14:01 ` Michal Hocko
2018-07-26 15:10 ` Baoquan He
2018-07-26 15:10 ` Baoquan He
2018-07-26 15:10 ` Baoquan He
2018-07-26 15:10 ` Baoquan He
2018-07-26 15:10 ` Baoquan He
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.