* Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr [not found] ` <20190215102458.GD10433-Jj63ApZU6fQ@public.gmane.org> @ 2019-02-18 1:48 ` Dave Young [not found] ` <20190218014820.GA10711-0VdLhd/A9Pl+NNSt+8eSiB/sF2h8X+2i0E9HWUfgJXw@public.gmane.org> 2019-02-20 8:32 ` Borislav Petkov 0 siblings, 2 replies; 18+ messages in thread From: Dave Young @ 2019-02-18 1:48 UTC (permalink / raw) To: Borislav Petkov Cc: Randy Dunlap, konrad.wilk-QHcLZuEGTsvQT0dZR+AlfA, x86-DgEjT+Ai2ygdnm+yROfE0A, kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Jerry Hoemann, Pingfan Liu, linux-kernel-u79uwXL29TY76Z2rM5mHXA, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Mike Rapoport, Andrew Morton, yinghai-DgEjT+Ai2ygdnm+yROfE0A, vgoyal-H+wXaHxf7aLQT0dZR+AlfA On 02/15/19 at 11:24am, Borislav Petkov wrote: > On Tue, Feb 12, 2019 at 04:48:16AM +0800, Dave Young wrote: > > Even we make it automatic in kernel, but we have to have some default > > value for swiotlb in case crashkernel can not find a free region under 4G. > > So this default value can not work for every use cases, people need > > manually use crashkernel=,low and crashkernel=,high in case > > crashkernel=X does not work. > > Why would the user need to find swiotlb range? The kernel has all the > information it requires at its finger tips in order to decide properly. > > The user wants a crashkernel range, the kernel tries the low range => > no workie, then it tries the next range => workie but needs to allocate > swiotlb range so that DMA can happen too. Doh, then the kernel does > allocate that too. It is ideal if kernel can do it automatically, but I'm not sure if kernel can predict the swiotlb reserved size automatically. Let's add more people to seek for comments. > > Why would the user need to do anything here?! > > -- > Regards/Gruss, > Boris. > > Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <20190218014820.GA10711-0VdLhd/A9Pl+NNSt+8eSiB/sF2h8X+2i0E9HWUfgJXw@public.gmane.org>]
* Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr [not found] ` <20190218014820.GA10711-0VdLhd/A9Pl+NNSt+8eSiB/sF2h8X+2i0E9HWUfgJXw@public.gmane.org> @ 2019-02-20 7:38 ` Pingfan Liu 0 siblings, 0 replies; 18+ messages in thread From: Pingfan Liu @ 2019-02-20 7:38 UTC (permalink / raw) To: Dave Young Cc: Randy Dunlap, Baoquan He, konrad.wilk-QHcLZuEGTsvQT0dZR+AlfA, x86-DgEjT+Ai2ygdnm+yROfE0A, kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Jerry Hoemann, LKML, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Mike Rapoport, Borislav Petkov, Andrew Morton, Yinghai Lu, vgoyal-H+wXaHxf7aLQT0dZR+AlfA On Mon, Feb 18, 2019 at 9:48 AM Dave Young <dyoung-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > On 02/15/19 at 11:24am, Borislav Petkov wrote: > > On Tue, Feb 12, 2019 at 04:48:16AM +0800, Dave Young wrote: > > > Even we make it automatic in kernel, but we have to have some default > > > value for swiotlb in case crashkernel can not find a free region under 4G. > > > So this default value can not work for every use cases, people need > > > manually use crashkernel=,low and crashkernel=,high in case > > > crashkernel=X does not work. > > > > Why would the user need to find swiotlb range? The kernel has all the > > information it requires at its finger tips in order to decide properly. > > > > The user wants a crashkernel range, the kernel tries the low range => > > no workie, then it tries the next range => workie but needs to allocate > > swiotlb range so that DMA can happen too. Doh, then the kernel does > > allocate that too. > > It is ideal if kernel can do it automatically, but I'm not sure if > kernel can predict the swiotlb reserved size automatically. > Agreed, I think it is hard to decide the reserved size automatically. We do not know the requirement for memory of ZONE_DMA32 at boot time. The requirement depends on how many DMA32 devices, and the dynamic payload of them. > Let's add more people to seek for comments. > > > > > Why would the user need to do anything here?! > > > > -- > > Regards/Gruss, > > Boris. > > > > Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr 2019-02-18 1:48 ` [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr Dave Young [not found] ` <20190218014820.GA10711-0VdLhd/A9Pl+NNSt+8eSiB/sF2h8X+2i0E9HWUfgJXw@public.gmane.org> @ 2019-02-20 8:32 ` Borislav Petkov 2019-02-20 9:41 ` Dave Young 1 sibling, 1 reply; 18+ messages in thread From: Borislav Petkov @ 2019-02-20 8:32 UTC (permalink / raw) To: Dave Young Cc: bhe, Jerry Hoemann, x86, Randy Dunlap, kexec, linux-kernel, Pingfan Liu, Mike Rapoport, Andrew Morton, yinghai, vgoyal, iommu, konrad.wilk On Mon, Feb 18, 2019 at 09:48:20AM +0800, Dave Young wrote: > It is ideal if kernel can do it automatically, but I'm not sure if > kernel can predict the swiotlb reserved size automatically. Do you see how even more absurd this gets? If the kernel cannot know the swiotlb reserved size automatically, how is the normal user even supposed to know?! I see swiotlb_size_or_default() so we have a sane default which we fall back to. Now where's the problem with that? -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr 2019-02-20 8:32 ` Borislav Petkov @ 2019-02-20 9:41 ` Dave Young 2019-02-20 12:51 ` Pingfan Liu 2019-02-21 17:13 ` Borislav Petkov 0 siblings, 2 replies; 18+ messages in thread From: Dave Young @ 2019-02-20 9:41 UTC (permalink / raw) To: Borislav Petkov Cc: bhe, Jerry Hoemann, x86, Randy Dunlap, kexec, linux-kernel, Pingfan Liu, Mike Rapoport, Andrew Morton, yinghai, vgoyal, iommu, konrad.wilk, Joerg Roedel On 02/20/19 at 09:32am, Borislav Petkov wrote: > On Mon, Feb 18, 2019 at 09:48:20AM +0800, Dave Young wrote: > > It is ideal if kernel can do it automatically, but I'm not sure if > > kernel can predict the swiotlb reserved size automatically. > > Do you see how even more absurd this gets? > > If the kernel cannot know the swiotlb reserved size automatically, how > is the normal user even supposed to know?! > > I see swiotlb_size_or_default() so we have a sane default which we fall > back to. Now where's the problem with that? Good question, I expect some answer from people who know more about the background. It would be good to have some actual test results, Pingfan is trying to do some tests. Previously Joerg posted below patch, maybe he has some idea. Joerg? commit 94fb9334182284e8e7e4bcb9125c25dc33af19d4 Author: Joerg Roedel <jroedel@suse.de> Date: Wed Jun 10 17:49:42 2015 +0200 x86/crash: Allocate enough low memory when crashkernel=high Thanks Dave ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr 2019-02-20 9:41 ` Dave Young @ 2019-02-20 12:51 ` Pingfan Liu 2019-02-21 17:13 ` Borislav Petkov 1 sibling, 0 replies; 18+ messages in thread From: Pingfan Liu @ 2019-02-20 12:51 UTC (permalink / raw) To: Dave Young Cc: Borislav Petkov, Baoquan He, Jerry Hoemann, x86, Randy Dunlap, kexec, LKML, Mike Rapoport, Andrew Morton, Yinghai Lu, vgoyal, iommu, konrad.wilk, Joerg Roedel On Wed, Feb 20, 2019 at 5:41 PM Dave Young <dyoung@redhat.com> wrote: > > On 02/20/19 at 09:32am, Borislav Petkov wrote: > > On Mon, Feb 18, 2019 at 09:48:20AM +0800, Dave Young wrote: > > > It is ideal if kernel can do it automatically, but I'm not sure if > > > kernel can predict the swiotlb reserved size automatically. > > > > Do you see how even more absurd this gets? > > > > If the kernel cannot know the swiotlb reserved size automatically, how > > is the normal user even supposed to know?! > > I think swiotlb is bounce-buffer, if we enlarge it, we can get better performance. Default size should be enough for platform to work. But in case of reserving low memory for crashkernel, things are different. The reserve low memory = swiotlb_size_or_default() + DMA32 memory for devices. And the 2nd item in the right of the equation varies, based on machine type and dynamic payload > > I see swiotlb_size_or_default() so we have a sane default which we fall > > back to. Now where's the problem with that? > > Good question, I expect some answer from people who know more about the > background. It would be good to have some actual test results, Pingfan > is trying to do some tests. > Not following the idea, I do not think the following test result can tell much. (We need various type of machine to get a final result.) I do a quick test on "HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10", command line "crashkernel=180M,high crashkernel=64M,low" can work for the 2nd kernel. Although it complained some memory shortage issue: [ 7.655591] fbcon: mgadrmfb (fb0) is primary device [ 7.655639] Console: switching to colour frame buffer device 128x48 [ 7.660609] systemd-udevd: page allocation failure: order:0, mode:0x280d4 [ 7.660611] CPU: 0 PID: 180 Comm: systemd-udevd Not tainted 3.10.0-957.el7.x86_64 #1 [ 7.660612] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 06/20/2018 [ 7.660612] Call Trace: [ 7.660621] [<ffffffff81761dc1>] dump_stack+0x19/0x1b [ 7.660625] [<ffffffff811bc830>] warn_alloc_failed+0x110/0x180 [ 7.660628] [<ffffffff8175d3ce>] __alloc_pages_slowpath+0x6b6/0x724 [ 7.660631] [<ffffffff811c0e95>] __alloc_pages_nodemask+0x405/0x420 [ 7.660633] [<ffffffff8120dcf8>] alloc_pages_current+0x98/0x110 [ 7.660638] [<ffffffffc00c8622>] ttm_pool_populate+0x3d2/0x4b0 [ttm] [ 7.660641] [<ffffffffc00bf1cd>] ttm_tt_populate+0x7d/0x90 [ttm] [ 7.660644] [<ffffffffc00c3c74>] ttm_bo_kmap+0x124/0x240 [ttm] [ 7.660648] [<ffffffff810cecbf>] ? __wake_up_sync_key+0x4f/0x60 [ 7.660650] [<ffffffffc012677e>] mga_dirty_update+0x25e/0x310 [mgag200] [ 7.660653] [<ffffffffc012685f>] mga_imageblit+0x2f/0x40 [mgag200] [ 7.660657] [<ffffffff813f97ca>] soft_cursor+0x1ba/0x260 [ 7.660659] [<ffffffff813f8f53>] bit_cursor+0x663/0x6a0 [ 7.660662] [<ffffffff81098739>] ? console_trylock+0x19/0x70 [ 7.660664] [<ffffffff813f514d>] fbcon_cursor+0x13d/0x1c0 [ 7.660665] [<ffffffff813f88f0>] ? bit_clear+0x120/0x120 [ 7.660668] [<ffffffff8146af2e>] hide_cursor+0x2e/0xa0 [ 7.660669] [<ffffffff8146d4e8>] redraw_screen+0x188/0x270 [ 7.660671] [<ffffffff8146e086>] do_bind_con_driver+0x316/0x340 [ 7.660672] [<ffffffff8146e5e9>] do_take_over_console+0x49/0x60 [ 7.660674] [<ffffffff813f24c3>] do_fbcon_takeover+0x63/0xd0 [ 7.660675] [<ffffffff813f808d>] fbcon_event_notify+0x61d/0x730 [ 7.660678] [<ffffffff8176fb0f>] notifier_call_chain+0x4f/0x70 [ 7.660681] [<ffffffff810c7f6d>] __blocking_notifier_call_chain+0x4d/0x70 [ 7.660683] [<ffffffff810c7fa6>] blocking_notifier_call_chain+0x16/0x20 [ 7.660684] [<ffffffff813e8b9b>] fb_notifier_call_chain+0x1b/0x20 [ 7.660686] [<ffffffff813e9e46>] register_framebuffer+0x1f6/0x340 [ 7.660690] [<ffffffffc01027e2>] __drm_fb_helper_initial_config_and_unlock+0x252/0x3e0 [drm_kms_helper] [ 7.660694] [<ffffffffc01029ae>] drm_fb_helper_initial_config+0x3e/0x50 [drm_kms_helper] [ 7.660697] [<ffffffffc01269d3>] mgag200_fbdev_init+0xe3/0x100 [mgag200] [ 7.660699] [<ffffffffc01254f4>] mgag200_modeset_init+0x154/0x1d0 [mgag200] [ 7.660701] [<ffffffffc012157d>] mgag200_driver_load+0x41d/0x5b0 [mgag200] [ 7.660708] [<ffffffffc005ba4f>] drm_dev_register+0x15f/0x1f0 [drm] [ 7.660711] [<ffffffff813c3518>] ? pci_enable_device_flags+0xe8/0x140 [ 7.660718] [<ffffffffc005d0da>] drm_get_pci_dev+0x8a/0x1a0 [drm] [ 7.660720] [<ffffffffc012626b>] mga_pci_probe+0x9b/0xc0 [mgag200] [ 7.660722] [<ffffffff813c4aca>] local_pci_probe+0x4a/0xb0 [ 7.660723] [<ffffffff813c6209>] pci_device_probe+0x109/0x160 [ 7.660726] [<ffffffff814a8285>] driver_probe_device+0xc5/0x3e0 [ 7.660727] [<ffffffff814a8683>] __driver_attach+0x93/0xa0 [ 7.660728] [<ffffffff814a85f0>] ? __device_attach+0x50/0x50 [ 7.660730] [<ffffffff814a5e25>] bus_for_each_dev+0x75/0xc0 [ 7.660731] [<ffffffff814a7bfe>] driver_attach+0x1e/0x20 [ 7.660733] [<ffffffff814a76a0>] bus_add_driver+0x200/0x2d0 [ 7.660734] [<ffffffff814a8d14>] driver_register+0x64/0xf0 [ 7.660735] [<ffffffff813c5a45>] __pci_register_driver+0xa5/0xc0 [ 7.660737] [<ffffffffc012d000>] ? 0xffffffffc012cfff [ 7.660739] [<ffffffffc012d039>] mgag200_init+0x39/0x1000 [mgag200] [ 7.660742] [<ffffffff8100210a>] do_one_initcall+0xba/0x240 [ 7.660745] [<ffffffff81118f8c>] load_module+0x272c/0x2bc0 [ 7.660748] [<ffffffff813a3030>] ? ddebug_proc_write+0x100/0x100 [ 7.660750] [<ffffffff8111950f>] SyS_init_module+0xef/0x140 [ 7.660752] [<ffffffff81774ddb>] system_call_fastpath+0x22/0x27 [ 7.660753] Mem-Info: [ 7.660756] active_anon:3364 inactive_anon:6661 isolated_anon:0 [ 7.660756] active_file:0 inactive_file:0 isolated_file:0 [ 7.660756] unevictable:0 dirty:0 writeback:0 unstable:0 [ 7.660756] slab_reclaimable:1492 slab_unreclaimable:3116 [ 7.660756] mapped:1223 shmem:8449 pagetables:179 bounce:0 [ 7.660756] free:20626 free_pcp:0 free_cma:0 [ 7.660761] Node 0 DMA free:0kB min:4kB low:4kB high:4kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:564kB managed:448kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [ 7.660762] lowmem_reserve[]: 0 0 152 152 [ 7.660766] Node 0 DMA32 free:0kB min:0kB low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:65536kB managed:0kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [ 7.660767] lowmem_reserve[]: 0 0 152 152 [ 7.660771] Node 0 Normal free:82504kB min:1572kB low:1964kB high:2356kB active_anon:13456kB inactive_anon:26644kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:183740kB managed:158716kB mlocked:0kB dirty:0kB writeback:0kB mapped:4892kB shmem:33796kB slab_reclaimable:5968kB slab_unreclaimable:12464kB kernel_stack:784kB pagetables:716kB unstable:0kB bounce:0kB free_pcp:[ 8.722693] Microsemi PQI Driver (v1.1.4-115) > Previously Joerg posted below patch, maybe he has some idea. Joerg? > > commit 94fb9334182284e8e7e4bcb9125c25dc33af19d4 > Author: Joerg Roedel <jroedel@suse.de> > Date: Wed Jun 10 17:49:42 2015 +0200 > > x86/crash: Allocate enough low memory when crashkernel=high > > Thanks > Dave ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr 2019-02-20 9:41 ` Dave Young 2019-02-20 12:51 ` Pingfan Liu @ 2019-02-21 17:13 ` Borislav Petkov 2019-02-22 2:11 ` Dave Young 1 sibling, 1 reply; 18+ messages in thread From: Borislav Petkov @ 2019-02-21 17:13 UTC (permalink / raw) To: Dave Young Cc: bhe, Jerry Hoemann, x86, Randy Dunlap, kexec, linux-kernel, Pingfan Liu, Mike Rapoport, Andrew Morton, yinghai, vgoyal, iommu, konrad.wilk, Joerg Roedel On Wed, Feb 20, 2019 at 05:41:46PM +0800, Dave Young wrote: > Previously Joerg posted below patch, maybe he has some idea. Joerg? Isn't it clear from the commit message? -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr 2019-02-21 17:13 ` Borislav Petkov @ 2019-02-22 2:11 ` Dave Young 2019-02-22 8:42 ` Joerg Roedel 0 siblings, 1 reply; 18+ messages in thread From: Dave Young @ 2019-02-22 2:11 UTC (permalink / raw) To: Borislav Petkov Cc: bhe, Jerry Hoemann, x86, Randy Dunlap, kexec, linux-kernel, Pingfan Liu, Mike Rapoport, Andrew Morton, yinghai, vgoyal, iommu, konrad.wilk, Joerg Roedel On 02/21/19 at 06:13pm, Borislav Petkov wrote: > On Wed, Feb 20, 2019 at 05:41:46PM +0800, Dave Young wrote: > > Previously Joerg posted below patch, maybe he has some idea. Joerg? > > Isn't it clear from the commit message? Then, does it answered your question? 256M is set as a default value in the patch, but it is not a predict to satisfy all use cases, from the description it is also possible that some people run out of the 256M and the ,low and ,high format is still necessary to exist even if we make crashkernel=X do the allocation automatically in high in case failed in low area. crashkernel=X: allocate in low first, if not possible, then allocate in high In case people have a lot of devices need more swiotlb, then he manually set the ,high with ,low together. What's your suggestion then? remove ,low and ,high and increase default 256M in case we get failure bug report? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr 2019-02-22 2:11 ` Dave Young @ 2019-02-22 8:42 ` Joerg Roedel 2019-02-22 13:00 ` Borislav Petkov 0 siblings, 1 reply; 18+ messages in thread From: Joerg Roedel @ 2019-02-22 8:42 UTC (permalink / raw) To: Dave Young Cc: Borislav Petkov, bhe, Jerry Hoemann, x86, Randy Dunlap, kexec, linux-kernel, Pingfan Liu, Mike Rapoport, Andrew Morton, yinghai, vgoyal, iommu, konrad.wilk On Fri, Feb 22, 2019 at 10:11:01AM +0800, Dave Young wrote: > In case people have a lot of devices need more swiotlb, then he manually > set the ,high with ,low together. The option to specify the high and low values for the crashkernel are important for certain machines. The point is that swiotlb already allocates 64MB of low memory by default. But that memory is only used for 32bit DMA-mask devices that want to DMA into high memory. There are drivers just allocating GFP_DMA32 memory, which also ends up in the low region (but not swiotlb), that is why the previous default of 72MB low memory was not enough, it only left 8MB of GFP_DMA32 memory. The current default of 256MB was found by experiments on a bigger number of machines, to create a reasonable default that is at least likely to be sufficient of an average machine. There is no way today for the kernel to find an optimum value for the amount of low memory required to successfully create a crash dump. It depends on the amount of devices in the system and how the drivers for them are written. The drivers have no way to report back their requirements, and even if they had, at the time the allocation happens no driver is loaded yet. So it is up to the system administrator to find workable values for the high and low memory requirements, even using experiments as a last resort. Regards, Joerg ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr 2019-02-22 8:42 ` Joerg Roedel @ 2019-02-22 13:00 ` Borislav Petkov 2019-02-24 13:25 ` Pingfan Liu 2019-02-25 11:00 ` Joerg Roedel 0 siblings, 2 replies; 18+ messages in thread From: Borislav Petkov @ 2019-02-22 13:00 UTC (permalink / raw) To: Joerg Roedel, Dave Young Cc: bhe, Jerry Hoemann, x86, Randy Dunlap, kexec, linux-kernel, Pingfan Liu, Mike Rapoport, Andrew Morton, yinghai, vgoyal, iommu, konrad.wilk On Fri, Feb 22, 2019 at 09:42:41AM +0100, Joerg Roedel wrote: > The current default of 256MB was found by experiments on a bigger > number of machines, to create a reasonable default that is at least > likely to be sufficient of an average machine. Exactly, and this is what makes sense. The code should try the requested reservation and if it fails, it should try high allocation with default swiotlb size because we need to reserve *some* range. If that reservation succeeds, we should say something along the lines of "... requested range failed, reserved <X> range instead." And then in Documentation/admin-guide/kernel-parameters.txt above the crashkernel= explanations, the allocation strategy of best effort should be explained in short. That the kernel will try to allocate high if the requested allocation didn't succeed and that the user can tweak the allocation with the below options. Bottom line is: the kernel should assist the user and try harder to allocate *some* range for a crash kernel when there's no detailed specification what that range should be. *If* the user adds ,low, high, then the kernel should try only that specified range because the assumption is that the user knows what she's doing. But if the user simply wants a range for a crash kernel without stating where that range should be in particular and it's placement is a don't care - as long as there is a range - then the kernel should simply try high, etc. Makes sense? -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr 2019-02-22 13:00 ` Borislav Petkov @ 2019-02-24 13:25 ` Pingfan Liu 2019-02-25 1:53 ` Dave Young 2019-02-25 9:39 ` Borislav Petkov 2019-02-25 11:00 ` Joerg Roedel 1 sibling, 2 replies; 18+ messages in thread From: Pingfan Liu @ 2019-02-24 13:25 UTC (permalink / raw) To: Borislav Petkov Cc: Joerg Roedel, Dave Young, Baoquan He, Jerry Hoemann, x86, Randy Dunlap, kexec, LKML, Mike Rapoport, Andrew Morton, Yinghai Lu, vgoyal, iommu, konrad.wilk On Fri, Feb 22, 2019 at 9:00 PM Borislav Petkov <bp@alien8.de> wrote: > > On Fri, Feb 22, 2019 at 09:42:41AM +0100, Joerg Roedel wrote: > > The current default of 256MB was found by experiments on a bigger > > number of machines, to create a reasonable default that is at least > > likely to be sufficient of an average machine. > > Exactly, and this is what makes sense. > > The code should try the requested reservation and if it fails, it should > try high allocation with default swiotlb size because we need to reserve > *some* range. > > If that reservation succeeds, we should say something along the lines of > > "... requested range failed, reserved <X> range instead." > Maybe I misunderstood you, but does "requested range failed" mean that user specify the range? If yes, then it should be the duty of user as you said later, not the duty of kernel" > And then in Documentation/admin-guide/kernel-parameters.txt above the > crashkernel= explanations, the allocation strategy of best effort should > be explained in short. That the kernel will try to allocate high if the > requested allocation didn't succeed and that the user can tweak the > allocation with the below options. > Yes, it should be improved. > Bottom line is: the kernel should assist the user and try harder to > allocate *some* range for a crash kernel when there's no detailed > specification what that range should be. > > *If* the user adds ,low, high, then the kernel should try only that > specified range because the assumption is that the user knows what she's > doing. > > But if the user simply wants a range for a crash kernel without stating > where that range should be in particular and it's placement is a don't > care - as long as there is a range - then the kernel should simply try > high, etc. > We do not know the memory layout of a system, maybe a system with memory less than 4GB. So it is better to try all the range of system memory Thanks, Pingfan > Makes sense? > > -- > Regards/Gruss, > Boris. > > Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr 2019-02-24 13:25 ` Pingfan Liu @ 2019-02-25 1:53 ` Dave Young 2019-02-25 9:39 ` Borislav Petkov 1 sibling, 0 replies; 18+ messages in thread From: Dave Young @ 2019-02-25 1:53 UTC (permalink / raw) To: Pingfan Liu Cc: Borislav Petkov, Joerg Roedel, Baoquan He, Jerry Hoemann, x86, Randy Dunlap, kexec, LKML, Mike Rapoport, Andrew Morton, Yinghai Lu, vgoyal, iommu, konrad.wilk On 02/24/19 at 09:25pm, Pingfan Liu wrote: > On Fri, Feb 22, 2019 at 9:00 PM Borislav Petkov <bp@alien8.de> wrote: > > > > On Fri, Feb 22, 2019 at 09:42:41AM +0100, Joerg Roedel wrote: > > > The current default of 256MB was found by experiments on a bigger > > > number of machines, to create a reasonable default that is at least > > > likely to be sufficient of an average machine. > > > > Exactly, and this is what makes sense. > > > > The code should try the requested reservation and if it fails, it should > > try high allocation with default swiotlb size because we need to reserve > > *some* range. > > > > If that reservation succeeds, we should say something along the lines of > > > > "... requested range failed, reserved <X> range instead." > > > Maybe I misunderstood you, but does "requested range failed" mean that > user specify the range? If yes, then it should be the duty of user as > you said later, not the duty of kernel" If you go with the changes in your current patch it is needed to say something like: "crashkernel: can not find free memory under 4G, reserve XM@.. instead" Also need to print the reserved low memory area in case ,high being used. But for 896M -> 4G, the 896M faulure is not necessary to show in dmesg, it is some in kernel logic. Thanks Dave ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr 2019-02-24 13:25 ` Pingfan Liu 2019-02-25 1:53 ` Dave Young @ 2019-02-25 9:39 ` Borislav Petkov 1 sibling, 0 replies; 18+ messages in thread From: Borislav Petkov @ 2019-02-25 9:39 UTC (permalink / raw) To: Pingfan Liu Cc: Joerg Roedel, Dave Young, Baoquan He, Jerry Hoemann, x86, Randy Dunlap, kexec, LKML, Mike Rapoport, Andrew Morton, Yinghai Lu, vgoyal, iommu, konrad.wilk On Sun, Feb 24, 2019 at 09:25:18PM +0800, Pingfan Liu wrote: > Maybe I misunderstood you, but does "requested range failed" mean that > user specify the range? If yes, then it should be the duty of user as > you said later, not the duty of kernel" No, it should say that it selected a different range only when the user didn't specify it. Which would mean that the user didn't care about the range - she/he only wanted to have *any* crashkernel range reserved. I.e., crashkernel=X invocation. > We do not know the memory layout of a system, maybe a system with > memory less than 4GB. So it is better to try all the range of system > memory. Ok. If 4G fails, you set high and then try again. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr 2019-02-22 13:00 ` Borislav Petkov 2019-02-24 13:25 ` Pingfan Liu @ 2019-02-25 11:00 ` Joerg Roedel 2019-02-25 11:12 ` Dave Young 1 sibling, 1 reply; 18+ messages in thread From: Joerg Roedel @ 2019-02-25 11:00 UTC (permalink / raw) To: Borislav Petkov Cc: Dave Young, bhe, Jerry Hoemann, x86, Randy Dunlap, kexec, linux-kernel, Pingfan Liu, Mike Rapoport, Andrew Morton, yinghai, vgoyal, iommu, konrad.wilk On Fri, Feb 22, 2019 at 02:00:26PM +0100, Borislav Petkov wrote: > On Fri, Feb 22, 2019 at 09:42:41AM +0100, Joerg Roedel wrote: > > The current default of 256MB was found by experiments on a bigger > > number of machines, to create a reasonable default that is at least > > likely to be sufficient of an average machine. > > Exactly, and this is what makes sense. > > The code should try the requested reservation and if it fails, it should > try high allocation with default swiotlb size because we need to reserve > *some* range. Right, makes sense. While at it, maybe it is time to move the default allocation policy to 'high' again. The change was reverted six years ago because it broke old kexec tools, but those are probably out-of-service now. I think this change would make the whole crashdump allocation process less fragile. Regards, Joerg ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr 2019-02-25 11:00 ` Joerg Roedel @ 2019-02-25 11:12 ` Dave Young [not found] ` <20190225111216.GA9276-0VdLhd/A9Pl+NNSt+8eSiB/sF2h8X+2i0E9HWUfgJXw@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Dave Young @ 2019-02-25 11:12 UTC (permalink / raw) To: Joerg Roedel Cc: Borislav Petkov, bhe, Jerry Hoemann, x86, Randy Dunlap, kexec, linux-kernel, Pingfan Liu, Mike Rapoport, Andrew Morton, yinghai, vgoyal, iommu, konrad.wilk On 02/25/19 at 12:00pm, Joerg Roedel wrote: > On Fri, Feb 22, 2019 at 02:00:26PM +0100, Borislav Petkov wrote: > > On Fri, Feb 22, 2019 at 09:42:41AM +0100, Joerg Roedel wrote: > > > The current default of 256MB was found by experiments on a bigger > > > number of machines, to create a reasonable default that is at least > > > likely to be sufficient of an average machine. > > > > Exactly, and this is what makes sense. > > > > The code should try the requested reservation and if it fails, it should > > try high allocation with default swiotlb size because we need to reserve > > *some* range. > > Right, makes sense. While at it, maybe it is time to move the default > allocation policy to 'high' again. The change was reverted six years ago > because it broke old kexec tools, but those are probably out-of-service > now. I think this change would make the whole crashdump allocation > process less fragile. One concern about this is for average cases, one do not need so much memory for kdump. For example in RHEL we use crashkernel=auto to automatically reserve kdump kernel memory, and for x86 the reserved size is like below now: 1G-64G:160M,64G-1T:256M,1T-:512M That means for a machine with less than 64G memory we only allocate 160M, it works for most machines in our lab. If we move to high as default, it will allocate 160M high + 256M low. It is too much for people who is good with the default 160M. Especially for virtual machine with less memory (but > 4G) To make the process less fragile maybe we can remove the 896M limitation and only try <4G then go to high. Thanks Dave ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <20190225111216.GA9276-0VdLhd/A9Pl+NNSt+8eSiB/sF2h8X+2i0E9HWUfgJXw@public.gmane.org>]
* Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr [not found] ` <20190225111216.GA9276-0VdLhd/A9Pl+NNSt+8eSiB/sF2h8X+2i0E9HWUfgJXw@public.gmane.org> @ 2019-02-25 11:30 ` Borislav Petkov 2019-03-01 3:04 ` Pingfan Liu 0 siblings, 1 reply; 18+ messages in thread From: Borislav Petkov @ 2019-02-25 11:30 UTC (permalink / raw) To: Dave Young Cc: Joerg Roedel, bhe-H+wXaHxf7aLQT0dZR+AlfA, konrad.wilk-QHcLZuEGTsvQT0dZR+AlfA, x86-DgEjT+Ai2ygdnm+yROfE0A, kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Jerry Hoemann, Pingfan Liu, linux-kernel-u79uwXL29TY76Z2rM5mHXA, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Mike Rapoport, Randy Dunlap, Andrew Morton, yinghai-DgEjT+Ai2ygdnm+yROfE0A, vgoyal-H+wXaHxf7aLQT0dZR+AlfA On Mon, Feb 25, 2019 at 07:12:16PM +0800, Dave Young wrote: > If we move to high as default, it will allocate 160M high + 256M low. It We won't move to high by default - we will *fall* back to high if the default allocation fails. > To make the process less fragile maybe we can remove the 896M limitation > and only try <4G then go to high. Sure, the more robust for the user, the better. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr 2019-02-25 11:30 ` Borislav Petkov @ 2019-03-01 3:04 ` Pingfan Liu 2019-03-01 3:19 ` Pingfan Liu 0 siblings, 1 reply; 18+ messages in thread From: Pingfan Liu @ 2019-03-01 3:04 UTC (permalink / raw) To: Borislav Petkov Cc: Dave Young, Joerg Roedel, Baoquan He, Jerry Hoemann, x86, Randy Dunlap, kexec, LKML, Mike Rapoport, Andrew Morton, Yinghai Lu, vgoyal, iommu, konrad.wilk Hi Borislav, Do you think the following patch is good at present? diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 81f9d23..9213073 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -460,7 +460,7 @@ static void __init memblock_x86_reserve_range_setup_data(void) # define CRASH_ADDR_LOW_MAX (512 << 20) # define CRASH_ADDR_HIGH_MAX (512 << 20) #else -# define CRASH_ADDR_LOW_MAX (896UL << 20) +# define CRASH_ADDR_LOW_MAX (1 << 32) # define CRASH_ADDR_HIGH_MAX MAXMEM #endif For documentation, I will send another patch to improve the description. Thanks, Pingfan On Mon, Feb 25, 2019 at 7:30 PM Borislav Petkov <bp@alien8.de> wrote: > > On Mon, Feb 25, 2019 at 07:12:16PM +0800, Dave Young wrote: > > If we move to high as default, it will allocate 160M high + 256M low. It > > We won't move to high by default - we will *fall* back to high if the > default allocation fails. > > > To make the process less fragile maybe we can remove the 896M limitation > > and only try <4G then go to high. > > Sure, the more robust for the user, the better. > > -- > Regards/Gruss, > Boris. > > Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr 2019-03-01 3:04 ` Pingfan Liu @ 2019-03-01 3:19 ` Pingfan Liu 2019-03-22 8:22 ` Dave Young 0 siblings, 1 reply; 18+ messages in thread From: Pingfan Liu @ 2019-03-01 3:19 UTC (permalink / raw) To: Borislav Petkov Cc: Dave Young, Joerg Roedel, Baoquan He, Jerry Hoemann, x86, Randy Dunlap, kexec, LKML, Mike Rapoport, Andrew Morton, Yinghai Lu, vgoyal, iommu, konrad.wilk On Fri, Mar 1, 2019 at 11:04 AM Pingfan Liu <kernelfans@gmail.com> wrote: > > Hi Borislav, > > Do you think the following patch is good at present? > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index 81f9d23..9213073 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -460,7 +460,7 @@ static void __init > memblock_x86_reserve_range_setup_data(void) > # define CRASH_ADDR_LOW_MAX (512 << 20) > # define CRASH_ADDR_HIGH_MAX (512 << 20) > #else > -# define CRASH_ADDR_LOW_MAX (896UL << 20) > +# define CRASH_ADDR_LOW_MAX (1 << 32) > # define CRASH_ADDR_HIGH_MAX MAXMEM > #endif > Or patch lools like: diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 3d872a5..ed0def5 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -459,7 +459,7 @@ static void __init memblock_x86_reserve_range_setup_data(void) # define CRASH_ADDR_LOW_MAX (512 << 20) # define CRASH_ADDR_HIGH_MAX (512 << 20) #else -# define CRASH_ADDR_LOW_MAX (896UL << 20) +# define CRASH_ADDR_LOW_MAX (1 << 32) # define CRASH_ADDR_HIGH_MAX MAXMEM #endif @@ -551,6 +551,15 @@ static void __init reserve_crashkernel(void) high ? CRASH_ADDR_HIGH_MAX : CRASH_ADDR_LOW_MAX, crash_size, CRASH_ALIGN); +#ifdef CONFIG_X86_64 + /* + * crashkernel=X reserve below 4G fails? Try MAXMEM + */ + if (!high && !crash_base) + crash_base = memblock_find_in_range(CRASH_ALIGN, + CRASH_ADDR_HIGH_MAX, + crash_size, CRASH_ALIGN); +#endif which tries 0-4G, the fall back to 4G above > For documentation, I will send another patch to improve the description. > > Thanks, > Pingfan > > On Mon, Feb 25, 2019 at 7:30 PM Borislav Petkov <bp@alien8.de> wrote: > > > > On Mon, Feb 25, 2019 at 07:12:16PM +0800, Dave Young wrote: > > > If we move to high as default, it will allocate 160M high + 256M low. It > > > > We won't move to high by default - we will *fall* back to high if the > > default allocation fails. > > > > > To make the process less fragile maybe we can remove the 896M limitation > > > and only try <4G then go to high. > > > > Sure, the more robust for the user, the better. > > > > -- > > Regards/Gruss, > > Boris. > > > > Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr 2019-03-01 3:19 ` Pingfan Liu @ 2019-03-22 8:22 ` Dave Young 0 siblings, 0 replies; 18+ messages in thread From: Dave Young @ 2019-03-22 8:22 UTC (permalink / raw) To: Pingfan Liu Cc: Borislav Petkov, Joerg Roedel, Baoquan He, Jerry Hoemann, x86, Randy Dunlap, kexec, LKML, Mike Rapoport, Andrew Morton, Yinghai Lu, vgoyal, iommu, konrad.wilk Hi Pingfan, Thanks for the effort, On 03/01/19 at 11:19am, Pingfan Liu wrote: > On Fri, Mar 1, 2019 at 11:04 AM Pingfan Liu <kernelfans@gmail.com> wrote: > > > > Hi Borislav, > > > > Do you think the following patch is good at present? > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > > index 81f9d23..9213073 100644 > > --- a/arch/x86/kernel/setup.c > > +++ b/arch/x86/kernel/setup.c > > @@ -460,7 +460,7 @@ static void __init > > memblock_x86_reserve_range_setup_data(void) > > # define CRASH_ADDR_LOW_MAX (512 << 20) > > # define CRASH_ADDR_HIGH_MAX (512 << 20) > > #else > > -# define CRASH_ADDR_LOW_MAX (896UL << 20) > > +# define CRASH_ADDR_LOW_MAX (1 << 32) > > # define CRASH_ADDR_HIGH_MAX MAXMEM > > #endif > > > Or patch lools like: > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index 3d872a5..ed0def5 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -459,7 +459,7 @@ static void __init > memblock_x86_reserve_range_setup_data(void) > # define CRASH_ADDR_LOW_MAX (512 << 20) > # define CRASH_ADDR_HIGH_MAX (512 << 20) > #else > -# define CRASH_ADDR_LOW_MAX (896UL << 20) > +# define CRASH_ADDR_LOW_MAX (1 << 32) > # define CRASH_ADDR_HIGH_MAX MAXMEM > #endif > > @@ -551,6 +551,15 @@ static void __init reserve_crashkernel(void) > high ? CRASH_ADDR_HIGH_MAX > : CRASH_ADDR_LOW_MAX, > crash_size, CRASH_ALIGN); > +#ifdef CONFIG_X86_64 > + /* > + * crashkernel=X reserve below 4G fails? Try MAXMEM > + */ > + if (!high && !crash_base) > + crash_base = memblock_find_in_range(CRASH_ALIGN, > + CRASH_ADDR_HIGH_MAX, > + crash_size, CRASH_ALIGN); > +#endif > > which tries 0-4G, the fall back to 4G above This way looks good to me, I will do some testing with old kexec-tools, Once testing done I can take up this again and repost later with some documentation update. Also will split to 2 patches one to drop the old limitation, another for the fallback. Thanks Dave ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2019-03-22 8:22 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20190125140823.GC27998@zn.tnic>
[not found] ` <20190131075907.GB19091@dhcp-128-65.nay.redhat.com>
[not found] ` <20190131105732.GC6749@zn.tnic>
[not found] ` <20190131222732.GA946@anatevka>
[not found] ` <20190131234740.GO6749@zn.tnic>
[not found] ` <20190204223016.GB11986@anatevka>
[not found] ` <20190205081552.GG21801@zn.tnic>
[not found] ` <20190206120804.GC10062@dhcp-128-65.nay.redhat.com>
[not found] ` <20190211204816.GB21473@dhcp-128-65.nay.redhat.com>
[not found] ` <20190215102458.GD10433@zn.tnic>
[not found] ` <20190215102458.GD10433-Jj63ApZU6fQ@public.gmane.org>
2019-02-18 1:48 ` [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr Dave Young
[not found] ` <20190218014820.GA10711-0VdLhd/A9Pl+NNSt+8eSiB/sF2h8X+2i0E9HWUfgJXw@public.gmane.org>
2019-02-20 7:38 ` Pingfan Liu
2019-02-20 8:32 ` Borislav Petkov
2019-02-20 9:41 ` Dave Young
2019-02-20 12:51 ` Pingfan Liu
2019-02-21 17:13 ` Borislav Petkov
2019-02-22 2:11 ` Dave Young
2019-02-22 8:42 ` Joerg Roedel
2019-02-22 13:00 ` Borislav Petkov
2019-02-24 13:25 ` Pingfan Liu
2019-02-25 1:53 ` Dave Young
2019-02-25 9:39 ` Borislav Petkov
2019-02-25 11:00 ` Joerg Roedel
2019-02-25 11:12 ` Dave Young
[not found] ` <20190225111216.GA9276-0VdLhd/A9Pl+NNSt+8eSiB/sF2h8X+2i0E9HWUfgJXw@public.gmane.org>
2019-02-25 11:30 ` Borislav Petkov
2019-03-01 3:04 ` Pingfan Liu
2019-03-01 3:19 ` Pingfan Liu
2019-03-22 8:22 ` Dave Young
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox