linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Eric DeVolder <eric.devolder@oracle.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Baoquan He <bhe@redhat.com>,
	david@redhat.com, Oscar Salvador <osalvador@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, x86@kernel.org,
	kexec@lists.infradead.org, ebiederm@xmission.com,
	dyoung@redhat.com, vgoyal@redhat.com, tglx@linutronix.de,
	mingo@redhat.com, dave.hansen@linux.intel.com, hpa@zytor.com,
	nramas@linux.microsoft.com, thomas.lendacky@amd.com,
	robh@kernel.org, efault@gmx.de, rppt@kernel.org,
	sourabhjain@linux.ibm.com, linux-mm@kvack.org
Subject: Re: [PATCH v12 7/7] x86/crash: Add x86 crash hotplug support
Date: Fri, 28 Oct 2022 10:29:45 -0500	[thread overview]
Message-ID: <d91f8728-6a63-415d-577c-bd76e69ec7f6@oracle.com> (raw)
In-Reply-To: <Y1uspLb7fLdtnQq+@zn.tnic>



On 10/28/22 05:19, Borislav Petkov wrote:
> On Thu, Oct 27, 2022 at 02:24:11PM -0500, Eric DeVolder wrote:
>> Be aware, in reality, that if the system was fully populated, it would not
>> actually consume all 8192 phdrs. Rather /proc/iomem would essentially show a
>> large contiguous address space which would require just a single phdr.
> 
> Then that from below:
> 
> 	pnum += CONFIG_CRASH_MAX_MEMORY_RANGES;
> 
> which then would end up allocating 8192 would be a total waste.
> 
> So why don't you make that number dynamic then?
> 
> You start with something sensible:
> 
> 	total_num_pheaders = num_online_cpus() + "some number of regions" + "some few others"
> 
> I.e., a number which is a good compromise on the majority of machines.
> 
> Then, on hotplug events you count how many new regions are coming in
> and when you reach the total_num_pheaders number, you double it (or some
> other increase stragegy), reallocate the ELF header buffers etc needed
> for kdump and you're good.
> 
> This way, you don't waste memory unnecessarily on the majority of
> systems and those who need more, get to allocate more.

This patch series sizes and allocates the memory buffer/segment for the elfcorehdr once, at kdump 
load time.

In order to dynamically resize the elcorehdr memory buffer/segment, that causes the following ripple 
effects:

  - Permitting resizing of the elfcorehdr requires a means to "allocate" a new size buffer from 
within the crash kernel reserved area. There is no allocator today; currently it is a kind of 
one-pass placement process that happens at load time. The critical side effect of allocating a new 
elfcorehdr buffer memory/segment is that it creates a new address for the elfcorehdr.

  - The elfcorehdr is passed to the crash kernel via the elfcorehdr= kernel cmdline option. As such, 
a dynamic change to the size of the elfcorehdr size necessarily invites a change of address of that 
buffer, and therefore a change to rewrite the crash kernel cmdline to reflect the new elfcorehdr 
buffer address.

  - A change to the cmdline, also invites a possible change of address of the buffer containing the 
cmdline, and thus a change to the x86 boot_params, which contains the cmdline pointer.

  - A change to the cmdline and/or boot_params, which are *not* excluded from the hash/digest, means 
that the purgatory hash/digest needs to be recomputed, and purgatory re-linked with the new 
hash/digest and replaced.

A fair amount of work, but I have had this working in the past, around the v1 patch series 
timeframe. However, it only worked for the kexec_file_load() syscall as all the needed pieces of 
information were available; but for kexec_load(), it isn't possible to relink purgatory as by that 
point purgatory is but a user-space binary blob.

It was feedback on the v1/v2 that pointed out that by excluding the elfcorehdr from the hash/digest, 
the "change of address" problem with the elfcorehdr buffer/segment goes away, and, in turn, negates 
the need to: introduce an allocator for the crash kernel reserved space, rewrite the crash kernel 
cmdline with a new elfcorehdr, update boot_params with a new cmdline and re-link and replace 
purgatory with the updated digest. And it enables this hotplug efforts to support kexec_load() 
syscall as well.

So it is with this in mind that I suggest we stay with the statically sized elfcorehdr buffer.

If that can be agreed upon, then it is "just a matter" of picking a useful elfcorehdr size. 
Currently that size is derived from the NR_DEFAULT_CPUS and CRASH_MAX_MEMORY_RANGES. So, there is 
still the CRASH_MAX_MEMORY_RANGES knob to help a dial in size, should there be some issue with the 
default value/size.

Or if there is desire to drop computing the size from NR_DEFAULT_CPUs and CRASH_MAX_MEMORY_RANGES 
and simply go with CRASH_HOTPLUG_ELFCOREHDR_SZ which simply specifies the buffer size, then I'm also 
good with that.

I still owe a much better explanation of how to size the elfcorehdr. I can use the comments and 
ideas from the discussion to provide the necessary insight when choosing this value, whether that be 
CRASH_MAX_MEMORY_RANGES or CRASH_HOTPLUG_ELFCOREHDR_SZ.


> 
>> I'd prefer keeping CRASH_MAX_MEMORY_RANGES as that allow the maximum phdr
>> number value to be reflective of CPUs and/or memory; not all systems support
>> both CPU and memory hotplug. For example, I have queued up this change to
>> reflect this:
>>
>>      if (IS_ENABLED(CONFIG_HOTPLUG_CPU) || IS_ENABLED(CONFIG_MEMORY_HOTPLUG)) {
> 
> If you're going to keep CRASH_MAX_MEMORY_RANGES, then you can test only
> that thing as it expresses the dependency on CONFIG_HOTPLUG_CPU and
> CONFIG_MEMORY_HOTPLUG already.
> 
> If you end up making the number dynamic, then you could make that a
> different Kconfig item which contains all that crash code as most of the
> people won't need it anyway.

It is my intention to correct the CRASH_MAX_MEMORY_RANGES (if we keep it) as such:

config CRASH_MAX_MEMORY_RANGES
     depends on CRASH_DUMP && KEXEC_FILE && MEMORY_HOTPLUG

CRASH_MAX_MEMORY_RANGES should have never had CPU_HOTPLUG as a dependency; that was a cut-n-paste 
error on my part.

> 
> Hmm?
> 

Thank you for the time and thought on this topic!
eric


  reply	other threads:[~2022-10-28 15:30 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20220909210509.6286-1-eric.devolder@oracle.com>
     [not found] ` <20220909210509.6286-8-eric.devolder@oracle.com>
     [not found]   ` <Yx7XEcXZ8PwwQW95@nazgul.tnic>
     [not found]     ` <cb343eef-46be-2d67-b93a-84c75be86325@oracle.com>
     [not found]       ` <YzRxPAoN+XmOfJzV@zn.tnic>
     [not found]         ` <fd08c13d-a917-4cd6-85ec-267e0fe74c41@oracle.com>
2022-09-30 16:50           ` [PATCH v12 7/7] x86/crash: Add x86 crash hotplug support Borislav Petkov
2022-09-30 17:11             ` Eric DeVolder
2022-09-30 17:40               ` Borislav Petkov
2022-10-08  2:35                 ` Baoquan He
2022-10-12 17:46                   ` Borislav Petkov
2022-10-12 20:19                     ` Eric DeVolder
2022-10-12 20:41                       ` Borislav Petkov
2022-10-13  2:57                         ` Baoquan He
2022-10-25 10:31                           ` Borislav Petkov
2022-10-26 14:48                             ` Baoquan He
2022-10-26 14:54                               ` David Hildenbrand
2022-10-27 13:52                                 ` Baoquan He
2022-10-27 19:28                                   ` Eric DeVolder
2022-10-29  4:27                                     ` Baoquan He
2022-10-27 19:24                               ` Eric DeVolder
2022-10-28 10:19                                 ` Borislav Petkov
2022-10-28 15:29                                   ` Eric DeVolder [this message]
2022-10-28 17:06                                     ` Borislav Petkov
2022-10-28 19:26                                       ` Eric DeVolder
2022-10-28 20:30                                         ` Borislav Petkov
2022-10-28 20:34                                           ` Eric DeVolder
2022-10-28 21:22                                           ` Eric DeVolder
2022-10-28 22:19                                             ` Borislav Petkov
2022-10-12 20:42                       ` Eric DeVolder
2022-10-12 16:20                 ` Eric DeVolder
2022-10-25 10:39                   ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d91f8728-6a63-415d-577c-bd76e69ec7f6@oracle.com \
    --to=eric.devolder@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=dyoung@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=efault@gmx.de \
    --cc=hpa@zytor.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@redhat.com \
    --cc=nramas@linux.microsoft.com \
    --cc=osalvador@suse.de \
    --cc=robh@kernel.org \
    --cc=rppt@kernel.org \
    --cc=sourabhjain@linux.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=vgoyal@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).