From: lijiang <lijiang@redhat.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: jgross@suse.com, Thomas.Lendacky@amd.com, bhe@redhat.com,
x86@kernel.org, kexec@lists.infradead.org,
linux-kernel@vger.kernel.org, dhowells@redhat.com,
mingo@redhat.com, bp@alien8.de, hpa@zytor.com,
tglx@linutronix.de, Dave Young <dyoung@redhat.com>,
vgoyal@redhat.com
Subject: Re: [PATCH 3/3 v3] x86/kdump: clean up all the code related to the backup region
Date: Wed, 16 Oct 2019 16:40:29 +0800 [thread overview]
Message-ID: <c42dd338-6fee-9311-df3a-0a3aa8c80cde@redhat.com> (raw)
In-Reply-To: <8736fu1d8k.fsf@x220.int.ebiederm.org>
在 2019年10月15日 19:04, Eric W. Biederman 写道:
> lijiang <lijiang@redhat.com> writes:
>
>> 在 2019年10月13日 11:54, Eric W. Biederman 写道:
>>> Dave Young <dyoung@redhat.com> writes:
>>>
>>>> Hi Eric,
>>>>
>>>> On 10/12/19 at 06:26am, Eric W. Biederman wrote:
>>>>> Lianbo Jiang <lijiang@redhat.com> writes:
>>>>>
>>>>>> When the crashkernel kernel command line option is specified, the
>>>>>> low 1MiB memory will always be reserved, which makes that the memory
>>>>>> allocated later won't fall into the low 1MiB area, thereby, it's not
>>>>>> necessary to create a backup region and also no need to copy the first
>>>>>> 640k content to a backup region.
>>>>>>
>>>>>> Currently, the code related to the backup region can be safely removed,
>>>>>> so lets clean up.
>>>>>>
>>>>>> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
>>>>>> ---
>>>>>
>>>>>> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
>>>>>> index eb651fbde92a..cc5774fc84c0 100644
>>>>>> --- a/arch/x86/kernel/crash.c
>>>>>> +++ b/arch/x86/kernel/crash.c
>>>>>> @@ -173,8 +173,6 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
>>>>>>
>>>>>> #ifdef CONFIG_KEXEC_FILE
>>>>>>
>>>>>> -static unsigned long crash_zero_bytes;
>>>>>> -
>>>>>> static int get_nr_ram_ranges_callback(struct resource *res, void *arg)
>>>>>> {
>>>>>> unsigned int *nr_ranges = arg;
>>>>>> @@ -234,9 +232,15 @@ static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg)
>>>>>> {
>>>>>> struct crash_mem *cmem = arg;
>>>>>>
>>>>>> - cmem->ranges[cmem->nr_ranges].start = res->start;
>>>>>> - cmem->ranges[cmem->nr_ranges].end = res->end;
>>>>>> - cmem->nr_ranges++;
>>>>>> + if (res->start >= SZ_1M) {
>>>>>> + cmem->ranges[cmem->nr_ranges].start = res->start;
>>>>>> + cmem->ranges[cmem->nr_ranges].end = res->end;
>>>>>> + cmem->nr_ranges++;
>>>>>> + } else if (res->end > SZ_1M) {
>>>>>> + cmem->ranges[cmem->nr_ranges].start = SZ_1M;
>>>>>> + cmem->ranges[cmem->nr_ranges].end = res->end;
>>>>>> + cmem->nr_ranges++;
>>>>>> + }
>>>>>
>>>>> What is going on with this chunk? I can guess but this needs a clear
>>>>> comment.
>>>>
>>>> Indeed it needs some code comment, this is based on some offline
>>>> discussion. cat /proc/vmcore will give a warning because ioremap is
>>>> mapping the system ram.
>>>>
>>>> We pass the first 1M to kdump kernel in e820 as system ram so that 2nd
>>>> kernel can use the low 1M memory because for example the trampoline
>>>> code.
>>>>
>>>>>
>>>>>>
>>>>>> return 0;
>>>>>> }
>>>>>
>>>>>> @@ -356,9 +337,12 @@ int crash_setup_memmap_entries(struct kimage *image, struct boot_params *params)
>>>>>> memset(&cmd, 0, sizeof(struct crash_memmap_data));
>>>>>> cmd.params = params;
>>>>>>
>>>>>> - /* Add first 640K segment */
>>>>>> - ei.addr = image->arch.backup_src_start;
>>>>>> - ei.size = image->arch.backup_src_sz;
>>>>>> + /*
>>>>>> + * Add the low memory range[0x1000, SZ_1M], skip
>>>>>> + * the first zero page.
>>>>>> + */
>>>>>> + ei.addr = PAGE_SIZE;
>>>>>> + ei.size = SZ_1M - PAGE_SIZE;
>>>>>> ei.type = E820_TYPE_RAM;
>>>>>> add_e820_entry(params, &ei);
>>>>>
>>>>> Likewise here. Why do we need a special case?
>>>>> Why the magic with PAGE_SIZE?
>>>>
>>>> Good catch, the zero page part is useless, I think no other special
>>>> reason, just assumed zero page is not usable, but it should be ok to
>>>> remove the special handling, just pass 0 - 1M is good enough.
>>>
>>> But if we have stopped special casing the low 1M. Why do we need a
>>> special case here at all?
>>>
>> Here, need to pass the low memory range to kdump kernel, which will guarantee
>> the availability of low memory in kdump kernel, otherwise, kdump kernel won't
>> use the low memory region.
>>
>>> If you need the special case it is almost certainly wrong to say you
>>> have ram above 640KiB and below 1MiB. That is the legacy ROM and video
>>> MMIO area.
>>>
>>> There is a reason the original code said 640KiB.
>>>
>> Do you mean that the 640k region is good enough here instead of 1MiB?
>
> Reading through the code of crash_setup_memap_entries I see that what
> the code is doing now. The code is repeating the e820 memory map with
> the memory areas that were not reserved for the crash kernel removed.
>
> In which case what the code needs to be doing something like:
>
> cmd.type = E820_TYPE_RAM;
> flags = IORESOURCE_MEM;
> walk_iomem_res_desc(IORES_DESC_RESERVED, flags, 0, 1024*1024, &cmd,
> memmap_entry_callback);
>
> Depending on which bugs exist it might make sense to limit this to
> the low 640KiB. But finding something the kernel already recognizes
> as RAM should prevent most of those problems already. Barring bugs
> I admit it doesn't make sense to repeat the work that someone else
> has already done.
>
> This bit:
> /* Add e820 reserved ranges */
> cmd.type = E820_TYPE_RESERVED;
> flags = IORESOURCE_MEM;
> walk_iomem_res_desc(IORES_DESC_RESERVED, flags, 0, -1, &cmd,
> memmap_entry_callback);
>
> Should probably start at 1MiB instead of 0. Just so we don't report the
> memory below 1MiB as unconditionally reserved. I don't properly
> understand the IORES_DESC_RESERVED flag, and how that differs from
> flags. So please test my suggestions to verify the code works as
> expected.
>
Thanks for your comment, Eric.
I will make a test based on your suggestions. But i need an SME machine,
maybe i will reply later.
Thanks.
Lianbo
> Eric
>
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
next prev parent reply other threads:[~2019-10-16 8:40 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-12 2:21 [PATCH 0/3 v3] x86/kdump: Fix 'kmem -s' reported an invalid freepointer when SME was active Lianbo Jiang
2019-10-12 2:21 ` [PATCH 1/3 " Lianbo Jiang
2019-10-12 2:21 ` [PATCH 2/3 v3] x86/kdump cleanup: remove the unused crash_copy_backup_region() Lianbo Jiang
2019-10-12 2:21 ` [PATCH 3/3 v3] x86/kdump: clean up all the code related to the backup region Lianbo Jiang
2019-10-12 11:26 ` Eric W. Biederman
2019-10-12 12:16 ` Dave Young
2019-10-13 3:54 ` Eric W. Biederman
2019-10-13 9:36 ` lijiang
2019-10-15 11:04 ` Eric W. Biederman
2019-10-16 8:40 ` lijiang [this message]
2019-10-16 14:30 ` lijiang
2019-10-16 16:51 ` Eric W. Biederman
2019-10-13 10:20 ` Dave Young
2019-10-14 10:02 ` lijiang
2019-10-15 11:11 ` Eric W. Biederman
2019-10-16 3:24 ` lijiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c42dd338-6fee-9311-df3a-0a3aa8c80cde@redhat.com \
--to=lijiang@redhat.com \
--cc=Thomas.Lendacky@amd.com \
--cc=bhe@redhat.com \
--cc=bp@alien8.de \
--cc=dhowells@redhat.com \
--cc=dyoung@redhat.com \
--cc=ebiederm@xmission.com \
--cc=hpa@zytor.com \
--cc=jgross@suse.com \
--cc=kexec@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=tglx@linutronix.de \
--cc=vgoyal@redhat.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox