From: Sourabh Jain <sourabhjain@linux.ibm.com>
To: Michael Ellerman <mpe@ellerman.id.au>,
Aneesh Kumar K V <aneesh.kumar@linux.ibm.com>,
linuxppc-dev@ozlabs.org
Cc: Aditya Gupta <adityag@linux.ibm.com>,
Mahesh Salgaonkar <mahesh@linux.ibm.com>,
Hari Bathini <hbathini@linux.ibm.com>
Subject: Re: [PATCH v5 1/3] powerpc: make fadump resilient with memory add/remove events
Date: Fri, 24 Nov 2023 12:51:17 +0530 [thread overview]
Message-ID: <f75a2cd4-e75d-40d0-bf86-e3f8cc9ce302@linux.ibm.com> (raw)
In-Reply-To: <87cyw2j6h1.fsf@mail.lhotse>
Hello Michael,
On 22/11/23 18:22, Michael Ellerman wrote:
> Sourabh Jain <sourabhjain@linux.ibm.com> writes:
>> On 22/11/23 10:47, Michael Ellerman wrote:
>>> Aneesh Kumar K V <aneesh.kumar@linux.ibm.com> writes:
>>> ...
>>>> I am not sure whether we need to add all the complexity to enable supporting different fadump kernel
>>>> version. Is that even a possible use case with fadump? Can't we always assume that with fadump the
>>>> crash kernel and fadump kernel will be same version?
>>> How sure are we of that?
>>>
>>> Don't we go through grub when we boot into the 2nd kernel. And so
>>> couldn't it choose to boot a different kernel, for whatever reason.
>>>
>>> I don't think we need to support different pt_reg / cpumask sizes, but
>>> requiring the exact same kernel version is too strict I think.
>> Agree.
>>> But maybe I'm wrong. Would be good to hear what distro folks think.
>> How about checking fadump crash info header compatibility in the
>> following way?
>>
>> static bool is_fadump_header_compatible(struct fadump_crash_info_header
>> *fdh)
>> {
>> if (fdh->magic_number == FADUMP_CRASH_INFO_MAGIC_OLD) {
>> pr_err("Old magic number, can't process the dump.");
>> return false;
>> }
>>
>> if (fdh->magic_number != FADUMP_CRASH_INFO_MAGIC) {
>> pr_err("Fadump header is corrupted.");
>> return false;
>> }
>>
>> /*
>> * If the kernel version of the first/crashed kernel and the
>> second/fadump
>> * kernel is not same, then only collect the dump if the size of all
>> * non-primitive type members of the fadump header is the same
>> across kernels.
>> */
>> if (strcmp(fdh->kernel_version, init_uts_ns.name.release)) {
>
> I don't think the kernel version check is necessary?
I didn't find a place where pt_regs members are accessed to take
a decision in fadump kernel. we just copy the pt_regs in fadump kernel.
So I think as long as size is same across kernels, we are good.
>
>> if (fdh->pt_regs_sz != sizeof(struct pt_regs) || fdh->cpu_mask_sz != sizeof(struct cpumask)) {
>> pr_err("Fadump header size mismatch.\n")
>> return false;
> Yeah I think that works.
>
>> } else
>> pr_warn("Kernel version mismatch; dump data is unreliable.\n");
>> }
>>
>> return true;
>> }
>>
>> And the new fadump crash info header will be: As suggested by Hari.
>>
>> /* fadump crash info structure */
>> struct fadump_crash_info_header {
>> u64 magic_number;
>> + u32 version;
>
> Do we need version? We're effectively using the magic_number as a
> version already. Having an actual version would allow us to make
> backward compatible changes in future, but it's not clear we'll need or
> want to do that.
Agree that currently version field is not used but I added a version
field to
make the fadump header structure compatible with future changes without
changing the magic number.
I will add a comment on how version field should be utilized if one
changes fadump
header in future.
>
>> u32 crashing_cpu;
>> u64 elfcorehdr_addr;
>> + u64 elfcorehdr_size;
>> + u64 vmcoreinfo_raddr;
>> + u64 vmcoreinfo_size;
>> + u8 kernel_version[__NEW_UTS_LEN + 1];
>> + u32 pt_regs_sz;
>> struct pt_regs regs;
>> + u32 cpu_mask_sz;
>
> Typically you would put all the size fields before the variable sized
> fields, because otherwise the later size fields may not be where you
> expect them to be. But because we're bailing out if any of the sizes
> don't match it doesn't actually matter.
Yeah, but I will reorganize fadump header and put the size fields before the
variable sized fields.
>
>> struct cpumask cpu_mask;
>> };
> The other issue is endian. I assume we're just declaring that the
> 1st/2nd kernel must be the same endian? We should still make that
> explicit though.
A comment is fine or should we add a explicit check and error out
with relevant error message if endianness is not same across the
kernels?
Something like:
if (fdh->magic_number != FADUMP_CRASH_INFO_MAGIC) {
if (fdh->magic_number == swab64(FADUMP_CRASH_INFO_MAGIC)) {
pr_err("Endianness mismatch");
} else {
pr_err("Fadump header is corrupted.");
}
return false;
}
>
> To make it clearer that this struct is somewhat an ABI, I think we
> should move the definition into arch/powerpc/include/uapi/asm/fadump.h
Sure
>
> We don't expect userspace to actually use the header, but it will
> hopefully remind everyone that the struct needs to be changed with care :)
Agree
Thanks,
Sourabh Jain
next prev parent reply other threads:[~2023-11-24 7:27 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-29 12:45 [PATCH v5 0/3] powerpc: make fadump resilient with memory add/remove events Sourabh Jain
2023-10-29 12:45 ` [PATCH v5 1/3] " Sourabh Jain
2023-11-09 12:14 ` Michael Ellerman
2023-11-13 6:42 ` Sourabh Jain
2023-11-15 4:44 ` Aneesh Kumar K.V
2023-11-17 4:33 ` Sourabh Jain
2023-11-17 5:31 ` Aneesh Kumar K V
2023-11-17 6:14 ` Hari Bathini
2023-11-22 5:17 ` Michael Ellerman
2023-11-22 10:35 ` Sourabh Jain
2023-11-22 12:20 ` Aneesh Kumar K V
2023-11-24 5:20 ` Sourabh Jain
2023-11-22 12:52 ` Michael Ellerman
2023-11-24 7:21 ` Sourabh Jain [this message]
2023-10-29 12:45 ` [PATCH v5 2/3] powerpc/fadump: add hotplug_ready sysfs interface Sourabh Jain
2023-10-29 12:45 ` [PATCH v5 3/3] Documentation/powerpc: update fadump implementation details Sourabh Jain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f75a2cd4-e75d-40d0-bf86-e3f8cc9ce302@linux.ibm.com \
--to=sourabhjain@linux.ibm.com \
--cc=adityag@linux.ibm.com \
--cc=aneesh.kumar@linux.ibm.com \
--cc=hbathini@linux.ibm.com \
--cc=linuxppc-dev@ozlabs.org \
--cc=mahesh@linux.ibm.com \
--cc=mpe@ellerman.id.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.