From: takahiro.akashi@linaro.org (AKASHI Takahiro)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 18/19] arm64: kdump: update a kernel doc
Date: Fri, 22 Jan 2016 15:23:14 +0900 [thread overview]
Message-ID: <56A1CAD2.70403@linaro.org> (raw)
In-Reply-To: <20160121120210.GD2581@leverpostej>
On 01/21/2016 09:02 PM, Mark Rutland wrote:
> On Thu, Jan 21, 2016 at 03:53:42PM +0900, AKASHI Takahiro wrote:
>> On 01/20/2016 08:49 PM, Mark Rutland wrote:
>>> On Wed, Jan 20, 2016 at 03:07:53PM +0900, AKASHI Takahiro wrote:
>>>> On 01/20/2016 11:49 AM, Dave Young wrote:
>>>>> On 01/19/16 at 02:01pm, Mark Rutland wrote:
>>>>>> On Tue, Jan 19, 2016 at 09:45:53PM +0800, Dave Young wrote:
>>>>>>> On 01/19/16 at 12:51pm, Mark Rutland wrote:
>>>>>>>> On Tue, Jan 19, 2016 at 08:28:48PM +0800, Dave Young wrote:
>>>>>>>>> On 01/19/16 at 02:35pm, AKASHI Takahiro wrote:
>>>>>>>>>> On 01/19/2016 10:43 AM, Dave Young wrote:
>>>>>>>>>>> X86 takes another way in latest kexec-tools and kexec_file_load, that is
>>>>>>>>>>> recreating E820 table and pass it to kexec/kdump kernel, if the entries
>>>>>>>>>>> are over E820 limitation then turn to use setup_data list for remain
>>>>>>>>>>> entries.
>>>>>>>>>>
>>>>>>>>>> Thanks. I will visit x86 code again.
>>>>>>>>>>
>>>>>>>>>>> I think it is X86 specific. Personally I think device tree property is
>>>>>>>>>>> better.
>>>>>>>>>>
>>>>>>>>>> Do you think so?
>>>>>>>>>
>>>>>>>>> I'm not sure it is the best way. For X86 we run into problem with
>>>>>>>>> memmap= design, one example is pci domain X (X>1) need the pci memory
>>>>>>>>> ranges being passed to kdump kernel. When we passed reserved ranges in /proc/iomem
>>>>>>>>> to 2nd kernel we find that cmdline[] array is not big enough.
>>>>>>>>
>>>>>>>> I'm not sure how PCI ranges relate to the memory map used for normal
>>>>>>>> memory (i.e. RAM), though I'm probably missing some caveat with the way
>>>>>>>> ACPI and UEFI describe PCI. Why does memmap= affect PCI memory?
>>>>>>>
>>>>>>> Here is the old patch which was rejected in kexec-tools:
>>>>>>> http://lists.infradead.org/pipermail/kexec/2013-February/007924.html
>>>>>>>
>>>>>>>>
>>>>>>>> If the kernel got the rest of its system topology from DT, the PCI
>>>>>>>> regions would be described there.
>>>>>>>
>>>>>>> Yes, if kdump kernel use same DT as 1st kernel.
>>>>>>
>>>>>> Other than for testing purposes, I don't see why you'd pass the kdump
>>>>>> kernel a DTB inconsistent with that the 1st kernel was passsed (other
>>>>>> than some proerties under /chosen).
>>>>>>
>>>>>> We added /sys/firmware/fdt specifically to allow the kexec tools to get
>>>>>> the exact DTB the first kernel used. There's no reason for tools to have
>>>>>> to make something up.
>>>>>
>>>>> Agreed, but kexec-tools has an option to pass in any dtb files. Who knows
>>>>> how one will use it unless dropping the option and use /sys/firmware/fdt
>>>>> unconditionally.
>>>>
>>>> As a matter of fact, specifying proper command line parameters as well as
>>>> dtb is partly users' responsibility for kdump to work correctly.
>>>> (especially for BE kernel)
>>>>
>>>>> If we choose to implement kexec_file_load only in kernel, the interfaces
>>>>> provided are kernel, initrd and cmdline. We can always use same dtb.
>>>>
>>>> I would say that we can always use the same dtb even with kexec_load
>>> >from user's perspective. Right?
>>>
>>> No.
>>>
>>> This breaks using kexec for boot-loader purposes, and imposes a policy.
>>
>> What kind of policy?
>> I said "can", but if we want to use other setting/configuration, we can
>> still have a full control.
>
> Apologies, I misunderstood.
>
> In most cases, using the existing DTB (with minor modifications to
> /chosen for bootargs and such) is fine. If the user just wants to boot
> another Linux kernel, that's generally what they should do.
>
> I think we're agreed on that.
Yes.
> However, there are cases when the user may want to use a different DTB,
> or use a different purgatory. So we cannot mandate that the existing DTB
> is reused, nor that an in-kernel purgatory must be used, as that imposes
> a policy.
Agree!
>>> For better or worse kexec_file_load has always imposed a constrained
>>> Linux-only policy, so that's a different story.
>>>
>>>>>> There's a horrible edge case I've spotted if performing a chain of
>>>>>> cross-endian kexecs: LE -> BE -> LE, as the BE kernel would have to
>>>>>> respect the EFI memory map so as to avoid corrupting it for the
>>>>>> subsequent LE kernel. Other than this I believe everything should just
>>>>>> work.
>>>>>
>>>>> Firmware do not know kernel endianniess, kernel should respect firmware
>>>>> maps and adapt to it, it sounds like a generic issue not specfic to kexec.
>>>>
>>>> On arm64, a kernel image header has a bit field to specify the image's endianness.
>>>> Anyway, our current implementation replies on a user-supplied dtb to start BE kernel.
>>>
>>> The firmware should _never_ care about the kernel's endianness. The
>>> bootlaoder or first kernel shouldn't care about the next kernel's
>>> endianness apart from in exceptional circumstances. The DTB for a LE
>>> kernel should look identical to that passed to a BE kernel.
>>
>> Please note that I didn't say anything different from your last two statements.
>> The current arm64 kexec implementation doesn't do anything specific to BE,
>> but as far as BE kernel doesn't support UEFI, users are responsible for
>> providing a proper dtb.
>
> I'm just confused as to what you mean by a "proper dtb" in that case.
>
> If you just mean one with memory nodes hacked in, then that would
> currently be a way to make that work, yes.
One of useful cases that I have in my mind is kdump.
We may want to use a small sub-set of dtb, especially devices, to
make the reboot more reliable. Device drivers are likely to be vulnerable
at crash.
> It seems like the better option is to fix the BE kernel to support a
> UEFI memory map, as that solves other issues.
Why did Ard throw away his patch?
>>
>>> In my mind, the only valid reason to look at that bit is so that
>>> bootloaders can provide a warning if the CPU does not implement that
>>> endianness.
>>>
>>> The issue I mention above should be solved by changes to the BE kernel.
>>>
>>>>>>> Is it possible to modify uefi memmap for kdump case?
>>>>>>
>>>>>> Technically it would be possible, however I don't think it's necessary,
>>>>>> and I think it would be disadvantageous to do so.
>>>>>>
>>>>>> Describing the range(s) the crash kernel can use in separate properties
>>>>>> under /chosen has a number of advantages.
>>>>>
>>>>> Ok, I got the points. We have a is_kdump_kernel() by checking if there is
>>>>> elfcorehdr_addr kernel cmdline. This is mainly for some drivers which
>>>>> do not work well in kdump kernel some uncertain reasons. But ideally I
>>>>> think kernel should handle things just like in 1st kernel and avoid to use
>>>>> it.
>>>>
>>>> So I'm not still sure about what are advantages of a property under /chosen
>>>> over "memmap=" kernel parameter.
>>>> Both are simple and can have the same effect with minimizing changes to dtb.
>>>> (But if, in the latter case, we have to provide *all* the memory-related information
>>>> through "memmap=" parameters, it would be much complicated.)
>>>
>>> The reason I prefer a property over command line additions include:
>>
>> Take some examples:
>> (a) a property under /chosen
>> {
>> chosen = {
>> cmdline = "elfcorehdr=AA at BB maxcpus=1 ...";
>> }
>> usable-memory = <XX YY>;
>> memory {
>> ...
>> }
>> }
>>
>> (b) a kernel command line parameter
>> (I use the same name, "usable-memory", to show the similarity. may use another name though.)
>> {
>> chosen = {
>> cmdline = "elfcorehdr=AA at BB maxcpus=1 usable-memory=YY at XX ...";
>> }
>> memory {
>> ...
>> }
>> }
>>
>>> * It keeps the command line simple (as you mention the opposite is
>>> "complicated").
>>
>> I think both are simple.
>>
>>> * It is logically separate from options the user may pass to the kernel
>>> in that the restricted region(s) of memory avaialble are effectively
>>> properties of the system (in that the crashed OS is part of the system
>>> state).
>>
>> "elfcorehdr=" parameter already breaks your point.
>> "elfcorehdr=" looks to be, what you say, a system property, and is actually
>> added by kexec-tools on all architectures, and "usable-memory", whether it is
>> a DT property or a kernel parameter, will also be added by kexec-tools.
>> (Users don't have to care.)
>
> Just because architectures do one thing today does not mean that we have
> to follow it.
>
> I don't think that breaks my point so much as shows that a different
> approach is taken by others today.
>
> There's also no reason this cannot be a property under /chosen.
No, but no strong reason to be so IMO.
>>> * The semantics of the command line parsing can change subtly over time
>>> (for example, see 51e158c12aca3c9a, which terminates command line
>>> parseing at "--"). Maknig sure that a command line option will
>>> actually be parsed by the next kernel is not trivial.
>>>
>>> Keeping this information isolated from the command line is more
>>> robust.
>>
>> Even so, who wants to use kdump without testing?
>> and this is not a kdump specific issue.
>>
>>> * Addition of a property is a self-contained operation, that doesn't
>>> require any knowledge about the command line.
>>
>> I don't get your point here.
>> For a kernel parameter, early_param() can encapsulate all the stuffs necessary.
>> Once the kernel recognizes a usable memory region, limiting available
>> memory should be done in the exact same way.
>
> I mean when modifying the command line.
OK, I understand what you mean.
> To place "elfcorehdr=" or "memmap="/"usable-memory=" into the command
> line, one needs to know where it is valid to place it. Appending doesn't
> always work, as per the example above with 51e158c12aca3c9a.
So, are you now suggesting that we put both "elfcorehdr=" and
"usable-memory=" under /chosen in dtb? That's fair enough.
(as far as nobody cares about incompatibility with other archs.)
-Takahiro AKASHI
> For both of these my point was that generally there is some fragility in
> this area. While it's easy to say that breaking this would be someone
> else's fault, we can easily avoid the possibility of that happening, and
> avoid a set of problems trying to maintain backwards compatibility if
> there were a sensible change that happened to break things.
>
> Thanks,
> Mark.
>
next prev parent reply other threads:[~2016-01-22 6:23 UTC|newest]
Thread overview: 87+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-15 19:18 [PATCH 00/19] arm64 kexec kernel patches v13 Geoff Levand
2016-01-15 19:18 ` [PATCH 04/19] arm64: Cleanup SCTLR flags Geoff Levand
2016-01-15 20:07 ` Mark Rutland
2016-01-18 10:12 ` Marc Zyngier
2016-01-19 11:59 ` Dave Martin
2016-01-25 15:09 ` James Morse
2016-01-15 19:18 ` [PATCH 07/19] arm64: Add back cpu_reset routines Geoff Levand
2016-01-15 19:18 ` [PATCH 06/19] arm64: Add new hcall HVC_CALL_FUNC Geoff Levand
2016-01-15 19:18 ` [PATCH 03/19] arm64: Add new asm macro copy_page Geoff Levand
2016-01-20 14:01 ` James Morse
2016-01-15 19:18 ` [PATCH 09/19] Revert "arm64: remove dead code" Geoff Levand
2016-01-15 19:55 ` Mark Rutland
2016-01-20 21:18 ` Geoff Levand
2016-01-15 19:18 ` [PATCH 05/19] arm64: Convert hcalls to use HVC immediate value Geoff Levand
2016-01-15 19:18 ` [PATCH 02/19] arm64: kernel: Include _AC definition in page.h Geoff Levand
2016-01-18 10:05 ` Mark Rutland
2016-01-15 19:18 ` [PATCH 08/19] Revert "arm64: mm: remove unused cpu_set_idmap_tcr_t0sz function" Geoff Levand
2016-01-15 19:18 ` [PATCH 01/19] arm64: Fold proc-macros.S into assembler.h Geoff Levand
2016-01-15 19:18 ` [PATCH 15/19] arm64: kdump: implement machine_crash_shutdown() Geoff Levand
2016-01-15 19:18 ` [PATCH 14/19] arm64: kdump: reserve memory for crash dump kernel Geoff Levand
2016-01-15 19:18 ` [PATCH 10/19] arm64: kvm: allows kvm cpu hotplug Geoff Levand
2016-01-26 17:42 ` James Morse
2016-01-27 7:37 ` AKASHI Takahiro
2016-01-15 19:18 ` [PATCH 19/19] arm64: kdump: relax BUG_ON() if more than one cpus are still active Geoff Levand
2016-01-15 19:18 ` [PATCH 16/19] arm64: kdump: add kdump support Geoff Levand
2016-01-21 14:17 ` James Morse
2016-01-22 4:50 ` AKASHI Takahiro
2016-01-15 19:18 ` [PATCH 13/19] arm64/kexec: Add pr_debug output Geoff Levand
2016-01-15 19:18 ` [PATCH 18/19] arm64: kdump: update a kernel doc Geoff Levand
2016-01-15 20:16 ` Mark Rutland
2016-01-18 10:26 ` AKASHI Takahiro
2016-01-18 11:29 ` Mark Rutland
2016-01-19 5:31 ` AKASHI Takahiro
2016-01-19 12:10 ` Mark Rutland
2016-01-20 4:34 ` AKASHI Takahiro
2016-01-19 1:43 ` Dave Young
2016-01-19 1:50 ` Dave Young
2016-01-19 5:35 ` AKASHI Takahiro
2016-01-19 12:28 ` Dave Young
2016-01-19 12:51 ` Mark Rutland
2016-01-19 13:45 ` Dave Young
2016-01-19 14:01 ` Mark Rutland
2016-01-20 2:49 ` Dave Young
2016-01-20 6:07 ` AKASHI Takahiro
2016-01-20 6:38 ` Dave Young
2016-01-20 7:00 ` Dave Young
2016-01-20 8:01 ` AKASHI Takahiro
2016-01-20 8:26 ` Dave Young
2016-01-20 11:54 ` Mark Rutland
2016-01-21 2:57 ` Dave Young
2016-01-21 3:03 ` Dave Young
2016-01-20 11:49 ` Mark Rutland
2016-01-21 6:53 ` AKASHI Takahiro
2016-01-21 12:02 ` Mark Rutland
2016-01-22 6:23 ` AKASHI Takahiro [this message]
2016-01-22 11:13 ` Mark Rutland
2016-02-02 5:18 ` AKASHI Takahiro
2016-01-25 3:19 ` Dave Young
2016-01-25 4:23 ` Dave Young
2016-01-20 11:28 ` Mark Rutland
2016-01-21 2:54 ` Dave Young
2016-01-20 5:25 ` AKASHI Takahiro
2016-01-20 12:02 ` Mark Rutland
2016-01-20 12:36 ` Mark Rutland
2016-01-20 14:59 ` Ard Biesheuvel
2016-01-20 15:04 ` Mark Rutland
2016-01-21 5:43 ` AKASHI Takahiro
2016-01-21 13:02 ` Mark Rutland
2016-01-19 12:17 ` Mark Rutland
2016-01-19 13:52 ` Dave Young
2016-01-19 14:05 ` Mark Rutland
2016-01-20 2:54 ` Dave Young
2016-01-15 19:18 ` [PATCH 11/19] arm64/kexec: Add core kexec support Geoff Levand
2016-01-15 19:18 ` [PATCH 12/19] arm64/kexec: Enable kexec in the arm64 defconfig Geoff Levand
2016-01-15 19:18 ` [PATCH 17/19] arm64: kdump: enable kdump " Geoff Levand
2016-01-19 12:32 ` [PATCH 00/19] arm64 kexec kernel patches v13 Dave Young
2016-01-20 0:15 ` Geoff Levand
2016-01-20 2:56 ` Dave Young
2016-01-20 21:15 ` Geoff Levand
2016-01-21 12:11 ` Mark Rutland
[not found] ` <c7575f853ccc491bb0212e025aab1cc9@NASANEXM01H.na.qualcomm.com>
2016-03-01 17:54 ` Azriel Samson
2016-03-02 1:17 ` Geoff Levand
2016-03-02 1:38 ` Will Deacon
2016-03-02 2:28 ` AKASHI Takahiro
2016-03-02 8:07 ` Marc Zyngier
2016-03-02 12:33 ` Pratyush Anand
2016-03-02 16:51 ` Azriel Samson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56A1CAD2.70403@linaro.org \
--to=takahiro.akashi@linaro.org \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).