linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Eugen Hristev <eugen.hristev@linaro.org>, Michal Hocko <mhocko@suse.com>
Cc: linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org,
	linux-arch@vger.kernel.org, linux-mm@kvack.org,
	tglx@linutronix.de, andersson@kernel.org, pmladek@suse.com,
	linux-arm-kernel@lists.infradead.org,
	linux-hardening@vger.kernel.org, corbet@lwn.net,
	mojha@qti.qualcomm.com, rostedt@goodmis.org, jonechou@google.com,
	tudor.ambarus@linaro.org, Christoph Hellwig <hch@infradead.org>,
	Sergey Senozhatsky <senozhatsky@chromium.org>
Subject: Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
Date: Wed, 27 Aug 2025 22:06:54 +0200	[thread overview]
Message-ID: <c59f2528-31b0-4b9d-8d20-f204a0600ff6@redhat.com> (raw)
In-Reply-To: <98afe1bd-99d2-4b5d-866a-e9541390fab4@linaro.org>

On 27.08.25 16:08, Eugen Hristev wrote:
> 
> 
> On 8/27/25 15:18, David Hildenbrand wrote:
>> On 27.08.25 13:59, Eugen Hristev wrote:
>>>
>>>
>>> On 8/25/25 16:58, David Hildenbrand wrote:
>>>> On 25.08.25 15:36, Eugen Hristev wrote:
>>>>>
>>>>>
>>>>> On 8/25/25 16:20, David Hildenbrand wrote:
>>>>>>
>>>>>>>>
>>>>>>>> IIRC, kernel/vmcore_info.c is never built as a module, as it also
>>>>>>>> accesses non-exported symbols.
>>>>>>>
>>>>>>> Hello David,
>>>>>>>
>>>>>>> I am looking again into this, and there are some things which in my
>>>>>>> opinion would be difficult to achieve.
>>>>>>> For example I looked into my patch #11 , which adds the `runqueues` into
>>>>>>> kmemdump.
>>>>>>>
>>>>>>> The runqueues is a variable of `struct rq` which is defined in
>>>>>>> kernel/sched/sched.h , which is not supposed to be included outside of
>>>>>>> sched.
>>>>>>> Now moving all the struct definition outside of sched.h into another
>>>>>>> public header would be rather painful and I don't think it's a really
>>>>>>> good option (The struct would be needed to compute the sizeof inside
>>>>>>> vmcoreinfo). Secondly, it would also imply moving all the nested struct
>>>>>>> definitions outside as well. I doubt this is something that we want for
>>>>>>> the sched subsys. How the subsys is designed, out of my understanding,
>>>>>>> is to keep these internal structs opaque outside of it.
>>>>>>
>>>>>> All the kmemdump module needs is a start and a length, correct? So the
>>>>>> only tricky part is getting the length.
>>>>>
>>>>> I also have in mind the kernel user case. How would a kernel programmer
>>>>> want to add some kernel structs/info/buffers into kmemdump such that the
>>>>> dump would contain their data ? Having "KMEMDUMP_VAR(...)" looks simple
>>>>> enough.
>>>>
>>>> The other way around, why should anybody have a saying in adding their
>>>> data to kmemdump? Why do we have that all over the kernel?
>>>>
>>>> Is your mechanism really so special?
>>>>
>>>> A single composer should take care of that, and it's really just start +
>>>> len of physical memory areas.
>>>>
>>>>> Otherwise maybe the programmer has to write helpers to compute lengths
>>>>> etc, and stitch them into kmemdump core.
>>>>> I am not saying it's impossible, but just tiresome perhaps.
>>>>
>>>> In your patch set, how many of these instances did you encounter where
>>>> that was a problem?
>>>>
>>>>>>
>>>>>> One could just add a const variable that holds this information, or even
>>>>>> better, a simple helper function to calculate that.
>>>>>>
>>>>>> Maybe someone else reading along has a better idea.
>>>>>
>>>>> This could work, but it requires again adding some code into the
>>>>> specific subsystem. E.g. struct_rq_get_size()
>>>>> I am open to ideas , and thank you very much for your thoughts.
>>>>>
>>>>>>
>>>>>> Interestingly, runqueues is a percpu variable, which makes me wonder if
>>>>>> what you had would work as intended (maybe it does, not sure).
>>>>>
>>>>> I would not really need to dump the runqueues. But the crash tool which
>>>>> I am using for testing, requires it. Without the runqueues it will not
>>>>> progress further to load the kernel dump.
>>>>> So I am not really sure what it does with the runqueues, but it works.
>>>>> Perhaps using crash/gdb more, to actually do something with this data,
>>>>> would give more insight about its utility.
>>>>> For me, it is a prerequisite to run crash, and then to be able to
>>>>> extract the log buffer from the dump.
>>>>
>>>> I have the faint recollection that percpu vars might not be stored in a
>>>> single contiguous physical memory area, but maybe my memory is just
>>>> wrong, that's why I was raising it.
>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>>     From my perspective it's much simpler and cleaner to just add the
>>>>>>> kmemdump annotation macro inside the sched/core.c as it's done in my
>>>>>>> patch. This macro translates to a noop if kmemdump is not selected.
>>>>>>
>>>>>> I really don't like how we are spreading kmemdump all over the kernel,
>>>>>> and adding complexity with __section when really, all we need is a place
>>>>>> to obtain a start and a length.
>>>>>>
>>>>>
>>>>> I understand. The section idea was suggested by Thomas. Initially I was
>>>>> skeptic, but I like how it turned out.
>>>>
>>>> Yeah, I don't like it. Taste differs ;)
>>>>
>>>> I am in particular unhappy about custom memblock wrappers.
>>>>
>>>> [...]
>>>>
>>>>>>>
>>>>>>> To have this working outside of printk, it would be required to walk
>>>>>>> through all the printk structs/allocations and select the required info.
>>>>>>> Is this something that we want to do outside of printk ?
>>>>>>
>>>>>> I don't follow, please elaborate.
>>>>>>
>>>>>> How is e.g., log_buf_len_get() + log_buf_addr_get() not sufficient,
>>>>>> given that you run your initialization after setup_log_buf() ?
>>>>>>
>>>>>>
>>>>>
>>>>> My initial thought was the same. However I got some feedback from Petr
>>>>> Mladek here :
>>>>>
>>>>> https://lore.kernel.org/lkml/aBm5QH2p6p9Wxe_M@localhost.localdomain/
>>>>>
>>>>> Where he explained how to register the structs correctly.
>>>>> It can be that setup_log_buf is called again at a later time perhaps.
>>>>>
>>>>
>>>> setup_log_buf() is a __init function, so there is only a certain time
>>>> frame where it can be called.
>>>>
>>>> In particular, once the buddy is up, memblock allocations are impossible
>>>> and it would be deeply flawed to call this function again.
>>>>
>>>> Let's not over-engineer this.
>>>>
>>>> Peter is on CC, so hopefully he can share his thoughts.
>>>>
>>>
>>> Hello David,
>>>
>>> I tested out this snippet (on top of my series, so you can see what I
>>> changed):
>>>
>>>
>>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>>> index 18ba6c1e174f..7ac4248a00e5 100644
>>> --- a/kernel/sched/core.c
>>> +++ b/kernel/sched/core.c
>>> @@ -67,7 +67,6 @@
>>>    #include <linux/wait_api.h>
>>>    #include <linux/workqueue_api.h>
>>>    #include <linux/livepatch_sched.h>
>>> -#include <linux/kmemdump.h>
>>>
>>>    #ifdef CONFIG_PREEMPT_DYNAMIC
>>>    # ifdef CONFIG_GENERIC_IRQ_ENTRY
>>> @@ -120,7 +119,12 @@
>>> EXPORT_TRACEPOINT_SYMBOL_GPL(sched_update_nr_running_tp);
>>>    EXPORT_TRACEPOINT_SYMBOL_GPL(sched_compute_energy_tp);
>>>
>>>    DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
>>> -KMEMDUMP_VAR_CORE(runqueues, sizeof(runqueues));
>>> +
>>> +size_t runqueues_get_size(void);
>>> +size_t runqueues_get_size(void)
>>> +{
>>> +       return sizeof(runqueues);
>>> +}
>>>
>>>    #ifdef CONFIG_SCHED_PROXY_EXEC
>>>    DEFINE_STATIC_KEY_TRUE(__sched_proxy_exec);
>>> diff --git a/kernel/vmcore_info.c b/kernel/vmcore_info.c
>>> index d808c5e67f35..c6dd2d6e96dd 100644
>>> --- a/kernel/vmcore_info.c
>>> +++ b/kernel/vmcore_info.c
>>> @@ -24,6 +24,12 @@
>>>    #include "kallsyms_internal.h"
>>>    #include "kexec_internal.h"
>>>
>>> +typedef void* kmemdump_opaque_t;
>>> +
>>> +size_t runqueues_get_size(void);
>>> +
>>> +extern kmemdump_opaque_t runqueues;
>>
>> I would have tried that through:
>>
>> struct rq;
>> extern struct rq runqueues;
>>
>> But the whole PER_CPU_SHARED_ALIGNED makes this all weird, and likely
>> not the way we would want to handle that.
>>
>>>    /* vmcoreinfo stuff */
>>>    unsigned char *vmcoreinfo_data;
>>>    size_t vmcoreinfo_size;
>>> @@ -230,6 +236,9 @@ static int __init crash_save_vmcoreinfo_init(void)
>>>
>>>           kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_VMCOREINFO,
>>>                                (void *)vmcoreinfo_data, vmcoreinfo_size);
>>> +       kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_runqueues,
>>> +                            (void *)&runqueues, runqueues_get_size());
>>> +
>>>           return 0;
>>>    }
>>>
>>> With this, no more .section, no kmemdump code into sched, however, there
>>> are few things :
>>
>> I would really just do here something like the following:
>>
>> /**
>>    * sched_get_runqueues_area - obtain the runqueues area for dumping
>>    * @start: ...
>>    * @size: ...
>>    *
>>    * The obtained area is only to be used for dumping purposes.
>>    */
>> void sched_get_runqueues_area(void *start, size_t size)
>> {
>> 	start = &runqueues;
>> 	size = sizeof(runqueues);
>> }
>>
>> might be cleaner.
>>
> 
> How about this in the header:
> 
> #define DECLARE_DUMP_AREA_FUNC(subsys, symbol) \
> 
> void subsys ## _get_ ## symbol ##_area(void **start, size_t *size);
> 
> 
> 
> #define DEFINE_DUMP_AREA_FUNC(subsys, symbol) \
> 
> void subsys ## _get_ ## symbol ##_area(void **start, size_t *size)\
> 
> {\
> 
>          *start = &symbol;\
> 
>          *size = sizeof(symbol);\
> 
> }
> 
> 
> then, in sched just
> 
> DECLARE_DUMP_AREA_FUNC(sched, runqueues);
> 
> DEFINE_DUMP_AREA_FUNC(sched, runqueues);
> 
> or a single macro that wraps both.
> 
> would make it shorter and neater.
> 
> What do you think ?

Looks a bit over-engineered, and will require us to import a header 
(likely kmemdump.h) in these files, which I don't really enjoy.

I would start simple, without any such macro-magic. It's a very simple 
function after all, and likely you won't end up having many of these?

-- 
Cheers

David / dhildenb


  reply	other threads:[~2025-08-27 20:07 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 01/29] kmemdump: " Eugen Hristev
2025-07-26  3:33   ` Randy Dunlap
2025-07-26  3:36   ` Randy Dunlap
2025-07-24 13:54 ` [RFC][PATCH v2 02/29] Documentation: add kmemdump Eugen Hristev
2025-07-24 14:13   ` Jonathan Corbet
2025-07-24 13:54 ` [RFC][PATCH v2 03/29] kmemdump: add coreimage ELF layer Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 04/29] Documentation: kmemdump: add section for coreimage ELF Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 05/29] kmemdump: introduce qcom-minidump backend driver Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 06/29] soc: qcom: smem: add minidump device Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 07/29] init/version: Annotate static information into Kmemdump Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 08/29] cpu: " Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 09/29] genirq/irqdesc: " Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 10/29] panic: " Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 11/29] sched/core: " Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 12/29] timers: " Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 13/29] kernel/fork: " Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 14/29] mm/page_alloc: " Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 15/29] mm/init-mm: " Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 16/29] mm/show_mem: " Eugen Hristev
2025-07-30 13:55   ` David Hildenbrand
2025-07-30 14:04     ` Eugen Hristev
2025-07-30 14:10       ` David Hildenbrand
2025-07-24 13:55 ` [RFC][PATCH v2 17/29] mm/swapfile: " Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 18/29] mm/percpu: " Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 19/29] mm/mm_init: " Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 20/29] printk: Register " Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 21/29] kernel/configs: Register dynamic " Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 22/29] mm/numa: Register " Eugen Hristev
2025-07-30 13:52   ` David Hildenbrand
2025-07-30 13:57     ` Eugen Hristev
2025-07-30 14:04       ` David Hildenbrand
2025-08-04 10:54         ` Michal Hocko
2025-08-04 11:06           ` Eugen Hristev
2025-08-04 12:18             ` David Hildenbrand
2025-08-04 12:29               ` Eugen Hristev
2025-08-04 12:49                 ` David Hildenbrand
2025-08-04 13:03                   ` Eugen Hristev
2025-08-04 13:26                     ` David Hildenbrand
2025-08-25 12:55                       ` Eugen Hristev
2025-08-25 13:20                         ` David Hildenbrand
2025-08-25 13:36                           ` Eugen Hristev
2025-08-25 13:58                             ` David Hildenbrand
2025-08-27 11:59                               ` Eugen Hristev
2025-08-27 12:18                                 ` David Hildenbrand
2025-08-27 14:08                                   ` Eugen Hristev
2025-08-27 20:06                                     ` David Hildenbrand [this message]
2025-09-01  8:57                                       ` Eugen Hristev
2025-09-01 10:01                                         ` David Hildenbrand
2025-09-01 12:02                                           ` Eugen Hristev
2025-09-01 12:17                                             ` David Hildenbrand
2025-08-04 12:16           ` David Hildenbrand
2025-07-24 13:55 ` [RFC][PATCH v2 23/29] mm/sparse: " Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 24/29] kernel/vmcore_info: Register dynamic " Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 25/29] kmemdump: Add additional symbols to the coreimage Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 26/29] init/version: Annotate init uts name separately into Kmemdump Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 27/29] kallsyms: Annotate static information " Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 28/29] mm/init-mm: Annotate additional " Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 29/29] kmemdump: Add Kinfo backend driver Eugen Hristev
2025-08-26 17:14 ` [RFC][PATCH v2 00/29] introduce kmemdump Mukesh Ojha
2025-08-27  6:42   ` Eugen Hristev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c59f2528-31b0-4b9d-8d20-f204a0600ff6@redhat.com \
    --to=david@redhat.com \
    --cc=andersson@kernel.org \
    --cc=corbet@lwn.net \
    --cc=eugen.hristev@linaro.org \
    --cc=hch@infradead.org \
    --cc=jonechou@google.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mojha@qti.qualcomm.com \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=senozhatsky@chromium.org \
    --cc=tglx@linutronix.de \
    --cc=tudor.ambarus@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).