Embedded Linux development
 help / color / mirror / Atom feed
* [boot-time] jent_mod_init on beagleplay (was RE: [boot-time] [RFC] analyze-initcall-debug.py - a tool to analyze the initcall debug output
From: Bird, Tim @ 2024-12-03 21:24 UTC (permalink / raw)
  To: Francesco Valla, Linux Embedded; +Cc: smueller@chronox.de
In-Reply-To: <1964175.7Z3S40VBb9@fedora.fritz.box>

> -----Original Message-----
> From: Francesco Valla <francesco@valla.it>
> Dear fellow boot time optimizers,
> 
> following the first boot time SIG meeting, which I lurked with much pleasure
> (but didn't participate to, as I was a bit in awe among such authorities), I'd
> like to introduce myself with code rather than a presentation or resume.
> 
> Here is a python script which can analyze a dmesg output with initcall_debug
> option enabled and extract some useful information. It can for example be used
> to analyze the output of the grab-boot-data.sh tool that Tim presented on this
> list [1] just a few days ago.
> 
> Usage is very simple, as the output of dmesg can be piped directly to it:
> 
>     dmesg | analyze-initcall-debug.py
> 
> If no option is specified, it outputs a brief summary, like the following one
> (obtained on my Beagleplay):
> 
> 1758 drivers has been initialized, of which 1758 before userspace
> 119 probes happened outside of the init of their driver
> 0 deferred probes pending
> ---
> Top 10 init/probes durations:
>  * 30200000.dss -> 523002us
>  * deferred_probe_initcall -> 487483us
>  * fd00000.gpu -> 162859us
>  * 8000f00.mdio -> 142521us
>  * 44043000.system-controller -> 71390us
>  * 2-004c -> 71178us
>  * 40900000.crypto -> 59350us
>  * 8000000.ethernet -> 58419us
>  * 44043000.system-controller:clock-controller -> 56599us
>  * jent_mod_init -> 52140us

This is odd.  On my beagleplay board, jent_mod_init only took 32326us.
(see https://birdcloud.org/boot-time-files/boot-data-timslab-bp1-241203-134136.txt)

I'm assuming that we have similar hardware, but possibly different configs,
kernel cmdlines or kernel versions.  I'll take a look at this and see if I can
figure out what the difference is between our systems, that causes the
difference in the duration for this function.

This initcall (jent_mod_init) is on my list of initcalls to research to see if they
can be improved or deferred.  I haven't seen any degradation in system
behavior by deferring this initcall on my system, but that could be from
lack of my system doing some related operation that needs the RNG data, or my
own ignorance of the effects of this call.

Can someone comment on what items or operations might depend on jent_mod_init
in early boot?  In particular, would we expect any cryptographic problems if
the initcall was deferred to a few seconds after booting?

It can be configured as a module, which makes me think that maybe loading
it sometime in very late boot (or even later) is OK.

jent_mod_init is defined in crypto/jitterentropy-kcapi.c, and controlled by 
CONFIG_CRYPTO_JITTERENTROPY (See crypto/Kconfig)

It appears to do some random number generator seeding by measuring
the jitter of a compressiong operation in the kernel. I assume some of the cryptography
configs affect the duration of the operations.  The initcall duration results on
my beagleplay seem to be relatively consistent (in the 32ms range consistently),
and from bobs_lab, on machine sam1, the duration range is also consistent
(at between 4800 and 5200 usecs).

I'd be interested to see if the range is consistent between runs on other machines

Francesco - Can you submit another boot-data file (or 2 or 3) for francsecoslab-beagleplay, to
provide some data on this?

Also, can anyone else who sees this initcall in their boot sequence run grab-boot-data.sh
(from https://birdcloud.org/boot-time/Boot-time_Tools)
and submit the data for one or more of their machines?

Stephan Mueller - what are you seeing for the cost of this operation on your machines?

This is one of the first initcalls I'm going to dive into to see if it can be improved or deferred.
Maybe some of the RNG seeding can be held over from a previous boot, in order to
eliminate the overhead on the early portion of a current boot.  Any feedback on that
idea?

Thanks.
   -- Tim


^ permalink raw reply

* RE: [boot-time] [RFC] analyze-initcall-debug.py - a tool to analyze the initcall debug output
From: Bird, Tim @ 2024-12-04  0:29 UTC (permalink / raw)
  To: Francesco Valla, Linux Embedded
In-Reply-To: <1964175.7Z3S40VBb9@fedora.fritz.box>



> -----Original Message-----
> From: Francesco Valla <francesco@valla.it>
> Dear fellow boot time optimizers,
> 
> following the first boot time SIG meeting, which I lurked with much pleasure
> (but didn't participate to, as I was a bit in awe among such authorities), I'd
> like to introduce myself with code rather than a presentation or resume.
> 
> Here is a python script which can analyze a dmesg output with initcall_debug
> option enabled and extract some useful information. It can for example be used
> to analyze the output of the grab-boot-data.sh tool that Tim presented on this
> list [1] just a few days ago.
> 
> Usage is very simple, as the output of dmesg can be piped directly to it:
> 
>     dmesg | analyze-initcall-debug.py
> 
> If no option is specified, it outputs a brief summary, like the following one
> (obtained on my Beagleplay):
> 
> 1758 drivers has been initialized, of which 1758 before userspace
> 119 probes happened outside of the init of their driver
> 0 deferred probes pending
> ---
> Top 10 init/probes durations:
>  * 30200000.dss -> 523002us
>  * deferred_probe_initcall -> 487483us
>  * fd00000.gpu -> 162859us
>  * 8000f00.mdio -> 142521us
>  * 44043000.system-controller -> 71390us
>  * 2-004c -> 71178us
>  * 40900000.crypto -> 59350us
>  * 8000000.ethernet -> 58419us
>  * 44043000.system-controller:clock-controller -> 56599us
>  * jent_mod_init -> 52140us
> 
> A more complete output, reporting all durations as well as statuses, impact of
> probe deferral and so on, can be produced either in HTML or Markdown format,
> using respectively the --html or --markdown parameters.
> 
> Note: it hasn't (yet?) been used in a "production" environment, as I never
> found the motivation to write a script and always analyzed manually the
> initcall_debug output.
> 
> If you have time to take a look, please tell me what you think and what
> can be added.
> 
> Thank you!
> 
> Regards,
> Francesco
> 
> [1] https://lore.kernel.org/linux-
> embedded/MW5PR13MB5632F54DCBDA0C74370E531FFD5C2@MW5PR13MB5632.namprd13.prod.outlook.com/
> 
> --- analyze-initcall-debug.py script follows ---
> 
> #!/usr/bin/env python3
> # -*- coding: utf-8 -*-
> 
> """
> This script can be used to analyze a Linux kernel dmesg output when the
> initcall_debug command line output is used. It extract various information,
> such as init durations, impact of probe deferral and so on.
> """
> 
> import argparse
> import re
> import sys
> 
> ERRCODE_PROBE_DEFER = 517
> 
> parser = argparse.ArgumentParser()
> 
> parser.add_argument('--dmesg', nargs='?', type=argparse.FileType('r'),
>                     default=sys.stdin)
> parser.add_argument('--html', action='store_true')
> parser.add_argument('--markdown', action='store_true')
> args = parser.parse_args()
> 
> class Run:
>     def __init__(self, start_time:int, end_time:int = -1, duration:int = 0, retval:int = 0):
>         self._start_time = start_time
>         self._end_time = end_time
>         self._duration = duration
>         self._retval = retval
>         self._ended = (end_time >= 0)
> 
>     @property
>     def start_time(self) -> int:
>         return self._start_time
> 
>     @property
>     def end_time(self) -> int:
>         return self._end_time
> 
>     @end_time.setter
>     def end_time(self, time:int):
>         self._end_time = time
>         self._ended = True
> 
>     @property
>     def duration(self) -> int:
>         return self._duration
> 
>     @duration.setter
>     def duration(self, time:int):
>         self._duration = time
> 
>     @property
>     def retval(self) -> int:
>         return self._retval
> 
>     @retval.setter
>     def retval(self, val:int):
>         self._retval = val
> 
>     @property
>     def running(self) -> bool:
>         return (not self._ended or abs(self.retval) == ERRCODE_PROBE_DEFER)
> 
>     @property
>     def failed(self) -> bool:
>         return ((self.retval != 0) and not self.running)
> 
> 
> class Entity:
>     def __init__(self, name:str, start_time:int = 0, end_time:int = 0, duration:int = 0, retval:int = 0):
>         self._name = name
>         self._runs = [ Run(start_time, end_time, duration, retval) ]
> 
>     @property
>     def name(self) -> str:
>         return self._name
> 
>     @property
>     def first_start_time(self) -> int:
>         return self._runs[0].start_time
> 
>     @property
>     def last_start_time(self) -> int:
>         return self._runs[-1].start_time
> 
>     @property
>     def first_end_time(self) -> int:
>         return self._runs[0].end_time
> 
>     @property
>     def last_end_time(self) -> int:
>         return self._runs[-1].end_time
> 
>     @property
>     def duration(self) -> int:
>         return sum([ r.duration for r in self._runs ])
> 
>     @property
>     def wasted_time(self) -> int:
>         return sum([ r.duration for r in filter(lambda x: x.failed or abs(x.retval) == ERRCODE_PROBE_DEFER, self._runs) ])
> 
>     @property
>     def retval(self) -> int:
>         return self._runs[-1].retval
> 
>     @property
>     def failed(self) -> bool:
>         return self._runs[-1].failed
> 
>     @property
>     def running(self) -> bool:
>         return self._runs[-1].running
> 
>     @property
>     def deferred_probe_pending(self) -> bool:
>         return (abs(self._runs[-1].retval) == ERRCODE_PROBE_DEFER)
> 
>     @property
>     def num_deferred_probes(self) -> int:
>         return len(list(filter(lambda r: abs(r.retval) == ERRCODE_PROBE_DEFER, self._runs)))
> 
>     def addStart(self, start_time: int):
>         self._runs.append( Run(start_time) )
> 
>     def addEnd(self, end_time:int = 0, duration:int = 0, retval:int = 0):
>         self._runs[-1].end_time = end_time
>         self._runs[-1].duration = duration
>         self._runs[-1].retval = retval
> 
>     def addRun(self, start_time: int, end_time:int = 0, duration:int = 0, retval:int = 0):
>         self._runs.append( Run(start_time, end_time, duration, retval) )
> 
> 
> class Driver (Entity):
>     def __init__(self, name: str, start_time: int):
>         super().__init__(name, start_time)
> 
>     def addInit(self, init_time: int):
>         self.addStart(init_time)
> 
>     def addReturn(self, return_time: int, duration: int, retval: int):
>         self.addEnd(return_time, duration, retval)
> 
> 
> class Probe (Entity):
>     def __init__(self, name: str, return_time: int, duration: int, retval: int):
>         super().__init__(name, return_time, return_time, duration, retval)
> 
> 
> class Init (Entity):
>     def __init__(self, name: str, start_time: int):
>         super().__init__(name, start_time)
> 
> 
> # Regex for 'calling' messages
> # "[    0.466115] calling  pci_sysfs_init+0x0/0xa8 @ 1"
> calling_prog = re.compile(r'\[([0-9\s]+\.[0-9]+)\] calling  ([0-9a-zA-Z_]+)\+(0x[0-9a-fA-F]+\/0x[0-9a-fA-F]+) @ ([0-9]+)')
> 
> # Regex for 'initcall ... returned' messages
> # "[    0.466115] initcall pci_sysfs_init+0x0/0xa8 returned 0 after 5 usecs"
> returned_prog = re.compile(r'\[([0-9\s]+\.[0-9]+)\] initcall ([0-9a-zA-Z_]+)\+(0x[0-9a-fA-F]+\/0x[0-9a-fA-F]+) returned ([\-0-9]+) after ([0-
> 9]+) usecs')
> 
> # Regex for 'probe ... returned' messages
> # "[    0.466115] probe of cpufreq-dt returned 517 after 140 usec"
> probe_prog = re.compile(r'\[([0-9\s]+\.[0-9]+)\] probe of ([0-9a-zA-Z_\-\.\:@]+) returned ([\-0-9]+) after ([0-9]+) usecs')
> 
> # Regex for
> # "[    1.060329] Run /sbin/init as init process"
> init_prog = re.compile(r'\[([0-9\s]+\.[0-9]+)\] Run ([/0-9a-zA-Z_]+) as init process')
> 

I'm a little worried about the fragility of these regex lines.  If a dmesg line doesn't
match, then it is silently ignored.  It might be worth checking for a simpler match first,
then trying for the full regex match.  See below.

> drivers = list()
> probes = list()
> init = None
> 
> # Extract data from dmesg
> for line in args.dmesg:
It might be nice to keep track of the line number here, for more detail
when reporting parsing errors.

>     match = calling_prog.match(line)

if "] calling" in line:
   match = calling_prog.match(line)

>     if match is not None:

if match:
is better here, IMHO.

>         time = float(match.group(1)) * float(10^6)
>         name = str(match.group(2))
>         try:
>             time = float(match.group(1)) * float(10^6)
>             name = str(match.group(2))
>         except Exception as e:
>             print(f'Failed parsing line "{line.rstrip()}" as call')
>             raise e
>         else:
>             for d in filter(lambda x: x.name == name, drivers):
This is a bit strange.
Why not use a dictionary for drivers here, with name as the key, instead
of a list where you have to do a search?

>                 d.addInit(time)
>                 break
>             else:
>                 drivers.append(Driver(name, time))
If the driver names are unique, then this can be:
  drivers[name] = Driver(name, time)
Otherwise:
  drivers[name] = drivers.get(name, []).append(Drivers(name, time))

which is equivalent to:
  if name in drivers:
      drivers[name].append(Driver(name, time))
 else:
      drivers[name] = [Drivers(name, time)]

>         finally:
>             continue
> 
>     match = returned_prog.match(line)
>     if match is not None:
if match:

>         try:
>             time = float(match.group(1)) * float(10^6)
>             name = str(match.group(2))
>             retval = int(match.group(4))
>             duration = int(match.group(5))
>         except:
>             print(f'Failed parsing line "{line.rstrip()}" as call return')
>         else:
>             for d in filter(lambda x: x.name == name, drivers):
Same thing here - should use a dictionary with name as key.

>                 d.addReturn(time, duration, retval)
>                 break
>             else:
>                 print(f'Detected return for driver {name}, for which an init was never recorded')
>         finally:
>             continue
> 
>     match = probe_prog.match(line)
>     if match is not None:
if match:
>         try:
>             time = float(match.group(1)) * float(10^6)
>             name = str(match.group(2))
>             retval = int(match.group(3))
>             duration = int(match.group(4))
>         except:
>             print(f'Failed parsing line "{line.rstrip()}" as probe return')
>         else:
>             for d in filter(lambda x: x.name == name, probes):
Should use a dictionary here.
>                 d.addRun(time, time, duration, retval)
>                 break
>             else:
>                 probes.append(Probe(name, time, duration, retval))
>         finally:
>             continue
> 
>     if init is None:
if not init:
>         match = init_prog.match(line)
>         if match is not None:
if match:
>             try:
>                 time = float(match.group(1)) * float(10^6)
>                 name = str(match.group(2))
>             except:
>                 print(f'Failed parsing line "{line.rstrip()}" as probe return')
>             else:
>                 init = Init(name, time)
>             finally:
>                 continue
> 
> if args.markdown:
>     print('| Name | Type [^1] | Status [^2] | Duration (us) | Before userspace | Num deferred probes | Time spent in failed inits (us) |')
>     print('| -----| --------- | ----------- | ------------- | ---------------- | ------------------- | ------------------------------- |')
>     for d in sorted(filter(lambda k: k.duration > 0, drivers + probes), key=lambda k: k.duration, reverse=True):
>         etype = 'I' if d in drivers else 'P'
>         run_status = 'D' if d.deferred_probe_pending else 'R' if d.running else f'F({abs(d.retval)})' if d.failed else 'K'
>         before_init = 'Y' if d.last_start_time < init.last_start_time else 'N'
>         print(f'| {d.name} | {etype} | {run_status} | {d.duration} | {before_init} | {d.num_deferred_probes} | {d.wasted_time} |')
>     print('[^1]: I=Init function, P=device Probe')
>     print('[^2]: D=Deferred probe pending, R=still Running, F(N)=Failed (with errcode N), K=init oK')
> elif args.html:
>     print(
> '''
> <!DOCTYPE HTML>
> <html>
> <head>
>   <meta charset="utf-8" />
>   <title>Initcall analysis</title>
>   <meta name="viewport" content="width=device-width, initial-scale=1" />
>   <style>
>     table, th, td {
>       border: 1px solid;
>     }
>   </style>
> </head>
> <body>
> 
I'd like a version of this where it just emits the HTML for the body, and not the full
page.  I could use this in my birdcloud boot-time wiki.

Maybe add an option to select full HTML page vs HTML snippet?

> <table>
>   <tr>
>     <th>Name</th>
>     <th>Type <a href="#note1">[1]</a></th>
>     <th>Status <a href="#note2">[2]</a></th>
>     <th>Duration (us)</th>
>     <th>Before userspace</th>
>     <th>Num deferred probes</th>
>     <th>Time spent in failed inits (us)</th>
>   </tr>''')
> 
>     for d in sorted(filter(lambda k: k.duration > 0, drivers + probes), key=lambda k: k.duration, reverse=True):
>         etype = 'I' if d in drivers else 'P'
>         run_status = 'D' if d.deferred_probe_pending else 'R' if d.running else f'F({abs(d.retval)})' if d.failed else 'K'
>         before_init = 'Y' if d.last_start_time < init.last_start_time else 'N'
>         print(
> f'''
>   <tr>
>     <td>{d.name}</td>
>     <td>{etype}</td>
>     <td>{run_status}</td>
>     <td>{d.duration}</td>
>     <td>{before_init}</td>
>     <td>{d.num_deferred_probes}</td>
>     <td>{d.wasted_time}</td>
>   </tr>''')
> 
>     print(
> '''
> </table>
> <p id="note1">[1] I=Init function, P=device Probe</p>
> <p id="note2">[2] D=Deferred probe pending, R=still Running, F(N)=Failed (with errcode N), K=init oK</p>
> </body>
> </html>
> ''')
> 
> else:
>     print(f'{len(drivers)} drivers has been initialized, of which {len(list(filter(lambda d: d.last_start_time < init.last_start_time, drivers)))}
> before userspace')
>     print(f'{len(probes)} probes happened outside of the init of their driver')
> 
>     print(f'{len(list(filter(lambda d: d.deferred_probe_pending, drivers + probes)))} deferred probes pending')
> 
>     print('---')
> 
>     print('Top 10 init/probes durations:')
>     for d in sorted(filter(lambda k: k.duration > 0, drivers + probes), key=lambda k: k.duration, reverse=True)[0:10]:
>         print(f' * {d.name} -> {d.duration}us')
> 
> 
> 
> 

Just my 2 cents...
 -- Tim

^ permalink raw reply

* Re: [PATCH v2] printk: Improve memory usage logging during boot
From: John Ogness @ 2024-12-04 17:17 UTC (permalink / raw)
  To: Petr Mladek, Geert Uytterhoeven
  Cc: Isaac J. Manjarres, Steven Rostedt, Sergey Senozhatsky, surenb,
	kernel-team, linux-kernel, Linux Embedded
In-Reply-To: <Z07SnoKSwHQ_y2j2@pathway.suse.cz>

On 2024-12-03, Petr Mladek <pmladek@suse.com> wrote:
> On Sun 2024-12-01 12:40:13, Geert Uytterhoeven wrote:
>> Isn't this kernel log message bookkeeping becoming a bit too excessive?
>> E.g. on a small system:
>> 
>>     printk: log buffer data + meta data: 65536 + 204800 = 270336 bytes
>> 
>> Why is the meta data that big (> 3*log buffer)?
>> 
>>     #define PRB_AVGBITS 5    /* 32 character average length */
>> 
>>     unsigned int descs_count = log_buf_len >> PRB_AVGBITS;
>>     meta_data_size = descs_count * (sizeof(struct prb_desc) +
>> sizeof(struct printk_info));
>> 
>>     struct prb_data_blk_lpos {
>>             unsigned long   begin;
>>             unsigned long   next;
>>     };
>> 
>>     struct prb_desc {
>>             atomic_long_t                   state_var;
>>             struct prb_data_blk_lpos        text_blk_lpos;
>>     };
>>
>> i.e. 12 bytes on 32-bit, 24 bytes on 64-bit.
>
> I am afraid that we could not do much about the size of this part.
> All the variables are important parts of the lockless machinery.

The descriptor array is one source of wasted space. It ensures there are
enough descriptors so that the text_ring can be fully maximized for an
average record text length of 32. However, looking at
/sys/kernel/debug/printk/index/vmlinux and running some basic tests, it
seems the average text length is >=45 (usually around 55). That means at
least 30% of the descriptor space is not in use, which is roughtly 5% of
the total memory used by all ringbuffer components.

For example, for CONFIG_LOG_BUF_SHIFT=13, the amount of wasted desc_ring
space is about 1.8KiB. (The total memory usage of the ringbuffer is
36KiB.)

However, if we bump the expected average size to 64, there will not be
enough descriptors, leading to wasted text_ring buffer space. (The
expected size must be a power of 2 due to the ID wrapping algorithm.)

>>     #define PRINTK_INFO_SUBSYSTEM_LEN       16
>>     #define PRINTK_INFO_DEVICE_LEN          48
>> 
>>     struct dev_printk_info {
>>             char subsystem[PRINTK_INFO_SUBSYSTEM_LEN];
>>             char device[PRINTK_INFO_DEVICE_LEN];
>>     };
>
> This is probably the lowest hanging fruit.

Yes, the info array is also a source of wasted space. This was known at
the time when we discussed [0] introducing this array. Many records do
not even use the dev_printk_info fields. (A running system seems to use
them more often than a booting system.) However, the 64 bytes is
currently quite large. During some testing I typically saw 70% of the
dev_printk_info text space for valid descriptors unused.

I typically saw dev_printk_info averages of:

- 5 bytes for SUBSYSTEM
- 12 bytes for DEVICE

However, this problem is compounded by the descriptor array issue I
mentioned at the beginning. The reason is that there is a 1:1
relationship between descriptors and dev_printk_info structs. So if 30%
of the descriptors are invalid, then that adds an additional waste of
30% totally unused dev_printk_info structs.

For example, for CONFIG_LOG_BUF_SHIFT=13, that represents a total of
13KiB wasted space in the info array (36% of the total memory usage).

> It should be possible to write these dev_printk-specific metadata into
> the data buffer a more efficient way and only for records created by
> dev_printk().
>
> It would require adding "dict_len" into "struct printk_info"
> and reserving space for both "text_len" and "dict_len".
>
> We bundled it into the metadata because these are metadata.
> We wanted to keep the design as clean as possible. We focused
> mainly on the stability and maintainability of the code.
> And it was really challenging to get it all working.

I think keeping it bundled in the meta data is correct. But if those
text arrays could be allocated from a text ring, then the space could be
used efficiently.

I am not recommending that we add the dict_ring back. It was
considerably more complex. But perhaps we could use the text_ring for
these allocations as well. It may even have the benefit that the
"average text size" becomes >=64, which would also help with the first
wasted item I mentioned.

>>     struct printk_info {
>>             u64     seq;            /* sequence number */
>
> I do not recal the details. But I think that we need to
> explicitely store the 64-bit "seq" number in the metadata
> because of the lockless algoritm. It helps to solve
> problems with wrapping of the counter in
> "atomic_long_t state_var".

Yes. I could not figure out a way to safely update a @log_first_seq to
represent the first sequence available in the ringbuffer. (The
complexity involves both writers and readers seeing appropriate sequence
values.) If that could be done somehow, that would save another 2KiB
(for CONFIG_LOG_BUF_SHIFT=13).

In summary...

I think the only quick fix we could do immediately is reduce
PRINTK_INFO_DEVICE_LEN. On my various test machines, I never encountered
anything above 25. Perhaps reducing it from 48 to 32? That would
immediately reclaim 11% (4KiB for CONFIG_LOG_BUF_SHIFT=13) and it would
not require any changes to the various crash/dump tools.

In the longer term we may want to consider using the text ring for
additional device/subsystem string allocation. This would only require
changes to crash/dump tools that provide the device/subsystem
information.

In addition, if low-memory single CPU systems are am important target,
we might be better off implementing a
CONFIG_PRINTK_RINGBUFFER_TINY. Implementing a lockless ringbuffer for a
single CPU is trivial (in comparison) and would not require all the
management abstractions. If it used the same printk_ringbuffer API it
could be a simple drop-in replacement.

John Ogness

[0] https://lore.kernel.org/lkml/20200904124530.GB20558@alley

^ permalink raw reply

* Re: [boot-time] jent_mod_init on beagleplay (was RE: [boot-time] [RFC] analyze-initcall-debug.py - a tool to analyze the initcall debug output
From: Stephan Mueller @ 2024-12-05  7:02 UTC (permalink / raw)
  To: Francesco Valla, Linux Embedded, Bird, Tim
In-Reply-To: <MW5PR13MB5632D5A1AE2D16F53FA2391FFD362@MW5PR13MB5632.namprd13.prod.outlook.com>

Am Dienstag, 3. Dezember 2024, 22:24:28 Mitteleuropäische Normalzeit schrieb 
Bird, Tim:

Hi Tim,

> > -----Original Message-----
> > From: Francesco Valla <francesco@valla.it>
> > Dear fellow boot time optimizers,
> > 
> > following the first boot time SIG meeting, which I lurked with much
> > pleasure
 (but didn't participate to, as I was a bit in awe among such
> > authorities), I'd like to introduce myself with code rather than a
> > presentation or resume. 
> > Here is a python script which can analyze a dmesg output with
> > initcall_debug
 option enabled and extract some useful information. It
> > can for example be used to analyze the output of the grab-boot-data.sh
> > tool that Tim presented on this list [1] just a few days ago.
> > 
> > Usage is very simple, as the output of dmesg can be piped directly to it:
> > 
> > 
> >     dmesg | analyze-initcall-debug.py
> > 
> > 
> > If no option is specified, it outputs a brief summary, like the following
> > one
 (obtained on my Beagleplay):
> > 
> > 1758 drivers has been initialized, of which 1758 before userspace
> > 119 probes happened outside of the init of their driver
> > 0 deferred probes pending
> > ---
> > Top 10 init/probes durations:
> > 
> >  * 30200000.dss -> 523002us
> >  * deferred_probe_initcall -> 487483us
> >  * fd00000.gpu -> 162859us
> >  * 8000f00.mdio -> 142521us
> >  * 44043000.system-controller -> 71390us
> >  * 2-004c -> 71178us
> >  * 40900000.crypto -> 59350us
> >  * 8000000.ethernet -> 58419us
> >  * 44043000.system-controller:clock-controller -> 56599us
> >  * jent_mod_init -> 52140us
> 
> 
> This is odd.  On my beagleplay board, jent_mod_init only took 32326us.
> (see
> https://birdcloud.org/boot-time-files/boot-data-timslab-bp1-241203-134136.t
> xt)
 
> I'm assuming that we have similar hardware, but possibly different configs,
> kernel cmdlines or kernel versions.  I'll take a look at this and see if I
> can figure out what the difference is between our systems, that causes the
> difference in the duration for this function.
> 
> This initcall (jent_mod_init) is on my list of initcalls to research to see
> if they
 can be improved or deferred.  I haven't seen any degradation in
> system behavior by deferring this initcall on my system, but that could be
> from lack of my system doing some related operation that needs the RNG
> data, or my own ignorance of the effects of this call.
> 
> Can someone comment on what items or operations might depend on
> jent_mod_init
 in early boot?  In particular, would we expect any
> cryptographic problems if the initcall was deferred to a few seconds after
> booting?
> 
> It can be configured as a module, which makes me think that maybe loading
> it sometime in very late boot (or even later) is OK.
> 
> jent_mod_init is defined in crypto/jitterentropy-kcapi.c, and controlled by
> 
 CONFIG_CRYPTO_JITTERENTROPY (See crypto/Kconfig)
> 
> It appears to do some random number generator seeding by measuring
> the jitter of a compressiong operation in the kernel. I assume some of the
> cryptography
 configs affect the duration of the operations.  The initcall
> duration results on my beagleplay seem to be relatively consistent (in the
> 32ms range consistently), and from bobs_lab, on machine sam1, the duration
> range is also consistent (at between 4800 and 5200 usecs).
> 
> I'd be interested to see if the range is consistent between runs on other
> machines
 
> Francesco - Can you submit another boot-data file (or 2 or 3) for
> francsecoslab-beagleplay, to
 provide some data on this?
> 
> Also, can anyone else who sees this initcall in their boot sequence run
> grab-boot-data.sh
 (from https://birdcloud.org/boot-time/Boot-time_Tools)
> and submit the data for one or more of their machines?
> 
> Stephan Mueller - what are you seeing for the cost of this operation on your
> machines?

I have never done the measurements on my system, but with patch 
95a798d20060d2b648dd604321e347c85edfd783 the so-called oversampling rate was 
changed which reduced the performance of the Jitter RNG (read: it takes a bit 
more to gather all data).

 
> This is one of the first initcalls I'm going to dive into to see if it can
> be improved or deferred.
 Maybe some of the RNG seeding can be held over
> from a previous boot, in order to eliminate the overhead on the early
> portion of a current boot.  Any feedback on that idea?
> 
> Thanks.
>    -- Tim
> 


Ciao
Stephan



^ permalink raw reply

* Re: [boot-time] [RFC] analyze-initcall-debug.py - a tool to analyze the initcall debug output
From: Francesco Valla @ 2024-12-05 13:05 UTC (permalink / raw)
  To: Linux Embedded, Bird, Tim
In-Reply-To: <MW5PR13MB563206FC92B1CACCD85DA33BFD372@MW5PR13MB5632.namprd13.prod.outlook.com>

Hi Tim,

On Wednesday, 4 December 2024 at 01:29:18 Bird, Tim <Tim.Bird@sony.com> wrote:
> 
> > -----Original Message-----
> > From: Francesco Valla <francesco@valla.it>
> > Dear fellow boot time optimizers,
> > 
> > following the first boot time SIG meeting, which I lurked with much pleasure
> > (but didn't participate to, as I was a bit in awe among such authorities), I'd
> > like to introduce myself with code rather than a presentation or resume.
> > 
> > Here is a python script which can analyze a dmesg output with initcall_debug
> > option enabled and extract some useful information. It can for example be used
> > to analyze the output of the grab-boot-data.sh tool that Tim presented on this
> > list [1] just a few days ago.
> > 
> > Usage is very simple, as the output of dmesg can be piped directly to it:
> > 
> >     dmesg | analyze-initcall-debug.py
> > 
> > If no option is specified, it outputs a brief summary, like the following one
> > (obtained on my Beagleplay):
> > 
> > 1758 drivers has been initialized, of which 1758 before userspace
> > 119 probes happened outside of the init of their driver
> > 0 deferred probes pending
> > ---
> > Top 10 init/probes durations:
> >  * 30200000.dss -> 523002us
> >  * deferred_probe_initcall -> 487483us
> >  * fd00000.gpu -> 162859us
> >  * 8000f00.mdio -> 142521us
> >  * 44043000.system-controller -> 71390us
> >  * 2-004c -> 71178us
> >  * 40900000.crypto -> 59350us
> >  * 8000000.ethernet -> 58419us
> >  * 44043000.system-controller:clock-controller -> 56599us
> >  * jent_mod_init -> 52140us
> > 
> > A more complete output, reporting all durations as well as statuses, impact of
> > probe deferral and so on, can be produced either in HTML or Markdown format,
> > using respectively the --html or --markdown parameters.
> > 
> > Note: it hasn't (yet?) been used in a "production" environment, as I never
> > found the motivation to write a script and always analyzed manually the
> > initcall_debug output.
> > 
> > If you have time to take a look, please tell me what you think and what
> > can be added.
> > 
> > Thank you!
> > 
> > Regards,
> > Francesco
> > 
> > [1] https://lore.kernel.org/linux-
> > embedded/MW5PR13MB5632F54DCBDA0C74370E531FFD5C2@MW5PR13MB5632.namprd13.prod.outlook.com/
> > 
> > --- analyze-initcall-debug.py script follows ---
> > 
> > #!/usr/bin/env python3
> > # -*- coding: utf-8 -*-
> > 
> > """
> > This script can be used to analyze a Linux kernel dmesg output when the
> > initcall_debug command line output is used. It extract various information,
> > such as init durations, impact of probe deferral and so on.
> > """
> > 
> > import argparse
> > import re
> > import sys
> > 
> > ERRCODE_PROBE_DEFER = 517
> > 
> > parser = argparse.ArgumentParser()
> > 
> > parser.add_argument('--dmesg', nargs='?', type=argparse.FileType('r'),
> >                     default=sys.stdin)
> > parser.add_argument('--html', action='store_true')
> > parser.add_argument('--markdown', action='store_true')
> > args = parser.parse_args()
> > 
> > class Run:
> >     def __init__(self, start_time:int, end_time:int = -1, duration:int = 0, retval:int = 0):
> >         self._start_time = start_time
> >         self._end_time = end_time
> >         self._duration = duration
> >         self._retval = retval
> >         self._ended = (end_time >= 0)
> > 
> >     @property
> >     def start_time(self) -> int:
> >         return self._start_time
> > 
> >     @property
> >     def end_time(self) -> int:
> >         return self._end_time
> > 
> >     @end_time.setter
> >     def end_time(self, time:int):
> >         self._end_time = time
> >         self._ended = True
> > 
> >     @property
> >     def duration(self) -> int:
> >         return self._duration
> > 
> >     @duration.setter
> >     def duration(self, time:int):
> >         self._duration = time
> > 
> >     @property
> >     def retval(self) -> int:
> >         return self._retval
> > 
> >     @retval.setter
> >     def retval(self, val:int):
> >         self._retval = val
> > 
> >     @property
> >     def running(self) -> bool:
> >         return (not self._ended or abs(self.retval) == ERRCODE_PROBE_DEFER)
> > 
> >     @property
> >     def failed(self) -> bool:
> >         return ((self.retval != 0) and not self.running)
> > 
> > 
> > class Entity:
> >     def __init__(self, name:str, start_time:int = 0, end_time:int = 0, duration:int = 0, retval:int = 0):
> >         self._name = name
> >         self._runs = [ Run(start_time, end_time, duration, retval) ]
> > 
> >     @property
> >     def name(self) -> str:
> >         return self._name
> > 
> >     @property
> >     def first_start_time(self) -> int:
> >         return self._runs[0].start_time
> > 
> >     @property
> >     def last_start_time(self) -> int:
> >         return self._runs[-1].start_time
> > 
> >     @property
> >     def first_end_time(self) -> int:
> >         return self._runs[0].end_time
> > 
> >     @property
> >     def last_end_time(self) -> int:
> >         return self._runs[-1].end_time
> > 
> >     @property
> >     def duration(self) -> int:
> >         return sum([ r.duration for r in self._runs ])
> > 
> >     @property
> >     def wasted_time(self) -> int:
> >         return sum([ r.duration for r in filter(lambda x: x.failed or abs(x.retval) == ERRCODE_PROBE_DEFER, self._runs) ])
> > 
> >     @property
> >     def retval(self) -> int:
> >         return self._runs[-1].retval
> > 
> >     @property
> >     def failed(self) -> bool:
> >         return self._runs[-1].failed
> > 
> >     @property
> >     def running(self) -> bool:
> >         return self._runs[-1].running
> > 
> >     @property
> >     def deferred_probe_pending(self) -> bool:
> >         return (abs(self._runs[-1].retval) == ERRCODE_PROBE_DEFER)
> > 
> >     @property
> >     def num_deferred_probes(self) -> int:
> >         return len(list(filter(lambda r: abs(r.retval) == ERRCODE_PROBE_DEFER, self._runs)))
> > 
> >     def addStart(self, start_time: int):
> >         self._runs.append( Run(start_time) )
> > 
> >     def addEnd(self, end_time:int = 0, duration:int = 0, retval:int = 0):
> >         self._runs[-1].end_time = end_time
> >         self._runs[-1].duration = duration
> >         self._runs[-1].retval = retval
> > 
> >     def addRun(self, start_time: int, end_time:int = 0, duration:int = 0, retval:int = 0):
> >         self._runs.append( Run(start_time, end_time, duration, retval) )
> > 
> > 
> > class Driver (Entity):
> >     def __init__(self, name: str, start_time: int):
> >         super().__init__(name, start_time)
> > 
> >     def addInit(self, init_time: int):
> >         self.addStart(init_time)
> > 
> >     def addReturn(self, return_time: int, duration: int, retval: int):
> >         self.addEnd(return_time, duration, retval)
> > 
> > 
> > class Probe (Entity):
> >     def __init__(self, name: str, return_time: int, duration: int, retval: int):
> >         super().__init__(name, return_time, return_time, duration, retval)
> > 
> > 
> > class Init (Entity):
> >     def __init__(self, name: str, start_time: int):
> >         super().__init__(name, start_time)
> > 
> > 
> > # Regex for 'calling' messages
> > # "[    0.466115] calling  pci_sysfs_init+0x0/0xa8 @ 1"
> > calling_prog = re.compile(r'\[([0-9\s]+\.[0-9]+)\] calling  ([0-9a-zA-Z_]+)\+(0x[0-9a-fA-F]+\/0x[0-9a-fA-F]+) @ ([0-9]+)')
> > 
> > # Regex for 'initcall ... returned' messages
> > # "[    0.466115] initcall pci_sysfs_init+0x0/0xa8 returned 0 after 5 usecs"
> > returned_prog = re.compile(r'\[([0-9\s]+\.[0-9]+)\] initcall ([0-9a-zA-Z_]+)\+(0x[0-9a-fA-F]+\/0x[0-9a-fA-F]+) returned ([\-0-9]+) after ([0-
> > 9]+) usecs')
> > 
> > # Regex for 'probe ... returned' messages
> > # "[    0.466115] probe of cpufreq-dt returned 517 after 140 usec"
> > probe_prog = re.compile(r'\[([0-9\s]+\.[0-9]+)\] probe of ([0-9a-zA-Z_\-\.\:@]+) returned ([\-0-9]+) after ([0-9]+) usecs')
> > 
> > # Regex for
> > # "[    1.060329] Run /sbin/init as init process"
> > init_prog = re.compile(r'\[([0-9\s]+\.[0-9]+)\] Run ([/0-9a-zA-Z_]+) as init process')
> > 
> 
> I'm a little worried about the fragility of these regex lines.  If a dmesg line doesn't
> match, then it is silently ignored.  It might be worth checking for a simpler match first,
> then trying for the full regex match.  See below.
> 

You are right - in fact this version of the tool is not matching probes for drivers
built as modules. I'll post an updated version once I solved all the other concerns.

> > drivers = list()
> > probes = list()
> > init = None
> > 
> > # Extract data from dmesg
> > for line in args.dmesg:
> It might be nice to keep track of the line number here, for more detail
> when reporting parsing errors.
> 

Agreed.

> >     match = calling_prog.match(line)
> 
> if "] calling" in line:
>    match = calling_prog.match(line)
> 

Yes, that would be better.

> >     if match is not None:
> 
> if match:
> is better here, IMHO.
> 

Noted (for the other occurrences below, too).

> >         time = float(match.group(1)) * float(10^6)
> >         name = str(match.group(2))
> >         try:
> >             time = float(match.group(1)) * float(10^6)
> >             name = str(match.group(2))
> >         except Exception as e:
> >             print(f'Failed parsing line "{line.rstrip()}" as call')
> >             raise e
> >         else:
> >             for d in filter(lambda x: x.name == name, drivers):
> This is a bit strange.
> Why not use a dictionary for drivers here, with name as the key, instead
> of a list where you have to do a search?
> 

Force of habit, probably - I code 99% of the time in C. A dictionary would be more efficient.

> >                 d.addInit(time)
> >                 break
> >             else:
> >                 drivers.append(Driver(name, time))
> If the driver names are unique, then this can be:
>   drivers[name] = Driver(name, time)
> Otherwise:
>   drivers[name] = drivers.get(name, []).append(Drivers(name, time))
> 
> which is equivalent to:
>   if name in drivers:
>       drivers[name].append(Driver(name, time))
>  else:
>       drivers[name] = [Drivers(name, time)]
> 

Noted. I need to think about the .get().append() syntax, is not so intuitive to me.

> >         finally:
> >             continue
> > 
> >     match = returned_prog.match(line)
> >     if match is not None:
> if match:
> 
> >         try:
> >             time = float(match.group(1)) * float(10^6)
> >             name = str(match.group(2))
> >             retval = int(match.group(4))
> >             duration = int(match.group(5))
> >         except:
> >             print(f'Failed parsing line "{line.rstrip()}" as call return')
> >         else:
> >             for d in filter(lambda x: x.name == name, drivers):
> Same thing here - should use a dictionary with name as key.
> 

Yes -  see above.

> >                 d.addReturn(time, duration, retval)
> >                 break
> >             else:
> >                 print(f'Detected return for driver {name}, for which an init was never recorded')
> >         finally:
> >             continue
> > 
> >     match = probe_prog.match(line)
> >     if match is not None:
> if match:
> >         try:
> >             time = float(match.group(1)) * float(10^6)
> >             name = str(match.group(2))
> >             retval = int(match.group(3))
> >             duration = int(match.group(4))
> >         except:
> >             print(f'Failed parsing line "{line.rstrip()}" as probe return')
> >         else:
> >             for d in filter(lambda x: x.name == name, probes):
> Should use a dictionary here.
> >                 d.addRun(time, time, duration, retval)
> >                 break
> >             else:
> >                 probes.append(Probe(name, time, duration, retval))
> >         finally:
> >             continue
> > 
> >     if init is None:
> if not init:
> >         match = init_prog.match(line)
> >         if match is not None:
> if match:
> >             try:
> >                 time = float(match.group(1)) * float(10^6)
> >                 name = str(match.group(2))
> >             except:
> >                 print(f'Failed parsing line "{line.rstrip()}" as probe return')
> >             else:
> >                 init = Init(name, time)
> >             finally:
> >                 continue
> > 
> > if args.markdown:
> >     print('| Name | Type [^1] | Status [^2] | Duration (us) | Before userspace | Num deferred probes | Time spent in failed inits (us) |')
> >     print('| -----| --------- | ----------- | ------------- | ---------------- | ------------------- | ------------------------------- |')
> >     for d in sorted(filter(lambda k: k.duration > 0, drivers + probes), key=lambda k: k.duration, reverse=True):
> >         etype = 'I' if d in drivers else 'P'
> >         run_status = 'D' if d.deferred_probe_pending else 'R' if d.running else f'F({abs(d.retval)})' if d.failed else 'K'
> >         before_init = 'Y' if d.last_start_time < init.last_start_time else 'N'
> >         print(f'| {d.name} | {etype} | {run_status} | {d.duration} | {before_init} | {d.num_deferred_probes} | {d.wasted_time} |')
> >     print('[^1]: I=Init function, P=device Probe')
> >     print('[^2]: D=Deferred probe pending, R=still Running, F(N)=Failed (with errcode N), K=init oK')
> > elif args.html:
> >     print(
> > '''
> > <!DOCTYPE HTML>
> > <html>
> > <head>
> >   <meta charset="utf-8" />
> >   <title>Initcall analysis</title>
> >   <meta name="viewport" content="width=device-width, initial-scale=1" />
> >   <style>
> >     table, th, td {
> >       border: 1px solid;
> >     }
> >   </style>
> > </head>
> > <body>
> > 
> I'd like a version of this where it just emits the HTML for the body, and not the full
> page.  I could use this in my birdcloud boot-time wiki.
> 
> Maybe add an option to select full HTML page vs HTML snippet?
> 

I'll add an option to do just that.

> > <table>
> >   <tr>
> >     <th>Name</th>
> >     <th>Type <a href="#note1">[1]</a></th>
> >     <th>Status <a href="#note2">[2]</a></th>
> >     <th>Duration (us)</th>
> >     <th>Before userspace</th>
> >     <th>Num deferred probes</th>
> >     <th>Time spent in failed inits (us)</th>
> >   </tr>''')
> > 
> >     for d in sorted(filter(lambda k: k.duration > 0, drivers + probes), key=lambda k: k.duration, reverse=True):
> >         etype = 'I' if d in drivers else 'P'
> >         run_status = 'D' if d.deferred_probe_pending else 'R' if d.running else f'F({abs(d.retval)})' if d.failed else 'K'
> >         before_init = 'Y' if d.last_start_time < init.last_start_time else 'N'
> >         print(
> > f'''
> >   <tr>
> >     <td>{d.name}</td>
> >     <td>{etype}</td>
> >     <td>{run_status}</td>
> >     <td>{d.duration}</td>
> >     <td>{before_init}</td>
> >     <td>{d.num_deferred_probes}</td>
> >     <td>{d.wasted_time}</td>
> >   </tr>''')
> > 
> >     print(
> > '''
> > </table>
> > <p id="note1">[1] I=Init function, P=device Probe</p>
> > <p id="note2">[2] D=Deferred probe pending, R=still Running, F(N)=Failed (with errcode N), K=init oK</p>
> > </body>
> > </html>
> > ''')
> > 
> > else:
> >     print(f'{len(drivers)} drivers has been initialized, of which {len(list(filter(lambda d: d.last_start_time < init.last_start_time, drivers)))}
> > before userspace')
> >     print(f'{len(probes)} probes happened outside of the init of their driver')
> > 
> >     print(f'{len(list(filter(lambda d: d.deferred_probe_pending, drivers + probes)))} deferred probes pending')
> > 
> >     print('---')
> > 
> >     print('Top 10 init/probes durations:')
> >     for d in sorted(filter(lambda k: k.duration > 0, drivers + probes), key=lambda k: k.duration, reverse=True)[0:10]:
> >         print(f' * {d.name} -> {d.duration}us')
> > 
> > 
> > 
> > 
> 
> Just my 2 cents...
>  -- Tim
> 

Thank you!

Regards,

Francesco





^ permalink raw reply

* Re: [boot-time] [RFC] analyze-initcall-debug.py - a tool to analyze the initcall debug output
From: Francesco Valla @ 2024-12-05 12:57 UTC (permalink / raw)
  To: Linux Embedded, Bird, Tim
In-Reply-To: <MW5PR13MB563207AD3DB4415E5A1678A6FD362@MW5PR13MB5632.namprd13.prod.outlook.com>

Hi Tim,

thank you for taking the time to review the tool. My reply too will be splitted in two parts.

On Tuesday, 3 December 2024 at 21:33:06 Bird, Tim <Tim.Bird@sony.com> wrote:
> Hey Francesco,
> 
> I appreciate the new tool, and have some feedback below.
> 
> I'll separate my response and ideas into 2 categories:
>  - issues raised from your sample report
>  - feedback on the tool itself
> 
> and send them as separate emails.
> 
> > -----Original Message-----
> > From: Francesco Valla <francesco@valla.it>
> > Dear fellow boot time optimizers,
> > 
> > following the first boot time SIG meeting, which I lurked with much pleasure
> > (but didn't participate to, as I was a bit in awe among such authorities), I'd
> > like to introduce myself with code rather than a presentation or resume.
> > 
> > Here is a python script which can analyze a dmesg output with initcall_debug
> > option enabled and extract some useful information. It can for example be used
> > to analyze the output of the grab-boot-data.sh tool that Tim presented on this
> > list [1] just a few days ago.
> > 
> > Usage is very simple, as the output of dmesg can be piped directly to it:
> > 
> >     dmesg | analyze-initcall-debug.py
> > 
> > If no option is specified, it outputs a brief summary, like the following one
> > (obtained on my Beagleplay):
> > 
> > 1758 drivers has been initialized, of which 1758 before userspace
> > 119 probes happened outside of the init of their driver
> > 0 deferred probes pending
> > ---
> > Top 10 init/probes durations:
> >  * 30200000.dss -> 523002us
> 
> This call, and a lot of the others are missing function names.  Did you compile the kernel with
> CONFIG_KALLSYMS=y?
> 
> If that's the case, is there a way to use the System.map file for the kernel (used on
> the machine where the dmesg was obtained from) to map these addresses
> to their respective  function names?
> 

These are not in fact addresses, but rather device names. In my understanding, they are printed
when a probe happens outside of the initialization function for their driver. I still don't have an idea
on how to match probes with their original initcall, in order to present the user the complete picture.

> >  * deferred_probe_initcall -> 487483us
> 
> I feel like this function is interfering with the ability to correctly assign
> boot time delays to the correct initcall.
> 
> I would really like to see someone tackle this issue, by studying the deferred_probe_initcall
> function, to see if there's a way to instrument it so that the cost of each
> probe can be reported and then associated with corresponding initcall, when initcall_debug
> is used.
> 
> Does anyone want to look into this issue?

I can take a look, but probably toward the end of the year. Another similar area I'd like to investigate
is related to initcall time vs actual probe time: for drivers instantiated multiple time, I'd like to know
how much time each probe took rather than the overall time.


Regards,
Francesco




^ permalink raw reply

* Re: [boot-time] [RFC] analyze-initcall-debug.py - a tool to analyze the initcall debug output
From: Geert Uytterhoeven @ 2024-12-05 14:58 UTC (permalink / raw)
  To: Francesco Valla; +Cc: Linux Embedded, Bird, Tim
In-Reply-To: <2281836.vFx2qVVIhK@fedora>

Hi Francesco,

On Thu, Dec 5, 2024 at 3:47 PM Francesco Valla <francesco@valla.it> wrote:
> On Tuesday, 3 December 2024 at 21:33:06 Bird, Tim <Tim.Bird@sony.com> wrote:
> > > From: Francesco Valla <francesco@valla.it>
> > > Top 10 init/probes durations:
> > >  * 30200000.dss -> 523002us
> >
> > This call, and a lot of the others are missing function names.  Did you compile the kernel with
> > CONFIG_KALLSYMS=y?
> >
> > If that's the case, is there a way to use the System.map file for the kernel (used on
> > the machine where the dmesg was obtained from) to map these addresses
> > to their respective  function names?
>
> These are not in fact addresses, but rather device names. In my understanding, they are printed
> when a probe happens outside of the initialization function for their driver. I still don't have an idea
> on how to match probes with their original initcall, in order to present the user the complete picture.

30200000.dss corresponds to dss@30200000 in the DTS.

$ git grep -W dss@30200000 | grep compatible
arch/arm64/boot/dts/ti/k3-am62-main.dtsi- compatible = "ti,am625-dss";
arch/arm64/boot/dts/ti/k3-am62a-main.dtsi- compatible = "ti,am62a7-dss";

$ git grep -Ww ti,am625-dss -- drivers/ | grep of_device_id
drivers/gpu/drm/tidss/tidss_drv.c=static const struct of_device_id
tidss_of_table[] = {

$ git grep -Ww tidss_of_table | grep _driver
drivers/gpu/drm/tidss/tidss_drv.c=static struct platform_driver
tidss_platform_driver = {

$ git grep -Ww tidss_platform_driver | grep module
drivers/gpu/drm/tidss/tidss_drv.c:drm_module_platform_driver(tidss_platform_driver);

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [boot-time] [RFC] analyze-initcall-debug.py - a tool to analyze the initcall debug output
From: Francesco Valla @ 2024-12-05 21:36 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Linux Embedded, Bird, Tim
In-Reply-To: <CAMuHMdUsLwvnZJ9i531coBkZpDzD_GuhODpHfqayrPAXT6PfOQ@mail.gmail.com>

Hi Geert,

On Thursday, 5 December 2024 at 15:58:09 Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> Hi Francesco,
> 
> On Thu, Dec 5, 2024 at 3:47 PM Francesco Valla <francesco@valla.it> wrote:
> > On Tuesday, 3 December 2024 at 21:33:06 Bird, Tim <Tim.Bird@sony.com> wrote:
> > > > From: Francesco Valla <francesco@valla.it>
> > > > Top 10 init/probes durations:
> > > >  * 30200000.dss -> 523002us
> > >
> > > This call, and a lot of the others are missing function names.  Did you compile the kernel with
> > > CONFIG_KALLSYMS=y?
> > >
> > > If that's the case, is there a way to use the System.map file for the kernel (used on
> > > the machine where the dmesg was obtained from) to map these addresses
> > > to their respective  function names?
> >
> > These are not in fact addresses, but rather device names. In my understanding, they are printed
> > when a probe happens outside of the initialization function for their driver. I still don't have an idea
> > on how to match probes with their original initcall, in order to present the user the complete picture.
> 
> 30200000.dss corresponds to dss@30200000 in the DTS.
> 

This is a simple example, but what about e.g. an I2C device (say 2-004c)? Some heuristic
would be needed to search for the correct I2C bus and then the 0x4c device; once found,
the compatible would then need to be searched through the entire kernel sources / git repo
to match it to the correct driver.

At that point, the tool would probably be too complex to be maintainable, while the added
value very little IMO (in my experience, boot time optimization is done by experienced
developers, which can match a probe with its driver quite easily, even if manually).


Thank you!


Regards,

Francesco



^ permalink raw reply

* [RFC PATCH] boot-time: instrument probes more
From: Bird, Tim @ 2024-12-09  3:10 UTC (permalink / raw)
  To: Francesco Valla, Linux Embedded
In-Reply-To: <2281836.vFx2qVVIhK@fedora>

> -----Original Message-----
> From: Francesco Valla <francesco@valla.it>
...
> On Tuesday, 3 December 2024 at 21:33:06 Bird, Tim <Tim.Bird@sony.com> wrote:
...
> > > -----Original Message-----
> > > From: Francesco Valla <francesco@valla.it>
...
> > >  * deferred_probe_initcall -> 487483us
> >
> > I feel like this function is interfering with the ability to correctly assign
> > boot time delays to the correct initcall.
> >
> > I would really like to see someone tackle this issue, by studying the deferred_probe_initcall
> > function, to see if there's a way to instrument it so that the cost of each
> > probe can be reported and then associated with corresponding initcall, when initcall_debug
> > is used.
> >
> > Does anyone want to look into this issue?
> 
> I can take a look, but probably toward the end of the year. Another similar area I'd like to investigate
> is related to initcall time vs actual probe time: for drivers instantiated multiple time, I'd like to know
> how much time each probe took rather than the overall time.

OK - here's an experimental patch to instrument deferred probing more.  It builds on
the already existing instrumentation for initcall_debug, but prints a bit more detail
about the probe function that is in effect during certain delays during boot.

Here is some output, from a kernel with this patch, filtered by grepping for "probe":
(this was on a beagleplay running a kernel with initcall_debug turned on)

$ ttc bp1 run "dmesg | grep probe"
[    0.003825] calling  init_kprobes+0x0/0x190 @ 1
[    0.004085] initcall init_kprobes+0x0/0x190 returned 0 after 4000 usecs
[    0.044089] calling  trace_events_eprobe_init_early+0x0/0x60 @ 1
[    0.044101] initcall trace_events_eprobe_init_early+0x0/0x60 returned 0 after 0 usecs
[    0.044112] calling  init_kprobe_trace_early+0x0/0x48 @ 1
[    0.044125] initcall init_kprobe_trace_early+0x0/0x48 returned 0 after 0 usecs
[    0.046498] platform probe dummy_regulator_probe is being called
[    0.046674] probe platform_probe in driver reg-dummy for device reg-dummy returned 0 after 0 usecs
[    0.049728] calling  release_early_probes+0x0/0x70 @ 1
[    0.049739] initcall release_early_probes+0x0/0x70 returned 0 after 0 usecs
[    0.057708] probe platform_probe in driver davinci_gpio for device 4201000.gpio returned -517 after 0 usecs
[    0.058357] platform probe sram_probe is being called
[    0.058463] probe platform_probe in driver sram for device 70000000.sram returned 0 after 0 usecs
[    0.061883] probe platform_probe in driver davinci_gpio for device 600000.gpio returned -517 after 0 usecs
[    0.062054] probe platform_probe in driver davinci_gpio for device 601000.gpio returned -517 after 0 usecs
[    0.070308] platform probe k3_chipinfo_probe is being called
[    0.070580] probe platform_probe in driver k3-chipinfo for device 43000014.chipid returned 0 after 0 usecs
[    0.071233] platform probe reg_fixed_voltage_probe is being called
[    0.071401] probe platform_probe in driver reg-fixed-voltage for device regulator-2 returned -517 after 0 usecs
[    0.071437] probe platform_probe in driver reg-fixed-voltage for device regulator-3 returned -517 after 0 usecs
[    0.071461] probe platform_probe in driver reg-fixed-voltage for device regulator-4 returned -517 after 0 usecs
[    0.071487] probe platform_probe in driver gpio-regulator for device regulator-5 returned -517 after 0 usecs
[    0.071663] probe platform_probe in driver reg-fixed-voltage for device regulator-1 returned 0 after 0 usecs
[    0.073156] probe platform_probe in driver omap_i2c for device 4900000.i2c returned -517 after 0 usecs
[    0.073228] probe platform_probe in driver omap_i2c for device 20000000.i2c returned -517 after 0 usecs
[    0.073250] probe platform_probe in driver omap_i2c for device 20010000.i2c returned -517 after 0 usecs
[    0.073272] probe platform_probe in driver omap_i2c for device 20020000.i2c returned -517 after 0 usecs
[    0.073293] probe platform_probe in driver omap_i2c for device 20030000.i2c returned -517 after 0 usecs
[    0.074080] platform probe psci_cpuidle_domain_probe is being called
[    0.074129] probe platform_probe in driver psci-cpuidle-domain for device firmware:psci returned 0 after 0 usecs
[    0.077465] calling  init_kprobe_trace+0x0/0x220 @ 1
[    0.842940] initcall init_kprobe_trace+0x0/0x220 returned 0 after 765459 usecs
[    0.843041] calling  init_uprobe_trace+0x0/0x98 @ 1
[    0.843068] initcall init_uprobe_trace+0x0/0x98 returned 0 after 17 usecs
[    0.855784] calling  arch_init_uprobes+0x0/0x48 @ 1
[    0.855796] initcall arch_init_uprobes+0x0/0x48 returned 0 after 1 usecs
[    0.893082] platform probe ti_sci_intr_irq_domain_probe is being called
[    0.893128] probe platform_probe in driver ti-sci-intr for device 4210000.interrupt-controller returned 517 after 111 usecs
[    0.893318] platform probe ti_sci_intr_irq_domain_probe is being called
[    0.893339] probe platform_probe in driver ti-sci-intr for device bus@f0000:interrupt-controller@a00000 returned 517 after 54 usecs
[    0.893557] platform probe ti_sci_inta_irq_domain_probe is being called
[    0.893575] probe platform_probe in driver ti-sci-inta for device 48000000.interrupt-controller returned 517 after 44 usecs
[    0.893712] platform probe simple_pm_bus_probe is being called
[    0.894140] probe platform_probe in driver simple-pm-bus for device bus@f0000 returned 0 after 449 usecs
[    0.894178] platform probe simple_pm_bus_probe is being called
[    0.894234] probe platform_probe in driver simple-pm-bus for device bus@f0000:bus@4000000 returned 0 after 78 usecs
[    0.894279] platform probe simple_pm_bus_probe is being called
[    0.894317] probe platform_probe in driver simple-pm-bus for device bus@f0000:bus@b00000 returned 0 after 60 usecs
[    0.894351] platform probe simple_pm_bus_probe is being called
[    0.894363] probe platform_probe in driver simple-pm-bus for device 43000000.syscon returned 19 after 32 usecs
[    0.894403] platform probe simple_pm_bus_probe is being called
[    0.894414] probe platform_probe in driver simple-pm-bus for device 100000.syscon returned 19 after 30 usecs
[    0.894453] platform probe simple_pm_bus_probe is being called
[    0.894487] probe platform_probe in driver simple-pm-bus for device bus@f0000:bus@48000000 returned 0 after 54 usecs
[    0.894574] platform probe simple_pm_bus_probe is being called
[    0.894601] probe platform_probe in driver simple-pm-bus for device fc00000.bus returned 0 after 47 usecs
[    0.895465] platform probe phy_gmii_sel_probe is being called
[    0.895835] probe platform_probe in driver phy-gmii-sel for device 104044.phy returned 0 after 395 usecs
[    0.896246] platform probe pcs_probe is being called
[    0.896588] probe platform_probe in driver pinctrl-single for device 4084000.pinctrl returned 0 after 369 usecs
[    0.896686] platform probe pcs_probe is being called
[    0.898159] probe platform_probe in driver pinctrl-single for device f4000.pinctrl returned 0 after 1501 usecs
[    0.900582] platform probe ti_syscon_gate_clk_probe is being called
[    0.900789] probe platform_probe in driver ti-syscon-gate-clk for device 104130.clock-controller returned 0 after 251 usecs
[    0.900816] probe platform_probe in driver ti-syscon-gate-clk for device 1082e0.clock-controller returned -517 after 10 usecs
[    0.900837] probe platform_probe in driver ti-syscon-gate-clk for device 1082e4.clock-controller returned -517 after 7 usecs
[    0.901084] platform probe udma_probe is being called
[    0.901191] probe platform_probe in driver ti-udma for device 485c0100.dma-controller returned 517 after 140 usecs
[    0.901287] platform probe udma_probe is being called
[    0.901346] probe platform_probe in driver ti-udma for device 485c0000.dma-controller returned 517 after 89 usecs
[    0.906403] probe serial_ctrl_probe in driver ctrl for device serial8250:0 returned 0 after 37 usecs
[    0.906481] probe serial_port_probe in driver port for device serial8250:0.0 returned 0 after 18 usecs
[    0.906812] probe serial_port_probe in driver port for device serial8250:0.1 returned 0 after 21 usecs
[    0.907078] probe serial_port_probe in driver port for device serial8250:0.2 returned 0 after 17 usecs
[    0.907328] probe serial_port_probe in driver port for device serial8250:0.3 returned 0 after 17 usecs
[    0.907580] probe serial_port_probe in driver port for device serial8250:0.4 returned 0 after 17 usecs
[    0.907848] probe serial_port_probe in driver port for device serial8250:0.5 returned 0 after 17 usecs
[    0.908102] probe serial_port_probe in driver port for device serial8250:0.6 returned 0 after 18 usecs
[    0.908353] probe serial_port_probe in driver port for device serial8250:0.7 returned 0 after 17 usecs
[    0.908604] probe serial_port_probe in driver port for device serial8250:0.8 returned 0 after 16 usecs
[    0.908852] probe serial_port_probe in driver port for device serial8250:0.9 returned 0 after 16 usecs
[    0.909110] probe serial_port_probe in driver port for device serial8250:0.10 returned 0 after 17 usecs
[    0.909420] probe serial_port_probe in driver port for device serial8250:0.11 returned 0 after 19 usecs
[    0.909771] platform probe serial8250_probe is being called
[    0.909791] probe platform_probe in driver serial8250 for device serial8250 returned 0 after 41 usecs
[    0.910194] probe platform_probe in driver omap8250 for device 2800000.serial returned 517 after 121 usecs
[    0.910228] probe platform_probe in driver omap8250 for device 2850000.serial returned -517 after 16 usecs
[    0.910251] probe platform_probe in driver omap8250 for device 2860000.serial returned -517 after 7 usecs
[    0.920029] calling  cfi_probe_init+0x0/0x40 @ 1
[    0.920039] initcall cfi_probe_init+0x0/0x40 returned 0 after 0 usecs
[    0.923593] probe platform_probe in driver mdio-gpio for device mdio returned -517 after 9 usecs
[    0.924519] probe platform_probe in driver am65-cpsw-nuss for device 8000000.ethernet returned -517 after 21 usecs
[    0.925689] probe platform_probe in driver dwc3-am62 for device f900000.dwc3-usb returned -517 after 16 usecs
[    0.925713] probe platform_probe in driver dwc3-am62 for device f910000.dwc3-usb returned -517 after 7 usecs
[    0.928710] platform probe ti_cpufreq_probe is being called
[    0.929101] platform probe dt_cpufreq_probe is being called
[    0.929161] probe platform_probe in driver cpufreq-dt for device cpufreq-dt returned 517 after 78 usecs
[    0.929190] probe platform_probe in driver ti-cpufreq for device ti-cpufreq returned 0 after 518 usecs
[    0.929796] platform probe psci_cpuidle_probe is being called
[    0.929822] probe platform_probe in driver psci-cpuidle for device psci-cpuidle returned 19 after 46 usecs
[    0.930416] probe platform_probe in driver sdhci-am654 for device fa10000.mmc returned -517 after 21 usecs
[    0.930429] probe platform_probe in driver sdhci-am654 for device fa00000.mmc returned -517 after 15 usecs
[    0.930449] probe platform_probe in driver sdhci-am654 for device fa20000.mmc returned -517 after 19 usecs
[    0.930922] platform probe gpio_led_probe is being called
[    0.930973] probe platform_probe in driver leds-gpio for device leds returned 517 after 97 usecs
[    0.931496] probe platform_probe in driver ti-sci for device 44043000.system-controller returned -517 after 4 usecs
[    0.932190] probe platform_probe in driver omap_timer for device 2400000.timer returned -517 after 13 usecs
[    0.932217] probe platform_probe in driver omap_timer for device 2410000.timer returned -517 after 12 usecs
[    0.932238] probe platform_probe in driver omap_timer for device 2420000.timer returned -517 after 7 usecs
[    0.932258] probe platform_probe in driver omap_timer for device 2430000.timer returned -517 after 7 usecs
[    0.932277] probe platform_probe in driver omap_timer for device 2440000.timer returned -517 after 7 usecs
[    0.932297] probe platform_probe in driver omap_timer for device 2450000.timer returned -517 after 7 usecs
[    0.932317] probe platform_probe in driver omap_timer for device 2460000.timer returned -517 after 7 usecs
[    0.932337] probe platform_probe in driver omap_timer for device 2470000.timer returned -517 after 7 usecs
[    0.933540] platform probe omap_mbox_probe is being called
[    0.933594] probe platform_probe in driver omap-mailbox for device 29000000.mailbox returned 19 after 104 usecs
[    0.933746] platform probe ti_msgmgr_probe is being called
[    0.934097] probe platform_probe in driver ti-msgmgr for device 4d000000.mailbox returned 0 after 380 usecs
[    0.934692] platform probe armv8_pmu_device_probe is being called
[    0.935257] probe platform_probe in driver armv8-pmu for device pmu returned 0 after 593 usecs
[    0.935845] platform probe optee_probe is being called
[    0.952849] probe platform_probe in driver optee for device firmware:optee returned 0 after 17036 usecs
[    0.954457] platform probe snd_soc_dummy_probe is being called
[    0.954498] probe platform_probe in driver snd-soc-dummy for device snd-soc-dummy returned 0 after 73 usecs
[    0.963947] calling  debugfs_kprobe_init+0x0/0xc0 @ 1
[    0.963974] initcall debugfs_kprobe_init+0x0/0xc0 returned 0 after 17 usecs
[    0.973856] calling  deferred_probe_initcall+0x0/0xc8 @ 1
[    0.973989] probe platform_probe in driver davinci_gpio for device 4201000.gpio returned -517 after 24 usecs
[    0.974036] probe platform_probe in driver davinci_gpio for device 600000.gpio returned -517 after 8 usecs
[    0.974071] probe platform_probe in driver davinci_gpio for device 601000.gpio returned -517 after 7 usecs
[    0.974627] platform probe reg_fixed_voltage_probe is being called
[    0.974730] probe platform_probe in driver reg-fixed-voltage for device regulator-3 returned -517 after 6 usecs
[    0.974873] probe platform_probe in driver reg-fixed-voltage for device regulator-4 returned -517 after 4 usecs
[    0.975017] probe platform_probe in driver omap_i2c for device 4900000.i2c returned -517 after 23 usecs
[    0.975031] probe platform_probe in driver gpio-regulator for device regulator-5 returned -517 after 34 usecs
[    0.975059] probe platform_probe in driver omap_i2c for device 20000000.i2c returned -517 after 7 usecs
[    0.975075] probe platform_probe in driver reg-fixed-voltage for device regulator-2 returned 0 after 535 usecs
[    0.975093] probe platform_probe in driver omap_i2c for device 20010000.i2c returned -517 after 7 usecs
[    0.975124] probe platform_probe in driver omap_i2c for device 20020000.i2c returned -517 after 7 usecs
[    0.975156] probe platform_probe in driver omap_i2c for device 20030000.i2c returned -517 after 7 usecs
[    0.975238] platform probe ti_sci_intr_irq_domain_probe is being called
[    0.975272] probe platform_probe in driver ti-sci-intr for device 4210000.interrupt-controller returned 517 after 88 usecs
[    0.975471] platform probe ti_sci_intr_irq_domain_probe is being called
[    0.975488] probe platform_probe in driver ti-sci-intr for device bus@f0000:interrupt-controller@a00000 returned 517 after 44 usecs
[    0.975649] platform probe ti_sci_inta_irq_domain_probe is being called
[    0.975667] probe platform_probe in driver ti-sci-inta for device 48000000.interrupt-controller returned 517 after 46 usecs
[    0.975826] probe platform_probe in driver ti-syscon-gate-clk for device 1082e0.clock-controller returned -517 after 8 usecs
[    0.975885] probe platform_probe in driver ti-syscon-gate-clk for device 1082e4.clock-controller returned -517 after 8 usecs
[    0.975967] platform probe udma_probe is being called
[    0.976095] probe platform_probe in driver ti-udma for device 485c0100.dma-controller returned 517 after 157 usecs
[    0.976258] platform probe udma_probe is being called
[    0.976315] probe platform_probe in driver ti-udma for device 485c0000.dma-controller returned 517 after 85 usecs
[    0.976558] probe platform_probe in driver omap8250 for device 2800000.serial returned 517 after 93 usecs
[    0.976723] probe platform_probe in driver omap8250 for device 2850000.serial returned -517 after 11 usecs
[    0.976804] probe platform_probe in driver omap8250 for device 2860000.serial returned -517 after 7 usecs
[    0.976888] probe platform_probe in driver mdio-gpio for device mdio returned -517 after 3 usecs
[    0.976986] probe platform_probe in driver am65-cpsw-nuss for device 8000000.ethernet returned -517 after 8 usecs
[    0.977082] probe platform_probe in driver dwc3-am62 for device f900000.dwc3-usb returned -517 after 8 usecs
[    0.977178] probe platform_probe in driver dwc3-am62 for device f910000.dwc3-usb returned -517 after 7 usecs
[    0.977308] platform probe dt_cpufreq_probe is being called
[    0.977371] probe platform_probe in driver cpufreq-dt for device cpufreq-dt returned 517 after 93 usecs
[    0.977707] probe platform_probe in driver sdhci-am654 for device fa10000.mmc returned -517 after 13 usecs
[    0.977869] probe platform_probe in driver sdhci-am654 for device fa00000.mmc returned -517 after 16 usecs
[    0.978019] probe platform_probe in driver sdhci-am654 for device fa20000.mmc returned -517 after 9 usecs
[    0.978054] platform probe gpio_led_probe is being called
[    0.978102] probe platform_probe in driver leds-gpio for device leds returned 517 after 95 usecs
[    0.978264] platform probe ti_sci_probe is being called
[    0.981447] platform probe ti_sci_pm_domain_probe is being called
[    0.982522] probe platform_probe in driver ti_sci_pm_domains for device 44043000.system-controller:power-controller returned 0 after 1140 usecs
[    0.984639] platform probe ti_sci_clk_probe is being called
[    1.036888] probe platform_probe in driver ti-sci-clk for device 44043000.system-controller:clock-controller returned 0 after 52299 usecs
[    1.037414] platform probe ti_sci_reset_probe is being called
[    1.037459] probe platform_probe in driver ti-sci-reset for device 44043000.system-controller:reset-controller returned 0 after 101 usecs
[    1.037504] probe platform_probe in driver ti-sci for device 44043000.system-controller returned 0 after 59269 usecs
[    1.037910] platform probe omap_dm_timer_probe is being called
[    1.038244] probe platform_probe in driver omap_timer for device 2400000.timer returned 0 after 474 usecs
[    1.038564] platform probe omap_dm_timer_probe is being called
[    1.038788] probe platform_probe in driver omap_timer for device 2410000.timer returned 0 after 327 usecs
[    1.039185] platform probe omap_dm_timer_probe is being called
[    1.039449] probe platform_probe in driver omap_timer for device 2420000.timer returned 0 after 393 usecs
[    1.039752] platform probe omap_dm_timer_probe is being called
[    1.039948] probe platform_probe in driver omap_timer for device 2430000.timer returned 0 after 300 usecs
[    1.040211] platform probe omap_dm_timer_probe is being called
[    1.040394] probe platform_probe in driver omap_timer for device 2440000.timer returned 0 after 281 usecs
[    1.040647] platform probe omap_dm_timer_probe is being called
[    1.040840] probe platform_probe in driver omap_timer for device 2450000.timer returned 0 after 287 usecs
[    1.041096] platform probe omap_dm_timer_probe is being called
[    1.041348] probe platform_probe in driver omap_timer for device 2460000.timer returned 0 after 351 usecs
[    1.041623] platform probe omap_dm_timer_probe is being called
[    1.041812] probe platform_probe in driver omap_timer for device 2470000.timer returned 0 after 290 usecs
[    1.041845] probe platform_probe in driver davinci_gpio for device 4201000.gpio returned -517 after 7 usecs
[    1.041879] probe platform_probe in driver davinci_gpio for device 600000.gpio returned -517 after 3 usecs
[    1.041908] probe platform_probe in driver davinci_gpio for device 601000.gpio returned -517 after 3 usecs
[    1.042164] probe platform_probe in driver reg-fixed-voltage for device regulator-3 returned -517 after 5 usecs
[    1.042293] probe platform_probe in driver reg-fixed-voltage for device regulator-4 returned -517 after 3 usecs
[    1.042448] platform probe omap_i2c_probe is being called
[    1.043411] probe platform_probe in driver omap_i2c for device 4900000.i2c returned 0 after 1124 usecs
[    1.043762] probe platform_probe in driver gpio-regulator for device regulator-5 returned -517 after 16 usecs
[    1.043940] platform probe omap_i2c_probe is being called
[    1.070994] platform probe tps65219_regulator_probe is being called
[    1.070994] platform probe tps65219_gpio_probe is being called
[    1.071562] probe platform_probe in driver tps65219-gpio for device tps65219-gpio.1.auto returned 0 after 608 usecs
[    1.072220] probe i2c_device_probe in driver tps65219 for device 0-0030 returned 0 after 26681 usecs
[    1.072780] probe platform_probe in driver omap_i2c for device 20000000.i2c returned 0 after 29033 usecs
[    1.073092] platform probe omap_i2c_probe is being called
[    1.074288] probe platform_probe in driver omap_i2c for device 20010000.i2c returned 0 after 1442 usecs
[    1.074580] platform probe omap_i2c_probe is being called
[    1.076436] probe platform_probe in driver omap_i2c for device 20020000.i2c returned 0 after 2067 usecs
[    1.076737] platform probe omap_i2c_probe is being called
[    1.077779] probe platform_probe in driver omap_i2c for device 20030000.i2c returned 0 after 1273 usecs
[    1.077917] platform probe ti_sci_intr_irq_domain_probe is being called
[    1.078074] probe platform_probe in driver ti-sci-intr for device 4210000.interrupt-controller returned 0 after 214 usecs
[    1.078171] platform probe ti_sci_intr_irq_domain_probe is being called
[    1.078272] probe platform_probe in driver ti-sci-intr for device bus@f0000:interrupt-controller@a00000 returned 0 after 132 usecs
[    1.078332] platform probe ti_sci_inta_irq_domain_probe is being called
[    1.078346] probe platform_probe in driver tps65219-pmic for device tps65219-regulator.0.auto returned 0 after 7393 usecs
[    1.078574] probe platform_probe in driver ti-sci-inta for device 48000000.interrupt-controller returned 0 after 273 usecs
[    1.078749] platform probe ti_syscon_gate_clk_probe is being called
[    1.079164] probe platform_probe in driver ti-syscon-gate-clk for device 1082e0.clock-controller returned 0 after 473 usecs
[    1.079287] platform probe ti_syscon_gate_clk_probe is being called
[    1.079527] probe platform_probe in driver ti-syscon-gate-clk for device 1082e4.clock-controller returned 0 after 279 usecs
[    1.079628] platform probe udma_probe is being called
[    1.084883] probe platform_probe in driver ti-udma for device 485c0100.dma-controller returned 0 after 5285 usecs
[    1.085124] platform probe udma_probe is being called
[    1.092231] probe platform_probe in driver ti-udma for device 485c0000.dma-controller returned 0 after 7165 usecs
[    1.092699] platform probe omap8250_probe is being called
[    1.093664] probe serial_ctrl_probe in driver ctrl for device 2800000.serial:0 returned 0 after 38 usecs
[    1.093742] probe serial_port_probe in driver port for device 2800000.serial:0.0 returned 0 after 18 usecs
[    1.158223] probe platform_probe in driver omap8250 for device 2800000.serial returned 0 after 65754 usecs
[    1.158640] platform probe omap8250_probe is being called
[    1.159478] probe serial_ctrl_probe in driver ctrl for device 2850000.serial:0 returned 0 after 32 usecs
[    1.159564] probe serial_port_probe in driver port for device 2850000.serial:0.0 returned 0 after 18 usecs
[    1.159933] probe platform_probe in driver omap8250 for device 2850000.serial returned 0 after 1485 usecs
[    1.160268] platform probe omap8250_probe is being called
[    1.161013] probe serial_ctrl_probe in driver ctrl for device 2860000.serial:0 returned 0 after 30 usecs
[    1.161091] probe serial_port_probe in driver port for device 2860000.serial:0.0 returned 0 after 18 usecs
[    1.161705] probe platform_probe in driver omap8250 for device 2860000.serial returned 0 after 1598 usecs
[    1.162091] probe platform_probe in driver mdio-gpio for device mdio returned -517 after 8 usecs
[    1.162693] platform probe am65_cpsw_nuss_probe is being called
[    1.180111] probe platform_probe in driver am65-cpsw-nuss for device 8000000.ethernet returned 0 after 17869 usecs
[    1.180627] platform probe dwc3_ti_probe is being called
[    1.195673] platform probe dwc3_probe is being called
[    1.198203] probe platform_probe in driver dwc3 for device 31000000.usb returned 0 after 2569 usecs
[    1.198449] probe platform_probe in driver dwc3-am62 for device f900000.dwc3-usb returned 0 after 18007 usecs
[    1.198933] platform probe dwc3_ti_probe is being called
[    1.214312] platform probe dwc3_probe is being called
[    1.214861] platform probe xhci_generic_plat_probe is being called
[    1.216389] probe usb_probe_interface in driver hub for device 1-0:1.0 returned 0 after 237 usecs
[    1.216467] probe usb_probe_device in driver usb for device usb1 returned 0 after 436 usecs
[    1.216536] probe platform_probe in driver xhci-hcd for device xhci-hcd.3.auto returned 0 after 1703 usecs
[    1.216576] probe platform_probe in driver dwc3 for device 31100000.usb returned 0 after 2380 usecs
[    1.216939] probe platform_probe in driver dwc3-am62 for device f910000.dwc3-usb returned 0 after 18209 usecs
[    1.217124] platform probe dt_cpufreq_probe is being called
[    1.218721] probe platform_probe in driver cpufreq-dt for device cpufreq-dt returned 0 after 1641 usecs
[    1.219429] probe platform_probe in driver sdhci-am654 for device fa00000.mmc returned -517 after 8 usecs
[    1.219587] probe platform_probe in driver sdhci-am654 for device fa20000.mmc returned -517 after 4 usecs
[    1.219603] platform probe gpio_led_probe is being called
[    1.219603] platform probe sdhci_am654_probe is being called
[    1.219678] probe platform_probe in driver leds-gpio for device leds returned 517 after 110 usecs
[    1.219861] platform probe davinci_gpio_probe is being called
[    1.221502] probe platform_probe in driver davinci_gpio for device 4201000.gpio returned 0 after 1760 usecs
[    1.221830] platform probe davinci_gpio_probe is being called
[    1.225318] probe platform_probe in driver davinci_gpio for device 600000.gpio returned 0 after 3732 usecs
[    1.225607] platform probe davinci_gpio_probe is being called
[    1.227855] probe platform_probe in driver davinci_gpio for device 601000.gpio returned 0 after 2461 usecs
[    1.228478] platform probe reg_fixed_voltage_probe is being called
[    1.228613] platform probe reg_fixed_voltage_probe is being called
[    1.228800] platform probe mdio_gpio_probe is being called
[    1.229055] probe platform_probe in driver reg-fixed-voltage for device regulator-4 returned 0 after 531 usecs
[    1.229055] probe platform_probe in driver reg-fixed-voltage for device regulator-3 returned 0 after 722 usecs
[    1.229110] platform probe gpio_regulator_probe is being called
[    1.229760] probe platform_probe in driver gpio-regulator for device regulator-5 returned 0 after 741 usecs
[    1.231825] probe phy_probe in driver RTL8211F-VD Gigabit Ethernet for device gpio-0:00 returned 0 after 1503 usecs
[    1.257881] probe platform_probe in driver sdhci-am654 for device fa10000.mmc returned 0 after 38624 usecs
[    1.310422] probe mmc_bus_probe in driver mmcblk for device mmc0:0001 returned 0 after 4604 usecs
[    1.366103] probe phy_probe in driver TI DP83TD510E for device gpio-0:01 returned 0 after 68640 usecs
[    1.366296] probe platform_probe in driver mdio-gpio for device mdio returned 0 after 137582 usecs
[    1.367130] platform probe gpio_led_probe is being called
[    1.367262] platform probe sdhci_am654_probe is being called
[    1.367305] platform probe sdhci_am654_probe is being called
[    1.367917] probe platform_probe in driver leds-gpio for device leds returned 0 after 862 usecs
[    1.367997] initcall deferred_probe_initcall+0x0/0xc8 returned 0 after 394128 usecs
[    1.370955] platform probe gpio_keys_probe is being called
[    1.371580] probe platform_probe in driver gpio-keys for device gpio-keys returned 0 after 777 usecs
[    1.406004] probe platform_probe in driver sdhci-am654 for device fa20000.mmc returned 0 after 38948 usecs
[    1.408093] probe platform_probe in driver sdhci-am654 for device fa00000.mmc returned 0 after 41196 usecs
[    1.424172] probe mmc_bus_probe in driver mmcblk for device mmc2:0001 returned 19 after 71 usecs
[    1.624661] probe mmc_bus_probe in driver mmcblk for device mmc1:0001 returned 0 after 3742 usecs

----
Below is the patch that adds the extra instrumentation.  My quick question is, does this look useful?
Can it be used to find the more problematic (ie long-duration) probe functions or drivers initializations?

Full boot data from a run with this instrumentation is available at:
https://birdcloud.org/boot-time-files/boot-data-timslab-bp1-241208-191139.txt

Please provide any feedback you have.
 -- Tim

----
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 0c3725c3eefa..5e19079b0a2b 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -28,6 +28,7 @@
 #include <linux/pm_runtime.h>
 #include <linux/pinctrl/devinfo.h>
 #include <linux/slab.h>
+#include <linux/kallsyms.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -732,8 +733,16 @@ static int really_probe(struct device *dev, struct device_driver *drv)
 static int really_probe_debug(struct device *dev, struct device_driver *drv)
 {
 	ktime_t calltime, rettime;
+	char fn_name[KSYM_SYMBOL_LEN];
+	unsigned long addr = 0;
 	int ret;
 
+	if (dev->bus->probe)
+		addr = (unsigned long)dev->bus->probe;
+	else if (drv->probe)
+		addr = (unsigned long)drv->probe;
+	sprint_symbol_no_offset(fn_name, addr);
+
 	calltime = ktime_get();
 	ret = really_probe(dev, drv);
 	rettime = ktime_get();
@@ -742,8 +751,8 @@ static int really_probe_debug(struct device *dev, struct device_driver *drv)
 	 * CONFIG_DYNAMIC_DEBUG and we want a simple 'initcall_debug' on the
 	 * kernel commandline to print this all the time at the debug level.
 	 */
-	printk(KERN_DEBUG "probe of %s returned %d after %lld usecs\n",
-		 dev_name(dev), ret, ktime_us_delta(rettime, calltime));
+	printk(KERN_DEBUG "probe %s in driver %s for device %s returned %d after %lld usecs\n",
+		 fn_name, drv->name, dev_name(dev), ret, ktime_us_delta(rettime, calltime));
 	return ret;
 }
 
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 76bfcba25003..ae15969c483a 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -32,6 +32,7 @@
 #include <linux/types.h>
 #include <linux/iommu.h>
 #include <linux/dma-map-ops.h>
+#include <linux/kallsyms.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -1381,6 +1382,9 @@ static int platform_probe(struct device *_dev)
 	struct platform_driver *drv = to_platform_driver(_dev->driver);
 	struct platform_device *dev = to_platform_device(_dev);
 	int ret;
+	char fn_name[KSYM_SYMBOL_LEN];
+	unsigned long addr = 0;
+
 
 	/*
 	 * A driver registered using platform_driver_probe() cannot be bound
@@ -1401,6 +1405,13 @@ static int platform_probe(struct device *_dev)
 		goto out;
 
 	if (drv->probe) {
+		if (initcall_debug) {
+			addr = (unsigned long)drv->probe;
+			sprint_symbol_no_offset(fn_name, addr);
+
+			printk(KERN_DEBUG "platform probe %s is being called\n", fn_name);
+		}
+
 		ret = drv->probe(dev);
 		if (ret)
 			dev_pm_domain_detach(_dev, true);




^ permalink raw reply related

* Dec 17 - Next Boot-Time SIG meeting
From: Bird, Tim @ 2024-12-17  4:48 UTC (permalink / raw)
  To: linux-embedded@vger.kernel.org; +Cc: khasim@ti.com

Hey Linux Boot-Time SIG interested parties,

Here is the information for the next Linux Boot-Time SIG conference call.

The meeting will be held via the Jitsi online meeting platform.
To Join the meeting via web, click on:
https://meet.jit.si/LinuxBootTimeSIG

----
Our next meeting is Tuesday, December 17, at 9:00 am Mountain Time.
See this link for other time zones:
https://www.timeanddate.com/worldclock/meetingdetails.html?year=2024&month=12&day=17&hour=16&min=0&sec=0&p1=220&p2=137&p3=195&p4=771

(That makes it 8:00 am Pacific, 16:00 UTC, 17:00 CET, and 21:30 IST)

I'm planning on 1 hour for this meeting.

The agenda for the meeting (and where we'll keep the minutes) is here:
https://docs.google.com/document/d/1XAufoTT6VVJOTMzKMoz8SyOss-JA9H4J1_yVXQq5mN0/edit?usp=sharing

The agenda for the Dec. 17 meeting is:
--------
Topics:
  - Round-table of completed or ongoing work
    - Francesco Valla's analyze-initcall-debug.py script
    - Work on the Boot Time wiki (section of elinux wiki) https://elinux.org/Boot_Time
    - Status of the Boot Time Data wiki (on birdcloud.org) https://birdcloud.org/boot-time/Boot_Data
       - Tim Bird's grab-boot-data.sh
       - Boot Region processor
    - What are people working on?
       - In particular, are there any patches in flight?
       - Status of RedHat boot time work?
       - Status of Texas Instruments boot time work?
       - Anyone else?
  - Review of project ideas
    - See https://elinux.org/Boot_Time_Project_Ideas
  - Review of action items
    - Write a white paper or presentation with a mission statement
    - Mission Proposal:

The Linux Boot Time Special Interest Group exists to make it easier
for embedded Linux developers to meet requirements of quickly
booting and resuming Linux systems.  To support this, the SIG
develops documentation, instrumentation, tests, tools and technology
(changes to the Linux kernel), that enable developers to minimize Linux
boot time.  The goal is not just to optimize Linux for a single particular
use case, but to make information and technology available that
will enable the ongoing ability to configure and tune the boot time for
Linux-based systems, to meet the needs of a variety of use cases where
quick booting is a required or desirable feature.

  - Engaging more developers
    - Not many people have run my data gathering tool - how can I get data for more systems?
  - Next meeting - Jan 28th?
    - proposed periodicity- 1 month (tentatively the 4th Tuesday of every month?)
--------

The meeting is open to anyone.

Please prepare for the meeting by reading the mission statement (above), and being
reading to comment on it.  Also, if you have done any work on boot time in the last
year, or are actively working on it, please be prepared to comment on it during the meeting.

I look forward to talking with you.
 -- Tim

P.S. As a reminder, the wiki for the SIG is at: https://elinux.org/Boot_Time


^ permalink raw reply

* Re: Dec 17 - Next Boot-Time SIG meeting
From: Francesco Valla @ 2024-12-17 12:47 UTC (permalink / raw)
  To: linux-embedded@vger.kernel.org, Bird, Tim; +Cc: khasim@ti.com
In-Reply-To: <MW5PR13MB56326570980A91F6A77472EDFD042@MW5PR13MB5632.namprd13.prod.outlook.com>

Hi,

On Tuesday, 17 December 2024 at 05:48:31 Bird, Tim <Tim.Bird@sony.com> wrote:
> Hey Linux Boot-Time SIG interested parties,
> 
> Here is the information for the next Linux Boot-Time SIG conference call.

while very honored to be on the agenda, I unfortunately won't be able to
participate to the meeting today due to end-of-year activities at my workplace
that cannot be  postponed.

I hope I'll have more time the next year to contribute to the SIG, for which I
confirm my interest. I already booked the slot for Jan 28th

I wish you all a joyful Christmas.

Regards,

Francesco




^ permalink raw reply

* [boot-time] [SCRIPT v2] analyze-initcall-debug.py
From: Francesco Valla @ 2024-12-17 12:57 UTC (permalink / raw)
  To: Linux Embedded

Hello,

this is the second version of the analyze-initcall-debug.py script, which can
be used to analyze the kernel output when booting with initcall_debug
to extract some human-readable data from it.

The first version ca be found at:
https://lore.kernel.org/linux-embedded/1964175.7Z3S40VBb9@fedora.fritz.box/

Regards,

Francesco

---

Differences v1 -> v2:

- added a list of failed driver probes in the plain text output
- added a preliminary comparison on read lines to avoid false negatives in
  regex-based matches
- added the --body-only option to output only the raw HTML body
- use of dictionaries instead of filter() constructs

------ the analyze-initcall-debug.py script follows ------

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

"""
This script can be used to analyze a Linux kernel dmesg output when the
initcall_debug command line output is used. It extract various information,
such as init durations, impact of probe deferral and so on.
"""

import argparse
import re
import sys

ERRCODE_PROBE_DEFER = 517

parser = argparse.ArgumentParser(description='Analyze a Linux kernel dmesg with the initcall_debug option enabled')

parser.add_argument('--dmesg', nargs='?', type=argparse.FileType('r'),
                    default=sys.stdin, help='The dmesg file to analyze (default: stdin)')
format_group = parser.add_mutually_exclusive_group()
format_group.add_argument('--html', action='store_true',
                          help='Output analysis result as HTML table')
format_group.add_argument('--markdown', action='store_true',
                          help='Output analysis result as Markdown table')
parser.add_argument('--body-only', action='store_true',
                    help='Do not add header and footer to HTML output')
args = parser.parse_args()

class Run:
    def __init__(self, start_time:int, end_time:int = -1, duration:int = 0, retval:int = 0):
        self._start_time = start_time
        self._end_time = end_time
        self._duration = duration
        self._retval = retval
        self._ended = (end_time >= 0)

    @property
    def start_time(self) -> int:
        return self._start_time

    @property
    def end_time(self) -> int:
        return self._end_time

    @end_time.setter
    def end_time(self, time:int):
        self._end_time = time
        self._ended = True

    @property
    def duration(self) -> int:
        return self._duration

    @duration.setter
    def duration(self, time:int):
        self._duration = time

    @property
    def retval(self) -> int:
        return self._retval

    @retval.setter
    def retval(self, val:int):
        self._retval = val

    @property
    def running(self) -> bool:
        return (not self._ended or abs(self.retval) == ERRCODE_PROBE_DEFER)

    @property
    def failed(self) -> bool:
        return ((self.retval != 0) and not self.running)


class Entity:
    def __init__(self, name:str, start_time:int = 0, end_time:int = 0, duration:int = 0, retval:int = 0):
        self._name = name
        self._runs = [ Run(start_time, end_time, duration, retval) ]

    @property
    def name(self) -> str:
        return self._name

    @property
    def first_start_time(self) -> int:
        return self._runs[0].start_time

    @property
    def last_start_time(self) -> int:
        return self._runs[-1].start_time

    @property
    def first_end_time(self) -> int:
        return self._runs[0].end_time

    @property
    def last_end_time(self) -> int:
        return self._runs[-1].end_time

    @property
    def duration(self) -> int:
        return sum([ r.duration for r in self._runs ])

    @property
    def wasted_time(self) -> int:
        return sum([ r.duration for r in filter(lambda x: x.failed or abs(x.retval) == ERRCODE_PROBE_DEFER, self._runs) ])

    @property
    def retval(self) -> int:
        return self._runs[-1].retval

    @property
    def failed(self) -> bool:
        return self._runs[-1].failed

    @property
    def running(self) -> bool:
        return self._runs[-1].running

    @property
    def deferred_probe_pending(self) -> bool:
        return (abs(self._runs[-1].retval) == ERRCODE_PROBE_DEFER)

    @property
    def num_deferred_probes(self) -> int:
        return len(list(filter(lambda r: abs(r.retval) == ERRCODE_PROBE_DEFER, self._runs)))

    def addStart(self, start_time: int):
        self._runs.append( Run(start_time) )

    def addEnd(self, end_time:int = 0, duration:int = 0, retval:int = 0):
        self._runs[-1].end_time = end_time
        self._runs[-1].duration = duration
        self._runs[-1].retval = retval

    def addRun(self, start_time: int, end_time:int = 0, duration:int = 0, retval:int = 0):
        self._runs.append( Run(start_time, end_time, duration, retval) )


class Driver (Entity):
    def __init__(self, name: str, start_time: int):
        super().__init__(name, start_time)

    def addInit(self, init_time: int):
        self.addStart(init_time)

    def addReturn(self, return_time: int, duration: int, retval: int):
        self.addEnd(return_time, duration, retval)


class Probe (Entity):
    def __init__(self, name: str, return_time: int, duration: int, retval: int):
        super().__init__(name, return_time, return_time, duration, retval)


class Init (Entity):
    def __init__(self, name: str, start_time: int):
        super().__init__(name, start_time)


# Regex for 'calling' messages
# "[    0.466115] calling  pci_sysfs_init+0x0/0xa8 @ 1"
calling_sentinel = 'calling  '
calling_prog = re.compile(r'\[([0-9\s]+\.[0-9]+)\] calling  ([0-9a-zA-Z_]+)\+(0x[0-9a-fA-F]+\/0x[0-9a-fA-F]+)( \[[a-zA-Z0-9\-_]+\])? @ ([0-9]+)')

# Regex for 'initcall ... returned' messages
# "[    0.466115] initcall pci_sysfs_init+0x0/0xa8 returned 0 after 5 usecs"
returned_sentinel = 'initcall '
returned_prog = re.compile(r'\[([0-9\s]+\.[0-9]+)\] initcall ([0-9a-zA-Z_]+)\+(0x[0-9a-fA-F]+\/0x[0-9a-fA-F]+)( \[[a-zA-Z0-9\-_]+\])? returned ([\-0-9]+) after ([0-9]+) usecs')

# Regex for 'probe ... returned' messages
# "[    0.466115] probe of cpufreq-dt returned 517 after 140 usec"
probe_sentinel = 'probe of '
probe_prog = re.compile(r'\[([0-9\s]+\.[0-9]+)\] probe of ([0-9a-zA-Z_\-\.\:@]+) returned ([\-0-9]+) after ([0-9]+) usecs')

# Regex for
# "[    1.060329] Run /sbin/init as init process"
init_sentinel = 'as init process'
init_prog = re.compile(r'\[([0-9\s]+\.[0-9]+)\] Run ([/0-9a-zA-Z_]+) as init process')

drivers = {}
probes = {}
init = None

# Extract data from dmesg
lineno = 0
for line in args.dmesg:
    lineno += 1
    if calling_sentinel in line:
        match = calling_prog.match(line)
        if match:
            try:
                time = int(float(match.group(1)) * 1000000.0)
                name = str(match.group(2))
            except Exception as e:
                print(f'Failed parsing line {lineno}:"{line.rstrip()}" as call', file=sys.stderr)
                raise e
            else:
                if name not in drivers.keys():
                    drivers[name] = Driver(name, time)
                else:
                    drivers[name].addInit(time)
            finally:
                continue
        else:
            print(f'Failed matching line {lineno}:"{line.rstrip()}" as call', file=sys.stderr)

    if returned_sentinel in line:
        match = returned_prog.match(line)
        if match:
            try:
                time = int(float(match.group(1)) * 1000000.0)
                name = str(match.group(2))
                retval = int(match.group(5))
                duration = int(match.group(6))
            except:
                print(f'Failed parsing line {lineno}:"{line.rstrip()}" as call return', file=sys.stderr)
            else:
                if name not in drivers.keys():
                    print(f'Detected return for driver {name}, for which an init was never recorded', file=sys.stderr)
                else:
                    drivers[name].addReturn(time, duration, retval)
            finally:
                continue
        else:
            print(f'Failed matching line {lineno}:"{line.rstrip()}" as call return', file=sys.stderr)

    if probe_sentinel in line:
        match = probe_prog.match(line)
        if match:
            try:
                time = int(float(match.group(1)) * 1000000.0)
                name = str(match.group(2))
                retval = int(match.group(3))
                duration = int(match.group(4))
            except:
                print(f'Failed parsing line {lineno}:"{line.rstrip()}" as probe return', file=sys.stderr)
            else:
                if name not in probes.keys():
                    probes[name] = Probe(name, time, duration, retval)
                else:
                    probes[name].addRun(time, time, duration, retval)
            finally:
                continue
        else:
            print(f'Failed matching line {lineno}:"{line.rstrip()}" as probe return', file=sys.stderr)

    if not init and init_sentinel in line:
        match = init_prog.match(line)
        if match:
            try:
                time = int(float(match.group(1)) * 1000000.0)
                name = str(match.group(2))
            except:
                print(f'Failed parsing line {lineno}:"{line.rstrip()}" as init', file=sys.stderr)
            else:
                init = Init(name, time)
            finally:
                continue
        else:
            print(f'Failed matching line {lineno}:"{line.rstrip()}" as init', file=sys.stderr)

# Print Markdown format
if args.markdown:
    print('| Name | Type [^1] | Status [^2] | Duration (us) | Before userspace | Num deferred probes | Time spent in failed inits (us) |')
    print('| -----| --------- | ----------- | ------------- | ---------------- | ------------------- | ------------------------------- |')
    for d in sorted(filter(lambda k: k.duration > 0, list(drivers.values()) + list(probes.values())), key=lambda k: k.duration, reverse=True):
        etype = 'I' if d in drivers else 'P'
        run_status = 'D' if d.deferred_probe_pending else 'R' if d.running else f'F(-{abs(d.retval)})' if d.failed else 'K'
        before_init = 'Y' if d.last_start_time < init.last_start_time else 'N'
        print(f'| {d.name} | {etype} | {run_status} | {d.duration} | {before_init} | {d.num_deferred_probes} | {d.wasted_time} |')
    print('[^1]: I=Init function, P=device Probe')
    print('[^2]: D=Deferred probe pending, R=still Running, F(N)=Failed (with errcode N), K=init oK')

# Print HTML format
elif args.html:
    if not args.body_only:
        print(
'''
<!DOCTYPE HTML>
<html>
<head>
  <meta charset="utf-8" />
  <title>Initcall analysis</title>
  <meta name="viewport" content="width=device-width, initial-scale=1" />
  <style>
    table, th, td {
      border: 1px solid;
    }
  </style>
</head>
<body>''')

    print(
'''
  <table>
    <tr>
      <th>Name</th>
      <th>Type <a href="#note1">[1]</a></th>
      <th>Status <a href="#note2">[2]</a></th>
      <th>Duration (us)</th>
      <th>Before userspace</th>
      <th>Num deferred probes</th>
      <th>Time spent in failed inits (us)</th>
    </tr>''')

    for d in sorted(filter(lambda k: k.duration > 0, list(drivers.values()) + list(probes.values())), key=lambda k: k.duration, reverse=True):
        etype = 'I' if d in drivers else 'P'
        run_status = 'D' if d.deferred_probe_pending else 'R' if d.running else f'F({abs(d.retval)})' if d.failed else 'K'
        before_init = 'Y' if d.last_start_time < init.last_start_time else 'N'
        print(
f'''
    <tr>
      <td>{d.name}</td>
      <td>{etype}</td>
      <td>{run_status}</td>
      <td>{d.duration}</td>
      <td>{before_init}</td>
      <td>{d.num_deferred_probes}</td>
      <td>{d.wasted_time}</td>
    </tr>''')

    print(
'''
  </table>
  <p id="note1">[1] I=Init function, P=device Probe</p>
  <p id="note2">[2] D=Deferred probe pending, R=still Running, F(N)=Failed (with errcode N), K=init oK</p>
''')

    if not args.body_only:
        print(
'''
</body>
</html>
''')

# Print plain text
else:
    num_before_userspace = len(list(filter(lambda d: d.last_start_time <= init.last_start_time, drivers.values())))
    num_after_userspace = len(list(filter(lambda d: d.last_start_time > init.last_start_time, drivers.values())))
    num_deferred_probe_pending = len(list(filter(lambda d: d.deferred_probe_pending, list(drivers.values()) + list(probes.values()))))
    num_failed = len(list(filter(lambda d: d.failed, list(drivers.values()) + list(probes.values()))))

    print('Summary:')
    print(f'  {len(drivers)} drivers have been initialized, of which {num_before_userspace} before userspace and {num_after_userspace} after')
    print(f'  {len(probes)} probes happened outside of the initcall of their driver')
    print(f'  {num_deferred_probe_pending} deferred probes are pending')
    print(f'  {num_failed} initcalls/probes failed')

    print('\n---\n')

    print('Top 10 initcall/probe durations:')
    for d in sorted(filter(lambda k: k.duration > 0, list(drivers.values()) + list(probes.values())), key=lambda k: k.duration, reverse=True)[0:10]:
        print(f' * {d.name} -> {d.duration}us')

    print('\n---\n')

    print('Failed initcalls/probes:')
    for d in filter(lambda k: k.failed, list(drivers.values()) + list(probes.values())):
        print(f' * {d.name} -> ret = -{abs(d.retval)}')



^ permalink raw reply

* Re: [RFC PATCH] boot-time: instrument probes more
From: Francesco Valla @ 2024-12-21 14:47 UTC (permalink / raw)
  To: Linux Embedded, Bird, Tim
In-Reply-To: <MW5PR13MB563277AF5972FD2B56026CF9FD3C2@MW5PR13MB5632.namprd13.prod.outlook.com>

Hello,

On Monday, 9 December 2024 at 04:10:04 Bird, Tim <Tim.Bird@sony.com> wrote:
> ----
> Below is the patch that adds the extra instrumentation.  My quick question is, does this look useful?
> Can it be used to find the more problematic (ie long-duration) probe functions or drivers initializations?
> 

Yes, it looks useful to me. While an experienced developer _might_ be able to match
device names and drivers at a quick glance, that's not the case for everyone, in
particular on certain SoCs which are modelled at devicetree level with an exceptionally
high number of "generic" device nodes.

> ----
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 0c3725c3eefa..5e19079b0a2b 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -28,6 +28,7 @@
>  #include <linux/pm_runtime.h>
>  #include <linux/pinctrl/devinfo.h>
>  #include <linux/slab.h>
> +#include <linux/kallsyms.h>
>  
>  #include "base.h"
>  #include "power/power.h"
> @@ -732,8 +733,16 @@ static int really_probe(struct device *dev, struct device_driver *drv)
>  static int really_probe_debug(struct device *dev, struct device_driver *drv)
>  {
>  	ktime_t calltime, rettime;
> +	char fn_name[KSYM_SYMBOL_LEN];
> +	unsigned long addr = 0;
>  	int ret;
>  
> +	if (dev->bus->probe)
> +		addr = (unsigned long)dev->bus->probe;
> +	else if (drv->probe)
> +		addr = (unsigned long)drv->probe;
> +	sprint_symbol_no_offset(fn_name, addr);
> +
>  	calltime = ktime_get();
>  	ret = really_probe(dev, drv);
>  	rettime = ktime_get();
> @@ -742,8 +751,8 @@ static int really_probe_debug(struct device *dev, struct device_driver *drv)
>  	 * CONFIG_DYNAMIC_DEBUG and we want a simple 'initcall_debug' on the
>  	 * kernel commandline to print this all the time at the debug level.
>  	 */
> -	printk(KERN_DEBUG "probe of %s returned %d after %lld usecs\n",
> -		 dev_name(dev), ret, ktime_us_delta(rettime, calltime));
> +	printk(KERN_DEBUG "probe %s in driver %s for device %s returned %d after %lld usecs\n",
> +		 fn_name, drv->name, dev_name(dev), ret, ktime_us_delta(rettime, calltime));
>  	return ret;
>  }
>  
> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> index 76bfcba25003..ae15969c483a 100644
> --- a/drivers/base/platform.c
> +++ b/drivers/base/platform.c
> @@ -32,6 +32,7 @@
>  #include <linux/types.h>
>  #include <linux/iommu.h>
>  #include <linux/dma-map-ops.h>
> +#include <linux/kallsyms.h>
>  
>  #include "base.h"
>  #include "power/power.h"
> @@ -1381,6 +1382,9 @@ static int platform_probe(struct device *_dev)
>  	struct platform_driver *drv = to_platform_driver(_dev->driver);
>  	struct platform_device *dev = to_platform_device(_dev);
>  	int ret;
> +	char fn_name[KSYM_SYMBOL_LEN];
> +	unsigned long addr = 0;
> +
>  
>  	/*
>  	 * A driver registered using platform_driver_probe() cannot be bound
> @@ -1401,6 +1405,13 @@ static int platform_probe(struct device *_dev)
>  		goto out;
>  
>  	if (drv->probe) {
> +		if (initcall_debug) {
> +			addr = (unsigned long)drv->probe;
> +			sprint_symbol_no_offset(fn_name, addr);
> +
> +			printk(KERN_DEBUG "platform probe %s is being called\n", fn_name);
> +		}
> +
>  		ret = drv->probe(dev);
>  		if (ret)
>  			dev_pm_domain_detach(_dev, true);

Does this really belong here? Why not add a print inside really_probe_debug(), just
before the call to really_probe()? That approach would catch all probe types (not just
platform_probe) while containing the initcall_debug instrumentation inside dd.c

Regards,
Francesco




^ permalink raw reply

* [boot-time] [SCRIPT v3] analyze-initcall-debug.py
From: Francesco Valla @ 2024-12-26 15:25 UTC (permalink / raw)
  To: Linux Embedded

Hello,

this is the third version of the analyze-initcall-debug.py script, which can
be used to analyze the kernel output when booting with initcall_debug
to extract some human-readable data from it.

This version brings several enhancements and new features, the most notable
ones being the split between initcalls and probes (since they partially overlap)
and the SVG pie charts for both of these categories. A complete changelog can
be found below.

Wishing the best for the new year,

Francesco

---

v2: https://lore.kernel.org/linux-embedded/1955396.7Z3S40VBb9@fedora/
v1: https://lore.kernel.org/linux-embedded/1964175.7Z3S40VBb9@fedora.fritz.box/

Differences v2 -> v3:

- initcalls and probes have been split in different tables
- added identification section (kernel version, machine, cmdline)
- added SVG pie charts for initcalls and probes times to the HTML output
- CSS styling added to HTML output
- added the --before-init option to restrict analysis to the events that
  happen before the init is started
- Markdown output has been dropped

Differences v1 -> v2:

- added a list of failed driver probes in the plain text output
- added a preliminary comparison on read lines to avoid false negatives in
  regex-based matches
- added the --body-only option to output only the raw HTML body
- use of dictionaries instead of filter() constructs

------ the analyze-initcall-debug.py script follows ------

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

"""
This script can be used to analyze a Linux kernel dmesg output when the
initcall_debug command line output is used. It extracts various information,
such as initcall durations, impact of probe deferrals and so on.
"""

import argparse
import random
import re
import sys

ERRCODE_PROBE_DEFER = 517

parser = argparse.ArgumentParser(description='Analyze a Linux kernel dmesg with the initcall_debug option enabled')

parser.add_argument('--dmesg', nargs='?', type=argparse.FileType('r'),
                    default=sys.stdin, help='The dmesg file to analyze (default: stdin)')
format_group = parser.add_mutually_exclusive_group()
format_group.add_argument('--html', action='store_true',
                          help='Output analysis result as HTML table')
parser.add_argument('--body-only', action='store_true',
                    help='Do not add header and footer to HTML output')
parser.add_argument('--before-init', action='store_true',
                    help='Add to analysis only initcalls/probes happening before init')
args = parser.parse_args()

class Run:
    def __init__(self, start_time:int, end_time:int = -1, duration:int = 0, retval:int = 0):
        self._start_time = start_time
        self._end_time = end_time
        self._duration = duration
        self._retval = retval
        self._ended = (end_time >= 0)

    @property
    def start_time(self) -> int:
        return self._start_time

    @property
    def end_time(self) -> int:
        return self._end_time

    @end_time.setter
    def end_time(self, time:int):
        self._end_time = time
        self._ended = True

    @property
    def duration(self) -> int:
        return self._duration

    @duration.setter
    def duration(self, time:int):
        self._duration = time

    @property
    def retval(self) -> int:
        return self._retval

    @retval.setter
    def retval(self, val:int):
        self._retval = val

    @property
    def running(self) -> bool:
        return (not self._ended or abs(self.retval) == ERRCODE_PROBE_DEFER)

    @property
    def failed(self) -> bool:
        return ((self.retval != 0) and not self.running)


class Entity:
    def __init__(self, name:str, start_time:int = 0, end_time:int = 0, duration:int = 0, retval:int = 0):
        self._name = name
        self._color = (random.randint(0, 255), random.randint(0, 255), random.randint(0, 255))
        self._runs = [ Run(start_time, end_time, duration, retval) ]

    @property
    def name(self) -> str:
        return self._name

    @property
    def color(self) -> tuple[int, int, int]:
        return self._color

    @property
    def first_start_time(self) -> int:
        return self._runs[0].start_time

    @property
    def last_start_time(self) -> int:
        return self._runs[-1].start_time

    @property
    def first_end_time(self) -> int:
        return self._runs[0].end_time

    @property
    def last_end_time(self) -> int:
        return self._runs[-1].end_time

    @property
    def duration(self) -> int:
        return sum([ r.duration for r in self._runs ])

    @property
    def wasted_time(self) -> int:
        return sum([ r.duration for r in filter(lambda x: x.failed or abs(x.retval) == ERRCODE_PROBE_DEFER, self._runs) ])

    @property
    def retval(self) -> int:
        return self._runs[-1].retval

    @property
    def failed(self) -> bool:
        return self._runs[-1].failed

    @property
    def running(self) -> bool:
        return self._runs[-1].running

    def addStart(self, start_time: int):
        self._runs.append( Run(start_time) )

    def addEnd(self, end_time:int = 0, duration:int = 0, retval:int = 0):
        self._runs[-1].end_time = end_time
        self._runs[-1].duration = duration
        self._runs[-1].retval = retval

    def addRun(self, start_time: int, end_time:int = 0, duration:int = 0, retval:int = 0):
        self._runs.append( Run(start_time, end_time, duration, retval) )


class Initcall (Entity):
    def __init__(self, name: str, start_time: int, module: str = None):
        super().__init__(name, start_time)
        self._module = module

    @property
    def module(self) -> str:
        return self._module

    def addInit(self, init_time: int):
        self.addStart(init_time)

    def addReturn(self, return_time: int, duration: int, retval: int):
        self.addEnd(return_time, duration, retval)


class Probe (Entity):
    def __init__(self, name: str, return_time: int, duration: int, retval: int):
        super().__init__(name, return_time, return_time, duration, retval)

    @property
    def deferred_probe_pending(self) -> bool:
        return (abs(self._runs[-1].retval) == ERRCODE_PROBE_DEFER)

    @property
    def num_deferred_probes(self) -> int:
        return len(list(filter(lambda r: abs(r.retval) == ERRCODE_PROBE_DEFER, self._runs)))


class Init (Entity):
    def __init__(self, name: str, start_time: int):
        super().__init__(name, start_time)

# Regex for Linux version
# "[    0.000000] Linux version 6.12.0 (oe-user@oe-host) (aarch64-poky-linux-gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.42.0.20240723) #1 SMP PREEMPT Sun Nov 17 22:15:08 UTC 2024"
version_sentinel = 'Linux version '
version_prog = re.compile(r'\[([0-9\s]+\.[0-9]+)\] Linux version (.+)')

# Regex for machine model
# "[    0.000000] Machine model: BeagleBoard.org BeaglePlay"
machine_sentinel = 'Machine model: '
machine_prog = re.compile(r'\[([0-9\s]+\.[0-9]+)\] Machine model: (.+)')

# Regex for cmdline
# "[    0.000000] Kernel command line: LABEL=Boot root=PARTUUID=076c4a2a-02 rootfstype=ext4 rootwait log_buf_len=10M initcall_debug quiet"
cmdline_sentinel = 'Kernel command line: '
cmdline_prog = re.compile(r'\[([0-9\s]+\.[0-9]+)\] Kernel command line: (.+)')

# Regex for 'calling' messages
# "[    0.466115] calling  pci_sysfs_init+0x0/0xa8 @ 1"
calling_sentinel = 'calling  '
calling_prog = re.compile(r'\[([0-9\s]+\.[0-9]+)\] calling  ([0-9a-zA-Z_]+)\+(0x[0-9a-fA-F]+\/0x[0-9a-fA-F]+)( \[[a-zA-Z0-9\-_]+\])? @ ([0-9]+)')

# Regex for 'initcall ... returned' messages
# "[    0.466115] initcall pci_sysfs_init+0x0/0xa8 returned 0 after 5 usecs"
returned_sentinel = 'initcall '
returned_prog = re.compile(r'\[([0-9\s]+\.[0-9]+)\] initcall ([0-9a-zA-Z_]+)\+(0x[0-9a-fA-F]+\/0x[0-9a-fA-F]+)( \[[a-zA-Z0-9\-_]+\])? returned ([\-0-9]+) after ([0-9]+) usecs')

# Regex for 'probe ... returned' messages
# "[    0.466115] probe of cpufreq-dt returned 517 after 140 usec"
probe_sentinel = 'probe of '
probe_prog = re.compile(r'\[([0-9\s]+\.[0-9]+)\] probe of ([0-9a-zA-Z_\-\.\:@]+) returned ([\-0-9]+) after ([0-9]+) usecs')

# Regex for
# "[    1.060329] Run /sbin/init as init process"
init_sentinel = 'as init process'
init_prog = re.compile(r'\[([0-9\s]+\.[0-9]+)\] Run ([/0-9a-zA-Z_]+) as init process')

initcalls = {}
probes = {}
init = None
version = 'Unknown'
machine = 'Unknown'
cmdline = 'Unknown'

# Extract data from dmesg
lineno = 0
for line in args.dmesg:
    lineno += 1

    if version_sentinel in line:
        match = version_prog.match(line)
        if match:
            version = str(match.group(2))
        else:
            print(f'Failed matching line {lineno}:"{line.rstrip()}" as version', file=sys.stderr)

    if machine_sentinel in line:
        match = machine_prog.match(line)
        if match:
            machine = str(match.group(2))
        else:
            print(f'Failed matching line {lineno}:"{line.rstrip()}" as machine', file=sys.stderr)

    if cmdline_sentinel in line:
        match = cmdline_prog.match(line)
        if match:
            cmdline = str(match.group(2))
        else:
            print(f'Failed matching line {lineno}:"{line.rstrip()}" as cmdline', file=sys.stderr)

    if calling_sentinel in line:
        match = calling_prog.match(line)
        if match:
            try:
                time = int(float(match.group(1)) * 1000000.0)
                name = str(match.group(2))
                module = str(match.group(4) or '').replace('[', '').replace(']', '')
            except Exception as e:
                print(f'Failed parsing line {lineno}:"{line.rstrip()}" as call', file=sys.stderr)
                raise e
            else:
                if name not in initcalls.keys():
                    initcalls[name] = Initcall(name, time, module)
                else:
                    initcalls[name].addInit(time)
            finally:
                continue
        else:
            print(f'Failed matching line {lineno}:"{line.rstrip()}" as call', file=sys.stderr)

    if returned_sentinel in line:
        match = returned_prog.match(line)
        if match:
            try:
                time = int(float(match.group(1)) * 1000000.0)
                name = str(match.group(2))
                retval = int(match.group(5))
                duration = int(match.group(6))
            except:
                print(f'Failed parsing line {lineno}:"{line.rstrip()}" as call return', file=sys.stderr)
            else:
                if name not in initcalls.keys():
                    print(f'Detected return for initcall {name}, for which a call was never recorded', file=sys.stderr)
                else:
                    initcalls[name].addReturn(time, duration, retval)
            finally:
                continue
        else:
            print(f'Failed matching line {lineno}:"{line.rstrip()}" as call return', file=sys.stderr)

    if probe_sentinel in line:
        match = probe_prog.match(line)
        if match:
            try:
                time = int(float(match.group(1)) * 1000000.0)
                name = str(match.group(2))
                retval = int(match.group(3))
                duration = int(match.group(4))
            except:
                print(f'Failed parsing line {lineno}:"{line.rstrip()}" as probe return', file=sys.stderr)
            else:
                if name not in probes.keys():
                    probes[name] = Probe(name, time, duration, retval)
                else:
                    probes[name].addRun(time, time, duration, retval)
            finally:
                continue
        else:
            print(f'Failed matching line {lineno}:"{line.rstrip()}" as probe return', file=sys.stderr)

    if not init and init_sentinel in line:
        match = init_prog.match(line)
        if match:
            try:
                time = int(float(match.group(1)) * 1000000.0)
                name = str(match.group(2))
            except:
                print(f'Failed parsing line {lineno}:"{line.rstrip()}" as init', file=sys.stderr)
            else:
                init = Init(name, time)
            finally:
                if args.before_init:
                    break
                else:
                    continue
        else:
            print(f'Failed matching line {lineno}:"{line.rstrip()}" as init', file=sys.stderr)

if len(initcalls) == 0:
    print(f'No initcalls parsed - check your kernel configuration and command line', file=sys.stderr)
    sys.exit(1)

# Print HTML format
if args.html:
    if not args.body_only:
        print(
'''
<!DOCTYPE HTML>
<html>
<head>
    <meta charset="utf-8" />
    <title>Boot time analysis</title>
    <meta name="viewport" content="width=device-width, initial-scale=1" />
    <style>
        .aid-title {
            font-weight: bold;
            font-size: 2em;
            text-align: center;
            padding: 0.5em;
        }

        .aid-piechart-container {
            width: 30%;
            height: 30%;
            margin: 0 auto;
            padding: 1em;
        }

        table, th, td {
            border: 1px solid;
            border-collapse: collapse;
        }

        table {
            text-align: left;
            overflow: hidden;
            width: 80%;
            margin: 0 auto;
            display: table;
        }

        table th {
            font-weight: bold;
            font-size: 1em;
            text-align: center;
        }

        table td, table th {
            padding: 0.4%;
        }

        .aid-status-ok {
            background-color: #77FF77;
        }

        .aid-status-failed {
            background-color: #FF7777;
        }

        .aid-status-deferred {
            background-color: #FFCC77;
        }

        .aid-status-deferred {
            background-color: #CCCCCC;
        }
    </style>
</head>
<body>''')

    # Identification
    print(
f'''
    <div class="aid-title">Identification</div>
    <table>
        <tr>
            <td>Linux version</td>
            <td>{version}</td>
        </tr>
        <tr>
            <td>Machine</td>
            <td>{machine}</td>
        </tr>
        <tr>
            <td>Cmdline</td>
            <td>{cmdline}</td>
        </tr>
    </table>
''')

    # Initcalls
    initcalls_total_time = sum( [ k.duration for k in initcalls.values() ] )

    print(
'''
    <div class="aid-title">Initcalls</div>
''')


    ## Print initcalls pie chart
    print(
'''
    <div class="aid-piechart-container"><svg viewBox="0 0 100 100">
''')

    startPoint = 0
    visible_limit = max([k.duration for k in list(initcalls.values())]) / 100
    for d in sorted(initcalls.values(), key=lambda k: k.duration, reverse=True):
        if d.duration < visible_limit:
            title = "ALL OTHER INITCALLS"
            length = initcalls_total_time - startPoint
        else:
            title = d.name
            length = d.duration

        startPoint = startPoint + length
        print(
f'''
        <a href="#aid-initcall-{d.name}">
            <circle r="25" cx="50" cy="50" fill="none" stroke="#{d.color[0]:02X}{d.color[1]:02X}{d.color[2]:02X}"
                    stroke-width="50" stroke-dasharray="{length} {initcalls_total_time - length}"
                    stroke-dashoffset="{startPoint}" pathLength="{initcalls_total_time}">
                <title>{title}</title>
            </circle>
        </a>
''')
        if d.duration < visible_limit:
            break

    print(
'''
    </svg></div>
''')

    print(
'''
    <table>
        <tr>
            <th>&nbsp;</th>
            <th>Name</th>
            <th>Status</th>
            <th>Duration (us)</th>
            <th>Time in failed calls (us)</th>
            <th>Fraction of total time (%)</th>
            <th>Module</th>
        </tr>''')

    for d in sorted(initcalls.values(), key=lambda k: k.duration, reverse=True):
        run_status = 'RUNNING' if d.running else 'FAILED' if d.failed else 'OK'
        print(
f'''
        <tr id="aid-initcall-{d.name}">
            <td style="background-color: #{d.color[0]:02X}{d.color[1]:02X}{d.color[2]:02X};"></td>
            <td>{d.name}</td>
            <td class="aid-status-{run_status.lower()}">{run_status}{f' ({abs(d.retval)})' if d.failed else ''}</td>
            <td>{d.duration}</td>
            <td>{d.wasted_time}</td>
            <td>{(d.duration * 100 / initcalls_total_time):0.3f}</td>
            <td>{d.module}</td>
        </tr>''')

    print(
'''
    </table>
''')

    # Probes
    probes_total_time = sum([k.duration for k in probes.values()])

    print(
'''
    <div class="aid-title">Probes</div>
''')

    ## Print probes pie chart
    print(
'''
    <div class="aid-piechart-container"><svg viewBox="0 0 100 100">
''')

    startPoint = 0
    visible_limit = max([k.duration for k in list(initcalls.values())]) / 100
    for d in sorted(probes.values(), key=lambda k: k.duration, reverse=True):
        if d.duration < visible_limit:
            title = "ALL OTHER PROBES"
            length = probes_total_time - startPoint
        else:
            title = d.name
            length = d.duration

        startPoint = startPoint + length
        print(
f'''
        <a href="#aid-initcall-{d.name}">
            <circle r="25" cx="50" cy="50" fill="none" stroke="#{d.color[0]:02X}{d.color[1]:02X}{d.color[2]:02X}"
                    stroke-width="50" stroke-dasharray="{length} {probes_total_time - length}"
                    stroke-dashoffset="{startPoint}" pathLength="{initcalls_total_time}">
                <title>{title}</title>
            </circle>
        </a>
''')
        if d.duration < visible_limit:
            break

    print(
'''
    </svg></div>
''')

    ## Print probes table
    print(
'''
    <table>
        <tr>
            <th>&nbsp;</th>
            <th>Name</th>
            <th>Status</th>
            <th>Duration (us)</th>
            <th>Num deferred probes</th>
            <th>Time in failed probes (us)</th>
            <th>Fraction of total time (%)</th>
            <th>After init</th>
        </tr>
''')

    for d in sorted(probes.values(), key=lambda k: k.duration, reverse=True):
        run_status = 'DEFERRED' if d.deferred_probe_pending else 'RUNNING' if d.running else 'FAILED' if d.failed else 'OK'
        after_init = 'YES' if d.last_start_time > init.last_start_time else 'NO'
        print(
f'''
        <tr id="aid-probe-{d.name}">
            <td style="background-color: #{d.color[0]:02X}{d.color[1]:02X}{d.color[2]:02X};"></td>
            <td>{d.name}</td>
            <td class="aid-status-{run_status.lower()}">{run_status}{f' ({abs(d.retval)})' if d.failed else ''}</td>
            <td>{d.duration}</td>
            <td>{d.num_deferred_probes}</td>
            <td>{d.wasted_time}</td>
            <td>{(d.duration * 100 / probes_total_time):0.3f}</td>
            <td>{after_init}</td>
        </tr>''')

    print(
'''
    </table>
''')

    if not args.body_only:
        print(
'''
</body>
</html>
''')

# Print plain text
else:
    num_before_userspace = len(list(filter(lambda d: d.last_start_time <= init.last_start_time, initcalls.values())))
    num_after_userspace = len(list(filter(lambda d: d.last_start_time > init.last_start_time, initcalls.values())))
    num_deferred_probe_pending = len(list(filter(lambda d: d.deferred_probe_pending, list(probes.values()))))
    num_failed = len(list(filter(lambda d: d.failed, list(initcalls.values()) + list(probes.values()))))

    print(f'Linux version: {version}')
    print(f'Machine: {machine}')
    print(f'Command line: {cmdline}')
    print('Summary:')
    print(f'  {len(initcalls)} initcalls have been executed, of which {num_before_userspace} before userspace and {num_after_userspace} after')
    print(f'  {num_deferred_probe_pending} deferred probes are pending')
    print(f'  {num_failed} initcalls/probes failed')

    print('\n---\n')

    print('Top 10 initcall durations:')
    for d in sorted(initcalls.values(), key=lambda k: k.duration, reverse=True)[0:10]:
        print(f' * {d.name} -> {d.duration}us')

    print('\n---\n')

    print('Top 10 probe durations:')
    for d in sorted(probes.values(), key=lambda k: k.duration, reverse=True)[0:10]:
        print(f' * {d.name} -> {d.duration}us')

    print('\n---\n')

    print('Failed initcalls/probes:')
    for d in filter(lambda k: k.failed, list(initcalls.values()) + list(probes.values())):
        print(f' * {d.name} -> ret = -{abs(d.retval)}')




^ permalink raw reply

* [PATCH] init/main.c: print initcall level when initcall_debug is used
From: Francesco Valla @ 2024-12-29 16:29 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-embedded, Steven Rostedt, Tim Bird

When initcall_debug is specified on the command line, the start and
return point for each initcall is printed. However, no information on
the initcall level is reported.

Add to the initcall_debug infrastructure an additional print that
informs when a new initcall level is entered. This is particularly
useful when debugging dependency chains and/or working on boot time
reduction.

Signed-off-by: Francesco Valla <francesco@valla.it>
---
 init/main.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/init/main.c b/init/main.c
index 00fac1170294..160c93332536 100644
--- a/init/main.c
+++ b/init/main.c
@@ -1223,6 +1223,12 @@ trace_initcall_finish_cb(void *data, initcall_t fn, int ret)
 		 fn, ret, (unsigned long long)ktime_us_delta(rettime, *calltime));
 }
 
+static __init_or_module void
+trace_initcall_level_cb(void *data, const char *level)
+{
+	printk(KERN_DEBUG "entering %s initcall level\n", level);
+}
+
 static ktime_t initcall_calltime;
 
 #ifdef TRACEPOINTS_ENABLED
@@ -1234,10 +1240,12 @@ static void __init initcall_debug_enable(void)
 					    &initcall_calltime);
 	ret |= register_trace_initcall_finish(trace_initcall_finish_cb,
 					      &initcall_calltime);
+	ret |= register_trace_initcall_level(trace_initcall_level_cb, NULL);
 	WARN(ret, "Failed to register initcall tracepoints\n");
 }
 # define do_trace_initcall_start	trace_initcall_start
 # define do_trace_initcall_finish	trace_initcall_finish
+# define do_trace_initcall_level	trace_initcall_level
 #else
 static inline void do_trace_initcall_start(initcall_t fn)
 {
@@ -1251,6 +1259,12 @@ static inline void do_trace_initcall_finish(initcall_t fn, int ret)
 		return;
 	trace_initcall_finish_cb(&initcall_calltime, fn, ret);
 }
+static inline void do_trace_initcall_level(const char *level)
+{
+	if (!initcall_debug)
+		return;
+	trace_initcall_level_cb(NULL, level);
+}
 #endif /* !TRACEPOINTS_ENABLED */
 
 int __init_or_module do_one_initcall(initcall_t fn)
@@ -1323,7 +1337,7 @@ static void __init do_initcall_level(int level, char *command_line)
 		   level, level,
 		   NULL, ignore_unknown_bootoption);
 
-	trace_initcall_level(initcall_level_names[level]);
+	do_trace_initcall_level(initcall_level_names[level]);
 	for (fn = initcall_levels[level]; fn < initcall_levels[level+1]; fn++)
 		do_one_initcall(initcall_from_entry(fn));
 }
@@ -1367,7 +1381,7 @@ static void __init do_pre_smp_initcalls(void)
 {
 	initcall_entry_t *fn;
 
-	trace_initcall_level("early");
+	do_trace_initcall_level("early");
 	for (fn = __initcall_start; fn < __initcall0_start; fn++)
 		do_one_initcall(initcall_from_entry(fn));
 }
-- 
2.47.1


^ permalink raw reply related

* Re: [RFC PATCH] boot-time: instrument probes more
From: Francesco Valla @ 2024-12-29 17:45 UTC (permalink / raw)
  To: Linux Embedded, Bird, Tim
In-Reply-To: <26696081.1r3eYUQgxm@fedora>

Hello Tim,

On Saturday, 21 December 2024 15:47:54 CET you wrote:
> Hello,
> 
> On Monday, 9 December 2024 at 04:10:04 Bird, Tim <Tim.Bird@sony.com> wrote:
> > ----
> > Below is the patch that adds the extra instrumentation.  My quick question is, does this look useful?
> > Can it be used to find the more problematic (ie long-duration) probe functions or drivers initializations?
> > 
> 
> Yes, it looks useful to me. While an experienced developer _might_ be able to match
> device names and drivers at a quick glance, that's not the case for everyone, in
> particular on certain SoCs which are modelled at devicetree level with an exceptionally
> high number of "generic" device nodes.
> 
> > ----
> > diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> > index 0c3725c3eefa..5e19079b0a2b 100644
> > --- a/drivers/base/dd.c
> > +++ b/drivers/base/dd.c
> > @@ -28,6 +28,7 @@
> >  #include <linux/pm_runtime.h>
> >  #include <linux/pinctrl/devinfo.h>
> >  #include <linux/slab.h>
> > +#include <linux/kallsyms.h>
> >  
> >  #include "base.h"
> >  #include "power/power.h"
> > @@ -732,8 +733,16 @@ static int really_probe(struct device *dev, struct device_driver *drv)
> >  static int really_probe_debug(struct device *dev, struct device_driver *drv)
> >  {
> >  	ktime_t calltime, rettime;
> > +	char fn_name[KSYM_SYMBOL_LEN];
> > +	unsigned long addr = 0;
> >  	int ret;
> >  
> > +	if (dev->bus->probe)
> > +		addr = (unsigned long)dev->bus->probe;
> > +	else if (drv->probe)
> > +		addr = (unsigned long)drv->probe;
> > +	sprint_symbol_no_offset(fn_name, addr);
> > +
> >  	calltime = ktime_get();
> >  	ret = really_probe(dev, drv);
> >  	rettime = ktime_get();
> > @@ -742,8 +751,8 @@ static int really_probe_debug(struct device *dev, struct device_driver *drv)
> >  	 * CONFIG_DYNAMIC_DEBUG and we want a simple 'initcall_debug' on the
> >  	 * kernel commandline to print this all the time at the debug level.
> >  	 */
> > -	printk(KERN_DEBUG "probe of %s returned %d after %lld usecs\n",
> > -		 dev_name(dev), ret, ktime_us_delta(rettime, calltime));
> > +	printk(KERN_DEBUG "probe %s in driver %s for device %s returned %d after %lld usecs\n",
> > +		 fn_name, drv->name, dev_name(dev), ret, ktime_us_delta(rettime, calltime));
> >  	return ret;
> >  }
> >  
> > diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> > index 76bfcba25003..ae15969c483a 100644
> > --- a/drivers/base/platform.c
> > +++ b/drivers/base/platform.c
> > @@ -32,6 +32,7 @@
> >  #include <linux/types.h>
> >  #include <linux/iommu.h>
> >  #include <linux/dma-map-ops.h>
> > +#include <linux/kallsyms.h>
> >  
> >  #include "base.h"
> >  #include "power/power.h"
> > @@ -1381,6 +1382,9 @@ static int platform_probe(struct device *_dev)
> >  	struct platform_driver *drv = to_platform_driver(_dev->driver);
> >  	struct platform_device *dev = to_platform_device(_dev);
> >  	int ret;
> > +	char fn_name[KSYM_SYMBOL_LEN];
> > +	unsigned long addr = 0;
> > +
> >  
> >  	/*
> >  	 * A driver registered using platform_driver_probe() cannot be bound
> > @@ -1401,6 +1405,13 @@ static int platform_probe(struct device *_dev)
> >  		goto out;
> >  
> >  	if (drv->probe) {
> > +		if (initcall_debug) {
> > +			addr = (unsigned long)drv->probe;
> > +			sprint_symbol_no_offset(fn_name, addr);
> > +
> > +			printk(KERN_DEBUG "platform probe %s is being called\n", fn_name);
> > +		}
> > +
> >  		ret = drv->probe(dev);
> >  		if (ret)
> >  			dev_pm_domain_detach(_dev, true);
> 
> Does this really belong here? Why not add a print inside really_probe_debug(), just
> before the call to really_probe()? That approach would catch all probe types (not just
> platform_probe) while containing the initcall_debug instrumentation inside dd.c
> 

I had some spare time and I tried to write a patch that maintains the format
of the "return" message (because of backward compatibility) while adding the
details you inserted just before the probe in dd.c.

I then used it on the Beagleplay and I can confirm this helps a lot.

Thank you!


Regards,

Francesco

---
 drivers/base/dd.c | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index f0e4b4aba885..e33ae82064ed 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -28,6 +28,7 @@
 #include <linux/pm_runtime.h>
 #include <linux/pinctrl/devinfo.h>
 #include <linux/slab.h>
+#include <linux/kallsyms.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -732,16 +733,28 @@ static int really_probe(struct device *dev, const struct device_driver *drv)
 static int really_probe_debug(struct device *dev, const struct device_driver *drv)
 {
 	ktime_t calltime, rettime;
+	char fn_name[KSYM_SYMBOL_LEN];
+	unsigned long addr = 0;
 	int ret;
 
-	calltime = ktime_get();
-	ret = really_probe(dev, drv);
-	rettime = ktime_get();
+	if (dev->bus->probe)
+		addr = (unsigned long)dev->bus->probe;
+	else if (drv->probe)
+		addr = (unsigned long)drv->probe;
+	sprint_symbol_no_offset(fn_name, addr);
+
 	/*
-	 * Don't change this to pr_debug() because that requires
+	 * Don't change this or the one below to pr_debug() because that requires
 	 * CONFIG_DYNAMIC_DEBUG and we want a simple 'initcall_debug' on the
 	 * kernel commandline to print this all the time at the debug level.
 	 */
+	printk(KERN_DEBUG "calling probe %s in driver %s for device %s\n",
+	       fn_name, drv->name, dev_name(dev));
+
+	calltime = ktime_get();
+	ret = really_probe(dev, drv);
+	rettime = ktime_get();
+
 	printk(KERN_DEBUG "probe of %s returned %d after %lld usecs\n",
 		 dev_name(dev), ret, ktime_us_delta(rettime, calltime));
 	return ret;
-- 
2.47.1





^ permalink raw reply related

* Re: [boot-time] jent_mod_init on beagleplay (was RE: [boot-time] [RFC] analyze-initcall-debug.py - a tool to analyze the initcall debug output
From: Francesco Valla @ 2024-12-30 22:22 UTC (permalink / raw)
  To: Linux Embedded, Bird, Tim; +Cc: smueller@chronox.de
In-Reply-To: <MW5PR13MB5632D5A1AE2D16F53FA2391FFD362@MW5PR13MB5632.namprd13.prod.outlook.com>

On Tuesday, 3 December 2024 at 22:24:28 Bird, Tim <Tim.Bird@sony.com> wrote:
> > -----Original Message-----
> > From: Francesco Valla <francesco@valla.it>
> > Dear fellow boot time optimizers,
> > 
> > following the first boot time SIG meeting, which I lurked with much pleasure
> > (but didn't participate to, as I was a bit in awe among such authorities), I'd
> > like to introduce myself with code rather than a presentation or resume.
> > 
> > Here is a python script which can analyze a dmesg output with initcall_debug
> > option enabled and extract some useful information. It can for example be used
> > to analyze the output of the grab-boot-data.sh tool that Tim presented on this
> > list [1] just a few days ago.
> > 
> > Usage is very simple, as the output of dmesg can be piped directly to it:
> > 
> >     dmesg | analyze-initcall-debug.py
> > 
> > If no option is specified, it outputs a brief summary, like the following one
> > (obtained on my Beagleplay):
> > 
> > 1758 drivers has been initialized, of which 1758 before userspace
> > 119 probes happened outside of the init of their driver
> > 0 deferred probes pending
> > ---
> > Top 10 init/probes durations:
> >  * 30200000.dss -> 523002us
> >  * deferred_probe_initcall -> 487483us
> >  * fd00000.gpu -> 162859us
> >  * 8000f00.mdio -> 142521us
> >  * 44043000.system-controller -> 71390us
> >  * 2-004c -> 71178us
> >  * 40900000.crypto -> 59350us
> >  * 8000000.ethernet -> 58419us
> >  * 44043000.system-controller:clock-controller -> 56599us
> >  * jent_mod_init -> 52140us
> 
> This is odd.  On my beagleplay board, jent_mod_init only took 32326us.
> (see https://birdcloud.org/boot-time-files/boot-data-timslab-bp1-241203-134136.txt)
> 
> I'm assuming that we have similar hardware, but possibly different configs,
> kernel cmdlines or kernel versions.  I'll take a look at this and see if I can
> figure out what the difference is between our systems, that causes the
> difference in the duration for this function.
> 
> This initcall (jent_mod_init) is on my list of initcalls to research to see if they
> can be improved or deferred.  I haven't seen any degradation in system
> behavior by deferring this initcall on my system, but that could be from
> lack of my system doing some related operation that needs the RNG data, or my
> own ignorance of the effects of this call.
> 
> Can someone comment on what items or operations might depend on jent_mod_init
> in early boot?  In particular, would we expect any cryptographic problems if
> the initcall was deferred to a few seconds after booting?
> 
> It can be configured as a module, which makes me think that maybe loading
> it sometime in very late boot (or even later) is OK.
> 
> jent_mod_init is defined in crypto/jitterentropy-kcapi.c, and controlled by 
> CONFIG_CRYPTO_JITTERENTROPY (See crypto/Kconfig)
> 
> It appears to do some random number generator seeding by measuring
> the jitter of a compressiong operation in the kernel. I assume some of the cryptography
> configs affect the duration of the operations.  The initcall duration results on
> my beagleplay seem to be relatively consistent (in the 32ms range consistently),
> and from bobs_lab, on machine sam1, the duration range is also consistent
> (at between 4800 and 5200 usecs).
> 
> I'd be interested to see if the range is consistent between runs on other machines
> 
> Francesco - Can you submit another boot-data file (or 2 or 3) for francsecoslab-beagleplay, to
> provide some data on this?
> 

I actually lost this request and only found it this morning when reviewing past
emails. My apologies.

Anyway: I just uploaded another boot-data file (of a kernel with some
optimizations), but can confirm that the time spent inside jent_mod_init across
multiple runs appears consistent - or should I say fixed? - at 52ms.

Francesco





^ permalink raw reply

* Re: [boot-time] jent_mod_init on beagleplay (was RE: [boot-time] [RFC] analyze-initcall-debug.py - a tool to analyze the initcall debug output
From: Francesco Valla @ 2024-12-30 22:35 UTC (permalink / raw)
  To: Stephan Mueller, Shankari; +Cc: Linux Embedded, Bird, Tim
In-Reply-To: <CAORPcfUfgNnQb6m0baW9qEGnrGYsnbpvwUUmF5Y9Byh9_iMAZw@mail.gmail.com>

Hi Shankari,

On Sunday, 29 December 2024 at 19:52:29 Shankari <beingcap11@gmail.com> wrote:
> Hi
> 
> I wanted to provide an update on my recent contributions to the BeaglePlay
> project. I have recently started contributing and have been analyzing the
> boot time of the init process. Below is the output from the system log:
> 
> debian@BeaglePlay:~$ dmesg | grep "init process"
> [    1.480490] Run /init as init process
> 
> Moving forward, I plan to explore ways to modify the command line and
> further investigate the data used for SIG analysis. This will help me gain
> a deeper understanding of the boot process and its performance
> characteristics.

I started the same journey around a week ago, and it's proving a valuable
experience.

After some trimming, I reached a (reported) init startup time of around 0.6s:

root@beagleplay-ti:~# dmesg|grep /sbin
[    0.630046] Run /sbin/init as init process

The result above was obtained on a slightly modified 6.12.4 kernel with a
custom configuration (I am still thinking about the modifications, as they are
a bit unorthodox, but I can share them as well as the defconfig if you are
interested).
> 
> Please let me know if you have any suggestions or areas where I could focus
> my efforts.

In general, I'd suggest to use a recent kernel, as I observed a consistent
reduction of boot time between e.g. 6.6.x and 6.12.x.

> 
> Best regards,
> 
> Shankari
>

Regards,

Francesco






^ permalink raw reply

* Re: [boot-time] jent_mod_init on beagleplay (was RE: [boot-time] [RFC] analyze-initcall-debug.py - a tool to analyze the initcall debug output
From: Stephan Müller @ 2024-12-31  9:55 UTC (permalink / raw)
  To: Shankari, Francesco Valla; +Cc: Linux Embedded, Bird, Tim
In-Reply-To: <22372644.4csPzL39Zc@fedora.fritz.box>

Am Montag, 30. Dezember 2024, 23:35:03 Mitteleuropäische Normalzeit schrieb 
Francesco Valla:

Hi Francesco,

> Hi Shankari,
> 
> On Sunday, 29 December 2024 at 19:52:29 Shankari <beingcap11@gmail.com> 
wrote:
> > Hi
> > 
> > I wanted to provide an update on my recent contributions to the BeaglePlay
> > project. I have recently started contributing and have been analyzing the
> > boot time of the init process. Below is the output from the system log:
> > 
> > debian@BeaglePlay:~$ dmesg | grep "init process"
> > [    1.480490] Run /init as init process
> > 
> > Moving forward, I plan to explore ways to modify the command line and
> > further investigate the data used for SIG analysis. This will help me gain
> > a deeper understanding of the boot process and its performance
> > characteristics.
> 
> I started the same journey around a week ago, and it's proving a valuable
> experience.
> 
> After some trimming, I reached a (reported) init startup time of around
> 0.6s:
> 
> root@beagleplay-ti:~# dmesg|grep /sbin
> [    0.630046] Run /sbin/init as init process
> 
> The result above was obtained on a slightly modified 6.12.4 kernel with a
> custom configuration (I am still thinking about the modifications, as they
> are a bit unorthodox, but I can share them as well as the defconfig if you
> are interested).
> 
> > Please let me know if you have any suggestions or areas where I could
> > focus
> > my efforts.
> 
> In general, I'd suggest to use a recent kernel, as I observed a consistent
> reduction of boot time between e.g. 6.6.x and 6.12.x.

I am now partially in this discussion - what exactly are you trying to 
achieve, or rather is there anything wrong with the current system? I am 
trying to see whether there is something on my side to work on.
> 
> > Best regards,
> > 
> > Shankari
> 
> Regards,
> 
> Francesco


Ciao
Stephan



^ permalink raw reply

* Re: [boot-time] jent_mod_init on beagleplay (was RE: [boot-time] [RFC] analyze-initcall-debug.py - a tool to analyze the initcall debug output
From: Francesco Valla @ 2024-12-31 14:42 UTC (permalink / raw)
  To: Shankari, Stephan Müller; +Cc: Linux Embedded, Bird, Tim
In-Reply-To: <26792960.1r3eYUQgxm@positron.chronox.de>

Hi Stephan,

On Tuesday, 31 December 2024 at 10:55:06 Stephan Müller <smueller@chronox.de> wrote:
> Am Montag, 30. Dezember 2024, 23:35:03 Mitteleuropäische Normalzeit schrieb 
> Francesco Valla:
> 
> Hi Francesco,
> 
> > Hi Shankari,
> > 
> > On Sunday, 29 December 2024 at 19:52:29 Shankari <beingcap11@gmail.com> 
> wrote:
> > > Hi
> > > 
> > > I wanted to provide an update on my recent contributions to the BeaglePlay
> > > project. I have recently started contributing and have been analyzing the
> > > boot time of the init process. Below is the output from the system log:
> > > 
> > > debian@BeaglePlay:~$ dmesg | grep "init process"
> > > [    1.480490] Run /init as init process
> > > 
> > > Moving forward, I plan to explore ways to modify the command line and
> > > further investigate the data used for SIG analysis. This will help me gain
> > > a deeper understanding of the boot process and its performance
> > > characteristics.
> > 
> > I started the same journey around a week ago, and it's proving a valuable
> > experience.
> > 
> > After some trimming, I reached a (reported) init startup time of around
> > 0.6s:
> > 
> > root@beagleplay-ti:~# dmesg|grep /sbin
> > [    0.630046] Run /sbin/init as init process
> > 
> > The result above was obtained on a slightly modified 6.12.4 kernel with a
> > custom configuration (I am still thinking about the modifications, as they
> > are a bit unorthodox, but I can share them as well as the defconfig if you
> > are interested).
> > 
> > > Please let me know if you have any suggestions or areas where I could
> > > focus
> > > my efforts.
> > 
> > In general, I'd suggest to use a recent kernel, as I observed a consistent
> > reduction of boot time between e.g. 6.6.x and 6.12.x.
> 
> I am now partially in this discussion - what exactly are you trying to 
> achieve, or rather is there anything wrong with the current system? I am 
> trying to see whether there is something on my side to work on.
>

The ultimate goal here is reducing the boot time to the minimum possible
without removing functionalities. The purpose is educational, but on the long
term it will probably be beneficial to my work too (as I work in the automotive
industry, where each millisecond counts).

There isn't anything inherently wrong, at the moment I am trying to determine
what is real initialization time and what instead is an actual waste of time
(e.g.: excessive probe deferral). The aforementioned jent_mod_init (on which
Tim I think is working on some level) is one of the longest single inits, but
to be honest I don't know much about all the entropy stuff.

Wishing you the best for the new year,

Francesco





^ permalink raw reply

* Re: [boot-time] jent_mod_init on beagleplay (was RE: [boot-time] [RFC] analyze-initcall-debug.py - a tool to analyze the initcall debug output
From: Stephan Mueller @ 2025-01-02  9:06 UTC (permalink / raw)
  To: Shankari, Francesco Valla; +Cc: Linux Embedded, Bird, Tim
In-Reply-To: <2285043.vFx2qVVIhK@fedora.fritz.box>

Am Dienstag, 31. Dezember 2024, 15:42:16 CET schrieb Francesco Valla:

Hi Francesco,

> The ultimate goal here is reducing the boot time to the minimum possible
> without removing functionalities. The purpose is educational, but on the
> long term it will probably be beneficial to my work too (as I work in the
> automotive industry, where each millisecond counts).
> 
> There isn't anything inherently wrong, at the moment I am trying to
> determine what is real initialization time and what instead is an actual
> waste of time (e.g.: excessive probe deferral). The aforementioned
> jent_mod_init (on which Tim I think is working on some level) is one of the
> longest single inits, but to be honest I don't know much about all the
> entropy stuff.
> 
> Wishing you the best for the new year,

Same to you!

Thanks for your answer. With that, if you desire it, I can try to help you 
lowering the boot time at least in light of the Jitter RNG. This, however, 
requires a bit of analysis which you need to help me with as I do not have a 
board you discuss. But the Jitter RNG has some turning knobs that can be used.

Ciao
Stephan



^ permalink raw reply

* Re: [boot-time] jent_mod_init on beagleplay (was RE: [boot-time] [RFC] analyze-initcall-debug.py - a tool to analyze the initcall debug output
From: Francesco Valla @ 2025-01-02 10:33 UTC (permalink / raw)
  To: Shankari, Stephan Mueller; +Cc: Linux Embedded, Bird, Tim
In-Reply-To: <3346275.44csPzL39Z@tauon>

Hi Stephan,

On Thursday, 2 January 2025 at 10:06:25 Stephan Mueller <smueller@chronox.de> wrote:
> Am Dienstag, 31. Dezember 2024, 15:42:16 CET schrieb Francesco Valla:
> 
> Hi Francesco,
> 
> > The ultimate goal here is reducing the boot time to the minimum possible
> > without removing functionalities. The purpose is educational, but on the
> > long term it will probably be beneficial to my work too (as I work in the
> > automotive industry, where each millisecond counts).
> > 
> > There isn't anything inherently wrong, at the moment I am trying to
> > determine what is real initialization time and what instead is an actual
> > waste of time (e.g.: excessive probe deferral). The aforementioned
> > jent_mod_init (on which Tim I think is working on some level) is one of the
> > longest single inits, but to be honest I don't know much about all the
> > entropy stuff.
> > 
> > Wishing you the best for the new year,
> 
> Same to you!
> 
> Thanks for your answer. With that, if you desire it, I can try to help you 
> lowering the boot time at least in light of the Jitter RNG. This, however, 
> requires a bit of analysis which you need to help me with as I do not have a 
> board you discuss. But the Jitter RNG has some turning knobs that can be used.
> 
> Ciao
> Stephan
> 

That would be wonderful! Whenever you have the time, please let me know what
analysis you need.

Regards,
Francesco





^ permalink raw reply

* Re: [boot-time] jent_mod_init on beagleplay (was RE: [boot-time] [RFC] analyze-initcall-debug.py - a tool to analyze the initcall debug output
From: Stephan Mueller @ 2025-01-02 12:56 UTC (permalink / raw)
  To: Shankari, Francesco Valla; +Cc: Linux Embedded, Bird, Tim
In-Reply-To: <2736774.vuYhMxLoTh@fedora.fritz.box>

Am Donnerstag, 2. Januar 2025, 11:33:08 CET schrieb Francesco Valla:

Hi Francesco,

> That would be wonderful! Whenever you have the time, please let me know what
> analysis you need.
> 

Ok, some background: the Jitter RNG technically has 2 noise sources which are 
sampled concurrently:

1. variations of the execution of CPU instructions

2. variations of memory access times

For (1) the Jitter RNG has a fixed set of instructions it performs the 
execution time measurements for: the SHA-3 conditioning operation 
(specifically the Keccak sponge function). For that, it performs a given set 
of Keccak operations.

For (2) the Jitter RNG allocates a fixed set of memory and simply reads / 
writes data there and measures this timing.


For (1), the more instructions are samples, the higher the entropy gathered. 
This means more time is required to sample that entropy. I.e. when you 
increase the number of measured Keccak operations, you get more entropy.

For (2), when the memory is large enough to "spill over" into the next type of 
memory (from L1 to L2 to L3 to RAM), the higher the entropy gathered.


So, for (2), to get more entropy, is invariant from the execution time. But 
for (1), the entropy rate depends on the execution time.


Thus, what you want is to try to reduce the time spend for (1).


The key now is that the overall entropy rate the Jitter RNG requires for 
functioning must be such that when gathering 256 bits of data from it, it 
contains 256 bits of entropy.


Now, there are 2 "knobs" to turn via Kconfig:

- the oversampling rate (OSR): given that the individual number of rounds for 
(1) and the number of accesses for (2) are kept the same, the OSR causes the 
Jitter RNG to multiply the round counts. For example, the baseline with OSR == 
1 is that for gathering 256 bits of entropy, 256 times both noise sources are 
sampled. For an OSR of, say, 3, to get 256 bits of entropy, 3 * 256 = 768 
times both noise sources are sampled. This value was changed from 1 to 3 for 
6.11 because there were reports on some CPUs that the Jitter RNG did not 
produce sufficient entropy - most CPUs, however, can perfectly live with OSR 
== 1.

- the amount of memory for (2) can be increased. The default is 2kBytes which 
usually means that the L1 can fully handle the accesses.


======


Now given the description, what can you do? I would apply the following steps:

1. measure whether the timer your system has is really a high-res timer (the 
higher the resolution, the higher the detected variations and thus the 
entropy)

2. assuming that test 1 shows a high res timer, reduce the OSR back to 1 
(CRYPTO_JITTERENTROPY_OSR) and measure the entropy rate - 

3. if test 2 shows insufficient entropy, increase the amount of memory 
(CRYPTO_JITTERENTROPY_MEMSIZE_*) and measure the entropy rate



The tool for measuring the entropy rate is given in [1] - check the README as 
you need to enable a kernel config option to get an interface into the Jitter 
RNG from user space. As you may not have the analysis tool, you may give the 
data to me and I can analyze it.


More details on tuning the Jitter RNG is given in [2] - it discusses to the 
user space variant, but applies to kernel as well.

[1] https://github.com/smuellerDD/jitterentropy-library/tree/master/tests/raw-entropy/recording_runtime_kernelspace

[2] https://github.com/smuellerDD/jitterentropy-library/tree/master/tests/raw-entropy#approach-to-solve-insufficient-entropy

Ciao
Stephan



^ permalink raw reply

* Re: [boot-time] jent_mod_init on beagleplay (was RE: [boot-time] [RFC] analyze-initcall-debug.py - a tool to analyze the initcall debug output
From: Francesco Valla @ 2025-01-03 17:23 UTC (permalink / raw)
  To: Shankari, Stephan Mueller; +Cc: Linux Embedded, Bird, Tim
In-Reply-To: <2972828.e9J7NaK4W3@tauon>

Hi Stephan,

On Thursday, 2 January 2025 at 13:56:59 Stephan Mueller <smueller@chronox.de> wrote:
> Am Donnerstag, 2. Januar 2025, 11:33:08 CET schrieb Francesco Valla:
> 
> Hi Francesco,
> 
> > That would be wonderful! Whenever you have the time, please let me know what
> > analysis you need.
> > 
> 
> Ok, some background: the Jitter RNG technically has 2 noise sources which are 
> sampled concurrently:
> 
> 1. variations of the execution of CPU instructions
> 
> 2. variations of memory access times
> 
> For (1) the Jitter RNG has a fixed set of instructions it performs the 
> execution time measurements for: the SHA-3 conditioning operation 
> (specifically the Keccak sponge function). For that, it performs a given set 
> of Keccak operations.
> 
> For (2) the Jitter RNG allocates a fixed set of memory and simply reads / 
> writes data there and measures this timing.
> 
> 
> For (1), the more instructions are samples, the higher the entropy gathered. 
> This means more time is required to sample that entropy. I.e. when you 
> increase the number of measured Keccak operations, you get more entropy.
> 
> For (2), when the memory is large enough to "spill over" into the next type of 
> memory (from L1 to L2 to L3 to RAM), the higher the entropy gathered.
> 
> 
> So, for (2), to get more entropy, is invariant from the execution time. But 
> for (1), the entropy rate depends on the execution time.
> 
> 
> Thus, what you want is to try to reduce the time spend for (1).
> 
> 
> The key now is that the overall entropy rate the Jitter RNG requires for 
> functioning must be such that when gathering 256 bits of data from it, it 
> contains 256 bits of entropy.
> 
> 
> Now, there are 2 "knobs" to turn via Kconfig:
> 
> - the oversampling rate (OSR): given that the individual number of rounds for 
> (1) and the number of accesses for (2) are kept the same, the OSR causes the 
> Jitter RNG to multiply the round counts. For example, the baseline with OSR == 
> 1 is that for gathering 256 bits of entropy, 256 times both noise sources are 
> sampled. For an OSR of, say, 3, to get 256 bits of entropy, 3 * 256 = 768 
> times both noise sources are sampled. This value was changed from 1 to 3 for 
> 6.11 because there were reports on some CPUs that the Jitter RNG did not 
> produce sufficient entropy - most CPUs, however, can perfectly live with OSR 
> == 1.
> 
> - the amount of memory for (2) can be increased. The default is 2kBytes which 
> usually means that the L1 can fully handle the accesses.
> 
> 

This was a very instructive read. I didn't (yet?) have the time to dive deep
into all the crypto "backbone", but I think I understood in broad terms.

Thank you!

> ======
> 
> 
> Now given the description, what can you do? I would apply the following steps:
> 
> 1. measure whether the timer your system has is really a high-res timer (the 
> higher the resolution, the higher the detected variations and thus the 
> entropy)
> 

Resolution reported by clock_getres() is 1ns. Is this sufficient?

> 2. assuming that test 1 shows a high res timer, reduce the OSR back to 1 
> (CRYPTO_JITTERENTROPY_OSR) and measure the entropy rate - 
> 

Turned out my system already had the OSR set to 1, since CONFIG_CRYPTO_FIPS was
set to N.

> 3. if test 2 shows insufficient entropy, increase the amount of memory 
> (CRYPTO_JITTERENTROPY_MEMSIZE_*) and measure the entropy rate
> 
> 
> 
> The tool for measuring the entropy rate is given in [1] - check the README as 
> you need to enable a kernel config option to get an interface into the Jitter 
> RNG from user space. As you may not have the analysis tool, you may give the 
> data to me and I can analyze it.
> 

Here are the results (with default parameters for processdata.sh:

Opening file: '../results-analysis-runtime/raw_noise.0Fbitout.data' (SHA-256 hash 079f14467f309991a0bcaac2b667283ff1f19eba0ddf1fbd6b34e55a6d94ed87)
Loaded 1000000 samples of 16 distinct 4-bit-wide symbols
Number of Binary Symbols: 4000000

Running non-IID tests...

Running Most Common Value Estimate...
Bitstring MCV Estimate: mode = 2027350, p-hat = 0.50683750000000005, p_u = 0.50748139719173813
        Most Common Value Estimate (bit string) = 0.978573 / 1 bit(s)
Literal MCV Estimate: mode = 68652, p-hat = 0.068652000000000005, p_u = 0.069303328207377918
        Most Common Value Estimate = 3.850932 / 4 bit(s)

Running Entropic Statistic Estimates (bit strings only)...
Bitstring Collision Estimate: X-bar = 2.4981538638330933, sigma-hat = 0.49999674790330145, p = 0.53784137679559285
        Collision Test Estimate (bit string) = 0.894747 / 1 bit(s)
Bitstring Markov Estimate: P_0 = 0.50683750000000005, P_1 = 0.49316249999999995, P_0,0 = 0.50927960145016893, P_0,1 = 0.49072039854983107, P_1,0 = 0.50432742976576173, P_1,1 = 0.49567257023423827, p_max = 3.0783392443248932e-38
        Markov Test Estimate (bit string) = 0.973524 / 1 bit(s)
Bitstring Compression Estimate: X-bar = 5.2062336468660613, sigma-hat = 1.0200139851715606, p = 0.037082666137132736
        Compression Test Estimate (bit string) = 0.792185 / 1 bit(s)

Running Tuple Estimates...
Bitstring t-Tuple Estimate: t = 18, p-hat_max = 0.5303003556530639666278, p_u = 0.5309431295223588315809
Bitstring LRS Estimate: u = 19, v = 43, p-hat = 0.50389349457067132, p_u = 0.50453743245287026
        T-Tuple Test Estimate (bit string) = 0.913371 / 1 bit(s)
Literal t-Tuple Estimate: t = 4, p-hat_max = 0.0814448250695374631388, p_u = 0.08214935860193944897723
Literal LRS Estimate: u = 5, v = 9, p-hat = 0.063061514307713893, p_u = 0.063687630641525933
        T-Tuple Test Estimate = 3.605607 / 4 bit(s)
        LRS Test Estimate (bit string) = 0.986967 / 1 bit(s)
        LRS Test Estimate = 3.972843 / 4 bit(s)

Running Predictor Estimates...
Bitstring MultiMCW Prediction Estimate: N = 3999937, Pglobal' = 0.50734663518092882 (C = 2026779) Plocal can't affect result (r = 25)
        Multi Most Common in Window (MultiMCW) Prediction Test Estimate (bit string) = 0.978956 / 1 bit(s)
Literal MultiMCW Prediction Estimate: N = 999937, Pglobal' = 0.071847840262787413 (C = 71181) Plocal can't affect result (r = 6)
        Multi Most Common in Window (MultiMCW) Prediction Test Estimate = 3.798911 / 4 bit(s)
Bitstring Lag Prediction Estimate: N = 3999999, Pglobal' = 0.50760302185063 (C = 2027836) Plocal can't affect result (r = 23)
        Lag Prediction Test Estimate (bit string) = 0.978227 / 1 bit(s)
Literal Lag Prediction Estimate: N = 999999, Pglobal' = 0.07177509499573978 (C = 71113) Plocal can't affect result (r = 6)
        Lag Prediction Test Estimate = 3.800373 / 4 bit(s)
Bitstring MultiMMC Prediction Estimate: N = 3999998, Pglobal' = 0.50910236964447519 (C = 2033833) Plocal can't affect result (r = 23)
        Multi Markov Model with Counting (MultiMMC) Prediction Test Estimate (bit string) = 0.973972 / 1 bit(s)
Literal MultiMMC Prediction Estimate: N = 999998, Pglobal' = 0.06979561605633619 (C = 69142) Plocal can't affect result (r = 6)
        Multi Markov Model with Counting (MultiMMC) Prediction Test Estimate = 3.840720 / 4 bit(s)
Bitstring LZ78Y Prediction Estimate: N = 3999983, Pglobal' = 0.50745980290740611 (C = 2027255) Plocal can't affect result (r = 25)
        LZ78Y Prediction Test Estimate (bit string) = 0.978635 / 1 bit(s)
Literal LZ78Y Prediction Estimate: N = 999983, Pglobal' = 0.068264919012970812 (C = 67617) Plocal can't affect result (r = 5)
        LZ78Y Prediction Test Estimate = 3.872712 / 4 bit(s)

H_original: 3.605607
H_bitstring: 0.792185
min(H_original, 4 X H_bitstring): 3.168741

---

Opening file: '../results-analysis-runtime/raw_noise.FFbitout.data' (SHA-256 hash 670cc4f908a3fa430112387c7490280b801fbc3e0d9b2eb5725a7382f1726353)
Loaded 1000000 samples of 256 distinct 8-bit-wide symbols
Number of Binary Symbols: 8000000

Running non-IID tests...

Running Most Common Value Estimate...
Bitstring MCV Estimate: mode = 4293572, p-hat = 0.53669650000000002, p_u = 0.53715061859478275
        Most Common Value Estimate (bit string) = 0.896601 / 1 bit(s)
Literal MCV Estimate: mode = 25394, p-hat = 0.025394, p_u = 0.025799226085884197
        Most Common Value Estimate = 5.276528 / 8 bit(s)

Running Entropic Statistic Estimates (bit strings only)...
Bitstring Collision Estimate: X-bar = 2.4368621131861672, sigma-hat = 0.4959976634888219, p = 0.67866590835207607
        Collision Test Estimate (bit string) = 0.559227 / 1 bit(s)
Bitstring Markov Estimate: P_0 = 0.53669650000000002, P_1 = 0.46330349999999998, P_0,0 = 0.58912975024059222, P_0,1 = 0.41087024975940778, P_1,0 = 0.47595730335441655, P_1,1 = 0.52404269664558345, p_max = 3.5198162547162731e-30
        Markov Test Estimate (bit string) = 0.764393 / 1 bit(s)
Bitstring Compression Estimate: X-bar = 5.1060913378184907, sigma-hat = 1.0602301402207321, p = 0.089700342434675784
        Compression Test Estimate (bit string) = 0.579790 / 1 bit(s)

Running Tuple Estimates...
Bitstring t-Tuple Estimate: t = 24, p-hat_max = 0.6077535669643823323526, p_u = 0.608198213970942755998
Bitstring LRS Estimate: u = 25, v = 55, p-hat = 0.57377327555229816, p_u = 0.57422363845508301
        T-Tuple Test Estimate (bit string) = 0.717387 / 1 bit(s)
Literal t-Tuple Estimate: t = 2, p-hat_max = 0.02539400000000000000072, p_u = 0.02579922608588419654918
Literal LRS Estimate: u = 3, v = 6, p-hat = 0.014678019585973216, p_u = 0.014987790126594754
        T-Tuple Test Estimate = 5.276528 / 8 bit(s)
        LRS Test Estimate (bit string) = 0.800315 / 1 bit(s)
        LRS Test Estimate = 6.060069 / 8 bit(s)

Running Predictor Estimates...
Bitstring MultiMCW Prediction Estimate: N = 7999937, Pglobal' = 0.53749557635492473 (C = 4296298) Plocal can't affect result (r = 26)
        Multi Most Common in Window (MultiMCW) Prediction Test Estimate (bit string) = 0.895675 / 1 bit(s)
Literal MultiMCW Prediction Estimate: N = 999937, Pglobal' = 0.031150330241116859 (C = 30704) Plocal can't affect result (r = 4)
        Multi Most Common in Window (MultiMCW) Prediction Test Estimate = 5.004609 / 8 bit(s)
Bitstring Lag Prediction Estimate: N = 7999999, Pglobal' = 0.55941498930539801 (C = 4471702) Plocal can't affect result (r = 30)
        Lag Prediction Test Estimate (bit string) = 0.838009 / 1 bit(s)
Literal Lag Prediction Estimate: N = 999999, Pglobal' = 0.024834737018808978 (C = 24437) Plocal can't affect result (r = 4)
        Lag Prediction Test Estimate = 5.331497 / 8 bit(s)
Bitstring MultiMMC Prediction Estimate: N = 7999998, Pglobal' = 0.56384793754105667 (C = 4507169) Plocal can't affect result (r = 30)
        Multi Markov Model with Counting (MultiMMC) Prediction Test Estimate (bit string) = 0.826622 / 1 bit(s)
Literal MultiMMC Prediction Estimate: N = 999998, Pglobal' = 0.026034084289744124 (C = 25627) Plocal can't affect result (r = 4)
        Multi Markov Model with Counting (MultiMMC) Prediction Test Estimate = 5.263455 / 8 bit(s)
Bitstring LZ78Y Prediction Estimate: N = 7999983, Pglobal' = 0.55942260687547074 (C = 4471754) Plocal can't affect result (r = 30)
        LZ78Y Prediction Test Estimate (bit string) = 0.837990 / 1 bit(s)
Literal LZ78Y Prediction Estimate: N = 999983, Pglobal' = 0.026039513473967817 (C = 25632) Plocal can't affect result (r = 4)
        LZ78Y Prediction Test Estimate = 5.263154 / 8 bit(s)

H_original: 5.004609
H_bitstring: 0.559227
min(H_original, 8 X H_bitstring): 4.473812

---

> 
> More details on tuning the Jitter RNG is given in [2] - it discusses to the 
> user space variant, but applies to kernel as well.
> 
> [1] https://github.com/smuellerDD/jitterentropy-library/tree/master/tests/raw-entropy/recording_runtime_kernelspace
> 
> [2] https://github.com/smuellerDD/jitterentropy-library/tree/master/tests/raw-entropy#approach-to-solve-insufficient-entropy
> 
> Ciao
> Stephan
> 

Thank you again!

Francesco





^ permalink raw reply

* Re: [boot-time] jent_mod_init on beagleplay (was RE: [boot-time] [RFC] analyze-initcall-debug.py - a tool to analyze the initcall debug output
From: Stephan Müller @ 2025-01-04 15:45 UTC (permalink / raw)
  To: Shankari, Francesco Valla; +Cc: Linux Embedded, Bird, Tim
In-Reply-To: <3506889.LZWGnKmheA@fedora.fritz.box>

Am Freitag, 3. Januar 2025, 18:23:19 Mitteleuropäische Normalzeit schrieb 
Francesco Valla:

Hi Francesco,

> > ======
> > 
> > 
> > Now given the description, what can you do? I would apply the following
> > steps:
> > 
> > 1. measure whether the timer your system has is really a high-res timer
> > (the higher the resolution, the higher the detected variations and thus
> > the entropy)
> 
> Resolution reported by clock_getres() is 1ns. Is this sufficient?
> 
It should, but it is in relationship to the complexity of the CPU itself.

> > 2. assuming that test 1 shows a high res timer, reduce the OSR back to 1
> > (CRYPTO_JITTERENTROPY_OSR) and measure the entropy rate -
> 
> Turned out my system already had the OSR set to 1, since CONFIG_CRYPTO_FIPS
> was set to N.
> 
> > 3. if test 2 shows insufficient entropy, increase the amount of memory
> > (CRYPTO_JITTERENTROPY_MEMSIZE_*) and measure the entropy rate
> > 
> > 
> > 
> > The tool for measuring the entropy rate is given in [1] - check the README
> > as you need to enable a kernel config option to get an interface into the
> > Jitter RNG from user space. As you may not have the analysis tool, you
> > may give the data to me and I can analyze it.
> 
> Here are the results (with default parameters for processdata.sh:
> 
...

> min(H_original, 4 X H_bitstring): 3.168741

That last value, the min-value, is the key: it must be larger than 1/OSR - 
when it gets close to 1/OSR, the health tests start to flag errors once in a 
while (with a probability rate of around 2**-30 time stamps). So, you have OSR 
set to 1 which is already the lowest value supported by the Jitter RNG. Thus, 
there is unfortunately not much more you can do to increase the performance 
during boot time.

I thought OSR was set to 3 in your environment.

> min(H_original, 8 X H_bitstring): 4.473812

Ciao
Stephan



^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox