RE: [boot-time]

linux-embedded.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* RE: [boot-time]
       [not found] <CAORPcfVRobA+u5q7aPboC=3iY8dibDUB0920Z=Z0VgpQEupKJw@mail.gmail.com>
@ 2025-01-08 18:33 ` Bird, Tim
  2025-01-08 20:39   ` [boot-time] Marko Hoyer
                     ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Bird, Tim @ 2025-01-08 18:33 UTC (permalink / raw)
  To: Shankari, linux-embedded@vger.kernel.org

> -----Original Message-----
> From: Shankari <beingcap11@gmail.com>
> Hi
> 
> I wanted to provide an update on my recent contributions to the boot-time reduction project. I have recently started contributing and
> am working with the beagleplay. I have been analyzing the boot time of the init process. Below is the output from the system log:
> 
> debian@BeaglePlay:~$ dmesg | grep "init process"
> [    1.480490] Run /init as init process
> 
> Moving forward, I plan to explore ways to modify the command line and further investigate the data used for SIG analysis. This will
> help me gain a deeper understanding of the boot process and its performance characteristics.
> 
> Please let me know if you have any suggestions or areas where I could focus my efforts.

Hi Shankari,

It sounds like you are off to a good start.  I have something that needs to be done, that I think
you can help with, and that matches where I believe you are in your status with being able
to evaluate the kernel.

In general, there's a lot of information on the elinux wiki which is stale, which needs to be
updated or archived, or maybe even just removed.

This section of the Boot Time page has a lot of material in this category:
https://elinux.org/Boot_Time#kernel_speedups

Can you validate the information on these 2 pages:
 * https://elinux.org/Disable_Console
 * https://elinux.org/Preset_LPJ

This would consist of reading through the material, and testing the
described techniques on your machine.  This will involve booting the
machine 2 ways, with a particular kernel command line option and without
it, and then reporting back the final boot time for both.  You can use
the timestamp for the "init process" string as your final boot time, for the
purposes of this exercise.

Helping me to update the elinux wiki material on boot time would be
an immense help, and is one of my main goals for the boot time SIG in 2025.

Don't hesitate to ask questions if you have any.

BTW - you can just report your findings to me and linux-embedded list, but
alternatively (and even better) would be if you could also update the wiki
pages themselves with your information based on recent kernels and hardware.
To do this, you will need an elinux wiki account, which you can make online on
elinux wiki.org by going to this page: https://elinux.org/Special:CreateAccount

Anyone else reading this who wants to also participate in this project to
update the elinux wiki boot time information, please contact me.

Thanks.
  -- Tim

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [boot-time]
  2025-01-08 18:33 ` [boot-time] Bird, Tim
@ 2025-01-08 20:39   ` Marko Hoyer
  2025-01-08 21:19     ` [boot-time] Bird, Tim
  2025-01-08 23:00   ` [boot-time] Rob Landley
  2025-01-10 22:46   ` [boot-time] Marko Hoyer
  2 siblings, 1 reply; 21+ messages in thread
From: Marko Hoyer @ 2025-01-08 20:39 UTC (permalink / raw)
  To: Bird, Tim, Shankari, linux-embedded@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 4087 bytes --]

Am 08.01.25 um 19:33 schrieb Bird, Tim:
>> -----Original Message-----
>> From: Shankari <beingcap11@gmail.com>
>> Hi
>>
>> I wanted to provide an update on my recent contributions to the boot-time reduction project. I have recently started contributing and
>> am working with the beagleplay. I have been analyzing the boot time of the init process. Below is the output from the system log:
>>
>> debian@BeaglePlay:~$ dmesg | grep "init process"
>> [    1.480490] Run /init as init process
>>
>> Moving forward, I plan to explore ways to modify the command line and further investigate the data used for SIG analysis. This will
>> help me gain a deeper understanding of the boot process and its performance characteristics.
>>
>> Please let me know if you have any suggestions or areas where I could focus my efforts.
> Hi Shankari,
>
> It sounds like you are off to a good start.  I have something that needs to be done, that I think
> you can help with, and that matches where I believe you are in your status with being able
> to evaluate the kernel.
>
> In general, there's a lot of information on the elinux wiki which is stale, which needs to be
> updated or archived, or maybe even just removed.
>
> This section of the Boot Time page has a lot of material in this category:
> https://elinux.org/Boot_Time#kernel_speedups
>
> Can you validate the information on these 2 pages:
>   * https://elinux.org/Disable_Console
>   * https://elinux.org/Preset_LPJ
>
> This would consist of reading through the material, and testing the
> described techniques on your machine.  This will involve booting the
> machine 2 ways, with a particular kernel command line option and without
> it, and then reporting back the final boot time for both.  You can use
> the timestamp for the "init process" string as your final boot time, for the
> purposes of this exercise.
>
> Helping me to update the elinux wiki material on boot time would be
> an immense help, and is one of my main goals for the boot time SIG in 2025.
>
> Don't hesitate to ask questions if you have any.
>
> BTW - you can just report your findings to me and linux-embedded list, but
> alternatively (and even better) would be if you could also update the wiki
> pages themselves with your information based on recent kernels and hardware.
> To do this, you will need an elinux wiki account, which you can make online on
> elinux wiki.org by going to this page: https://elinux.org/Special:CreateAccount
>
> Anyone else reading this who wants to also participate in this project to
> update the elinux wiki boot time information, please contact me.
> Thanks.
>    -- Tim
>
Hi Tim, all,

first time I'm posting here so hopefully everything is fine w/ my mail 
format / attachment and so on ... If not, please give me some feedback 
and guidance.


To the "disable console" topic: I have some numbers in place for an RPI 
Zero W, find dmesg dumps and systemd-analyze plots attached.


Environment:

- RPi Zero W, kernel 5.15.24, systemd 247.3, customized debian

- onboard UART used


Cases:

- #1 quiet: cmdline w/ quiet, no kernel or userspace output up to the 
serial login console

- #2 normal: cmdline w/o quiet, serial console @115200 baud

- #3 normal_baud9600: cmdline w/o quiet, serial console @9600 baud


Main outcomes:

- kernel timestamps "Run /sbin/init as init process"

#1: "1.714458", #2: "3.011701", #3: "16.108101"

Interpretation:

* enabled serial console has significant impact in kernel boot time

* reducing baud to 9600 induced some side effect, not sure what it is ...

-  systemd startup

* systemd drops 2 log lines per started unit to the console

* seems serial output is not implemented asynchronously (see steps of 
units in sd plot, ~10ms per unit w/ baud 115200, ~80ms per unit w/ baud9600


Side notes:

* I remember similar behavior w/ imx.6 SoCs

* Maybe this issues is not seen on other SoCs (maybe w/ another hw 
implementation of the UART)

* Maybe this issues is only seen in single core machines (I can double 
check w/ a PI3 or orange pi zero once)

Hope this helps.

Regards,

Marko






[-- Attachment #2: fastboot_disable-console_rpi-zero-w.tgz --]
[-- Type: application/x-compressed-tar, Size: 25145 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [boot-time]
  2025-01-08 20:39   ` [boot-time] Marko Hoyer
@ 2025-01-08 21:19     ` Bird, Tim
  2025-01-08 23:26       ` [boot-time] Rob Landley
  2025-01-09 12:43       ` [boot-time] Marko Hoyer
  0 siblings, 2 replies; 21+ messages in thread
From: Bird, Tim @ 2025-01-08 21:19 UTC (permalink / raw)
  To: Marko Hoyer, Shankari, linux-embedded@vger.kernel.org



> -----Original Message-----
> From: Marko Hoyer <mhoyer.oss-devel@freenet.de>
> Am 08.01.25 um 19:33 schrieb Bird, Tim:
> >> -----Original Message-----
> >> From: Shankari <beingcap11@gmail.com>
> >> Hi
> >>
> >> I wanted to provide an update on my recent contributions to the boot-time reduction project. I have recently started contributing
> and
> >> am working with the beagleplay. I have been analyzing the boot time of the init process. Below is the output from the system log:
> >>
> >> debian@BeaglePlay:~$ dmesg | grep "init process"
> >> [    1.480490] Run /init as init process
> >>
> >> Moving forward, I plan to explore ways to modify the command line and further investigate the data used for SIG analysis. This will
> >> help me gain a deeper understanding of the boot process and its performance characteristics.
> >>
> >> Please let me know if you have any suggestions or areas where I could focus my efforts.
> > Hi Shankari,
> >
> > It sounds like you are off to a good start.  I have something that needs to be done, that I think
> > you can help with, and that matches where I believe you are in your status with being able
> > to evaluate the kernel.
> >
> > In general, there's a lot of information on the elinux wiki which is stale, which needs to be
> > updated or archived, or maybe even just removed.
> >
> > This section of the Boot Time page has a lot of material in this category:
> > https://elinux.org/Boot_Time#kernel_speedups
> >
> > Can you validate the information on these 2 pages:
> >   * https://elinux.org/Disable_Console
> >   * https://elinux.org/Preset_LPJ
> >
> > This would consist of reading through the material, and testing the
> > described techniques on your machine.  This will involve booting the
> > machine 2 ways, with a particular kernel command line option and without
> > it, and then reporting back the final boot time for both.  You can use
> > the timestamp for the "init process" string as your final boot time, for the
> > purposes of this exercise.
> >
> > Helping me to update the elinux wiki material on boot time would be
> > an immense help, and is one of my main goals for the boot time SIG in 2025.
> >
> > Don't hesitate to ask questions if you have any.
> >
> > BTW - you can just report your findings to me and linux-embedded list, but
> > alternatively (and even better) would be if you could also update the wiki
> > pages themselves with your information based on recent kernels and hardware.
> > To do this, you will need an elinux wiki account, which you can make online on
> > elinux wiki.org by going to this page: https://elinux.org/Special:CreateAccount
> >
> > Anyone else reading this who wants to also participate in this project to
> > update the elinux wiki boot time information, please contact me.
> > Thanks.
> >    -- Tim
> >
> Hi Tim, all,
> 
> first time I'm posting here so hopefully everything is fine w/ my mail
> format / attachment and so on ... If not, please give me some feedback
> and guidance.

Marko,

Thanks for this great data!

In general, I don't see a lot of attachments on kernel mailing lists.
They don't bother me, and we aren't CC'ing LKML (that's a separate
issue we should discuss - developers outside of embedded might
want to see this data).  I'll check later and see what lore does with this,
but if no one complains, I don't see a problem with it.  If someone
does complain, I can provide file hosting either on the elinux wiki
or the boot-time wiki, and we can link attachments like you've
provided on this message from one of those places (to avoid
putting attachments on kernel mailing lists).
 
> 
> To the "disable console" topic: I have some numbers in place for an RPI
> Zero W, find dmesg dumps and systemd-analyze plots attached.
> 
> 
> Environment:
> 
> - RPi Zero W, kernel 5.15.24, systemd 247.3, customized debian
> 
> - onboard UART used
> 
> 
> Cases:
> 
> - #1 quiet: cmdline w/ quiet, no kernel or userspace output up to the
> serial login console
> 
> - #2 normal: cmdline w/o quiet, serial console @115200 baud
> 
> - #3 normal_baud9600: cmdline w/o quiet, serial console @9600 baud
> 
> 
> Main outcomes:
> 
> - kernel timestamps "Run /sbin/init as init process"
> 
> #1: "1.714458", #2: "3.011701", #3: "16.108101"

Wow from 1.7 seconds to 16.1 seconds.  That's a pretty huge
difference.  I guess this particular technique is still
very relevant!

> 
> Interpretation:
> 
> * enabled serial console has significant impact in kernel boot time
> 
> * reducing baud to 9600 induced some side effect, not sure what it is ...

Did you see any other weird behavior besides the huge slowdown?
I'll take a look at the amount of characters in your dmesg output and
see if it can be linearly correlated to the baud rate, or if it seems something
else is going on.

> -  systemd startup
> 
> * systemd drops 2 log lines per started unit to the console
> 
> * seems serial output is not implemented asynchronously (see steps of
> units in sd plot, ~10ms per unit w/ baud 115200, ~80ms per unit w/ baud9600

I'm not sure what you're referring to here.  Is the 'unit' you are talking about
the graphing grid size, or are you referring to systemd units?  The grid size
seems to be 10ms per minor grid line in each plot.

> 
> Side notes:
> 
> * I remember similar behavior w/ imx.6 SoCs
> 
> * Maybe this issues is not seen on other SoCs (maybe w/ another hw
> implementation of the UART)
> 
> * Maybe this issues is only seen in single core machines (I can double
> check w/ a PI3 or orange pi zero once)
> 
> Hope this helps.

It helps a lot.  Thanks for this data!

I think the 'Disable Console' technique will continue to stay as one of the first
things we recommend for developers working on their board's boot time.

To others developers - I'd like to see more data like this on other systems
as well.  So please keep submitting your data to this list.

Thanks,
 -- Tim


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [boot-time]
  2025-01-08 21:19     ` [boot-time] Bird, Tim
@ 2025-01-08 23:26       ` Rob Landley
  2025-01-09 13:02         ` [boot-time] Marko Hoyer
  2025-01-09 12:43       ` [boot-time] Marko Hoyer
  1 sibling, 1 reply; 21+ messages in thread
From: Rob Landley @ 2025-01-08 23:26 UTC (permalink / raw)
  To: Bird, Tim, Marko Hoyer, Shankari, linux-embedded@vger.kernel.org

On 1/8/25 15:19, Bird, Tim wrote:
>> Cases:
>>
>> - #1 quiet: cmdline w/ quiet, no kernel or userspace output up to the
>> serial login console
>>
>> - #2 normal: cmdline w/o quiet, serial console @115200 baud
>>
>> - #3 normal_baud9600: cmdline w/o quiet, serial console @9600 baud
>>
>>
>> Main outcomes:
>>
>> - kernel timestamps "Run /sbin/init as init process"
>>
>> #1: "1.714458", #2: "3.011701", #3: "16.108101"
> 
> Wow from 1.7 seconds to 16.1 seconds.  That's a pretty huge
> difference.  I guess this particular technique is still
> very relevant!

CONFIG_EARLY_PRINTK output is emitted before interrupts are enabled 
(last I checked they didn't kick in until RIGHT before PID 1 gets forked 
off), so the early output drivers spin waiting for the next character to 
go into the buffer (the memory mapped register ones look something like 
"while (MASK&*status); *output = *data++;" in a for loop) and the 
printk() call can't return until all the data has been queued to the 
serial hardware, so you spend a lot of time blocked in printk().

With 9600 baud 8n1 output, 9600/9 = 1066 characters per second, or 
approximately a 1ms wait between each character, blocking in printk when 
the hardware FIFO buffer fills up, so 16k of output data takes 16 
seconds to write if the rest of the boot is doing NOTHING. Even a 1k 
hardware FIFO is only 1 second of output, and that's assuming all 1k is 
outgoing rather than split between in/out.

Your options are:

1) disable early printk so it all goes into a malloced buffer until 
interrupts are enabled and it can be asynchronously flushed (meaning if 
something DOES go wrong in early boot you can't see it)
2) set your FIFO speed as fast as possible
3) have your default boot use the "quiet" option (similar to disabling 
EARLY_PRINTK but at least you have the option to yank quiet from your 
bootloader args without rebuilding the kernel.)

Faster UART speeds mean shorter serial cables (although there's also 3 
volt vs 5 volt, wire thickness/capacitance, and some other stuff, Jeff 
Dionne walked me through the math last year but I don't have my notes in 
front of me). Modern hardware can do up to 4 megabits/second but outside 
"this serial chip immediately talks to a USB chip and then it's 
transported as USB with the funky noise-cancelling signaling over VERY 
twisted pair to actually leave the board"), I wouldn't trust that over 
any real length of cable.

Alas 
https://tldp.org/HOWTO/Remote-Serial-Console-HOWTO/serial-distance.html 
is from the dawn of time and only goes up to 56k over wires made from 
recycled drainpipes. 
https://novatel.com/support/known-solutions/maximum-cable-length-vs-data-rate 
says 115200 is 2.5 meters. It LOOKS like it scales linearly with twice 
the speed being half the cable, so a megabit would be about 1 foot of 
serial cable before the bits get all mushy.

Rob

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [boot-time]
  2025-01-08 23:26       ` [boot-time] Rob Landley
@ 2025-01-09 13:02         ` Marko Hoyer
  2025-01-09 21:10           ` [boot-time] Rob Landley
  0 siblings, 1 reply; 21+ messages in thread
From: Marko Hoyer @ 2025-01-09 13:02 UTC (permalink / raw)
  To: Rob Landley, Bird, Tim, Marko Hoyer, Shankari,
	linux-embedded@vger.kernel.org

Am 09.01.25 um 00:26 schrieb Rob Landley:
> On 1/8/25 15:19, Bird, Tim wrote:
>>> Cases:
>>>
>>> - #1 quiet: cmdline w/ quiet, no kernel or userspace output up to the
>>> serial login console
>>>
>>> - #2 normal: cmdline w/o quiet, serial console @115200 baud
>>>
>>> - #3 normal_baud9600: cmdline w/o quiet, serial console @9600 baud
>>>
>>>
>>> Main outcomes:
>>>
>>> - kernel timestamps "Run /sbin/init as init process"
>>>
>>> #1: "1.714458", #2: "3.011701", #3: "16.108101"
>>
>> Wow from 1.7 seconds to 16.1 seconds.  That's a pretty huge
>> difference.  I guess this particular technique is still
>> very relevant!
>
> CONFIG_EARLY_PRINTK output is emitted before interrupts are enabled 
> (last I checked they didn't kick in until RIGHT before PID 1 gets 
> forked off), so the early output drivers spin waiting for the next 
> character to go into the buffer (the memory mapped register ones look 
> something like "while (MASK&*status); *output = *data++;" in a for 
> loop) and the printk() call can't return until all the data has been 
> queued to the serial hardware, so you spend a lot of time blocked in 
> printk().

Hi Rob,

thx for the explanation, helps further!

* This implementation would explain the observed behavior.

* What I'm not understanding yet: logs from systemd delay systemd the 
same way as seen in the kernel. Looks like the issue is not solved even 
when PID 1 is started.  As said, It can be something specific to single 
core SoCs or even just to RPI Zero W. I'll check ...

>
> With 9600 baud 8n1 output, 9600/9 = 1066 characters per second, or 
> approximately a 1ms wait between each character, blocking in printk 
> when the hardware FIFO buffer fills up, so 16k of output data takes 16 
> seconds to write if the rest of the boot is doing NOTHING. Even a 1k 
> hardware FIFO is only 1 second of output, and that's assuming all 1k 
> is outgoing rather than split between in/out.
>
> Your options are:
>
> 1) disable early printk so it all goes into a malloced buffer until 
> interrupts are enabled and it can be asynchronously flushed (meaning 
> if something DOES go wrong in early boot you can't see it)
> 2) set your FIFO speed as fast as possible
> 3) have your default boot use the "quiet" option (similar to disabling 
> EARLY_PRINTK but at least you have the option to yank quiet from your 
> bootloader args without rebuilding the kernel.)
>
> Faster UART speeds mean shorter serial cables (although there's also 3 
> volt vs 5 volt, wire thickness/capacitance, and some other stuff, Jeff 
> Dionne walked me through the math last year but I don't have my notes 
> in front of me). Modern hardware can do up to 4 megabits/second but 
> outside "this serial chip immediately talks to a USB chip and then 
> it's transported as USB with the funky noise-cancelling signaling over 
> VERY twisted pair to actually leave the board"), I wouldn't trust that 
> over any real length of cable.
>
> Alas 
> https://tldp.org/HOWTO/Remote-Serial-Console-HOWTO/serial-distance.html 
> is from the dawn of time and only goes up to 56k over wires made from 
> recycled drainpipes. 
> https://novatel.com/support/known-solutions/maximum-cable-length-vs-data-rate 
> says 115200 is 2.5 meters. It LOOKS like it scales linearly with twice 
> the speed being half the cable, so a megabit would be about 1 foot of 
> serial cable before the bits get all mushy.
>
As said in another mail: I do not know a valid (production) use case in 
which kernel logs need to be dumped to a serial console. I regard this 
mechanism only as useful for development purposes (in which fast boot is 
probably not so relevant). Please correct me if I'm wrong, would be 
happy to learn about such use cases.

Based on that I think option 3) is the best options for most cases.


> Rob


Marko


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [boot-time]
  2025-01-09 13:02         ` [boot-time] Marko Hoyer
@ 2025-01-09 21:10           ` Rob Landley
  2025-01-09 21:35             ` [boot-time] Marko Hoyer
  0 siblings, 1 reply; 21+ messages in thread
From: Rob Landley @ 2025-01-09 21:10 UTC (permalink / raw)
  To: Marko Hoyer, Bird, Tim, Marko Hoyer, Shankari,
	linux-embedded@vger.kernel.org

On 1/9/25 07:02, Marko Hoyer wrote:
> Am 09.01.25 um 00:26 schrieb Rob Landley:
>> CONFIG_EARLY_PRINTK output is emitted before interrupts are enabled 
>> (last I checked they didn't kick in until RIGHT before PID 1 gets 
>> forked off), so the early output drivers spin waiting for the next 
>> character to go into the buffer (the memory mapped register ones look 
>> something like "while (MASK&*status); *output = *data++;" in a for 
>> loop) and the printk() call can't return until all the data has been 
>> queued to the serial hardware, so you spend a lot of time blocked in 
>> printk().
> 
> Hi Rob,
> 
> thx for the explanation, helps further!
> 
> * This implementation would explain the observed behavior.
> 
> * What I'm not understanding yet: logs from systemd delay systemd the 
> same way as seen in the kernel. Looks like the issue is not solved even 
> when PID 1 is started.  As said, It can be something specific to single 
> core SoCs or even just to RPI Zero W. I'll check ...

Buffering or not in the char device is a driver choice. If your serial 
hardware has a small FIFO buffer and the driver doesn't do its own layer 
of output buffering (with a tasklet or something to copy the data to the 
hardware), then the write() syscall will block waiting for the data to 
go out. (Writes to filesystems stopped doing this back around 2.0 or 
something, when they rewrote the vfs to be based on the page cache and 
deentry cache, meaning ALL filesystem writes go through that now unless 
you say O_DIRECT to _ask_ for it to block, which isn't even always 
honored. But for some reason the TTY layer drives people insane, and 
char devices have been given a wide berth...)

There's a similar issue with some xterms where "make -j16 build" 
spamming lots of output to a display terminal can run significantly 
slower than "make -j16 build | cat" because the linux pipe 
infrastructure inserts a pipe buffer (ulimit -p says 8 but I think 
that's _pages_ so 32k? Except in 2.6.11 it was 64k? Eh, not looking it 
up...) so the writes from each cc instance go into the pipe buffer and 
return immediately when it's not full, whereas writes to a terminal 
device block until the terminal has finished updating (which includes 
scrolling the screen).

If I recall (many years ago), the kde terminal implementation included a 
buffer of its own (immediately returned before updating), and the gnome 
one didn't (blocked until x11 display update completed), so foreground 
builds were faster under kde.

And the gnome guys' answer was to spray everything down with 3D 
acceleration so the GPU was scrolling the screen for you, because of 
course it was.

Anyway, serialized latency has _always_ killed throughput, because it's 
a cost you can't get BACK. The kernel guys used to know this:

https://yarchive.net/comp/linux/raid0.html

Hence the old high school math problem: if you have 2 hours to go 100 
miles and you travel the first 40 miles at 20 miles per hour, how fast 
do you have to go the rest to make it on time? Answer: you'd have to 
instantaneously teleport because you spent 2 hours going 40 miles and 
your time is up with 60 miles left to go. Optimizing the wrong part 
DOESN'T HELP.

> As said in another mail: I do not know a valid (production) use case in 
> which kernel logs need to be dumped to a serial console. I regard this 
> mechanism only as useful for development purposes (in which fast boot is 
> probably not so relevant). Please correct me if I'm wrong, would be 
> happy to learn about such use cases.
> 
> Based on that I think option 3) is the best options for most cases.

You can adjust the loglevel so they still go into dmesg but don't go out 
to the console, which theoretically shouldn't be THAT slow? (At least 
cpu limited rather than wait-for-hardware.)

But keep in mind that a lot of kernel devs seem actively trying to 
sabotage embedded development, and the dmesg metadata is now multiple 
times the size of the actual payload.

https://lkml.iu.edu/hypermail/linux/kernel/2412.0/00045.html

So trimming the dmesg buffer size is probably ALSO a good idea on modern 
embedded systems, because WHAT THE FSCK? Also, I dunno how much cpu time 
all that metadata fapping takes. (Before systemd, this was just text 
going into a ring buffer. But systemd couldn't cope with that because 
https://www.theregister.com/2014/04/05/torvalds_sievers_dust_up/ and 
https://en.wikipedia.org/wiki/Systemd#History and of course 
https://www.linux.com/news/fun-photo-greg-kroah-hartman-crowned-systemd-hackfest/ 
thus the data structures in the kernel had to become far more complex 
after about 30 years of NOT being like that, because systemd.)

Rob

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [boot-time]
  2025-01-09 21:10           ` [boot-time] Rob Landley
@ 2025-01-09 21:35             ` Marko Hoyer
  2025-01-09 22:31               ` [boot-time] Rob Landley
  0 siblings, 1 reply; 21+ messages in thread
From: Marko Hoyer @ 2025-01-09 21:35 UTC (permalink / raw)
  To: Rob Landley, Bird, Tim, Shankari, linux-embedded@vger.kernel.org

Am 09.01.25 um 22:10 schrieb Rob Landley:
> On 1/9/25 07:02, Marko Hoyer wrote:
>> Am 09.01.25 um 00:26 schrieb Rob Landley:
>>> CONFIG_EARLY_PRINTK output is emitted before interrupts are enabled 
>>> (last I checked they didn't kick in until RIGHT before PID 1 gets 
>>> forked off), so the early output drivers spin waiting for the next 
>>> character to go into the buffer (the memory mapped register ones 
>>> look something like "while (MASK&*status); *output = *data++;" in a 
>>> for loop) and the printk() call can't return until all the data has 
>>> been queued to the serial hardware, so you spend a lot of time 
>>> blocked in printk().
>>
>> Hi Rob,
>>
>> thx for the explanation, helps further!
>>
>> * This implementation would explain the observed behavior.
>>
>> * What I'm not understanding yet: logs from systemd delay systemd the 
>> same way as seen in the kernel. Looks like the issue is not solved 
>> even when PID 1 is started.  As said, It can be something specific to 
>> single core SoCs or even just to RPI Zero W. I'll check ...
>
> Buffering or not in the char device is a driver choice. If your serial 
> hardware has a small FIFO buffer and the driver doesn't do its own 
> layer of output buffering (with a tasklet or something to copy the 
> data to the hardware), then the write() syscall will block waiting for 
> the data to go out. (Writes to filesystems stopped doing this back 
> around 2.0 or something, when they rewrote the vfs to be based on the 
> page cache and deentry cache, meaning ALL filesystem writes go through 
> that now unless you say O_DIRECT to _ask_ for it to block, which isn't 
> even always honored. But for some reason the TTY layer drives people 
> insane, and char devices have been given a wide berth...)

Yeah looks like this is the case for RPi Zero W. I guess there is 
probably no buffer at all in the RPi serial driver / hw since every log 
line from systemd delays systemd for ~10ms (~80ms in baud9600 case).

Btw: I can confirm the same for RPi3 w/ four cores. Difference is that 
something seems to go on in kernel in parallel to logs writing to serial 
but at a certain point the kernel is waiting again for lot of seconds 
probably for the serial device to finish transmission. Systemds delay is 
pretty much similar to the single core case.

>
> There's a similar issue with some xterms where "make -j16 build" 
> spamming lots of output to a display terminal can run significantly 
> slower than "make -j16 build | cat" because the linux pipe 
> infrastructure inserts a pipe buffer (ulimit -p says 8 but I think 
> that's _pages_ so 32k? Except in 2.6.11 it was 64k? Eh, not looking it 
> up...) so the writes from each cc instance go into the pipe buffer and 
> return immediately when it's not full, whereas writes to a terminal 
> device block until the terminal has finished updating (which includes 
> scrolling the screen).
>
> If I recall (many years ago), the kde terminal implementation included 
> a buffer of its own (immediately returned before updating), and the 
> gnome one didn't (blocked until x11 display update completed), so 
> foreground builds were faster under kde.
>
> And the gnome guys' answer was to spray everything down with 3D 
> acceleration so the GPU was scrolling the screen for you, because of 
> course it was.
>
> Anyway, serialized latency has _always_ killed throughput, because 
> it's a cost you can't get BACK. The kernel guys used to know this:
>
> https://yarchive.net/comp/linux/raid0.html
>
> Hence the old high school math problem: if you have 2 hours to go 100 
> miles and you travel the first 40 miles at 20 miles per hour, how fast 
> do you have to go the rest to make it on time? Answer: you'd have to 
> instantaneously teleport because you spent 2 hours going 40 miles and 
> your time is up with 60 miles left to go. Optimizing the wrong part 
> DOESN'T HELP.
>
Absolutely correct.
>> As said in another mail: I do not know a valid (production) use case 
>> in which kernel logs need to be dumped to a serial console. I regard 
>> this mechanism only as useful for development purposes (in which fast 
>> boot is probably not so relevant). Please correct me if I'm wrong, 
>> would be happy to learn about such use cases.
>>
>> Based on that I think option 3) is the best options for most cases.
>
> You can adjust the loglevel so they still go into dmesg but don't go 
> out to the console, which theoretically shouldn't be THAT slow? (At 
> least cpu limited rather than wait-for-hardware.)

With quiet logs go into dmesg as well.

But as said, i do not really see use cases to dump out these logs to a 
serial console in a boot time critical system on each production boot. 
Reading dmesg or systemd's journal after time critical things are done 
should be ok in most case.

So my recommendation to people who seek for fastboot potential:

* switch of kernel to console log during startup using quiet cmdline 
option if you don't need these logs over serial essentially

* if you need them, your (Rob) options could be applied reducing the 
effect as best as possible

>
>
> But keep in mind that a lot of kernel devs seem actively trying to 
> sabotage embedded development, and the dmesg metadata is now multiple 
> times the size of the actual payload.
>
> https://lkml.iu.edu/hypermail/linux/kernel/2412.0/00045.html
>
> So trimming the dmesg buffer size is probably ALSO a good idea on 
> modern embedded systems, because WHAT THE FSCK? Also, I dunno how much 
> cpu time all that metadata fapping takes. (Before systemd, this was 
> just text going into a ring buffer. But systemd couldn't cope with 
> that because 
> https://www.theregister.com/2014/04/05/torvalds_sievers_dust_up/ and 
> https://en.wikipedia.org/wiki/Systemd#History and of course 
> https://www.linux.com/news/fun-photo-greg-kroah-hartman-crowned-systemd-hackfest/ 
> thus the data structures in the kernel had to become far more complex 
> after about 30 years of NOT being like that, because systemd.)
>
> Rob
Marko

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [boot-time]
  2025-01-09 21:35             ` [boot-time] Marko Hoyer
@ 2025-01-09 22:31               ` Rob Landley
  0 siblings, 0 replies; 21+ messages in thread
From: Rob Landley @ 2025-01-09 22:31 UTC (permalink / raw)
  To: Marko Hoyer, Bird, Tim, Shankari, linux-embedded@vger.kernel.org

On 1/9/25 15:35, Marko Hoyer wrote:
> Am 09.01.25 um 22:10 schrieb Rob Landley:
>> Buffering or not in the char device is a driver choice. If your serial 
>> hardware has a small FIFO buffer and the driver doesn't do its own 
>> layer of output buffering (with a tasklet or something to copy the 
>> data to the hardware), then the write() syscall will block waiting for 
>> the data to go out. (Writes to filesystems stopped doing this back 
>> around 2.0 or something, when they rewrote the vfs to be based on the 
>> page cache and deentry cache, meaning ALL filesystem writes go through 
>> that now unless you say O_DIRECT to _ask_ for it to block, which isn't 
>> even always honored. But for some reason the TTY layer drives people 
>> insane, and char devices have been given a wide berth...)
> 
> Yeah looks like this is the case for RPi Zero W. I guess there is 
> probably no buffer at all in the RPi serial driver / hw since every log 
> line from systemd delays systemd for ~10ms (~80ms in baud9600 case).

Well there's gotta be a LITTLE fifo for input or you drop characters all 
over the place.

(That's the reason Linus started writing Linux in the first place, 
because minix's microkernel design couldn't keep up with serial input, 
the overhead of the task switch to the userspace serial receive driver 
process took too long and characters got dropped. So he wrote a terminal 
program that booted from a floppy, and then taught it to read from and 
write to the minix filesystem on his hard drive so he could download 
stuff from usenet, then taught it to run "bash" so he didn't have to 
reboot to mkdir/mv/rom, and that turned out to be 90% of the way to 
getting it to run gcc...)

And serial hardware tends to be symmetrical about that: if it's got 16 
chars of input buffer, it'll usually have 16 chars of output buffer. But 
that's less than 1/50th of a second at 9600 baud...

(Fun detail: the input fifo often has a programmable watermark so you 
can say "fill up this much before generating an interrupt, or if X timer 
ticks pass by with no more input" so you don't get an interrupt every 
character (and spend all your time entering and exiting the interrupt 
code) BUT still have some leeway between the interrupt being generated 
and the buffer filling up until it drops characters. The OUTPUT fifo can 
do something similar, only from the other end (fill it all the way up, 
then generate an interrupt when it drains to the watermark so you can 
refill it before it empties and produces a gap in the output).

Programming serial devices can get slightly complicated...

> Btw: I can confirm the same for RPi3 w/ four cores. Difference is that 
> something seems to go on in kernel in parallel to logs writing to serial 
> but at a certain point the kernel is waiting again for lot of seconds 
> probably for the serial device to finish transmission. Systemds delay is 
> pretty much similar to the single core case.

Yeah, the point of a bottleneck is that's the part you're waiting for, 
so speeding up the rest of it doesn't help so much.

Optimization is a whole thing. Spinlocks vs semaphores infuriate some 
people (you're intentionally spinning wasting time?) so sometimes you 
need to explain with analogies to get them to stop "helping".

You're standing at a train crossing, and a train is going past, it'll be 
through in 10 minutes. If you walk towards the end of the train you'll 
reach the end faster and can cross in only 7 minutes, but if you need to 
come BACK HERE to where your road is you'll wind up walking 7 minutes, 
crossing, walking 7 minutes back, and resume from here 14 minutes from 
now instead of only 10, so being busy doing the wrong thing and then 
just _undoing_ it again instead of waiting here ready to go is actually 
_slower_ than the waiting.

>>> As said in another mail: I do not know a valid (production) use case 
>>> in which kernel logs need to be dumped to a serial console. I regard 
>>> this mechanism only as useful for development purposes (in which fast 
>>> boot is probably not so relevant). Please correct me if I'm wrong, 
>>> would be happy to learn about such use cases.
>>>
>>> Based on that I think option 3) is the best options for most cases.
>>
>> You can adjust the loglevel so they still go into dmesg but don't go 
>> out to the console, which theoretically shouldn't be THAT slow? (At 
>> least cpu limited rather than wait-for-hardware.)
> 
> With quiet logs go into dmesg as well.

Which _used_ to be almost free back when it was just a ring buffer doing 
a strlen() and two memcpy() at the wrap. But these days: dunno, haven't 
benched it.

> But as said, i do not really see use cases to dump out these logs to a 
> serial console in a boot time critical system on each production boot. 
> Reading dmesg or systemd's journal after time critical things are done 
> should be ok in most case.

The switch from printk(blah) to pr_loglevel(blah) was IN THEORY so you 
could kconfig a minimum loglevel to retain, and all the macros below 
that level would drop out of the kernel at compile time, reducing the 
kernel image size significantly AND doing nice things with cache 
locality and so on. (String processing is expensive, you traverse a lot 
of data that goes through the memory bus and evicts cache lines from L1 
and L2.)

Last I checked the kernel devs had broken it for some reason, but it 
might be working again? (Or was a patch still out of tree...?) Anyway, 
if you run out of ideas that's a thing to look for.

Data going across the memory bus is another one of those bottleneck 
things, where it doesn't matter how fast your processor is clocked if 
you're waiting for memory. An order of magnitude down from where we're 
currently looking, but still a thing that comes up a lot once the real 
low hanging fruit is dealt with...

Of course there's all sorts of "Loop unrolling! No, smaller L1 cache 
footprint! Prefetch! No, spectre/meltdown!" pendulum nonsense I usually 
treat roughly the same way as the man trying to cross the street in The 
Pink Panther:

https://www.youtube.com/watch?v=nistdsACs3E

I once watched using a lookup table instead of calculating the value be 
an optimization, then a pessimization, then an optimization, then a 
pessimization, without even recompiling the binary (just upgrading the 
hardware). Doing the simple thing is always at least excusable. (And 
less to reverse engineer to understand WHY, and a good general argument 
against the endless "this is not helping". Basically Chesterton's fence 
in software: understanding why it's there lets you throw it out.)

Rob

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [boot-time]
  2025-01-08 21:19     ` [boot-time] Bird, Tim
  2025-01-08 23:26       ` [boot-time] Rob Landley
@ 2025-01-09 12:43       ` Marko Hoyer
  2025-01-09 13:27         ` [boot-time] Geert Uytterhoeven
  1 sibling, 1 reply; 21+ messages in thread
From: Marko Hoyer @ 2025-01-09 12:43 UTC (permalink / raw)
  To: Bird, Tim, Marko Hoyer, Shankari, linux-embedded@vger.kernel.org

Am 08.01.25 um 22:19 schrieb Bird, Tim:
>
>> -----Original Message-----
>> From: Marko Hoyer <mhoyer.oss-devel@freenet.de>
>> Am 08.01.25 um 19:33 schrieb Bird, Tim:
>>>> -----Original Message-----
>>>> From: Shankari <beingcap11@gmail.com>
>>>> Hi
>>>>
>>>> I wanted to provide an update on my recent contributions to the boot-time reduction project. I have recently started contributing
>> and
>>>> am working with the beagleplay. I have been analyzing the boot time of the init process. Below is the output from the system log:
>>>>
>>>> debian@BeaglePlay:~$ dmesg | grep "init process"
>>>> [    1.480490] Run /init as init process
>>>>
>>>> Moving forward, I plan to explore ways to modify the command line and further investigate the data used for SIG analysis. This will
>>>> help me gain a deeper understanding of the boot process and its performance characteristics.
>>>>
>>>> Please let me know if you have any suggestions or areas where I could focus my efforts.
>>> Hi Shankari,
>>>
>>> It sounds like you are off to a good start.  I have something that needs to be done, that I think
>>> you can help with, and that matches where I believe you are in your status with being able
>>> to evaluate the kernel.
>>>
>>> In general, there's a lot of information on the elinux wiki which is stale, which needs to be
>>> updated or archived, or maybe even just removed.
>>>
>>> This section of the Boot Time page has a lot of material in this category:
>>> https://elinux.org/Boot_Time#kernel_speedups
>>>
>>> Can you validate the information on these 2 pages:
>>>    * https://elinux.org/Disable_Console
>>>    * https://elinux.org/Preset_LPJ
>>>
>>> This would consist of reading through the material, and testing the
>>> described techniques on your machine.  This will involve booting the
>>> machine 2 ways, with a particular kernel command line option and without
>>> it, and then reporting back the final boot time for both.  You can use
>>> the timestamp for the "init process" string as your final boot time, for the
>>> purposes of this exercise.
>>>
>>> Helping me to update the elinux wiki material on boot time would be
>>> an immense help, and is one of my main goals for the boot time SIG in 2025.
>>>
>>> Don't hesitate to ask questions if you have any.
>>>
>>> BTW - you can just report your findings to me and linux-embedded list, but
>>> alternatively (and even better) would be if you could also update the wiki
>>> pages themselves with your information based on recent kernels and hardware.
>>> To do this, you will need an elinux wiki account, which you can make online on
>>> elinux wiki.org by going to this page: https://elinux.org/Special:CreateAccount
>>>
>>> Anyone else reading this who wants to also participate in this project to
>>> update the elinux wiki boot time information, please contact me.
>>> Thanks.
>>>     -- Tim
>>>
>> Hi Tim, all,
>>
>> first time I'm posting here so hopefully everything is fine w/ my mail
>> format / attachment and so on ... If not, please give me some feedback
>> and guidance.
> Marko,
>
> Thanks for this great data!
>
> In general, I don't see a lot of attachments on kernel mailing lists.
> They don't bother me, and we aren't CC'ing LKML (that's a separate
> issue we should discuss - developers outside of embedded might
> want to see this data).  I'll check later and see what lore does with this,
> but if no one complains, I don't see a problem with it.  If someone
> does complain, I can provide file hosting either on the elinux wiki
> or the boot-time wiki, and we can link attachments like you've
> provided on this message from one of those places (to avoid
> putting attachments on kernel mailing lists).
Ok sounds good. I  don't have really a place to publish data, so would 
be good if you can find a way ...
>   
>> To the "disable console" topic: I have some numbers in place for an RPI
>> Zero W, find dmesg dumps and systemd-analyze plots attached.
>>
>>
>> Environment:
>>
>> - RPi Zero W, kernel 5.15.24, systemd 247.3, customized debian
>>
>> - onboard UART used
>>
>>
>> Cases:
>>
>> - #1 quiet: cmdline w/ quiet, no kernel or userspace output up to the
>> serial login console
>>
>> - #2 normal: cmdline w/o quiet, serial console @115200 baud
>>
>> - #3 normal_baud9600: cmdline w/o quiet, serial console @9600 baud
>>
>>
>> Main outcomes:
>>
>> - kernel timestamps "Run /sbin/init as init process"
>>
>> #1: "1.714458", #2: "3.011701", #3: "16.108101"
> Wow from 1.7 seconds to 16.1 seconds.  That's a pretty huge
> difference.  I guess this particular technique is still
> very relevant!
>
>> Interpretation:
>>
>> * enabled serial console has significant impact in kernel boot time
>>
>> * reducing baud to 9600 induced some side effect, not sure what it is ...
> Did you see any other weird behavior besides the huge slowdown?
> I'll take a look at the amount of characters in your dmesg output and
> see if it can be linearly correlated to the baud rate, or if it seems something
> else is going on.

Take a look into the kmesg logs. Looks like there is a 8s delay at a 
certain point:

[    5.897018] input: C-Media Electronics Inc. USB Audio Device as 
/devices/platform/soc/20980000.usb/usb1/1-1/1-1:1.3/0003:0D8C:0014.0001/input/input1
[    6.016086] hid-generic 0003:0D8C:0014.0001: input,hidraw0: USB HID 
v1.00 Device [C-Media Electronics Inc. USB Audio Device] on 
usb-20980000.usb-1/input3[   14.012174] printk: console [ttyS0] enabled
[   14.064965] bcm2835-wdt bcm2835-wdt: Broadcom BCM2835 watchdog timer
[   14.142795] bcm2835-power bcm2835-power: Broadcom BCM2835 power 
domains driver
[   14.232013] mmc-bcm2835 20300000.mmcnr: mmc_debug:0 mmc_debug2:0

Not sure if it really makes sense to dig further into this issue. Might 
be something in the serial driver of the RPI. I don't see really a valid 
use case, 9600 baud is nothing you really need and in anyway pushing 
serial logs out via serial is nothing really needed in productive use 
cases, just for development (correct me if I'm wrong).

>
>> -  systemd startup
>>
>> * systemd drops 2 log lines per started unit to the console
>>
>> * seems serial output is not implemented asynchronously (see steps of
>> units in sd plot, ~10ms per unit w/ baud 115200, ~80ms per unit w/ baud9600
> I'm not sure what you're referring to here.  Is the 'unit' you are talking about
> the graphing grid size, or are you referring to systemd units?  The grid size
> seems to be 10ms per minor grid line in each plot.

unit -> systemd unit

* grid size per minor grid line is 100ms to me (1s per main grid line -> 
see top of plot)

* take two units "user.slice" and "remote-fs.target"

* they should have basically no delay between each other -> you can see 
it in quiet case

* w/ serial on, you see delays (10ms, 80ms) which in fact are caused 
just by dropping the log lines to serial kernel log

* overall delay in user space startup sums up significantly

>
>> Side notes:
>>
>> * I remember similar behavior w/ imx.6 SoCs
>>
>> * Maybe this issues is not seen on other SoCs (maybe w/ another hw
>> implementation of the UART)
>>
>> * Maybe this issues is only seen in single core machines (I can double
>> check w/ a PI3 or orange pi zero once)
>>
>> Hope this helps.
> It helps a lot.  Thanks for this data!
>
> I think the 'Disable Console' technique will continue to stay as one of the first
> things we recommend for developers working on their board's boot time.
Yes I agree. Let me check out other boards and especially multi-core 
systems if the issues is happening there as well ...
>
> To others developers - I'd like to see more data like this on other systems
> as well.  So please keep submitting your data to this list.
>
> Thanks,
>   -- Tim
>
Regards,

Marko


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [boot-time]
  2025-01-09 12:43       ` [boot-time] Marko Hoyer
@ 2025-01-09 13:27         ` Geert Uytterhoeven
  0 siblings, 0 replies; 21+ messages in thread
From: Geert Uytterhoeven @ 2025-01-09 13:27 UTC (permalink / raw)
  To: Marko Hoyer
  Cc: Bird, Tim, Marko Hoyer, Shankari, linux-embedded@vger.kernel.org

Hi Marko,

On Thu, Jan 9, 2025 at 1:49 PM Marko Hoyer <Marko.Hoyer@freenet.de> wrote:
> Take a look into the kmesg logs. Looks like there is a 8s delay at a
> certain point:
>
> [    5.897018] input: C-Media Electronics Inc. USB Audio Device as
> /devices/platform/soc/20980000.usb/usb1/1-1/1-1:1.3/0003:0D8C:0014.0001/input/input1
> [    6.016086] hid-generic 0003:0D8C:0014.0001: input,hidraw0: USB HID
> v1.00 Device [C-Media Electronics Inc. USB Audio Device] on
> usb-20980000.usb-1/input3[   14.012174] printk: console [ttyS0] enabled
> [   14.064965] bcm2835-wdt bcm2835-wdt: Broadcom BCM2835 watchdog timer
> [   14.142795] bcm2835-power bcm2835-power: Broadcom BCM2835 power
> domains driver
> [   14.232013] mmc-bcm2835 20300000.mmcnr: mmc_debug:0 mmc_debug2:0

Those eight seconds are the time needed for printing all
previously-collected and time-stamped kernel log lines to the serial
console.

BTW, only slightly related, but I have no better place to vent ;-)
On OrangeCrab running a 64 MHz VexRiscv softcore, I noticed another
big delay. With initcall_debug:

    initcall pty_init+0x0/0x3c8 returned 0 after 185427581 usecs

Apparently this is due to:

    CONFIG_LEGACY_PTYS=y
    CONFIG_LEGACY_PTY_COUNT=256

So yes, almost one one second to set up one legacy pty, ugh...
Disabling CONFIG_LEGACY_PTYS fixed the issue.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [boot-time]
  2025-01-08 18:33 ` [boot-time] Bird, Tim
  2025-01-08 20:39   ` [boot-time] Marko Hoyer
@ 2025-01-08 23:00   ` Rob Landley
  2025-01-09  2:23     ` [boot-time] Bird, Tim
  2025-01-10 22:46   ` [boot-time] Marko Hoyer
  2 siblings, 1 reply; 21+ messages in thread
From: Rob Landley @ 2025-01-08 23:00 UTC (permalink / raw)
  To: Bird, Tim, Shankari, linux-embedded@vger.kernel.org

On 1/8/25 12:33, Bird, Tim wrote:
> In general, there's a lot of information on the elinux wiki which is stale, which needs to be
> updated or archived, or maybe even just removed.
> 
> This section of the Boot Time page has a lot of material in this category:
> https://elinux.org/Boot_Time#kernel_speedups

That page says "grab-boot-data.sh - see 
http://birdcloud.org/boot-time/Boot-time_Tools" but the link is 404.

> Anyone else reading this who wants to also participate in this project to
> update the elinux wiki boot time information, please contact me.

Maybe? What do you need?

Rob

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [boot-time]
  2025-01-08 23:00   ` [boot-time] Rob Landley
@ 2025-01-09  2:23     ` Bird, Tim
  0 siblings, 0 replies; 21+ messages in thread
From: Bird, Tim @ 2025-01-09  2:23 UTC (permalink / raw)
  To: Rob Landley, Shankari, linux-embedded@vger.kernel.org

> -----Original Message-----
> From: Rob Landley <rob@landley.net>
> On 1/8/25 12:33, Bird, Tim wrote:
> > In general, there's a lot of information on the elinux wiki which is stale, which needs to be
> > updated or archived, or maybe even just removed.
> >
> > This section of the Boot Time page has a lot of material in this category:
> > https://elinux.org/Boot_Time#kernel_speedups
> 
> That page says "grab-boot-data.sh - see
> http://birdcloud.org/boot-time/Boot-time_Tools" but the link is 404.

Hmmm.  That should be 'https', not 'http'.  When I click on it
my Chrome browser takes me to the https site (silently autocorrecting it?)
I've fixed the link on the elinux site.   Sorry about that.

Here is a direct link to the tool, but it's worth getting to the page to read
what kernel command line options it needs.
https://birdcloud.org/boot-time/Boot-time_Tools

https://birdcloud.org/boot-time-files/grab-boot-data.sh

> 
> > Anyone else reading this who wants to also participate in this project to
> > update the elinux wiki boot time information, please contact me.
> 
> Maybe? What do you need?

I am always looking for more data.  If you have a system you are running
Linux on, and can edit the kernel command, I'd be very happy to have
you run grab-boot-data.sh.  It will collect a bunch of boot data and system
data, and send it to the birdcloud.org wiki, where I'm doing data analysis.

You can run the tool in a "don't send" mode and look at what it's about
to send, if you are concerned about privacy.

Thanks!
 -- Tim

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [boot-time]
  2025-01-08 18:33 ` [boot-time] Bird, Tim
  2025-01-08 20:39   ` [boot-time] Marko Hoyer
  2025-01-08 23:00   ` [boot-time] Rob Landley
@ 2025-01-10 22:46   ` Marko Hoyer
  2025-01-10 23:15     ` [boot-time] Rob Landley
  2 siblings, 1 reply; 21+ messages in thread
From: Marko Hoyer @ 2025-01-10 22:46 UTC (permalink / raw)
  To: Bird, Tim, Shankari, linux-embedded@vger.kernel.org

Hello Tim, all,

Am 08.01.25 um 19:33 schrieb Bird, Tim:
>> -----Original Message-----
>> From: Shankari <beingcap11@gmail.com>
>> Hi
>>
>> I wanted to provide an update on my recent contributions to the boot-time reduction project. I have recently started contributing and
>> am working with the beagleplay. I have been analyzing the boot time of the init process. Below is the output from the system log:
>>
>> debian@BeaglePlay:~$ dmesg | grep "init process"
>> [    1.480490] Run /init as init process
>>
>> Moving forward, I plan to explore ways to modify the command line and further investigate the data used for SIG analysis. This will
>> help me gain a deeper understanding of the boot process and its performance characteristics.
>>
>> Please let me know if you have any suggestions or areas where I could focus my efforts.
> Hi Shankari,
>
> It sounds like you are off to a good start.  I have something that needs to be done, that I think
> you can help with, and that matches where I believe you are in your status with being able
> to evaluate the kernel.
>
> In general, there's a lot of information on the elinux wiki which is stale, which needs to be
> updated or archived, or maybe even just removed.
>
> This section of the Boot Time page has a lot of material in this category:
> https://elinux.org/Boot_Time#kernel_speedups
>
> Can you validate the information on these 2 pages:
>   * https://elinux.org/Disable_Console
>   * https://elinux.org/Preset_LPJ
>
> This would consist of reading through the material, and testing the
> described techniques on your machine.  This will involve booting the
> machine 2 ways, with a particular kernel command line option and without
> it, and then reporting back the final boot time for both.  You can use
> the timestamp for the "init process" string as your final boot time, for the
> purposes of this exercise.
>
> Helping me to update the elinux wiki material on boot time would be
> an immense help, and is one of my main goals for the boot time SIG in 2025.
>
> Don't hesitate to ask questions if you have any.
>
> BTW - you can just report your findings to me and linux-embedded list, but
> alternatively (and even better) would be if you could also update the wiki
> pages themselves with your information based on recent kernels and hardware.
> To do this, you will need an elinux wiki account, which you can make online on
> elinux wiki.org by going to this page: https://elinux.org/Special:CreateAccount
>
> Anyone else reading this who wants to also participate in this project to
> update the elinux wiki boot time information, please contact me.

I'm just going through the "userspace application speed-up" parts in the 
wiki.


To the udev stuff ("Avoid udev, it takes ..." and "If you still like udev"):

* both hints seem to be a bit old fashioned to me taking the kernels 
devtmpfs into account

* In my experience, device nodes are created by the kernel today, I 
never needed to use any kind of mknod call anymore

* udevs job today is mostly about

     - setup of access rights and user and group ids of device nodes

     - creating symlinks (e.g. for partitions named w/ their UUIDs, ...)

     - loading kernel modules

     - annotating metadata to uevents for interested userspace applications

* In any way, udev is still very expensive when used the conventional way

     - Once udev is started, a trigger is send (by 
systemd-udev-trigger.service or udevadm trigger, for those who don't 
like systemd ;)) through the whole device tree to let all devices sent 
uevents again so that udev can work on each device. This causes massive 
CPU load for e few seconds you don't want to spend so early in the morning.

* There are options like selective triggering or moving the trigger back 
in time and do the setup manuelly ...

So I think it is worth talking a bit about udev and options to deal with 
it but adapting thinks a bit to todays world. I'm currently registering 
for the wiki, maybe I can setup an initial page at some time ...

Btw: If I'm wrong with my experiene about devtmpf and mknode, please 
correct me. And feel free to add other ideas to cope w/ udev in context 
of startup time critical systems ...

Regards,

Marko



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [boot-time]
  2025-01-10 22:46   ` [boot-time] Marko Hoyer
@ 2025-01-10 23:15     ` Rob Landley
  2025-01-11  8:40       ` [boot-time] Marko Hoyer
  0 siblings, 1 reply; 21+ messages in thread
From: Rob Landley @ 2025-01-10 23:15 UTC (permalink / raw)
  To: Marko Hoyer, Bird, Tim, Shankari, linux-embedded@vger.kernel.org

On 1/10/25 16:46, Marko Hoyer wrote:
> So I think it is worth talking a bit about udev and options to deal with 
> it but adapting thinks a bit to todays world. I'm currently registering 
> for the wiki, maybe I can setup an initial page at some time ...

busybox mdev is a lot lighter weight, and can do pretty elaborate things.

https://wiki.alpinelinux.org/wiki/Mdev

https://github.com/fff7d1bc/mdev-like-a-boss

The theory these days is you mount devtmpfs and then use mdev to add 
scriptable behavior to device insertion/removal events via netlink 
notifications.

Rob

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [boot-time]
  2025-01-10 23:15     ` [boot-time] Rob Landley
@ 2025-01-11  8:40       ` Marko Hoyer
  2025-01-11 17:56         ` [boot-time] Rob Landley
  0 siblings, 1 reply; 21+ messages in thread
From: Marko Hoyer @ 2025-01-11  8:40 UTC (permalink / raw)
  To: Rob Landley, Marko Hoyer, Bird, Tim, Shankari,
	linux-embedded@vger.kernel.org


Am 11.01.25 um 00:15 schrieb Rob Landley:
> On 1/10/25 16:46, Marko Hoyer wrote:
>> So I think it is worth talking a bit about udev and options to deal 
>> with it but adapting thinks a bit to todays world. I'm currently 
>> registering for the wiki, maybe I can setup an initial page at some 
>> time ...
>
> busybox mdev is a lot lighter weight, and can do pretty elaborate things.
>
> https://wiki.alpinelinux.org/wiki/Mdev
>
> https://github.com/fff7d1bc/mdev-like-a-boss
>
> The theory these days is you mount devtmpfs and then use mdev to add 
> scriptable behavior to device insertion/removal events via netlink 
> notifications.
>
> Rob

Hey Rob,

thx for the hint. Sounds good!

How is the enumeration of cold plugged devices realized in mdev? Is it 
similar to udev triggering all devices in the complete device tree?

Marko


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [boot-time]
  2025-01-11  8:40       ` [boot-time] Marko Hoyer
@ 2025-01-11 17:56         ` Rob Landley
  2025-01-11 18:57           ` [boot-time] Bird, Tim
  0 siblings, 1 reply; 21+ messages in thread
From: Rob Landley @ 2025-01-11 17:56 UTC (permalink / raw)
  To: Marko Hoyer, Bird, Tim, Shankari, linux-embedded@vger.kernel.org

On 1/11/25 02:40, Marko Hoyer wrote:
> 
> Am 11.01.25 um 00:15 schrieb Rob Landley:
>> On 1/10/25 16:46, Marko Hoyer wrote:
>>> So I think it is worth talking a bit about udev and options to deal 
>>> with it but adapting thinks a bit to todays world. I'm currently 
>>> registering for the wiki, maybe I can setup an initial page at some 
>>> time ...
>>
>> busybox mdev is a lot lighter weight, and can do pretty elaborate things.
>>
>> https://wiki.alpinelinux.org/wiki/Mdev
>>
>> https://github.com/fff7d1bc/mdev-like-a-boss
>>
>> The theory these days is you mount devtmpfs and then use mdev to add 
>> scriptable behavior to device insertion/removal events via netlink 
>> notifications.
>>
>> Rob
> 
> Hey Rob,
> 
> thx for the hint. Sounds good!
> 
> How is the enumeration of cold plugged devices realized in mdev? Is it 
> similar to udev triggering all devices in the complete device tree?

mdev -s will scan /sys for current devices, it can be run as a hotplug 
helper, and there's netlink support somewhere but I've never used it.

https://busybox.net/downloads/BusyBox.html#mdev

The hotplug helper doesn't require a persistent demon like netlink does 
(or udev), at boot time you "mdev -s" to scan, then when you register a 
hotplug helper the kernel spawns a new process with environment 
variables whenever there's something new to do. The downside is if a lot 
of events come in rapidly it can spawn a lot of processes in parallel 
which makes sequencing difficult, which is why the netlink API exists as 
an alternative, but that doesn't really happen in systems I've put 
together, so...

I wrote some introductory documentation about this back in 2007. It's a 
bit stale, and never REALLY got finished, but...

https://landley.net/kdocs/local/hotplug2.html

That's the context within which sysfs happened.

/dev and /sys serve different purposes: /dev shows the device drivers' 
view of the system, full of devices that don't actually exist like
/dev/null, or five devices for one piece of hardware (partitions), 
meanwhile a device that shows up but doesn't have a driver bound to it 
yet won't be in /dev at all. This is half the reason the old 
demon-managed "devfs" failed, it was CONCEPTUALLY wrong. (The other half 
was it used crazy solaris names for everything so people looking for 
/dev/hda1 couldn't find it and had to deal with some 9 character long 
monstrosity instead. Plus Linux isn't a microkernel so expecting a 
userspace demon to be necessary for /dev to _exist_ was just silly, and 
also led to some problems booting the system because were does that 
demon get its information from, eh?)

sysfs is a hardware view of the system, where /sys/devices/pci0000:00 is 
full of what bus probing found, and  /sys/block/sda/dev contains "8:0" 
(major and minor number) when a driver binds to something and goes 
"mine", but something still had to mknod that.

devtmpfs is a synthetic filesystem that just DOES that, when a new "dev" 
node shows up under /sys/class or /sys/block it creates the apropriate 
char or block device under dev with that major/minor and the same name 
the directory uses (which is provided by the driver).

Oh, "synthetic" filesystem is one of the four times of filesystem: block 
backed, char/pipe backed, ram backed, and synthetic. I wrote 
documentation about that a very long time ago...

https://landley.net/toybox/doc/mount.html

Rob

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [boot-time]
  2025-01-11 17:56         ` [boot-time] Rob Landley
@ 2025-01-11 18:57           ` Bird, Tim
  2025-01-12  1:03             ` [boot-time] Rob Landley
  0 siblings, 1 reply; 21+ messages in thread
From: Bird, Tim @ 2025-01-11 18:57 UTC (permalink / raw)
  To: Rob Landley, Marko Hoyer, Shankari,
	linux-embedded@vger.kernel.org



> -----Original Message-----
> From: Rob Landley <rob@landley.net>
> 
> On 1/11/25 02: 40, Marko Hoyer wrote: > > Am 11. 01. 25 um 00: 15 schrieb Rob Landley: >> On 1/10/25 16: 46, Marko Hoyer wrote: >>>
> So I think it is worth talking a bit about udev and options to deal >>> with it but
> On 1/11/25 02:40, Marko Hoyer wrote:
> >
> > Am 11.01.25 um 00:15 schrieb Rob Landley:
> >> On 1/10/25 16:46, Marko Hoyer wrote:
> >>> So I think it is worth talking a bit about udev and options to deal
> >>> with it but adapting thinks a bit to todays world. I'm currently
> >>> registering for the wiki, maybe I can setup an initial page at some
> >>> time ...
> >>
> >> busybox mdev is a lot lighter weight, and can do pretty elaborate things.
> >>
> >> https://wiki.alpinelinux.org/wiki/Mdev
> >>
> >> https://github.com/fff7d1bc/mdev-like-a-boss
> >>
> >> The theory these days is you mount devtmpfs and then use mdev to add
> >> scriptable behavior to device insertion/removal events via netlink
> >> notifications.
> >>
> >> Rob
> >
> > Hey Rob,
> >
> > thx for the hint. Sounds good!
> >
> > How is the enumeration of cold plugged devices realized in mdev? Is it
> > similar to udev triggering all devices in the complete device tree?
> 
> mdev -s will scan /sys for current devices, it can be run as a hotplug
> helper, and there's netlink support somewhere but I've never used it.
> 
> https://busybox.net/downloads/BusyBox.html#mdev
> 
> The hotplug helper doesn't require a persistent demon like netlink does
> (or udev), at boot time you "mdev -s" to scan, then when you register a
> hotplug helper the kernel spawns a new process with environment
> variables whenever there's something new to do. The downside is if a lot
> of events come in rapidly it can spawn a lot of processes in parallel
> which makes sequencing difficult, which is why the netlink API exists as
> an alternative, but that doesn't really happen in systems I've put
> together, so...
> 
> I wrote some introductory documentation about this back in 2007. It's a
> bit stale, and never REALLY got finished, but...
> 
> https://landley.net/kdocs/local/hotplug2.html
> 
> That's the context within which sysfs happened.
> 
> /dev and /sys serve different purposes: /dev shows the device drivers'
> view of the system, full of devices that don't actually exist like
> /dev/null, or five devices for one piece of hardware (partitions),
> meanwhile a device that shows up but doesn't have a driver bound to it
> yet won't be in /dev at all. This is half the reason the old
> demon-managed "devfs" failed, it was CONCEPTUALLY wrong. (The other half
> was it used crazy solaris names for everything so people looking for
> /dev/hda1 couldn't find it and had to deal with some 9 character long
> monstrosity instead. Plus Linux isn't a microkernel so expecting a
> userspace demon to be necessary for /dev to _exist_ was just silly, and
> also led to some problems booting the system because were does that
> demon get its information from, eh?)
> 
> sysfs is a hardware view of the system, where /sys/devices/pci0000:00 is
> full of what bus probing found, and  /sys/block/sda/dev contains "8:0"
> (major and minor number) when a driver binds to something and goes
> "mine", but something still had to mknod that.
> 
> devtmpfs is a synthetic filesystem that just DOES that, when a new "dev"
> node shows up under /sys/class or /sys/block it creates the apropriate
> char or block device under dev with that major/minor and the same name
> the directory uses (which is provided by the driver).
> 
> Oh, "synthetic" filesystem is one of the four times of filesystem: block
> backed, char/pipe backed, ram backed, and synthetic. I wrote
> documentation about that a very long time ago...
> 
> https://landley.net/toybox/doc/mount.html
Hey Rob, This is a great review of /dev, /sys and the different
ways that /dev gets populated.

For a lot of embedded Linux devices, the only bus where
new items can show up dynamically is USB.

I've always thought the best solution (in terms of boot time)
was to use static nodes in /dev during early boot (that is, just
mknod the /dev nodes in the rootfs manually, and have them be present
before the kernel even runs).  No dynamic discovery or boot-time
population of /dev needed.  Then, sometime later, use either
mdev or devtmpfs to accumulate (and remove?) other
runtime-plugged devices.  This can be done after the time-critical
phase of booting.

Does this overall approach work, or is there some in-kernel
connections that may be missing if the dynamic tools are
not used from startup?

Thanks,
 -- Tim


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [boot-time]
  2025-01-11 18:57           ` [boot-time] Bird, Tim
@ 2025-01-12  1:03             ` Rob Landley
  2025-01-12 10:11               ` [boot-time] Marko Hoyer
  0 siblings, 1 reply; 21+ messages in thread
From: Rob Landley @ 2025-01-12  1:03 UTC (permalink / raw)
  To: Bird, Tim, Marko Hoyer, Shankari, linux-embedded@vger.kernel.org

On 1/11/25 12:57, Bird, Tim wrote:
> Hey Rob, This is a great review of /dev, /sys and the different
> ways that /dev gets populated.

Feel free to link stuff from wikis or some such. The newest of those 
documents was written in 2007.

> For a lot of embedded Linux devices, the only bus where
> new items can show up dynamically is USB.

Yup, /sys/bus/usb/devices is in there too and when a driver binds to 
them, they wind up in /sys/block and such as well. (you USED to have to 
seprately mount a usbfs under /sys but they finally acknowledged that 
was silly about 5 years ago, hence 
https://askubuntu.com/questions/1218321/if-usbfs-has-been-deprecated-then-why-is-sys-bus-usb-drivers-usbfs-directory-p)

When a driver DOESN'T automatically bind to them it gets a bit 
complicated, and one of the things mdev can be configured to do is act 
as a firmware loader! Which is just... Ahem, there are YEARS of poor 
design decisions the kernel guys made, where they ignored a mechanism 
they already had an implemented something more complicated. The 
mechanism whereby the kernel opens a firmware file and read it directly 
out of the filesystem instead of calling a hotplug helper was... I'm 
just going to gloss over that.

Anyway, I saw patches go by ala 
https://patchwork.ozlabs.org/project/buildroot/patch/1436188175-7912-1-git-send-email-luca@lucaceresoli.net/ 
which says it's from 2015. I haven't really tried to do it because I 
often know how the plumbing works and usually just implement a ten line 
hack rather than looking up how to configure the more generic tool. 
(Even the generic tool grew out of something I wrote 10 years earlier...)

Anyway, the kernel's "request module" plumbing tends to do a "give me 
usb-vendorID-deviceID thing, and then there's alias plumbing lookup that 
figures out what module name to insmod for that, and at various points 
I've seen said alias lookup plumbing A) in the kernel, B) in module 
headers, C) in modprobe config files under /etc or /lib or something.

*shrug* I build static kernels when given a choice, and don't source 
hardware that needs drivers WITHOUT built-in firmware, so I am WAY out 
of date on that stuff. I remember enough to look it up but not the 
details off the top of my head. And I say that as someone who really 
SHOULD care more:

http://lists.landley.net/pipermail/toybox-landley.net/2024-October/030549.html

It's on the todo list...

> I've always thought the best solution (in terms of boot time)
> was to use static nodes in /dev during early boot (that is, just
> mknod the /dev nodes in the rootfs manually, and have them be present
> before the kernel even runs).  No dynamic discovery or boot-time
> population of /dev needed.

The system knows what devices are available. While you can mknod a 
major:minor node the kernel doesn't have a driver for, if you open it 
you get some sort of -EWTF where the kernel goes "nope".

The kernel has a CONFIG_DEVTMPFS_MOUNT that automatically mounts 
devtmpfs on /dev, but for SOME reason it doesn't apply to initramfs. 
I've been irregularly posting a patch to MAKE it apply on and off for 
most of a decade now:

https://lkml.iu.edu/hypermail/linux/kernel/2005.1/09399.html

And it's part of my mkroot kernel patches:

https://landley.net/bin/mkroot/latest/linux-patches/0003-Wire-up-CONFIG_DEVTMPFS_MOUNT-to-initramfs.patch

But Greg KH. Oh well...

> Then, sometime later, use either
> mdev or devtmpfs to accumulate (and remove?) other
> runtime-plugged devices.  This can be done after the time-critical
> phase of booting.
> 
> Does this overall approach work, or is there some in-kernel
> connections that may be missing if the dynamic tools are
> not used from startup?

I mean it more or less works, it's just... pointless manual maintenance 
of something the kernel does for you in a very small amount of code? (In 
devtmpfs, the /dev node being there means something. In a static /dev, 
it doesn't.)

So: I blather on a lot about my mkroot project, which is a 400 line bash 
script that builds tiny linux system that boots to shell prompt (mostly 
under qemu) on a dozen different architectures. And the init script in 
that starts here:

https://github.com/landley/toybox/blob/master/mkroot/mkroot.sh#L102

And you see how the script does this setup (which is only needed when 
you don't apply my patch to the kernel):

if ! mountpoint -q dev; then
   mount -t devtmpfs dev dev
   [ $$ -eq 1 ] && ! 2>/dev/null <0 && exec 0<>/dev/console 1>&0 2>&1
   for i in ,fd /0,stdin /1,stdout /2,stderr
   do ln -sf /proc/self/fd${i/,*/} dev/${i/*,/}; done
   mkdir -p dev/shm
   chmod +t /dev/shm
fi

(Don't ask me why devtmfs doesn't automatically have a "shm" directory 
with the sticky bit, it's a subclass of tmpfs so it does the right thing 
when it's there. And the middle two lines are just making 
/dev/{stdin,stdout,stderr} because toysh doesn't special case /dev/stdin 
like bash does so needs it in the filesystem if you're gonna use that.)

But that exec redirect line hit a bug, because when you don't have 
devtmpfs automounting but ALSO don't have /dev/console in your 
statically linked initramfs image... oh here, I explained it when I 
fixed it:

https://github.com/landley/toybox/commit/0b2d5c2bb3f1

https://landley.net/notes-2024.html#06-08-2024

tl;dr in usr/main.c the kernel tries to open /dev/console and fails if 
it's not already there, so PID 1 starts with stdin, stdout, and stderr 
closed, so if something goes wrong in early boot init can't tell you why 
it failed. This ONLY happens with static initramfs (created with cpio as 
a normal user and you can't mknod without root access), because of 
COURSE the kernel has two different codepaths for static vs dynamic, and 
the dynamic one does a manual fixup for this issue. No really!

https://github.com/torvalds/linux/blob/master/init/noinitramfs.c#L18

And of course the code doing the fixup runs for ANY system that doesn't 
have a static linked initramfs, including one where the bootloader 
points it at an EXTERNAL cpio.gz image like qemu -initrd blah.cpio.gz, 
because it checks for the external one AFTER doing the fixup. (Pop quiz: 
if you have static _and_ external initrd, and they include the same 
file, does the old one prevent the new one from extracting, does the new 
one replace the old one, or does the new one APPEND to the old one? At 
various points in history, it's done ALL THREE! I forget which is 
current, replace I think?) So external initramfs and built-in initramfs 
behave DIFFERENTLY in a subtle sharp edge ath I've complained at them 
about for YEARS...

*shrug* I make this stuff work in as simple a way as I know how. Been 
doing it for an embarrassingly long time now. But you _reach_ simple by 
process of elimination, and proving a negative is a lot of work.

> Thanks,
>   -- Tim
> 

Rob

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [boot-time]
  2025-01-12  1:03             ` [boot-time] Rob Landley
@ 2025-01-12 10:11               ` Marko Hoyer
  2025-01-12 13:39                 ` [boot-time] Francesco Valla
  2025-01-12 18:35                 ` [boot-time] Rob Landley
  0 siblings, 2 replies; 21+ messages in thread
From: Marko Hoyer @ 2025-01-12 10:11 UTC (permalink / raw)
  To: Rob Landley, Bird, Tim, Marko Hoyer, Shankari,
	linux-embedded@vger.kernel.org


Am 12.01.25 um 02:03 schrieb Rob Landley:
> On 1/11/25 12:57, Bird, Tim wrote:
>> Hey Rob, This is a great review of /dev, /sys and the different
>> ways that /dev gets populated.
>
> Feel free to link stuff from wikis or some such. The newest of those 
> documents was written in 2007.
>
>> For a lot of embedded Linux devices, the only bus where
>> new items can show up dynamically is USB.

SDCARD readers connected via MMC are common in automtove head units as 
well ...


>
> Yup, /sys/bus/usb/devices is in there too and when a driver binds to 
> them, they wind up in /sys/block and such as well. (you USED to have 
> to seprately mount a usbfs under /sys but they finally acknowledged 
> that was silly about 5 years ago, hence 
> https://askubuntu.com/questions/1218321/if-usbfs-has-been-deprecated-then-why-is-sys-bus-usb-drivers-usbfs-directory-p)
>
> When a driver DOESN'T automatically bind to them it gets a bit 
> complicated, and one of the things mdev can be configured to do is act 
> as a firmware loader! Which is just... Ahem, there are YEARS of poor 
> design decisions the kernel guys made, where they ignored a mechanism 
> they already had an implemented something more complicated. The 
> mechanism whereby the kernel opens a firmware file and read it 
> directly out of the filesystem instead of calling a hotplug helper 
> was... I'm just going to gloss over that.

WIFI & Bluetooth devices often use this firmware mechanism. And yes I 
agree, it looks a bit ** ugly** seeing the kernel loading a firmware 
file from /lib/firmware  searching it in the root file system w/o 
knowing the state of it during boot ... For WIFI and bluetooth I do not 
see a big issue here since I'd prevent putting such features on a 
critical chain by system design in any way since bringing them up and 
(re)connecting external devices is time consuming by nature. Nothing you 
shall need to wait for ...

> Anyway, I saw patches go by ala 
> https://patchwork.ozlabs.org/project/buildroot/patch/1436188175-7912-1-git-send-email-luca@lucaceresoli.net/ 
> which says it's from 2015. I haven't really tried to do it because I 
> often know how the plumbing works and usually just implement a ten 
> line hack rather than looking up how to configure the more generic 
> tool. (Even the generic tool grew out of something I wrote 10 years 
> earlier...)
>
> Anyway, the kernel's "request module" plumbing tends to do a "give me 
> usb-vendorID-deviceID thing, and then there's alias plumbing lookup 
> that figures out what module name to insmod for that, and at various 
> points I've seen said alias lookup plumbing A) in the kernel, B) in 
> module headers, C) in modprobe config files under /etc or /lib or 
> something.
>
> *shrug* I build static kernels when given a choice, and don't source 
> hardware that needs drivers WITHOUT built-in firmware, so I am WAY out 
> of date on that stuff.

Compiling in modules vs. loading them later from user space is a 
trade-off. The effect of putting stuff into modules is to keep the 
kernel small which helps you in the "unpacking & loading kernel" phase 
before the kernel is actually started. Having an 1MB unpacked kernel is 
significantly a difference to a 5MB one. On the other hand, my 
experience is that there is lot of overhead (CPU time and IO) loading 
modules from user space. So it really only makes sense, if you have 
drivers to load at a point in time during startup where you have enough 
time and resources left.

> I remember enough to look it up but not the details off the top of my 
> head. And I say that as someone who really SHOULD care more:
>
> http://lists.landley.net/pipermail/toybox-landley.net/2024-October/030549.html 
>
>
> It's on the todo list...
>
>> I've always thought the best solution (in terms of boot time)
>> was to use static nodes in /dev during early boot (that is, just
>> mknod the /dev nodes in the rootfs manually, and have them be present
>> before the kernel even runs).  No dynamic discovery or boot-time
>> population of /dev needed.
>
> The system knows what devices are available. While you can mknod a 
> major:minor node the kernel doesn't have a driver for, if you open it 
> you get some sort of -EWTF where the kernel goes "nope".
>
> The kernel has a CONFIG_DEVTMPFS_MOUNT that automatically mounts 
> devtmpfs on /dev, but for SOME reason it doesn't apply to initramfs. 
> I've been irregularly posting a patch to MAKE it apply on and off for 
> most of a decade now:
>
> https://lkml.iu.edu/hypermail/linux/kernel/2005.1/09399.html
>
> And it's part of my mkroot kernel patches:
>
> https://landley.net/bin/mkroot/latest/linux-patches/0003-Wire-up-CONFIG_DEVTMPFS_MOUNT-to-initramfs.patch 
>
>
> But Greg KH. Oh well...
>
>> Then, sometime later, use either
>> mdev or devtmpfs to accumulate (and remove?) other
>> runtime-plugged devices.  This can be done after the time-critical
>> phase of booting.
>>
>> Does this overall approach work, or is there some in-kernel
>> connections that may be missing if the dynamic tools are
>> not used from startup?
>
> I mean it more or less works, it's just... pointless manual 
> maintenance of something the kernel does for you in a very small 
> amount of code? (In devtmpfs, the /dev node being there means 
> something. In a static /dev, it doesn't.)

I agree. There is kind of dynamic device enumeration done by the kernel 
drivers anyway once loaded. Any data structures to devices are build up 
internally. Nothing you can save ...

I'm even not sure how devtmpfs can be combined w/ your static devnodes 
you created in any kind of persistent partition. And if you even can get 
the kernel accepting your partition to use as /dev, you need to have it 
writeable for the case of dynamics you might need (usb for instance) 
which does not really go well with a read only RFS ... You could ... 
overlay fs ... well no, I think this goes into a wrong direction -> too 
complicated ;)

>
> So: I blather on a lot about my mkroot project, which is a 400 line 
> bash script that builds tiny linux system that boots to shell prompt 
> (mostly under qemu) on a dozen different architectures. And the init 
> script in that starts here:
>
> https://github.com/landley/toybox/blob/master/mkroot/mkroot.sh#L102
>
> And you see how the script does this setup (which is only needed when 
> you don't apply my patch to the kernel):
>
> if ! mountpoint -q dev; then
>   mount -t devtmpfs dev dev
>   [ $$ -eq 1 ] && ! 2>/dev/null <0 && exec 0<>/dev/console 1>&0 2>&1
>   for i in ,fd /0,stdin /1,stdout /2,stderr
>   do ln -sf /proc/self/fd${i/,*/} dev/${i/*,/}; done
>   mkdir -p dev/shm
>   chmod +t /dev/shm
> fi
>
> (Don't ask me why devtmfs doesn't automatically have a "shm" directory 
> with the sticky bit, it's a subclass of tmpfs so it does the right 
> thing when it's there. And the middle two lines are just making 
> /dev/{stdin,stdout,stderr} because toysh doesn't special case 
> /dev/stdin like bash does so needs it in the filesystem if you're 
> gonna use that.)
>
> But that exec redirect line hit a bug, because when you don't have 
> devtmpfs automounting but ALSO don't have /dev/console in your 
> statically linked initramfs image... oh here, I explained it when I 
> fixed it:
>
> https://github.com/landley/toybox/commit/0b2d5c2bb3f1
>
> https://landley.net/notes-2024.html#06-08-2024
>
> tl;dr in usr/main.c the kernel tries to open /dev/console and fails if 
> it's not already there, so PID 1 starts with stdin, stdout, and stderr 
> closed, so if something goes wrong in early boot init can't tell you 
> why it failed. This ONLY happens with static initramfs (created with 
> cpio as a normal user and you can't mknod without root access), 
> because of COURSE the kernel has two different codepaths for static vs 
> dynamic, and the dynamic one does a manual fixup for this issue. No 
> really!
>
> https://github.com/torvalds/linux/blob/master/init/noinitramfs.c#L18
>
> And of course the code doing the fixup runs for ANY system that 
> doesn't have a static linked initramfs, including one where the 
> bootloader points it at an EXTERNAL cpio.gz image like qemu -initrd 
> blah.cpio.gz, because it checks for the external one AFTER doing the 
> fixup. (Pop quiz: if you have static _and_ external initrd, and they 
> include the same file, does the old one prevent the new one from 
> extracting, does the new one replace the old one, or does the new one 
> APPEND to the old one? At various points in history, it's done ALL 
> THREE! I forget which is current, replace I think?) So external 
> initramfs and built-in initramfs behave DIFFERENTLY in a subtle sharp 
> edge ath I've complained at them about for YEARS...
>
> *shrug* I make this stuff work in as simple a way as I know how. Been 
> doing it for an embarrassingly long time now. But you _reach_ simple 
> by process of elimination, and proving a negative is a lot of work.
>
>> Thanks,
>>   -- Tim
>>
>
> Rob
>
To summarize from my point of view:

* It's worth talking a bit about the effect of udev and about alternatives

* "mdev" is surely worth being named as an potential option besides 
"selective triggering" and "static setup and moving triggers back in time"

* I wouldn't regard mknode as an real alternative in todays system

* In addition I can imagine is "modules loading" vs. "compiling in 
drivers" something which is worth mentioning

* Once I've access to the wiki, I can try to put these ideas into an 
initial structure filled up w/ info we discussed in this thread

Marko


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [boot-time]
  2025-01-12 10:11               ` [boot-time] Marko Hoyer
@ 2025-01-12 13:39                 ` Francesco Valla
  2025-01-12 18:35                 ` [boot-time] Rob Landley
  1 sibling, 0 replies; 21+ messages in thread
From: Francesco Valla @ 2025-01-12 13:39 UTC (permalink / raw)
  To: Rob Landley, Bird, Tim, Marko Hoyer, Shankari,
	linux-embedded@vger.kernel.org

On Sunday, 12 January 2025 at 11:11:44 Marko Hoyer <mhoyer.oss-devel@freenet.de> wrote:
> 
> Am 12.01.25 um 02:03 schrieb Rob Landley:
> > On 1/11/25 12:57, Bird, Tim wrote:
> >> Hey Rob, This is a great review of /dev, /sys and the different
> >> ways that /dev gets populated.
> >
> > Feel free to link stuff from wikis or some such. The newest of those 
> > documents was written in 2007.
> >
> >> For a lot of embedded Linux devices, the only bus where
> >> new items can show up dynamically is USB.
> 
> SDCARD readers connected via MMC are common in automtove head units as 
> well ...
> 
> 
> >
> > Yup, /sys/bus/usb/devices is in there too and when a driver binds to 
> > them, they wind up in /sys/block and such as well. (you USED to have 
> > to seprately mount a usbfs under /sys but they finally acknowledged 
> > that was silly about 5 years ago, hence 
> > https://askubuntu.com/questions/1218321/if-usbfs-has-been-deprecated-then-why-is-sys-bus-usb-drivers-usbfs-directory-p)
> >
> > When a driver DOESN'T automatically bind to them it gets a bit 
> > complicated, and one of the things mdev can be configured to do is act 
> > as a firmware loader! Which is just... Ahem, there are YEARS of poor 
> > design decisions the kernel guys made, where they ignored a mechanism 
> > they already had an implemented something more complicated. The 
> > mechanism whereby the kernel opens a firmware file and read it 
> > directly out of the filesystem instead of calling a hotplug helper 
> > was... I'm just going to gloss over that.
> 
> WIFI & Bluetooth devices often use this firmware mechanism. And yes I 
> agree, it looks a bit ** ugly** seeing the kernel loading a firmware 
> file from /lib/firmware  searching it in the root file system w/o 
> knowing the state of it during boot ... For WIFI and bluetooth I do not 
> see a big issue here since I'd prevent putting such features on a 
> critical chain by system design in any way since bringing them up and 
> (re)connecting external devices is time consuming by nature. Nothing you 
> shall need to wait for ...
> 

The whole "try to access the rootfs during boot" domain is an area worth
investigating, as it *should* be simple to track the actual init state
and directly skip the accesses that aren't going to succeed. I recently
stumbled for example on the Ethernet PHY core trying to load modules
during init [1], but the firmware loading is another of such examples.


> To summarize from my point of view:
> 
> * It's worth talking a bit about the effect of udev and about alternatives
> 
> * "mdev" is surely worth being named as an potential option besides 
> "selective triggering" and "static setup and moving triggers back in time"
> 
> * I wouldn't regard mknode as an real alternative in todays system
> 

Another approach that in my opinion is worth mentioning is: no udev/mdev at all.
In a couple of embedded products with a very limited scope I simply decided to
use devtmpfs + manual insmod + a simple bash script for USB automounting
registered as hotplug handler. Very few dependencies, no boot time parsing of
configuration files. It took a bit to configure the init sequence, but the
result was/is very satisfying.

> * In addition I can imagine is "modules loading" vs. "compiling in 
> drivers" something which is worth mentioning
> 
> * Once I've access to the wiki, I can try to put these ideas into an 
> initial structure filled up w/ info we discussed in this thread
> 
> Marko
> 
> 
> 

[1] https://lore.kernel.org/netdev/SJ0PR18MB5216A8D227B2B3651DB9AC0DDB152@SJ0PR18MB5216.namprd18.prod.outlook.com/T/

---

Regards,
Francesco





^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [boot-time]
  2025-01-12 10:11               ` [boot-time] Marko Hoyer
  2025-01-12 13:39                 ` [boot-time] Francesco Valla
@ 2025-01-12 18:35                 ` Rob Landley
  1 sibling, 0 replies; 21+ messages in thread
From: Rob Landley @ 2025-01-12 18:35 UTC (permalink / raw)
  To: Marko Hoyer, Bird, Tim, Shankari, linux-embedded@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 13643 bytes --]

On 1/12/25 04:11, Marko Hoyer wrote:
> Am 12.01.25 um 02:03 schrieb Rob Landley:
>> On 1/11/25 12:57, Bird, Tim wrote:
>>> Hey Rob, This is a great review of /dev, /sys and the different
>>> ways that /dev gets populated.
>>
>> Feel free to link stuff from wikis or some such. The newest of those 
>> documents was written in 2007.
>>
>>> For a lot of embedded Linux devices, the only bus where
>>> new items can show up dynamically is USB.
> 
> SDCARD readers connected via MMC are common in automtove head units as 
> well ...

But do they give an insertion/removal notification that can generate an 
interrupt rather than needing to be polled? (Last couple of boards I 
poked at didn't, but it was cheap hardware...)

>> When a driver DOESN'T automatically bind to them it gets a bit 
>> complicated, and one of the things mdev can be configured to do is act 
>> as a firmware loader! Which is just... Ahem, there are YEARS of poor 
>> design decisions the kernel guys made, where they ignored a mechanism 
>> they already had an implemented something more complicated. The 
>> mechanism whereby the kernel opens a firmware file and read it 
>> directly out of the filesystem instead of calling a hotplug helper 
>> was... I'm just going to gloss over that.
> 
> WIFI & Bluetooth devices often use this firmware mechanism.

The wifi and bluetooth _hardware_ is always there though. Transciever 
link toggle is more or less a media insertion/removal event, which is a 
slightly different hotplug mechanism.

Ogres, onions... layers.

> And yes I 
> agree, it looks a bit ** ugly** seeing the kernel loading a firmware 
> file from /lib/firmware  searching it in the root file system w/o 
> knowing the state of it during boot ...

They already HAD the hotplug helper mechanism and initramfs! You could 
already CALL A LOADER and some of us had that working and DEPLOYED 
before they built a whole new mechanism for "the kernel reaches out and 
reads a file out of the userspace view of the filesystem from kernel 
space without a process context to do it in like the ELF loader has, 
don't ask me what this means for containers and namespaces..."

(Ok, they wanted to load firmware before PID 1 launched, but they were 
already breaking the drivers into separate probe/init sections so you 
could probe before were started and init after interrupts were started 
and launching PID 1 is the first thing that happens after interrupts are 
enabled (we have a scheduler now, the idle task can fork off PID 1 and 
PID 0 can run pause() in a loop. Except between those two the kernel 
launches a zillion "kernel threads" including the tasklets and deferred 
device initialization and so on...)

It wasn't just awkward, it was unnecessary. (And it DOES NOT SOLVE the 
underlying licensing issue of "this firmware is not gpl, I am bundling 
it into a statically linked initramfs, is this "mere aggregation", let's 
see what a judge has to say!

Meanwhile Bradley is in court ACTIVELY ARGUING that there's no 
difference between GPLv2 and GPLv3 and that the complete lack of any 
copyright holders willing to sign on to his increasingly extreme 
enforcement views isn't a problem because GPLv2 is a contract despite 
the complete absence of things like "privity of contract"... No really:

https://blog.tidelift.com/will-the-new-judicial-ruling-in-the-vizio-lawsuit-strengthen-the-gpl

I got dragged into this recently to spend a day telling a camera "no, 
Bradley's full of it", and yes he flew in to sit at the other end of the 
table for some reason:

https://landley.net/notes-2024.html#24-06-2024

Sigh. There's a reason I do 0BSD these days:

https://landley.net/toybox/license.html

> For WIFI and bluetooth I do not 
> see a big issue here since I'd prevent putting such features on a 
> critical chain by system design in any way since bringing them up and 
> (re)connecting external devices is time consuming by nature. Nothing you 
> shall need to wait for ...

Except that reconnection mostly happens in software. The _hardware_ 
you're talking to stays connected. It's a resource 
acquisition/allocation problem sure, but closer to partition re-scanning.

*shrug* The asynchronous notifications that something happened behind 
your back come in through similar mechanisms, but if that's ALL we were 
dealing with we wouldn't have needed most of this plumbing.

(Although that was ANOTHER fun failure of the old devfs: /dev/eth0 isn't 
common, thanks to Bill Joy somehow not really understanding unix in 
1979. And of course renaming /dev/hda to /dev/sda is a big deal from a 
compatibility perspective, but the <strike>devfsd2</strike> systemd guys 
deciding that eth0: is now potato03x1: or some such? That's just fine, 
who cares about compatibility with that...)

> Compiling in modules vs. loading them later from user space is a trade- 
> off. The effect of putting stuff into modules is to keep the kernel 
> small which helps you in the "unpacking & loading kernel" phase before 
> the kernel is actually started. Having an 1MB unpacked kernel is 
> significantly a difference to a 5MB one.

If you can avoid ever loading the module, you may come out ahead. 
(Modulo why are you shipping it then, still needs storage.) Last I 
checked the actual module unloading was still a NOP half the time (the 
memory stays pinned) and marks your kernel "tainted" if you ever 
actually do it, which is not a vote of confidence in the codepath if you 
ask me.

But I had toybox insmod working years ago, the question is toybox 
_modprobe_ is still in pending because modprobe pulls fairly extensive 
shenanigans I am not personally familiar with and have to learn how to 
use before I can implement them, and they just seem like TERRIBLE IDEAS:

https://github.com/landley/toybox/issues/522

> On the other hand, my 
> experience is that there is lot of overhead (CPU time and IO) loading 
> modules from user space. So it really only makes sense, if you have 
> drivers to load at a point in time during startup where you have enough 
> time and resources left.

The kernel boot process is already fairly heavily asynchronous, which is 
why your shell prompt gets buried with "link up" notifications spamming 
the console after it prints the $ and so on. That's why mkroot's init 
script does echo 3 > /proc/sys/kernel/printk before the exec handoff to 
whatever inherits PID 1 from the setup script:

https://github.com/landley/toybox/blob/0.8.11/mkroot/mkroot.sh#L133

Because if it's a shell, and we don't do that, you won't see the prompt 
under the noise.

>> I mean it more or less works, it's just... pointless manual 
>> maintenance of something the kernel does for you in a very small 
>> amount of code? (In devtmpfs, the /dev node being there means 
>> something. In a static /dev, it doesn't.)
> 
> I agree. There is kind of dynamic device enumeration done by the kernel 
> drivers anyway once loaded. Any data structures to devices are build up 
> internally. Nothing you can save ...

I spent YEARS convincing the android guys to look at devtmpfs, 
initramfs, container plumbing... (Keep in mind Google bought Android 
Inc. in 2005 and shipped the first phone at the end of 2008, meaning 
their main development effort predated most of this plumbing and they 
had to retrofit it in much later.)  No idea how much impact I had and 
how much they would have eventually done anyway, but the main guy I was 
having those conversations with WAS the android base OS maintainer, 
so... Most recent was probably:

http://lists.landley.net/pipermail/toybox-landley.net/2022-August/029139.html

You'd think the early boot stuff was fairly straightfoward, but I keep 
winding up being the one to manually fix crap like:

https://lkml.iu.edu/hypermail/linux/kernel/1306.3/04204.html

And then YEARS LATER, it's me who has to:

https://lore.kernel.org/lkml/8244c75f-445e-b15b-9dbf-266e7ca666e2@landley.net/

And then it had to be rewritten to remove my taint:

https://lkml.iu.edu/hypermail/linux/kernel/2311.1/01821.html
https://lkml.iu.edu/hypermail/linux/kernel/2311.2/02938.html

Let alone obvious polishing nonsense like:

https://lkml.iu.edu/hypermail/linux/kernel/1705.0/02640.html

(Which only went in because Andrew Morton picked it up despite Greg KH 
doing his usual stonewalling of literally anything from me. Oh well.)

Anyway, there's a reason I'm not really a kernel developer. When I try 
to engage with them myself, "crickets chirp" is pretty much the GOOD 
outcome...

https://lkml.iu.edu/hypermail/linux/kernel/1707.2/01797.html

Ahem. I'll stop now.

> I'm even not sure how devtmpfs can be combined w/ your static devnodes 
> you created in any kind of persistent partition.

You could mount your own /tmp and do mdev -s into it. That's what we 
used to do back around 2005:

https://lkml.iu.edu/hypermail/linux/kernel/0512.0/1326.html

(Also, when devtmpfs first went in, if you modified a node (touch, 
chattr, etc) then it wouldn't delete it and your management tool would 
have to delete it via hotplug removal event handling. So you could PIN 
nodes, I was just never clear on why you'd want to. It probably still 
does that?)

> And if you even can get 
> the kernel accepting your partition to use as /dev,

Kernel doesn't care.

> you need to have it 
> writeable for the case of dynamics you might need (usb for instance) 
> which does not really go well with a read only RFS ... You could ... 
> overlay fs ... well no, I think this goes into a wrong direction -> too 
> complicated ;)

If you just have a /tmp dir in initramfs with some starting nodes 
initialized via the cpio extractor, and then have something like mdev 
add things on top of that as they're hotplugged, initramfs is inherently 
writeable thus the /tmp dir would be.

There's a race condition where "I booted a device with USB already 
plugged into it before powerup, when is the hotplug event delivered and 
is it before the hotplug handler is registered", which I cared deeply 
about in 2005 and no longer remember the details of. I could try to dig 
them up out of my blog and the busybox/kernel mailing lists if you care?

> To summarize from my point of view:
> 
> * It's worth talking a bit about the effect of udev and about alternatives

I am not a fan of udev, for reasons that are part technical and part "oh 
those assholes" rant path I'm trying to avoid going down.

> * "mdev" is surely worth being named as an potential option besides 
> "selective triggering" and "static setup and moving triggers back in time"
> 
> * I wouldn't regard mknode as an real alternative in todays system

It still comes up from time to time, usually when initializing 
containers. (Because devtmpfs in containers does NOT give a proper 
container-local view of its namespace.)

Once upon a time, you could use the linux kernel's built in initramfs 
generation plumbing to create a cpio with arbitrary contents by 
providing simple text snippets to supplement their scanner, including a 
/dev/console entry created as a normal user (without running as root!).

But of COURSE the kernel developers removed the ability, and I patched 
it back in (attached), and then went "no, not fighting that fight"...

> * In addition I can imagine is "modules loading" vs. "compiling in 
> drivers" something which is worth mentioning

There's buckets of domain expertise there and I have like 1/3 of what 
I'd need to be confident there. (I know where to look it up, but have 
never considered it a good thing. Half the point of modules was to 
load/unload drivers for testing without reboots, and I just boot cycle a 
system under qemu or KVM when I can, and boot cycle a physical board 
when I can't because fiddling with modules really doesn't HELP my 
workflow. YMMV...)

The main other reason modules persist is out-of-tree drivers, usually 
not under GPL, which have been under systematic attack for well over a 
decade and the people still doing it have large teams writing shim code.

Most "let's use modules" decisions _since_ then boil down to either

1) "this is a generic PC hardware distro and I have no idea what 
hardware will be on there, and building every possible module into the 
kernel wastes a couple dozen megabytes of RAM on a system"

2) This mechanism exists, there must be a reason, therefore I should 
definitely use it because it's there.

(They built _mechanisms_ to prevent you from upgrading modules without 
upgrading the kernel they plug into. Note that the description of 
CONFIG_MODVERSIONS says that WITHOUT it you can't have even slight 
version skew. That's without MODULE_SIG and MODULE_SRCVERSION_ALL and so 
on.)

By the way, you can provide "module arguments" on the kernel command 
line, write to things like /sys/module/psmouse/parameters/rate after the 
driver's up...

> * Once I've access to the wiki, I can try to put these ideas into an 
> initial structure filled up w/ info we discussed in this thread
> 
> Marko

Good luck.

You know what we REALLY need a new version of? A rewrite of:

https://landley.net/kdocs/mirror/lki-single.html

With sections for each architecture. (And if you tried to write one, 
you'd hate Raspberry Pi as much as I do! Although 
https://forums.raspberrypi.com/viewtopic.php?t=357536 is extremely 
promising, and a far sight better than 
https://github.com/christinaa/rpi-open-firmware ever got to. Although I 
haven't really dug into the details of what's still proprietary black 
box spyware subtly bugging your board with "system management mode" 
hijacks, and what they actually managed to work around despite not 
having hardware documentation for broadcom chips...)

Rob

[-- Attachment #2: 0011-gen_init_cpio-regression.patch --]
[-- Type: text/x-patch, Size: 1824 bytes --]

From: Rob Landley <rob@landley.net>
Date: Fri, 06 Oct 2023 02:56:19 -0500
Subject: [PATCH] Add gen_initramfs.sh -O

Add a -O option to output the list instead of the archive. (You can
specify -o after -O to produce both.)

For 15 years gen_initramfs_list.sh produced a text output format that
other things consumed and modified and fed back to the kernel, then
the script changed to consume the list internally and produce the cpio
archive directly. (Why they didn't just change gen_init_cpio.c to traverse
directories itself if they were going to take away the ability to filter
the list is an open question. Maybe it could handle filenames with spaces
in them if they'd done that? And why "squash" in-band signalling instead of
the -1 I submitted, which doesn't conflict with existing users because
integers aren't valid usernames...)

Signed-off-by: Rob Landley <rob@landley.net>
---

 usr/gen_initramfs.sh |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/usr/gen_initramfs.sh b/usr/gen_initramfs.sh
index 14b5782f961a..8f75988a5799 100755
--- a/usr/gen_initramfs.sh
+++ b/usr/gen_initramfs.sh
@@ -15,6 +15,7 @@ usage() {
 cat << EOF
 Usage:
 $0 [-o <file>] [-l <dep_list>] [-u <uid>] [-g <gid>] {-d | <cpio_source>} ...
+	-O <file>      Output annotated file list instead of archive
 	-o <file>      Create initramfs file named <file> by using gen_init_cpio
 	-l <dep_list>  Create dependency list named <dep_list>
 	-u <uid>       User ID to map to user ID 0 (root).
@@ -206,6 +207,15 @@ while [ $# -gt 0 ]; do
 			echo "deps_initramfs := \\" > $dep_list
 			shift
 			;;
+		"-O")	# Output annotated file list
+			unset output
+			trap - EXIT
+			[ "$1" = "-" ] &&
+				cpio_list="/dev/stdout" ||
+				cpio_list="$1"
+			shift
+			;;
+
 		"-o")	# generate cpio image named $1
 			output="$1"
 			shift

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2025-01-12 18:36 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CAORPcfVRobA+u5q7aPboC=3iY8dibDUB0920Z=Z0VgpQEupKJw@mail.gmail.com>
2025-01-08 18:33 ` [boot-time] Bird, Tim
2025-01-08 20:39   ` [boot-time] Marko Hoyer
2025-01-08 21:19     ` [boot-time] Bird, Tim
2025-01-08 23:26       ` [boot-time] Rob Landley
2025-01-09 13:02         ` [boot-time] Marko Hoyer
2025-01-09 21:10           ` [boot-time] Rob Landley
2025-01-09 21:35             ` [boot-time] Marko Hoyer
2025-01-09 22:31               ` [boot-time] Rob Landley
2025-01-09 12:43       ` [boot-time] Marko Hoyer
2025-01-09 13:27         ` [boot-time] Geert Uytterhoeven
2025-01-08 23:00   ` [boot-time] Rob Landley
2025-01-09  2:23     ` [boot-time] Bird, Tim
2025-01-10 22:46   ` [boot-time] Marko Hoyer
2025-01-10 23:15     ` [boot-time] Rob Landley
2025-01-11  8:40       ` [boot-time] Marko Hoyer
2025-01-11 17:56         ` [boot-time] Rob Landley
2025-01-11 18:57           ` [boot-time] Bird, Tim
2025-01-12  1:03             ` [boot-time] Rob Landley
2025-01-12 10:11               ` [boot-time] Marko Hoyer
2025-01-12 13:39                 ` [boot-time] Francesco Valla
2025-01-12 18:35                 ` [boot-time] Rob Landley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).