public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC PATCH] initramfs: correctly handle space in path on cpio list generation
       [not found] <20260209153800.28228-1-ansuelsmth@gmail.com>
@ 2026-02-10 11:34 ` David Disseldorp
  2026-02-10 17:37   ` Christian Marangi
  0 siblings, 1 reply; 6+ messages in thread
From: David Disseldorp @ 2026-02-10 11:34 UTC (permalink / raw)
  To: Christian Marangi
  Cc: Nathan Chancellor, Nicolas Schier, Dmitry Safonov, linux-kbuild,
	linux-kernel, linux-fsdevel@vger.kernel.org

[cc'ing fsdevel]

On Mon,  9 Feb 2026 16:37:58 +0100, Christian Marangi wrote:

> The current gen_initramfs.sh and gen_init_cpio.c tools doesn't correctly
> handle path or filename with space in it. Although highly discouraged,

"highly discouraged" isn't really appropriate here; the kernel generally
doesn't care whether or not a filename carries whitespace.
The limitation here is specifically the gen_init_cpio manifest format,
which is strictly space-separated.

> Linux also supports filename or path with whiespace and currently this
> will produce error on generating and parsing the cpio_list file as the
> pattern won't match the expected variables order. (with gid or mode
> parsed as string)
> 
> This was notice when creating an initramfs with including the ALSA test
> files and configuration that have whitespace in both some .conf and even
> some symbolic links.
> 
> Example error:

The error messages don't really add any value here.
<snip>

> To correctly handle this problem, rework the gen_initramfs.sh and
> gen_init_cpio.c to guard all the path with "" to handle all kind of
> whitespace for filename/path.
> 
> The default_cpio_list is also updated to follow this new pattern.
> 
> Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
> ---
>  usr/default_cpio_list |  6 +++---
>  usr/gen_init_cpio.c   | 10 +++++-----
>  usr/gen_initramfs.sh  | 27 +++++++++++++++++++--------
>  3 files changed, 27 insertions(+), 16 deletions(-)
> 
> diff --git a/usr/default_cpio_list b/usr/default_cpio_list
> index 37b3864066e8..d4a66b4aa7f7 100644
> --- a/usr/default_cpio_list
> +++ b/usr/default_cpio_list
> @@ -1,6 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  # This is a very simple, default initramfs
>  
> -dir /dev 0755 0 0
> -nod /dev/console 0600 0 0 c 5 1
> -dir /root 0700 0 0
> +dir "/dev" 0755 0 0
> +nod "/dev/console" 0600 0 0 c 5 1
> +dir "/root" 0700 0 0
> diff --git a/usr/gen_init_cpio.c b/usr/gen_init_cpio.c
> index b7296edc6626..ca5950998841 100644
> --- a/usr/gen_init_cpio.c
> +++ b/usr/gen_init_cpio.c
> @@ -166,7 +166,7 @@ static int cpio_mkslink_line(const char *line)
>  	int gid;
>  	int rc = -1;
>  
> -	if (5 != sscanf(line, "%" str(PATH_MAX) "s %" str(PATH_MAX) "s %o %d %d", name, target, &mode, &uid, &gid)) {
> +	if (5 != sscanf(line, "\"%" str(PATH_MAX) "[^\"]\" \"%" str(PATH_MAX) "[^\"]\" %o %d %d", name, target, &mode, &uid, &gid)) {

This breaks parsing of existing manifest files, so is unacceptable
IMO. If we really want to go down the route of having gen_init_cpio
support space-separated paths, then perhaps a new --field-separator
parameter might make sense. For your specific workload it seems that
simply using an external cpio archiver with space support (e.g. GNU
cpio --null) would make sense. Did you consider going down that
path?

Thanks, David

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] initramfs: correctly handle space in path on cpio list generation
  2026-02-10 11:34 ` [RFC PATCH] initramfs: correctly handle space in path on cpio list generation David Disseldorp
@ 2026-02-10 17:37   ` Christian Marangi
  2026-02-11  0:43     ` David Disseldorp
  0 siblings, 1 reply; 6+ messages in thread
From: Christian Marangi @ 2026-02-10 17:37 UTC (permalink / raw)
  To: David Disseldorp
  Cc: Nathan Chancellor, Nicolas Schier, Dmitry Safonov, linux-kbuild,
	linux-kernel, linux-fsdevel@vger.kernel.org

On Tue, Feb 10, 2026 at 10:34:31PM +1100, David Disseldorp wrote:
> [cc'ing fsdevel]
> 
> On Mon,  9 Feb 2026 16:37:58 +0100, Christian Marangi wrote:
> 
> > The current gen_initramfs.sh and gen_init_cpio.c tools doesn't correctly
> > handle path or filename with space in it. Although highly discouraged,
> 
> "highly discouraged" isn't really appropriate here; the kernel generally
> doesn't care whether or not a filename carries whitespace.
> The limitation here is specifically the gen_init_cpio manifest format,
> which is strictly space-separated.
>

Yes but the value space-separated was done only out of simplicity also with the
parsing in the .c tool not strictly a requirement for the actual cpio blob that
is then generated. The problem is in the intermediate file and I feel it should
be fixed or handled.
 
> > Linux also supports filename or path with whiespace and currently this
> > will produce error on generating and parsing the cpio_list file as the
> > pattern won't match the expected variables order. (with gid or mode
> > parsed as string)
> > 
> > This was notice when creating an initramfs with including the ALSA test
> > files and configuration that have whitespace in both some .conf and even
> > some symbolic links.
> > 
> > Example error:
> 
> The error messages don't really add any value here.
> <snip>
> 

It was really to give output of what happen when file with whitespace are used.
The shell is not so chatty with this so these error are really just the mode gid
and other values that gets parsed with the filename whitespace.

> > To correctly handle this problem, rework the gen_initramfs.sh and
> > gen_init_cpio.c to guard all the path with "" to handle all kind of
> > whitespace for filename/path.
> > 
> > The default_cpio_list is also updated to follow this new pattern.
> > 
> > Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
> > ---
> >  usr/default_cpio_list |  6 +++---
> >  usr/gen_init_cpio.c   | 10 +++++-----
> >  usr/gen_initramfs.sh  | 27 +++++++++++++++++++--------
> >  3 files changed, 27 insertions(+), 16 deletions(-)
> > 
> > diff --git a/usr/default_cpio_list b/usr/default_cpio_list
> > index 37b3864066e8..d4a66b4aa7f7 100644
> > --- a/usr/default_cpio_list
> > +++ b/usr/default_cpio_list
> > @@ -1,6 +1,6 @@
> >  # SPDX-License-Identifier: GPL-2.0-only
> >  # This is a very simple, default initramfs
> >  
> > -dir /dev 0755 0 0
> > -nod /dev/console 0600 0 0 c 5 1
> > -dir /root 0700 0 0
> > +dir "/dev" 0755 0 0
> > +nod "/dev/console" 0600 0 0 c 5 1
> > +dir "/root" 0700 0 0
> > diff --git a/usr/gen_init_cpio.c b/usr/gen_init_cpio.c
> > index b7296edc6626..ca5950998841 100644
> > --- a/usr/gen_init_cpio.c
> > +++ b/usr/gen_init_cpio.c
> > @@ -166,7 +166,7 @@ static int cpio_mkslink_line(const char *line)
> >  	int gid;
> >  	int rc = -1;
> >  
> > -	if (5 != sscanf(line, "%" str(PATH_MAX) "s %" str(PATH_MAX) "s %o %d %d", name, target, &mode, &uid, &gid)) {
> > +	if (5 != sscanf(line, "\"%" str(PATH_MAX) "[^\"]\" \"%" str(PATH_MAX) "[^\"]\" %o %d %d", name, target, &mode, &uid, &gid)) {
> 
> This breaks parsing of existing manifest files, so is unacceptable
> IMO. If we really want to go down the route of having gen_init_cpio
> support space-separated paths, then perhaps a new --field-separator
> parameter might make sense. For your specific workload it seems that
> simply using an external cpio archiver with space support (e.g. GNU
> cpio --null) would make sense. Did you consider going down that
> path?
> 

This is mostly why this is posted as RFC. I honestly wants to fix this in the
linux tool instead of using external tools.

So is there an actual use of manually passing the cpio list instead of
generating one with the script? (just asking not saying that there isn't one)

One case I have (the scenario here is OpenWrt) is when a base cpio_list is
provided and then stuff is appended to it.

In such case yes there is a problem since the format changed.

My solution to this would be introduce new type that will have the new pattern.
This way we can keep support for the old list and still handle whitespace files.

An idea might be to have the file type with capital letter to differenciate with
the old one.

Something like 

FILE "path" "location" ...
SLINK "name" "target" ...
NODE ...

What do you think?

The option of --field-separator might also work but it might complicate stuff in
the .c tool as a more ""manual"" tokenizer will be needed than the simple
implementation currently present.

I'm open to both solution. Lets just agree on one of the 2.

-- 
	Ansuel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] initramfs: correctly handle space in path on cpio list generation
  2026-02-10 17:37   ` Christian Marangi
@ 2026-02-11  0:43     ` David Disseldorp
  2026-02-11  0:58       ` Christian Marangi
  0 siblings, 1 reply; 6+ messages in thread
From: David Disseldorp @ 2026-02-11  0:43 UTC (permalink / raw)
  To: Christian Marangi
  Cc: Nathan Chancellor, Nicolas Schier, Dmitry Safonov, linux-kbuild,
	linux-kernel, linux-fsdevel@vger.kernel.org

On Tue, 10 Feb 2026 18:37:44 +0100, Christian Marangi wrote:
...
> > > diff --git a/usr/gen_init_cpio.c b/usr/gen_init_cpio.c
> > > index b7296edc6626..ca5950998841 100644
> > > --- a/usr/gen_init_cpio.c
> > > +++ b/usr/gen_init_cpio.c
> > > @@ -166,7 +166,7 @@ static int cpio_mkslink_line(const char *line)
> > >  	int gid;
> > >  	int rc = -1;
> > >  
> > > -	if (5 != sscanf(line, "%" str(PATH_MAX) "s %" str(PATH_MAX) "s %o %d %d", name, target, &mode, &uid, &gid)) {
> > > +	if (5 != sscanf(line, "\"%" str(PATH_MAX) "[^\"]\" \"%" str(PATH_MAX) "[^\"]\" %o %d %d", name, target, &mode, &uid, &gid)) {  
> > 
> > This breaks parsing of existing manifest files, so is unacceptable
> > IMO. If we really want to go down the route of having gen_init_cpio
> > support space-separated paths, then perhaps a new --field-separator
> > parameter might make sense. For your specific workload it seems that
> > simply using an external cpio archiver with space support (e.g. GNU
> > cpio --null) would make sense. Did you consider going down that
> > path?
> >   
> 
> This is mostly why this is posted as RFC. I honestly wants to fix this in the
> linux tool instead of using external tools.
> 
> So is there an actual use of manually passing the cpio list instead of
> generating one with the script? (just asking not saying that there isn't one)

Absolutely. As a simple example, consider an unprivileged user wishing
to add a device node to their initramfs image. A manifest entry (as
opposed to staging area mknod=EPERM) is ideal for this.

> One case I have (the scenario here is OpenWrt) is when a base cpio_list is
> provided and then stuff is appended to it.
> 
> In such case yes there is a problem since the format changed.
> 
> My solution to this would be introduce new type that will have the new pattern.
> This way we can keep support for the old list and still handle whitespace files.
> 
> An idea might be to have the file type with capital letter to differenciate with
> the old one.
> 
> Something like 
> 
> FILE "path" "location" ...
> SLINK "name" "target" ...
> NODE ...
> 
> What do you think?

Introducing a new type to handle space-containing filenames isn't a bad
idea, but using capital letters to signify the API change is confusing.

> The option of --field-separator might also work but it might complicate stuff in
> the .c tool as a more ""manual"" tokenizer will be needed than the simple
> implementation currently present.

What happens when someone wants support for filenames containing spaces
and quotes?

> I'm open to both solution. Lets just agree on one of the 2.

I don't think any of the options will be particularly simple, but
nul-byte delimited field support might be the most straightforward.

Thanks, David

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] initramfs: correctly handle space in path on cpio list generation
  2026-02-11  0:43     ` David Disseldorp
@ 2026-02-11  0:58       ` Christian Marangi
  2026-02-11  2:40         ` David Disseldorp
  0 siblings, 1 reply; 6+ messages in thread
From: Christian Marangi @ 2026-02-11  0:58 UTC (permalink / raw)
  To: David Disseldorp
  Cc: Nathan Chancellor, Nicolas Schier, Dmitry Safonov, linux-kbuild,
	linux-kernel, linux-fsdevel@vger.kernel.org

On Wed, Feb 11, 2026 at 11:43:10AM +1100, David Disseldorp wrote:
> On Tue, 10 Feb 2026 18:37:44 +0100, Christian Marangi wrote:
> ...
> > > > diff --git a/usr/gen_init_cpio.c b/usr/gen_init_cpio.c
> > > > index b7296edc6626..ca5950998841 100644
> > > > --- a/usr/gen_init_cpio.c
> > > > +++ b/usr/gen_init_cpio.c
> > > > @@ -166,7 +166,7 @@ static int cpio_mkslink_line(const char *line)
> > > >  	int gid;
> > > >  	int rc = -1;
> > > >  
> > > > -	if (5 != sscanf(line, "%" str(PATH_MAX) "s %" str(PATH_MAX) "s %o %d %d", name, target, &mode, &uid, &gid)) {
> > > > +	if (5 != sscanf(line, "\"%" str(PATH_MAX) "[^\"]\" \"%" str(PATH_MAX) "[^\"]\" %o %d %d", name, target, &mode, &uid, &gid)) {  
> > > 
> > > This breaks parsing of existing manifest files, so is unacceptable
> > > IMO. If we really want to go down the route of having gen_init_cpio
> > > support space-separated paths, then perhaps a new --field-separator
> > > parameter might make sense. For your specific workload it seems that
> > > simply using an external cpio archiver with space support (e.g. GNU
> > > cpio --null) would make sense. Did you consider going down that
> > > path?
> > >   
> > 
> > This is mostly why this is posted as RFC. I honestly wants to fix this in the
> > linux tool instead of using external tools.
> > 
> > So is there an actual use of manually passing the cpio list instead of
> > generating one with the script? (just asking not saying that there isn't one)
> 
> Absolutely. As a simple example, consider an unprivileged user wishing
> to add a device node to their initramfs image. A manifest entry (as
> opposed to staging area mknod=EPERM) is ideal for this.
> 
> > One case I have (the scenario here is OpenWrt) is when a base cpio_list is
> > provided and then stuff is appended to it.
> > 
> > In such case yes there is a problem since the format changed.
> > 
> > My solution to this would be introduce new type that will have the new pattern.
> > This way we can keep support for the old list and still handle whitespace files.
> > 
> > An idea might be to have the file type with capital letter to differenciate with
> > the old one.
> > 
> > Something like 
> > 
> > FILE "path" "location" ...
> > SLINK "name" "target" ...
> > NODE ...
> > 
> > What do you think?
> 
> Introducing a new type to handle space-containing filenames isn't a bad
> idea, but using capital letters to signify the API change is confusing.
>

The problem of a new type is that other tool might not support that but no idea
if it would be really that relevant. After all it's all intermediate file to
generate the final cpio.
 
> > The option of --field-separator might also work but it might complicate stuff in
> > the .c tool as a more ""manual"" tokenizer will be needed than the simple
> > implementation currently present.
> 
> What happens when someone wants support for filenames containing spaces
> and quotes?
> 

I mean... it's a less common case where filename start to have almost invalid
char but yes it's a valid point.

> > I'm open to both solution. Lets just agree on one of the 2.
> 
> I don't think any of the options will be particularly simple, but
> nul-byte delimited field support might be the most straightforward.
> 

Yes that was the initial idea but was quickly scrapped as major work is needed
in the .c tool to handle NULL separated entry.

Can you by chance point to me how the GNU tool work with --null ?


They also create a cpio_list file with entry NULL separated?


-- 
	Ansuel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] initramfs: correctly handle space in path on cpio list generation
  2026-02-11  0:58       ` Christian Marangi
@ 2026-02-11  2:40         ` David Disseldorp
  2026-02-15  2:54           ` Christian Marangi
  0 siblings, 1 reply; 6+ messages in thread
From: David Disseldorp @ 2026-02-11  2:40 UTC (permalink / raw)
  To: Christian Marangi
  Cc: Nathan Chancellor, Nicolas Schier, Dmitry Safonov, linux-kbuild,
	linux-kernel, linux-fsdevel@vger.kernel.org

On Wed, 11 Feb 2026 01:58:27 +0100, Christian Marangi wrote:

> > What happens when someone wants support for filenames containing spaces
> > and quotes?
> >   
> 
> I mean... it's a less common case where filename start to have almost invalid
> char but yes it's a valid point.
> 
> > > I'm open to both solution. Lets just agree on one of the 2.  
> > 
> > I don't think any of the options will be particularly simple, but
> > nul-byte delimited field support might be the most straightforward.
> >   
> 
> Yes that was the initial idea but was quickly scrapped as major work is needed
> in the .c tool to handle NULL separated entry.
> 
> Can you by chance point to me how the GNU tool work with --null ?
> 
> 
> They also create a cpio_list file with entry NULL separated?

E.g. dracut uses the GNU cpio --null alongside find -print0:

  cd "$initdir"
  find . -print0 | sort -z \
      | cpio ${CPIO_REPRODUCIBLE:+--reproducible} --null ${cpio_owner:+-R "$cpio_owner"} -H newc -o --quiet \
      | $compress >> "${DRACUT_TMPDIR}/initramfs.img"

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] initramfs: correctly handle space in path on cpio list generation
  2026-02-11  2:40         ` David Disseldorp
@ 2026-02-15  2:54           ` Christian Marangi
  0 siblings, 0 replies; 6+ messages in thread
From: Christian Marangi @ 2026-02-15  2:54 UTC (permalink / raw)
  To: David Disseldorp
  Cc: Nathan Chancellor, Nicolas Schier, Dmitry Safonov, linux-kbuild,
	linux-kernel, linux-fsdevel@vger.kernel.org

On Wed, Feb 11, 2026 at 01:40:25PM +1100, David Disseldorp wrote:
> On Wed, 11 Feb 2026 01:58:27 +0100, Christian Marangi wrote:
> 
> > > What happens when someone wants support for filenames containing spaces
> > > and quotes?
> > >   
> > 
> > I mean... it's a less common case where filename start to have almost invalid
> > char but yes it's a valid point.
> > 
> > > > I'm open to both solution. Lets just agree on one of the 2.  
> > > 
> > > I don't think any of the options will be particularly simple, but
> > > nul-byte delimited field support might be the most straightforward.
> > >   
> > 
> > Yes that was the initial idea but was quickly scrapped as major work is needed
> > in the .c tool to handle NULL separated entry.
> > 
> > Can you by chance point to me how the GNU tool work with --null ?
> > 
> > 
> > They also create a cpio_list file with entry NULL separated?
> 
> E.g. dracut uses the GNU cpio --null alongside find -print0:
> 
>   cd "$initdir"
>   find . -print0 | sort -z \
>       | cpio ${CPIO_REPRODUCIBLE:+--reproducible} --null ${cpio_owner:+-R "$cpio_owner"} -H newc -o --quiet \
>       | $compress >> "${DRACUT_TMPDIR}/initramfs.img"

Ok I finished developing this and while testing it I had an interesting idea...
What if the delimiter is auto detected by checking the very next char after the
file type?

This way we can support a number of different format without having to update
any file...

The .c file had to be reworked for the tokenizer conversion so this
autodetection feature is litterally disabling the format validation of the
string and make the delimiter dynamic for the string based on the next char


For example in one file we can have these kind of thing without having to
support any additional arg.

nod /dev/tty0 660 0 0 c 4 0
nod /dev/tty1 660 0 0 c 4 1
nod /dev/random 666 0 0 c 1 8
nod /dev/urandom 666 0 0 c 1 9
# dir /dev/pts 755 0 0

nod|/dev/pts|755|0|0|c|0|9

dir\0/bin\755\01000\01000

(the \0 are NULL char, it's here to display in the actual file they are zero
char)

Wonder if this might be interesting or I should just stick to the current idea
of adding a -0 option and enforce the NULL delimiter.



-- 
	Ansuel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-02-15  2:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20260209153800.28228-1-ansuelsmth@gmail.com>
2026-02-10 11:34 ` [RFC PATCH] initramfs: correctly handle space in path on cpio list generation David Disseldorp
2026-02-10 17:37   ` Christian Marangi
2026-02-11  0:43     ` David Disseldorp
2026-02-11  0:58       ` Christian Marangi
2026-02-11  2:40         ` David Disseldorp
2026-02-15  2:54           ` Christian Marangi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox